NVIDIA is driving the development of robotics technology through the launch of Project GR00T. This project is a multimodal artificial intelligence aimed at powering future humanoid robots with advanced foundational AI.
At the GTC conference held at the San Jose McEnery Convention Center, NVIDIA showcased Project GR00T. The project utilizes a universal foundational model that enables humanoid robots to receive text, speech, video, and live demonstrations as inputs and process them to take specific general actions. It is supported by NVIDIA's Isaac robot platform tools, including a new Isaac Lab for reinforcement learning.
"Building foundational models for general humanoid robots is one of the most exciting challenges in the field of artificial intelligence today," said Jensen Huang, CEO of NVIDIA, in a statement. "Now, the technologies driving the advancement of robotics are converging, enabling robotics experts from around the world to take huge strides towards achieving general artificial intelligence robots."
To assist in the successful operation of GR00T, NVIDIA announced the Jetson Thor chip designed specifically for humanoid robots. Additionally, NVIDIA shared significant progress in building AI-driven industrial manipulators and robots capable of navigating unstructured environments.
What can NVIDIA's Project GR00T bring?
Although the name may sound similar to Marvel's Groot, it actually stands for Generalist Robot 00 Technology. According to NVIDIA, it is designed to understand natural language text, speech, video, and live demonstrations to mimic human actions, including coordination, dexterity, and other skills, and generate general actions for navigation, adaptation, and interaction with the real world.
This not only enhances the capabilities of humanoid robots but also makes their development and deployment extremely easy. Essentially, anyone with text and demonstrations as inputs (with relevant permissions) can program the robots.
In the GTC keynote, Jensen Huang showcased several instances of humanoid robots powered by GR00T performing various tasks, including robots from Agility Robotics, Apptronik, Fourier Intelligence, and Unitree Robotics. Deepu Talla, who introduced GR00T to the press, pointed out that the project leverages the latest research in generative AI and transformers but did not share further details about its full capabilities.
It is worth noting that OpenAI, a leader in generative AI, is also researching embodied AI and has supported two startups in the field: 1X Technologies and Figure. Recently, Figure released a video demonstrating its robot performing daily household tasks, such as picking up trash, using a powerful vision language model (VLM) trained in a research lab led by Sam Altman. Both companies have confirmed their collaboration with NVIDIA.
Talla stated that the company is currently unable to share more details about the internal architecture but plans to share more information about the capabilities in the future. He also mentioned that currently, only a few humanoid robot developers, including the aforementioned companies, have early access to the model, but they intend to make it available to more humanoid robots and other embodied forms soon.
To ensure that humanoid robots can run complex multimodal models like GR00T, NVIDIA has also introduced the Jetson Thor computing platform for humanoid robots. The platform is based on NVIDIA's Thor SoC, which includes a high-performance CPU cluster and the next-generation GPU based on NVIDIA's Blackwell architecture, featuring a tensor engine that provides 800 teraflops of 8-bit floating-point AI performance.
Talla mentioned in the presentation that the GPU performance of this system is eight times that of the previous generation product, Jetson Orin, while the CPU performance has increased by 2.6 times.
The new Isaac robot tools at the core of GR00T
To realize Project GR00T, NVIDIA leverages its own Isaac robot platform, providing developers with a powerful end-to-end platform for developing, simulating, and deploying AI-driven robots.
Specifically, the company states that it utilizes the all-new Isaac Lab based on Isaac Sim for parallel simulation in GPU-accelerated virtual environments to test and train models. It also utilizes the OSMO compute orchestration service to concurrently manage training and simulation workloads on NVIDIA DGX and NVIDIA OVX.
In addition to these capabilities, the Isaac robot platform has also introduced two products for specific use cases: Isaac Manipulator and Isaac Perceptor.
Talla explained that Isaac Manipulator provides GPU-accelerated libraries and dedicated foundational models to help robotic arm manufacturers improve their products with state-of-the-art motion and dexterity. It includes models for object detection, estimating their 6D poses, tracking them, and even performing dense predictions for grasping.
On the other hand, Perceptor is responsible for providing multi-camera, 360-degree vision capabilities for 3D perception and situational awareness through AI-based accelerated algorithms, guiding robots in navigating unstructured environments. NVIDIA offers this technology through its Nova Orin DevKit and has collaborated with multiple partners, including ArcBest, BYD, and KION Group, to advance autonomous mobile robot capabilities in the manufacturing and fulfillment domains.
Michael Newcity, Chief Innovation Officer and President of ArcBest Technologies, stated in a statement, "Using the Isaac Perceptor platform in our Vaux Smart Autonomy AMR forklifts and stackers enables better perception, semantic-aware navigation, and 3D mapping to detect obstacles in material handling processes in warehouses, distribution centers, and manufacturing facilities."
The new Isaac platform features are expected to be released in the second quarter of this year, while Project GR00T is still in the early access stage. NVIDIA is accepting applications to grant more humanoid robot developers access to the technology, but a broader public release schedule is currently unclear.