NVIDIA Unveils LATTE3D: Transform Text into 3D Shapes Instantly

2024-03-22

NVIDIA researchers have unveiled LATTE3D, a novel model that can convert text prompts into high-quality 3D shapes within milliseconds. The impressive speed of this rapid generation process greatly simplifies the creative process. For instance, it enables designers to iterate quickly on their ideas instead of starting from scratch or searching through asset libraries. Sanja Fidler, Vice President of AI Research at NVIDIA, stated, "Just a year ago, AI models took an hour to generate this level of quality in 3D visual effects—and even the state-of-the-art techniques took around 10 to 12 seconds. We can now produce results that are an order of magnitude faster than before, allowing creators to convert text into 3D graphics in near real-time to meet the needs of various industries." The model generates multiple 3D shape options for each text prompt, providing creators with a range of choices. The selected objects can be optimized within minutes to enhance their quality and exported to various graphic software applications or platforms, such as NVIDIA Omniverse. While LATTE3D was specifically trained on datasets of animals and everyday objects, the model architecture can be adapted to train on various other types of data. For example, a version trained on 3D plants could assist landscape designers in quickly populating garden renderings, while a version trained on household items could generate 3D home simulation objects for training personal assistant robots. The training of LATTE3D involved NVIDIA A100 Tensor Core GPUs and diverse text prompts generated using ChatGPT. This approach enhances the model's ability to handle the various ways in which users may describe 3D objects. Further research detailed in the paper reveals additional advantages of the model, such as enhancing robustness through 3D priors, shape regularization, and model initialization. A two-stage pipeline involving volume and surface-based rendering allows for the rapid generation of meshes with detailed textures. With LATTE3D, NVIDIA is pushing the boundaries of generative AI, making it faster and more accessible for creators across industries to present their ideas in 3D form. As technology continues to advance, we can expect to see more innovative applications and use cases emerge.