Not long ago, creating 3D images was an extremely challenging task, with its difficulty leaving people speechless. This process not only requires complex wireframe design and the use of professional software but also relies on powerful hardware support. However, today, everything has changed.
Stability AI recently unveiled an innovative generative AI technology called Stable Fast 3D, which can quickly transform a single image into a lifelike 3D image, and its speed is astonishing. According to Stability AI, its latest model can generate a 3D image in less than half a second.
This achievement signifies a huge leap in processing efficiency, far surpassing previous models that took several minutes to achieve similar results. Looking back to March of this year, Stability AI introduced Stable Video 3D (SV3D) technology, which took a long 10 minutes to generate a 3D asset. Now, Stable Fast 3D is 1200 times faster, leaving people in awe.
Stability AI foresees that this new technology will demonstrate immeasurable practical value in various fields such as design, architecture, retail, virtual reality, and game development. Users can conveniently experience this technology through Stability AI's Stable Assistant chatbot and its API. Additionally, this technology is open for sharing under the license of the Hugging Face community.
So, how does Stable Fast 3D achieve this unprecedented speed in image generation?
It is not a product of imagination but an optimized upgrade based on Stability AI's collaboration with 3D modeling pioneer Trip AI on the TripoSR model. As early as March of this year, the collaboration between the two parties began to show its potential, aiming to create a new era of fast 3D asset generation.
In their latest research paper, Stability AI's research team delves into how the new model rapidly reconstructs high-quality 3D meshes from a single image. The system cleverly combines various innovative technologies, effectively solving many challenges in fast 3D reconstruction, maintaining astonishing speed while improving the quality of the output image.
In terms of core mechanisms, Stable Fast 3D utilizes an enhanced transformer network to generate high-resolution triplanes from input images, which accurately represent the 3D volume. This network is ingeniously designed to efficiently process high-resolution data without increasing computational complexity, capturing more delicate details and reducing aliasing artifacts.
In addition, the research team has made breakthrough progress in material and lighting estimation. The material estimation network adopts a novel probabilistic approach to accurately predict global metallic texture and roughness values, significantly enhancing the realism and consistency of the images.
It is worth mentioning that the Stable Fast 3D model can cleverly integrate multiple elements required for 3D images, such as meshes, textures, and material properties, into a compact and plug-and-play 3D asset, bringing users an unprecedented convenience.
From 2D to 4D, Stability AI is continuously expanding the boundaries of generative AI. While Stable Diffusion, as a well-known 2D generative technology for text-to-image, Stability AI has been deeply involved in the 3D field since November 2023 and introduced Stable 3D at that time. Subsequently, Stable Video 3D, which debuted in March of this year, added basic camera translation functionality to enhance the quality of 3D image generation.
Stability AI's exploration does not stop at 3D. Not long ago, the company also announced the release of Stable Video 4D, introducing the time dimension to short 3D video generation, once again refreshing the industry's perception of generative AI capabilities.