Latte3D generates 3D images from text prompts.

Nvidia unveils Latte3D to instantly generate 3D shapes from text

Nvidia unveiled Latte3D to use generative AI to instantly generate 3D shapes from text. The text-to-3D generative AI generative AI model can produce high-quality 3D shapes in milliseconds.

Crafted by Nvidia’s AI lab team in Toronto, Latte3D represents a significant advancement in the field of artificial intelligence, offering near-real-time generation of 3D objects and animals from simple text prompts, Nvidia said.

Sanja Fidler, vice president of AI Research at Nvidia, hailed Latte3D as a game-changer for creators across industries.

“We can now produce results an order of magnitude faster, putting near-real-time text-to-3D generation within reach for creators across industries,” said Fidler, in a statement.

The heart of Latte3D lies in its ability to transform text prompts into detailed 3D representations, akin to a virtual 3D printer. Utilizing a single graphics processing unit (GPU), such as the Nvidia RTX A6000, the model can generate intricate 3D shapes instantly, eliminating the need for time-consuming rendering processes.

Instead of laboriously designing objects from scratch or sifting through 3D asset libraries, creators can now rely on Latte3D to bring their ideas to life with speed and efficiency. The model offers multiple shape options based on each text input, allowing users to select the most suitable design for their needs.

Having talked to a lot of experts in this space, they worry that it’s really hard to modify a generative image so that it becomes exactly what you want it to be. It’s easy to generate a concept, but then to mold that concept with words into something that you really want is the tough challenge.

Latte3D’s versatility extends beyond its initial training datasets, which include animals and everyday objects. Developers have the flexibility to train the model on different types of data, enabling applications in various domains such as landscape design and robotics.

For landscape designers, Latte3D could expedite the process of filling out garden renderings with lifelike foliage, while robotics developers could use the model to simulate household environments for training personal assistant robots.

Powered by Nvidia A100 Tensor Core GPUs and trained on diverse text prompts generated using ChatGPT, Latte3D demonstrates Nvidia’s commitment to advancing AI-driven content creation tools. The model’s ability to handle a wide range of text descriptions ensures accurate and responsive shape generation, tailored to the user’s needs.

As part of Nvidia Research’s ongoing efforts to push the boundaries of AI and computer graphics, Latte3D stands as a testament to the company’s dedication to innovation. With hundreds of scientists and engineers worldwide, Nvidia continues to drive progress in AI, computer vision, self-driving cars, and robotics.


Dean Takahashi

Dean Takahashi is editorial director for GamesBeat at VentureBeat. He has been a tech journalist since 1988, and he has covered games as a beat since 1996. He was lead writer for GamesBeat at VentureBeat from 2008 to April 2025. Prior to that, he wrote for the San Jose Mercury News, the Red Herring, the Wall Street Journal, the Los Angeles Times, and the Dallas Times-Herald. He is the author of two books, "Opening the Xbox" and "The Xbox 360 Uncloaked." He organizes the annual GamesBeat Next, GamesBeat Summit and GamesBeat Insider Series: Hollywood and Games conferences and is a frequent speaker at gaming and tech events. He lives in the San Francisco Bay Area.