Nvidia Nemotron and Cosmos Reasoning models enable smarter enterprise agents

Become a member of GB MAX to gain exclusive access to the industry and to the most influential global B2B leadership community in the business of gaming, entertainment, and tech. Join now and also get a VIP ticket to GamesBeat Next (Nov 2-3, SF).

AI agents are poised to deliver as much as $450 billion from revenue gains and cost
savings by 2028, according to Capgemini. Developers building these agents are turning to
higher-performing reasoning models to improve AI agent platforms and physical AI
systems.

At the Siggraph computer science event in Vancouver, Canada, Nvidia announced an expansion of two model families with reasoning capabilities — Nvidia Nemotron and Nvidia Cosmos — that leaders across industries are using to drive productivity via teams of AI agents and humanoid robots, said Kari Briski, vice president of generative AI software for enterprise at Nvidia, in a blog post.

CrowdStrike, Uber, NetApp and Zoom are among some of the enterprises tapping into
these model families.

“With Nemo Tron, we take the best of the open source models, including from our own research teams, and make them better,” said Rev Lebaredian, vice president of Omniverse and simulation technology at Nvidia, in a press briefing. “Today, we’re announcing two new additions to the Nemotron family, empowering AI agents to reason better and make smarter decisions.”

He added, “These open models offer the highest accuracy in their size classes for agentic tasks, and are trained on open and transparent data with a new hybrid model architecture and configurable thinking budget feature. They give AI agents the ability to think more deeply and work more efficiently, exploring broader options and delivering smarter results within set time boundaries.”

For example, Nvidia Nemotron Nano 2 can generate up to six times more tokens in a given time frame and lower reasoning costs by 60% compared to the other open models in its class, Lebaredian said.

Lebaredian said, “We are well on our way to becoming an agentic AI driven workforce.”

As Lebaredian said, new Nvidia Nemotron Nano 2 and Llama Nemotron Super 1.5 models offer the highest accuracy in their size categories for scientific reasoning, math, coding, tool-calling, instruction-following and chat. These new models give AI agents the power to think more deeply and work more efficiently — exploring broader options, speeding up research and delivering smarter results within set time limits.

Think of the model as the brain of an AI agent — it provides the core intelligence. But to
make that brain useful for a business, it must be embedded into an agent that understands specific workflows, in addition to industry and business jargon, and operates
safely. Nvidia helps enterprises bridge that gap with leading libraries and AI blueprints for
onboarding, customizing and governing AI agents at scale.

Cosmos Reason is a new reasoning vision language model (VLM) for physical AI applications that excels in understanding how the real world works, using structured
reasoning to understand concepts like physics, object permanence and space-time
alignment.

Cosmos Reason is purpose-built to serve as the reasoning backbone to a robot vision
language action (VLA) model, or critique and caption training data for robotics and autonomous vehicles, and equip runtime visual AI agents with spatial-temporal understanding and reasoning of physical operations, like in factories or cities.

Nemotron: Highest accuracy and efficiency for agentic enterprise AI

As enterprises develop AI agents to tackle complex, multistep tasks, models that can provide strong reasoning accuracy with efficient token generation enable intelligent,
autonomous decision-making at scale.

Nvidia Nemotron is a family of advanced open reasoning models that use leading models,
Nvidia-curated open datasets and advanced AI techniques to provide an accurate and
efficient starting point for AI agents.

The latest Nemotron models deliver leading efficiency in three ways: a new hybrid model architecture, compact quantized models and a configurable thinking budget that provides
developers with control over token generation, resulting in 60% lower reasoning costs.

This combination lets the models reason more deeply and respond faster, without needing more time or computing power. This means better results at a lower cost. Nemotron Nano 2 provides as much as six times higher token generation compared with other leading models of its size.

Llama Nemotron Super 1.5 achieves leading performance and the highest reasoning accuracy in its class, empowering AI agents to reason better, make smarter decisions and
handle complex tasks independently. It’s now available in NVFP4, or 4-bit floating point, which delivers as much as six times higher throughput on Nvidia B200 GPUs compared with Nvidia H100 GPUs.

The chart above shows the Nemotron model delivers top reasoning accuracy in the same
timeframe and on the same compute budget, delivering the highest accuracy per dollar.
Along with the two new Nemotron models, Nvidia is also announcing its first open VLM
training dataset — Llama Nemotron VLM dataset v1 — with three million samples of optical character recognition, visual QA and captioning data that power the previously released Llama 3.1 Nemotron Nano VL 8B model.

In addition to the accuracy of the reasoning models, agents also rely on retrieval-
augmented generation to fetch the latest and most relevant information from connected data across disparate sources to make informed decisions. The recently released Llama 3.2 NeMo Retriever embedding model tops three visual document retrieval leaderboards — ViDoRe V1, ViDoRe V2 and MTEB VisualDocumentRetrieval — for boosting agentic system accuracy.

Using these reasoning and information retrieval models, a deep research agent built using
the AI-Q Nvidia Blueprint is currently No. 1 for open and portable agents on DeepResearch Bench.

Nvidia NeMo and Nvidia NIM microservices support the entire AI agent lifecycle — from development and deployment to monitoring and optimization of the agentic systems.

Cosmos Reason: A breakthrough in Physical AI

Nvidia’s AI models. Source: Nvidia

VLMs marked a breakthrough for computer vision and robotics, empowering machines to
identify objects and patterns. However, nonreasoning VLMs lack the ability to understand
and interact with the real world — meaning they can’t handle ambiguity or novel experiences, nor solve complex multistep tasks.

Nvidia Cosmos Reason is a new open, customizable, seven-billion-parameter reasoning VLM for physical AI and robotics. Cosmos Reason lets robots and vision AI agents reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the physical world.

Cosmos Reason enables advanced capabilities across robotics and physical AI applications
such as training data critiquing and captioning, robot decision-making and video analytics
AI agents.

It can help automate the curation and annotation of large, diverse training datasets, accelerating the development of high-accuracy AI models. It can also serve as a sophisticated reasoning engine for robot planning, parsing complex instructions into
actionable steps for VLA models, even in new environments.

It also powers video analytics AI agents built on the Nvidia Blueprint for video search and
summarization (VSS), enabled by the Nvidia Metropolis platform, gleaning valuable insights from massive volumes of stored or live video data. These visually perceptive and
interactive AI agents can help streamline operations in factories, warehouses, retail stores,
airports, traffic intersections and more by spotting anomalies.

Nvidia’s robotics research team uses Cosmos Reason for data filtration and curation, and
as the “System 2” reasoning VLM behind VLA models such as the next versions of GR00T
NX.

Now serving: Nvidia Reasoning models for AI agents and robots

Diverse enterprises and consulting leaders are adopting Nvidia’s latest reasoning models.
Leaders spanning cybersecurity to telecommunications are among those working with
Nemotron to build enterprise AI agents.

Zoom plans to harness Nemotron reasoning models with Zoom AI Companion to make
decisions and manage multistep tasks to take action for users across Zoom Meetings,
Zoom Chat, and Zoom documents.

CrowdStrike is currently testing Nemotron models to enable their Charlotte AI agents to
write queries on the CrowdStrike Falcon platform.

Amdocs is using Nvidia Nemotron models in its amAIz Suite to drive AI agents to handle complex, multistep automation spanning care, sales, network, and customer support. EY is adopting Nemotron Nano 2, given its high throughput, to support agentic AI in large
organizations for tax, risk management and finance use cases.

NetApp is currently testing Nemotron reasoning models so that AI agents can search and
analyze business data. DataRobot is working with Nemotron models for its Agent Workforce Platform for end-to-end agent lifecycle management.

Tabnine is working with Nemotron models for suggest and automate coding tasks on
behalf of developers.

Automation Anywhere and Dataiku are among the additional agentic AI software developers integrating Nemotron models into their platforms.

Leading companies across transportation, safety and AI intelligence are using Cosmos
Reason to advance autonomous driving, video analytics, road and workplace safety. Uber is exploring Cosmos Reason to analyze autonomous vehicle behavior. In addition, Uber is post-training Cosmos Reason to summarize visual data and analyze scenarios like pedestrians walking across highways to perform quality analysis and inform autonomous
driving behavior.

Centific is testing Cosmos Reason to enhance its AI-powered video intelligence platform. The VLM enables the platform to process complex video data into actionable insights,
helping reduce false positives and improve decision-making efficiency.

VAST is advancing real-time urban intelligence using Nvidia Cosmos Reason with its AI operating system to process massive video streams at scale. With the VSS Blueprint,
VAST can build agents that can identify incidents and trigger responses, turning video
streams and metadata into actionable, proactive public safety tools.

Ambient.ai is working withCosmos Reason’s temporal, physics-aware reasoning, to enable
automated detection of missing personal protection equipment and monitoring of hazardous conditions, helping enhance Environmental Health & Safety (EHS) across construction, manufacturing, logistics, and other industrial settings.

These models are expected to be available as Nvidia NIM microservices for secure, reliable deployment on any Nvidia-accelerated infrastructure for maximum privacy and control.
They are planned to be available soon through Amazon Bedrock and Amazon SageMaker AI for Nemotron models, as well as through Azure AI Foundry, Oracle Data Science Platform and Google Vertex AI.

Try Cosmos Reason on build.nvidia.com or download it from Hugging Face.