Nemotron 3 - A family of open source AI models released by NVIDIA

Latest AI Resources2mos agorelease AI Sharing Circle

23.9K 00

What is the Nemotron 3?

Nemotron 3 is a family of open source AI models released by NVIDIA in Nano, Super and Ultra sizes. It utilizes a latent MoE architecture to dramatically improve inference efficiency and reduce operating costs. The Nemotron 3 Nano has 30 billion parameters, up to 3 billion per activation, and is optimized for tasks such as software debugging, content summarization, AI assistant workflows, and low inference cost information retrieval. Compared to its predecessor, the token Nemotron 3 Super and Ultra have ~100 billion and 500 billion parameters, respectively, for multi-intelligence applications and complex AI scenarios.

Features of Nemotron 3

model architectureThe Mixture-of-Experts (MoE) architecture combines the Mamba layer, Transformer layer, and MoE routing mechanism to achieve efficient processing of long sequences, high-precision reasoning, and scalable computational efficiency. The architecture supports large-scale multi-intelligence systems and can dynamically invoke the "expert" network to reduce computation cost and increase throughput.
model size: Three sizes are available:
- Nano: 30 billion parameters and 3 billion active parameters for lightweight, efficient tasks such as edge device deployment.
- Super: 100 billion parameters, designed for collaborative multi-intelligence applications, emphasizing high-precision reasoning.
- Ultra: about 500 billion parameters for complex scenarios such as scientific computing, long document analysis, etc.
Extremely long context support: Support for 1 million token context windows that can handle full task context, history and complex plans, reducing information fragmentation.
Multi-token prediction: Generate multiple tokens at a time to improve the responsiveness of tasks such as long sequence reasoning and code generation.
Low memory overhead: Reduce memory footprint while maintaining performance through optimized architectures and quantization techniques such as NVFP4.

Core Benefits of the Nemotron 3

Hybrid Architecture InnovationThe MoE architecture utilizes a Mixed Potential Expert Mix (latent MoE) architecture that combines a Mamba layer with the Transformer layer to optimize computational efficiency and improve model performance.
Improved reasoning efficiency: The Nemotron 3 Nano delivers 4x higher throughput than its predecessor, with a 60% increase in inference token generation efficiency, significantly reducing inference costs.
Strong long text processing skills: The Nano model supports a context window of 1 million tokens, which enables efficient processing of long texts and improves the accuracy of information correlation.
Multiple specifications to meet different needsNano, Super, and Ultra are available in three sizes optimized for different application scenarios, ranging from lightweight tasks to complex multi-intelligence applications.
Open Source and Customization: The model weights are released under the NVIDIA Open Model License, and developers can access detailed training and post-training recipes for easy customization and optimization via GitHub.

What is the official website for Nemotron 3

Project website:: https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models
HuggingFace Model Library:: https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8

Who Nemotron 3 is for

AI developers and researchers: Nemotron 3 provides open source models and detailed training recipes for developers and researchers who wish to build on existing models for custom development or research.
Corporate Technical Team: For organizations that require efficient, low-cost inference capabilities, Nemotron 3's high throughput and low inference cost features make it an ideal intelligent body development tool for business scenarios such as software debugging and content summarization.
Multi-intelligence body application developers: The multi-specification design of the Nemotron 3, especially the Super and Ultra versions, lends itself to the development of multi-intelligence application scenarios, such as complex human-computer interaction systems or automated processes.
AI Assistant Developer: The Nano version's efficient reasoning and long text processing capabilities make it ideal for developing applications such as smart assistants and chatbots that provide a smoother user experience.
Educational and academic institutions: The open source models and flexible customization capabilities make it suitable for educational institutions to use for teaching and research, helping students and researchers better understand and apply the latest AI technologies.