James Ding
Jun 06, 2025 10:02
NVIDIA introduces the Nemotron-H Reasoning model family, delivering significant throughput gains and versatile applications in reasoning-intensive tasks, according to NVIDIA’s blog.
In a significant development for artificial intelligence, NVIDIA has announced the Nemotron-H Reasoning model family, designed to enhance throughput without compromising performance. These models are tailored to handle reasoning-intensive tasks, with a particular focus on math and science, where output lengths have been expanding significantly, sometimes reaching tens of thousands of tokens.
NVIDIA’s latest offering includes the Nemotron-H-47B-Reasoning-128K and Nemotron-H-8B-Reasoning-128K models, both available in FP8 quantized variants. These models are derived from the Nemotron-H-47B-Base-8K and Nemotron-H-8B-Base-8K foundation models, according to NVIDIA’s blog.
The Nemotron-H-47B-Reasoning model, the most capable in this family, delivers nearly four times greater throughput than comparable transformer models such as the Llama-Nemotron Super 49B V1.0. It supports 128K token contexts and excels in accuracy for reasoning-heavy tasks. Similarly, the Nemotron-H-8B-Reasoning-128K model shows significant improvements over the Llama-Nemotron Nano 8B V1.0.
The Nemotron-H models introduce a flexible operational feature, allowing users to choose between reasoning and non-reasoning modes. This adaptability makes it suitable for a wide range of real-world applications. NVIDIA has released these models under an open research license, encouraging the research community to explore and innovate further.
The training of these models involved supervised fine-tuning (SFT) with examples that included explicit reasoning traces. This comprehensive training approach, which spanned over 30,000 steps for math, science, and coding, has resulted in consistent improvements on internal STEM benchmarks. A subsequent training phase focused on instruction following, safety alignment, and dialogue, further enhancing the model’s performance across diverse tasks.
To support 128K-token contexts, the models were trained using synthetic sequences up to 256K tokens, which improved their long-context attention capabilities. Additionally, reinforcement learning with Group Relative Policy Optimization (GRPO) was applied to refine skills such as instruction following and tool use, enhancing the model’s overall response quality.
Benchmarking against models like Llama-Nemotron Super 49B V1.0 and Qwen3 32B, the Nemotron-H-47B-Reasoning-128K model demonstrated superior accuracy and throughput. Notably, it achieved approximately four times higher throughput than traditional transformer-based models, marking a significant advancement in AI model efficiency.
Overall, the Nemotron-H Reasoning models represent a versatile and high-performing foundation for applications requiring precision and speed, offering significant advancements in AI reasoning capabilities.
For more detailed information, please refer to the official announcement on the NVIDIA blog.
Image source: Shutterstock
Zach Anderson Jul 22, 2025 15:49 The GENIUS Act establishes regulatory…
Microsoft ने चीनी हैकर्स द्वारा SharePoint की कमजोरियों के दोहन की चेतावनी दी Source link
Ted Hisokawa Jul 22, 2025 12:36 Discover how rollups and custom…
फेड की बोमैन ने केंद्रीय बैंक की स्वतंत्रता के महत्व पर जोर दिया Source link
Peter Zhang Jul 22, 2025 03:54 JASMY price trades at $0.02…
ओरेकल और OpenAI ने 4.5 गीगावाट डेटा सेंटर विस्तार के लिए साझेदारी की Source link
This website uses cookies.