NVIDIA Blackwell Outshines in InferenceMAX™ v1 Benchmarks

[ad_1]



Luisa Crawford
Oct 10, 2025 02:52

NVIDIA’s Blackwell architecture demonstrates significant performance and efficiency gains in SemiAnalysis’s InferenceMAX™ v1 benchmarks, setting new standards for AI hardware.





SemiAnalysis has introduced InferenceMAX™ v1, an open source initiative aimed at evaluating inference hardware performance comprehensively. The results, published recently, reveal that NVIDIA’s latest GPUs, particularly the Blackwell series, lead in inference performance across various workloads, according to NVIDIA.

Performance Breakthroughs with NVIDIA Blackwell

NVIDIA Blackwell showcases a remarkable 15-fold performance improvement over its predecessor, the Hopper generation, translating into a significant revenue opportunity. This advancement is largely attributed to NVIDIA’s hardware-software co-design, which includes support for NVFP4 low precision format, fifth-generation NVIDIA NVLink, and advanced inference frameworks like NVIDIA TensorRT-LLM and Dynamo.

The open source nature of InferenceMAX v1 allows the AI community to replicate NVIDIA’s impressive results, providing a benchmark for performance validation across various AI inference scenarios.

Key Features of InferenceMAX v1

InferenceMAX v1 distinguishes itself with continuous, automated testing, publishing results daily. These benchmarks encompass single-node and multi-node configurations, covering a wide range of models, precisions, and sequence lengths to reflect real-world deployment scenarios.

The benchmarks provide insights into latency, throughput, and batch size performance, crucial metrics for AI applications involving reasoning tasks, document processing, and chat scenarios.

NVIDIA’s Generational Leap

The leap from NVIDIA Hopper HGX H200 to the Blackwell DGX B200 and GB200 NVL72 platforms marks a significant increase in efficiency and cost-effectiveness. Blackwell’s architecture, featuring fifth-generation Tensor Cores and advanced NVLink bandwidth, offers superior compute-per-watt and memory bandwidth, lowering the cost per million tokens considerably.

This architectural prowess is complemented by continuous software optimizations, enhancing performance over time. Notably, improvements in the TensorRT-LLM stack have led to substantial throughput gains, optimizing large language models like gpt-oss-120b.

Cost Efficiency and Scalability

GB200 NVL72 sets a new standard in AI cost efficiency, offering significantly lower total cost of ownership compared to previous generations. It achieves this by delivering higher throughput and maintaining low costs per million tokens, even at high interactivity levels.

The innovative design of GB200 NVL72, combined with Dynamo and TensorRT-LLM, maximizes the performance of Mixture of Experts (MoE) models, enabling efficient GPU use and high throughput under various SLA constraints.

Collaborative Advancements

NVIDIA’s collaboration with open source projects like SGLang and vLLM has further enhanced the performance and efficiency of Blackwell. These partnerships have led to the development of new kernels and optimizations, ensuring that NVIDIA’s hardware can fully leverage open source inference frameworks.

With these advancements, NVIDIA continues to push the boundaries of AI hardware and software, setting new benchmarks for performance and efficiency in the industry.

Image source: Shutterstock


[ad_2]

Source link

Santosh

Share
Published by
Santosh

Recent Posts

Stocks Vs Crypto vs Forex what to do?

Source Download video - Download Video

2 weeks ago

7 Most Time Management Tips | by Him eesh Madaan

Discover 7 magical time management techniques for 100% success. Do you want to achieve more…

2 weeks ago

THIS CHAKRA THAT SUMMONS ME IS IT MADARA’S

Source Download video - Download Video

2 weeks ago

2026 में Crypto Market में वापसी की जोरदार उम्मीद! | Bitcoin News

2026 में Crypto Market में वापसी की जोरदार उम्मीद! | Bitcoin News 2025 में क्रिप्टो…

2 weeks ago

Caffeinated Cowboys: A History of Coffee in the Old Wild West…

Coffee played an essential role in shaping the American frontier during the Old West. For…

2 weeks ago

Financial Education in Hindi Financial literacy

Financial Education in Hindi Financial Literacy Follow me here Qj1GXxO16XXOpVIuAYUNm7 youtube channelhttps://www.youtube.com/channel/UCZt6GXD3VnY4rsvXqLX8IQw Source Download video…

2 weeks ago

This website uses cookies.