NVIDIA published results late last week showing its GB300 NVL72 platform runs 20 times more concurrent agents per megawatt than the older HGX H200 system, according to the NVIDIA company blog. The metric comes from AgentPerf, a benchmark developed by Artificial Analysis that measures how many simultaneous agentic tasks a platform can sustain while meeting defined response-rate thresholds.
The benchmark uses DeepSeek V4 Pro, a large mixture-of-experts model, as its workload proxy. Rather than measuring single-prompt latency, AgentPerf simulates real coding-agent trajectories: file reads, writes, command execution, and context growth across dozens of chained calls.
One caveat worth noting: the benchmark is new, the results were published on NVIDIA’s own blog, and Artificial Analysis has commercial relationships with infrastructure providers. No independent replication has been reported. The 20x figure is a comparison against NVIDIA’s own prior generation, not against AMD, Intel, or cloud-provider alternatives.
For teams sizing agent infrastructure, throughput per megawatt is the number that determines how many agents you can run before power capacity, not chip count, becomes the ceiling. Teams planning rack purchases for 2026 agent deployments should request AgentPerf figures from all vendors before committing, and verify the workload profile matches their own agent call patterns.
Source: NVIDIA company blog, published June 13, 2026.