Nvidia Unveils Rubin CPX Amidst Chart-Topping Blackwell Ultra MLPerf Results

19 hours ago

Copy Link

NVIDIA's GB300 Blackwell Ultra GPUs set new AI inference performance records in MLPerf benchmarks.
New techniques like NVFP4 format and parallelism methods (expert and data parallelism) were key to performance gains.
Disaggregated Serving splits AI inference workloads between GPU pools for efficiency, boosting throughput by 1.5x per GPU.
NVIDIA unveiled Rubin CPX, a GPU optimized for massive-context inference, using GDDR7 and featuring faster exponent operations.
Rubin CPX delivers 30 petaFLOPS of tensor compute in NVFP4 format and includes video encoders for generative AI.
New rack solutions like Vera Rubin NVL144 CPX and dual-rack configurations enhance compute power up to 8 exaFLOPS.
Standard Vera Rubin NVL144 systems are expected in H2 2025, with Rubin Ultra NVL576 coming in 2027.

Hasty Briefsbeta