Nvidia Unveils Rubin CPX Amidst Chart-Topping Blackwell Ultra MLPerf Results
19 hours ago
- #AI
- #GPU
- #NVIDIA
- NVIDIA's GB300 Blackwell Ultra GPUs set new AI inference performance records in MLPerf benchmarks.
- New techniques like NVFP4 format and parallelism methods (expert and data parallelism) were key to performance gains.
- Disaggregated Serving splits AI inference workloads between GPU pools for efficiency, boosting throughput by 1.5x per GPU.
- NVIDIA unveiled Rubin CPX, a GPU optimized for massive-context inference, using GDDR7 and featuring faster exponent operations.
- Rubin CPX delivers 30 petaFLOPS of tensor compute in NVFP4 format and includes video encoders for generative AI.
- New rack solutions like Vera Rubin NVL144 CPX and dual-rack configurations enhance compute power up to 8 exaFLOPS.
- Standard Vera Rubin NVL144 systems are expected in H2 2025, with Rubin Ultra NVL576 coming in 2027.