Modular 25.4: One Container, AMD and Nvidia GPUs, No Lock-In

a year ago

Modular Platform 25.4 introduces official support for AMD GPUs, including MI300X and MI325X, enabling seamless portability and performance.
Performance gains include up to 53% better throughput on prefill-heavy BF16 workloads and up to 32% better throughput for decode-heavy BF16 workloads compared to vLLM on AMD MI300X.
Expanded model support includes GGUF quantized Llamas, Qwen3, OLMo2, and Gemma3 models, enhancing the platform's versatility.
Enhanced documentation and developer experience with a unified navigation system and new Python-Mojo bindings for easier integration.
Open sourcing of over 450k lines of Mojo kernel and serving code, inviting community contributions to the MAX AI kernel library.
Special events include the Modular Hack Weekend and the launch of the comic 'GPU Whisperers' to engage the developer community.

Hasty Briefsbeta