Virtualizing Nvidia HGX B200 GPUs with Open Source

a day ago

Copy Link

GPU VMs enabled on NVIDIA’s B200 HGX machines, which are trickier to virtualize than H100s.
B200 HGX uses SXM modules and NVLink for high-bandwidth GPU-to-GPU connectivity, making virtualization challenging.
Three virtualization models: Full Passthrough Mode, vGPU, and Shared NVSwitch Multitenancy Mode.
Shared NVSwitch Multitenancy Mode supports 1-, 2-, 4-, and 8-GPU VMs with full NVLink bandwidth.
Host preparation involves binding GPUs to vfio-pci driver and configuring IOMMU support.
Matching driver versions between host and VM is critical for Shared NVSwitch Multitenancy.
PCI topology mismatch can cause CUDA initialization failures; QEMU can recreate correct hierarchy.
Large-BAR stalls during VM boot can be resolved by upgrading QEMU or disabling BAR mmap.
Fabric Manager controls GPU partitions and enforces isolation in Shared NVSwitch Multitenancy Mode.
Open-source implementation available in Ubicloud, with components for GPU allocation and VM launch.

Hasty Briefsbeta