Hasty Briefsbeta

Bilingual

LithOS: An Operating System for Efficient Machine Learning on GPUs

a year ago
  • #GPU
  • #Machine Learning
  • #Operating System
  • LithOS is introduced as an operating system designed for efficient machine learning on GPUs.
  • It features a TPC Scheduler for spatial scheduling at individual TPC granularity.
  • Includes transparent kernel atomization to reduce head-of-line blocking.
  • Offers lightweight hardware right-sizing to determine minimal TPC resources per atom.
  • Implements transparent power management to reduce consumption based on workload behavior.
  • LithOS is implemented in Rust and shows significant improvements in GPU efficiency.
  • Reduces tail latencies by up to 13x compared to NVIDIA's MPS in inference stacking.
  • Improves aggregate throughput by 1.6x over state-of-the-art solutions.
  • Provides quarter GPU capacity savings with under 4% performance hit via right-sizing.
  • Achieves quarter GPU energy savings with a 7% performance hit through power management.