This 2025 CMU paper introduces **LithOS**, a novel operating system designed to improve the efficiency and utilization of Graphics Processing Units (GPUs) for machine learning (ML) workloads in data centers. The authors argue that current GPU management solutions, such as NVIDIA's MPS and MIG, are too coarse-grained, leading to low utilization and high latency in multi-tenant environments. LithOS proposes a transparent, OS-level approach featuring a **TPC Scheduler** for fine-grained resource control, a **Kernel Atomizer** that breaks up monolithic kernels to reduce head-of-line blocking, and mechanisms for **hardware right-sizing** and **transparent power management** (DVFS). Evaluation results demonstrate that LithOS significantly reduces tail latencies (up to 13× compared to MPS) and improves aggregate throughput in both inference-only and hybrid inference/training scenarios while achieving substantial capacity and energy savings. Overall, the work establishes a foundation for developing true operating systems for GPUs to address the growing efficiency crisis in ML infrastructure.
Source:
https://www.cs.cmu.edu/~dskarlat/publications/lithos_sosp25.pdf
Spooky Podcasts from iHeartRadio
Whether you’re a scaredy-cat or a brave bat, this collection of episodes from iHeartPodcasts will put you in the Halloween spirit. Binge stories, frights, and more that may keep you up at night!
Dateline NBC
Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com
Stuff You Should Know
If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.