Hasty Briefsbeta

Light Sleep: Waking VMs in 200ms with eBPF and snapshots

8 days ago
  • #virtualization
  • #serverless
  • #eBPF
  • Koyeb introduced Light Sleep to reduce cold starts to around 200ms for CPU workloads.
  • Transitioned from Firecracker to Cloud Hypervisor for broader hardware support, including GPUs.
  • Integrated Kata Containers for flexibility in swapping between different VMM backends.
  • Implemented snapshotting with pause_with_snapshot and resume_from_snapshot endpoints.
  • Encountered and resolved issues with virtio-fs and network restoration during snapshotting.
  • Used eBPF for kernel-level idle detection and to ignore health check traffic.
  • Developed scaletozero-agent to monitor and manage VM sleep and wake cycles.
  • Proxied health checks to prevent Nomad from restarting paused services.
  • Achieved near-instant wakeups by leveraging TCP retries and eBPF signaling.
  • Plans to extend snapshotting to GPU-based services, addressing VRAM preservation challenges.