Hasty Briefsbeta

Bilingual

A tale about fixing eBPF spinlock issues in the Linux kernel

6 hours ago
  • #eBPF
  • #Spinlocks
  • #Linux Kernel
  • Superluminal, a CPU profiler, encountered periodic system freezes on Linux during captures.
  • The issue was traced to a complex interaction between eBPF programs handling sampling and context switch events.
  • Debugging revealed a recursive spinlock acquisition scenario involving non-maskable interrupts (NMIs).
  • The resilient queued spinlock (rqspinlock) in the Linux kernel had a race condition during deadlock detection.
  • Fixes involved reordering lock acquisition steps and improving deadlock checks to handle NMIs correctly.
  • The issue was specific to newer kernels (6.15+) due to the introduction of rqspinlock in eBPF ring buffers.
  • Patches were backported to kernels 6.17 and 6.18, with a workaround implemented for older kernels.