The fastest Linux timestamps
16 hours ago
- #Linux Timestamps
- #Performance Optimization
- #Low-Latency Systems
- The author optimized timestamping for distributed tracing in a low-latency pipeline, aiming for under 50-100 ns per span.
- Initial benchmarks of standard C++ clocks showed overheads of 46-49 ns, consuming most of the time budget.
- The TSC (timestamp counter) on x86 provides fast, invariant timestamps, but requires serialization and cycle-to-nanosecond conversion.
- The vDSO allows efficient clock_gettime() calls by reading kernel-shared data, but still involves multiple TSC reads and conversions.
- Custom timers like TscTimer and VdsoTimer reduced median latency by up to 57%, with VdsoTimer achieving 20.5 ns.
- Kernel updates to the vDSO data page cause tail latencies, exceeding 200 ns due to cache misses or seqlock spins.
- Caching timer data (e.g., VdsoCacheTimer, TscCacheTimer) eliminated tail latencies, offering stable performance around 20 ns.
- TscCacheTimer is preferred for portability, while VdsoCacheTimer offers better accuracy by tracking kernel frequency adjustments.
- The study emphasizes understanding underlying abstractions for predictable latency, beyond average benchmarks.
- Benchmarks were conducted on an Intel Core i7-8565U with isolated cores and performance tuning for consistency.