The Limits of NTP Accuracy on Linux
16 days ago
- #Time Synchronization
- #Linux
- #NTP
- The article explores the limits of NTP (Network Time Protocol) accuracy on Linux systems, aiming for synchronization within microseconds.
- GPS receivers exhibit inherent inaccuracies, with differences up to 200 ns observed between devices, and datasheets indicating jitter around 5 ns for high-end modules.
- Network complexity introduces systemic errors of 200-300 ns, with asymmetric paths and varying NIC (Network Interface Card) performances affecting timing accuracy.
- Different NICs show varying suitability for sub-microsecond timing, with Intel E810 and X710 performing well, while Realtek cards are less reliable.
- Linux systems can experience significant timing inaccuracies due to low-latency issues, with stalls from SMBIOS or power management causing pauses of hundreds of microseconds.
- Testing setup includes multiple GPS-backed NTP servers and eight identical Linux servers, with Chrony used for time synchronization and metrics stored in Prometheus.
- Network topology, including redundant links and ECMP (Equal-Cost Multi-Path) routing, contributes to timing inconsistencies due to asymmetric traffic paths.
- Chrony's time synchronization claims median offsets of 25–110 ns, but real-world measurements show discrepancies influenced by GPS drift and network asymmetries.
- Cross-server synchronization tests reveal offsets up to 207 ns between servers, highlighting limitations in achieving nanosecond-level accuracy with current hardware.
- Adjustments to TAI (International Atomic Time) offsets were necessary to align NTP sources, reducing inconsistencies from microseconds to nanoseconds.
- Jitter measurements show desktop systems with high-quality NICs and GNSS modules perform best (~1.01 μS), while Raspberry Pi-based NTP servers exhibit higher jitter (~2.02 μS).
- Factors degrading time synchronization include NICs without hardware timestamps, network tunnels, packet coalescing, and software processing delays.
- The author concludes that achieving ~10 ns accuracy is impractical with current hardware, but sub-microsecond synchronization (~200-500 ns) is feasible across the network.