Hasty Briefsbeta

Bilingual

How much slower is random access, really?

10 months ago
  • #memory-access
  • #benchmarking
  • #performance
  • Random access performance is significantly impacted by cache locality, with first-to-last order being faster than random order.
  • Performance differences become noticeable when arrays exceed the L3 cache size (around 1 million elements).
  • First-to-last order averages about 1 nanosecond per element on a MacBook and 0.5 nanoseconds on a Linux desktop.
  • Random order is 4x slower on MacBook and 8-16x slower on Linux for arrays fitting in RAM but exceeding L3 cache.
  • For arrays larger than RAM, performance degrades sharply, with random order becoming 50x slower on Linux.
  • Fisher-Yates shuffle is inefficient for large datasets; a two-pass shuffle is recommended.
  • Memory-mapped files do not significantly improve performance for large datasets, with macOS and Linux showing different behaviors.
  • Direct file reading can be more efficient than memory-mapping, especially on Linux.