Hasty Briefsbeta

Memory is slow, Disk is fast – Part 2

6 days ago
  • #hardware
  • #optimization
  • #performance
  • Sourcing data directly from disk can be faster than caching in memory due to hardware scaling trends.
  • Disk bandwidth is growing exponentially, while memory access latency has stagnated, challenging traditional caching dogma.
  • An experiment counting occurrences of the number 10 in a dataset shows that optimized disk reads can outperform memory access.
  • Vectorized instructions and loop unrolling significantly improve processing speed by leveraging CPU capabilities.
  • Using io_uring for direct disk access with optimized pipelines can surpass mmap() performance due to reduced memory latency overhead.
  • Memory access patterns, especially with mmap(), introduce latency that can bottleneck performance despite higher bandwidth.
  • Scaling performance requires streaming data efficiently, whether from disk or memory, to leverage bandwidth over latency.
  • Modern hardware trends suggest that traditional approaches may not fully utilize available bandwidth, necessitating new methods.
  • The experiment demonstrates that with careful optimization, disk-based solutions can match or exceed in-memory performance for certain workloads.
  • Future hardware advancements may further blur the lines between memory and disk performance, emphasizing the need for adaptive strategies.