Hashed sorting is typically faster than hash tables
2 days ago
- #algorithms
- #optimization
- #performance
- Hashed sorting is typically faster than hash tables for counting unique values in large arrays of mostly-unique uint64s.
- Benchmark results show that tuned sorting outperforms hash tables by ~1.5× on non-tiny sizes and up to 4× faster than Rust's standard library hash tables.
- Sorting wins due to more efficient memory bandwidth usage, with radix sort making productive use of cache lines compared to hash tables.
- Radix sort's performance can degrade on non-uniform data distributions, but using an invertible hash function (like MulSwapMul) mitigates this issue.
- Hashed radix sort is viable where batching is possible, outperforming hash tables when keys are accessed a few times, but hash tables win with many repeat accesses per key.
- Parallel implementations of both radix sort and hash tables exist, with radix sort showing good parallel performance in benchmarks.
- Tuning techniques for radix sort include diverting LSD sort, fused hashing and histogramming passes, and inlined insertion sort for small buckets.
- Hash table tuning involves optimizing for specific use cases, such as using a single table instead of metadata+data tables and employing prefetching instructions.