Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets
a day ago
- #64-bit-performance
- #compiler-optimization
- #division-by-constants
- Proposes an optimization method for 32-bit unsigned division by constants on 64-bit CPUs, improving on the existing GM method.
- Highlights limitations of current compiler-generated code (e.g., for x/7) that doesn't fully utilize 64-bit capabilities.
- Reports performance improvements: 1.67x speedup on Intel Xeon w9-3495X and 1.98x on Apple M4 in microbenchmarks.
- Mentions practical implementation with patches for LLVM/GCC; LLVM patch already merged into main branch.
- Contextualizes work within existing literature, citing contributions from Granlund, Montgomery, and others.