Binary GCD
10 hours ago
- #optimization
- #algorithms
- #gcd
- Euclid's algorithm is a method for finding the GCD of two numbers, based on a recursive formula.
- The standard implementation involves integer division, which is slow, taking about 90% of the execution time in assembly.
- The binary GCD algorithm uses shifts, comparisons, and subtractions instead of division, making it potentially faster.
- Initial implementation of binary GCD is inefficient due to excessive branching, so optimizations are introduced.
- Optimizations include using __builtin_ctz for power-of-two divisions, minimizing iterations, and reducing branching.
- Final optimized version runs in 91ns, almost 2x faster than std::gcd, which takes 198ns.
- Further micro-optimizations are possible, such as manual assembly tweaks or lookup tables, to reduce time further.