Building Your Own Efficient uint128 in C++
3 days ago
- #Optimization
- #C++
- #Fixed-width Arithmetic
- Implementation of a fixed-width uint128 in C++ using two u64 limbs for efficient arithmetic operations.
- Use of intrinsics like _addcarry_u64, _subborrow_u64, and _mulx_u64 for direct mapping to x64 instructions.
- Comparison of generated assembly with builtin __uint128_t showing identical performance for basic operations.
- Focus on unsigned arithmetic, fixed width, and modern x64 architecture with notes on MSVC and AArch64.
- Explanation of multiplication strategy using base \(2^{64}\) digits and discarding unnecessary high bits.
- Optimized comparison using hardware borrow flag for minimal branching.
- Extension possibilities to larger integer types and signed variants.
- Platform-specific notes on PowerPC, GCC codegen quirks, and the impracticality of division for large integers.
- Mention of _BitInt(N) type with performance caveats for widths greater than 128 bits.