Building Your Own Efficient uint128 in C++

3 days ago

Copy Link

Implementation of a fixed-width uint128 in C++ using two u64 limbs for efficient arithmetic operations.
Use of intrinsics like _addcarry_u64, _subborrow_u64, and _mulx_u64 for direct mapping to x64 instructions.
Comparison of generated assembly with builtin __uint128_t showing identical performance for basic operations.
Focus on unsigned arithmetic, fixed width, and modern x64 architecture with notes on MSVC and AArch64.
Explanation of multiplication strategy using base \(2^{64}\) digits and discarding unnecessary high bits.
Optimized comparison using hardware borrow flag for minimal branching.
Extension possibilities to larger integer types and signed variants.
Platform-specific notes on PowerPC, GCC codegen quirks, and the impracticality of division for large integers.
Mention of _BitInt(N) type with performance caveats for widths greater than 128 bits.

Hasty Briefsbeta