Bijou64: A variable-length integer encoding
2 hours ago
- #performance-optimization
- #canonical-data-formats
- #varint-encoding
- The article introduces bijou64, a varint encoding that ensures canonicality by design, meaning each integer has exactly one representation, unlike LEB128 where numbers can have multiple encodings.
- Bijou64 uses a first-byte tag system to indicate length, allowing O(1) memory allocation and faster decoding without continuation-bit scanning, leading to performance gains of 2-10x over LEB128 in benchmarks.
- The encoding offsets payloads to prevent redundant representations, and while it's not always the most compact, its size is comparable to LEB128 in realistic workloads, with advantages for security-sensitive applications like signed data.