Parsing Protobuf Like Never Before
10 months ago
- #Performance
- #Protobuf
- #Go
- The author has worked on high-performance Protobuf projects, including C++ and Rust runtimes, and integrating UPB, the fastest Protobuf runtime.
- hyperpb is a new library that brings UPB optimizations to Go, outperforming existing Go Protobuf parsers like Protobuf Go and vtprotobuf.
- UPB is a dynamic Protobuf parser using data tables and a VM-like interpreter, avoiding the drawbacks of traditional generated parsers.
- Go's C FFI (cgo) is inefficient, leading to the development of hyperpb in pure Go to avoid cgo's performance penalties.
- hyperpb leverages Go's unique features like its register ABI, lack of undefined behavior, and robust reflection system for optimizations.
- The library includes a JIT compiler for Protobuf parsing, utilizing runtime-generated profiles for optimization (online PGO).
- hyperpb's API is simple, focusing on read-only use cases like validation, with compilation steps similar to regexp.Compile.
- Key optimizations in hyperpb include zerocopy strings, repeated preloads, map optimizations, arena reuse, and true oneof unions.
- The parser VM is designed to maximize register usage and minimize stack spills, with a focus on indirect branch prediction for performance.
- Future optimizations could include SIMD for varint parsing, smarter parser scheduling, and inline allocation of small submessages.