Internals of a Key-Value Store Built on SQLite's B-Tree Engine
a day ago
- #B-tree
- #key-value store
- #SQLite
- SNKV is a key-value store built directly on SQLite's B-tree engine, bypassing SQLite's SQL and VM layers.
- SNKV interacts with SQLite through the kvstore_* public API, with pages flowing through the pager's cache and journal to disk via the VFS abstraction.
- SNKV bypasses SQLite's default WAL auto-checkpoint hook by opening the B-tree via sqlite3BtreeOpen instead of sqlite3Open.
- SNKV requires manual transaction lifecycle management, as it skips the SQL/VDBE layer which normally handles this automatically.
- Core data structures include KVStore, KVColumnFamily, and KVIterator, which manage B-tree handles, column families, and iteration state respectively.
- Key-value pairs are stored in BLOBKEY B-tree tables with a 4-byte big-endian length prefix on the key, followed by key and value bytes.
- Column family metadata is stored in an INTKEY B-tree, with each CF entry identified by an FNV-1a hash of its name.
- SNKV uses a persistent read transaction for the lifetime of an open store, reducing overhead for read operations.
- Write operations manage their own single-operation transactions if no explicit transaction is active, committing the persistent read, beginning a write, performing the operation, and restoring the read.
- SNKV employs two-level recursive mutex locking (CF-level and store-level) to protect state and ensure thread safety.
- WAL auto-checkpointing is managed manually in SNKV, as it bypasses SQLite's default auto-checkpoint hook.
- SNKV provides functions for incremental vacuuming to recover space from deleted pages, requiring an exclusive write transaction.
- The complete call chain for a kvstore_put operation traverses all four layers of the stack: kvstore.c, btree.c, pager.c, and os.c.