Removing fsync from our local storage engine
2 days ago
- #storage-engine
- #SSD-performance
- #fsync-optimization
- Designed a single-node KV storage engine eliminating fsync for PUT and DELETE operations to improve performance.
- Leveraged fixed-size pre-allocation, O_DIRECT writes, and 4KB atomic journal commits for crash consistency without fsync.
- Achieved ~191k obj/s throughput for 4KB random writes, outperforming ext4 with O_DIRECT and fsync by 1.6x.
- Focused on SSDs only, with operation-atomic writes and a simple KV API, making it unsuitable for HDDs or general use.
- Used an in-memory index (Fractal ART), a journal for crash recovery, and engine-controlled data area allocation.
- Demonstrated 34% higher throughput and 33% lower latency in end-to-end tests compared to a MinIO-like engine.
- Acknowledged limitations including dependency on SSD durability contracts and lack of general transaction support.
- Avoided filesystem metadata updates via pre-zeroing after fallocate, ensuring stable tail latency.