Exposing ZFS volumes over the network via NVMe-oF
19 hours ago
- #Postgres
- #NVMe-oF
- #distributed-storage
- Xatastor is a distributed storage system designed to support millions of Postgres databases and branches, focusing on cost efficiency and scalability.
- It addresses the challenge of managing a large number of mostly idle volumes, unlike SPDK-based solutions optimized for fewer busy volumes.
- Xatastor uses ZFS zvols for block storage, leveraging features like snapshots, clones, thin provisioning, and compression.
- The system includes a custom user-space implementation of NVMe-over-fabrics (NVMe-oF) in Rust, built on monoio and io_uring for high performance.
- A Kubernetes operator serves as the control plane, using CRDs for volume management and integrating with standard Kubernetes APIs.
- Redundancy is handled at the Postgres level with read-replicas, avoiding complex storage-layer replication to simplify the system and reduce costs.
- Performance benchmarks show Xatastor matches SPDK-based solutions while using fewer hardware resources, with compression offering significant cost savings.
- The system is tailored for use cases like database-per-tenant architectures, coding agents, and platforms requiring scale-to-zero capabilities.