Apache Iggy: thread-per-core with io_uring in Rust
14 hours ago
- #thread-per-core
- #io_uring
- #performance
- Apache Iggy migrated to a thread-per-core architecture powered by io_uring for better performance and scalability.
- The previous architecture using tokio faced issues with block device I/O and thread pool limitations.
- Thread-per-core architecture improves scalability by reducing lock contention and improving cache locality.
- io_uring was chosen for its completion-based I/O model, which is more efficient than the poll-based model used by tokio.
- Three async runtimes were evaluated: monoio, glommio, and compio, with compio being the final choice due to its active maintenance and broad io_uring feature coverage.
- Challenges included issues with interior mutability and the need for a better API to handle shared state.
- The solution involved dividing resources into shared, strongly consistent resources and sharded, eventually consistent ones, using a single-writer principle.
- Performance benchmarks showed significant improvements in throughput and latency, especially with higher loads and strong consistency mode.
- The state of Rust async runtimes ecosystem was discussed, highlighting the lack of a Rust equivalent to the Seastar framework and the limitations of POSIX-compliant APIs.
- Future work includes clustering using Viewstamped Replication and further optimizations.