Jetrelay: A high-performance ATproto relay in 500 LOC
a year ago
- #linux
- #pubsub
- #performance
- Jetrelay is a pub/sub server compatible with Bluesky’s 'jetstream' data feed, designed to be highly efficient by leveraging Linux kernel features.
- The server can saturate a 10 Gbps network connection with just 8 CPU cores by avoiding unnecessary work.
- Jetrelay uses multicast-like behavior for TCP-based websockets, sending identical data to all clients efficiently.
- Key techniques include using `sendfile()` to bypass userspace for zero-copy data transfer, `io_uring` for handling many clients with minimal syscalls, and `FALLOC_FL_PUNCH_HOLE` for discarding old data without disrupting clients.
- The server maintains an in-memory index (`BTreeMap`) to quickly locate data for clients requesting backfill based on timestamps.
- Testing showed Jetrelay can handle up to 9,000 clients, saturating a 10 Gbps connection, with performance scaling linearly with CPU cores until bottlenecked by the network.
- Compared to the official jetstream server, Jetrelay achieves higher throughput for the 'no filtering' use case but lacks features like per-client filtering.
- The post discusses the broader context of push-based internet protocols, comparing ATproto's improvements over RSS, such as signed records and a filesystem-like data model.
- Jetrelay is a tech demo; production use would require additional features like backfill on startup, better websocket compliance, security measures, and more.