Scaling Akvorado BMP RIB with Sharding
2 days ago
- #sharding
- #scalability
- #BMP
- Akvorado uses BMP to associate routing information like AS paths or BGP communities with flows, requiring scaling to tens of millions of routes.
- Previous implementation used a global read/write lock on a prefix tree and route map, causing contention between Kafka workers, BMP updates, and route flushing.
- RIB sharding splits the routing database into shards, each with its own lock and intern pools, reducing contention and improving performance.
- First step involved basic sharding with shard-specific locks and route maps, confirmed to improve BMP receiver stability.
- Second step introduced lock-free reads using copy-on-write for the prefix tree and generation numbers to prevent stale prefix index issues during concurrent updates.
- Benchmarks show reduced read and write latencies, though high writer counts can degrade read performance, and the second step further improves read latency.