Client-side load balancing at a million requests per second
a day ago
- #microservices
- #load-balancing
- #performance-optimization
- Client-side load balancing (CSLB) was built to replace shared edge ingress load balancer for internal fan-out traffic, handling over a million requests per second.
- Key features added include N-ring fade-in for gradual scale-up, occupancy-based bounded load for even distribution, and AZ-aware routing with latency health factors.
- Migration from Skipper to CSLB reduced latency spikes, improved stability, and cut costs by scaling down Skipper fleet and reducing pod count through better load management.
- Implementation involved hash parity with Skipper using xxHash64, Kubernetes discovery via watch-based informer, and a hardened pipeline for faster, safer deployments.
- Hardening measures included retry policies, FIFO buffers, and enhanced logging, which revealed and mitigated node-level network freezes without causing incidents.
- Lessons learned: owning routing decisions provides valuable telemetry but adds complexity; cache locality conflicts with traffic isolation; and fast deployment pipelines reduce risk.
- Future work includes resuming AZ-aware routing trials to evaluate cost savings versus cache fragmentation, especially during peak traffic events.