How reverse proxies handle concurrent requests at scale (ATS, HAProxy, Envoy)
2 days ago
- #reverse-proxy
- #scalability
- #concurrency
- Reverse proxies face challenges managing tens of thousands of concurrent connections efficiently without blocking, memory bloat, or request drops.
- The primary constraint is concurrent connection count, not throughput, as proxies must handle idle connections, WebSocket streams, and keepalive connections.
- The 'thin layer constraint' emphasizes minimizing per-connection resource usage to avoid performance degradation during load spikes.
- Thread-per-connection models (e.g., Apache HTTPd) are inefficient due to high memory and CPU overhead from thread management.
- Event loops (using epoll/kqueue) separate connection holding from work, enabling efficient handling of many connections with minimal threads.
- Apache Traffic Server (ATS) uses event threads and continuations for CDN-scale caching but struggles with plugin isolation and complexity.
- HAProxy employs a single-process event loop for predictable latency and low memory usage, with later multi-threading via SO_REUSEPORT.
- Envoy's thread-per-core model with worker isolation suits service meshes, offering deep L7 programmability and dynamic configuration via xDS.
- Graceful restarts, circuit breaking, and connection draining are critical for maintaining service during failures and updates.
- Choose ATS for HTTP caching, HAProxy for low-overhead L4/L7 load balancing, and Envoy for dynamic, programmable service mesh sidecars.