How reverse proxies handle concurrent requests at scale (ATS, HAProxy, Envoy)

2 days ago

Reverse proxies face challenges managing tens of thousands of concurrent connections efficiently without blocking, memory bloat, or request drops.
The primary constraint is concurrent connection count, not throughput, as proxies must handle idle connections, WebSocket streams, and keepalive connections.
The 'thin layer constraint' emphasizes minimizing per-connection resource usage to avoid performance degradation during load spikes.
Thread-per-connection models (e.g., Apache HTTPd) are inefficient due to high memory and CPU overhead from thread management.
Event loops (using epoll/kqueue) separate connection holding from work, enabling efficient handling of many connections with minimal threads.
Apache Traffic Server (ATS) uses event threads and continuations for CDN-scale caching but struggles with plugin isolation and complexity.
HAProxy employs a single-process event loop for predictable latency and low memory usage, with later multi-threading via SO_REUSEPORT.
Envoy's thread-per-core model with worker isolation suits service meshes, offering deep L7 programmability and dynamic configuration via xDS.
Graceful restarts, circuit breaking, and connection draining are critical for maintaining service during failures and updates.
Choose ATS for HTTP caching, HAProxy for low-overhead L4/L7 load balancing, and Envoy for dynamic, programmable service mesh sidecars.

Hasty Briefsbeta