How to Make Things Slower So They Go Faster
17 days ago
- #performance optimization
- #load management
- #system design
- Synchronized demand occurs when many clients act simultaneously, overwhelming service capacity.
- Usable headroom (H) is the difference between service capacity (μ) and background load (λ₀).
- Peak loads can lead to queues, timeouts, retries, and major incidents.
- Natural alignment comes from clocks, defaults, and state transitions like deployments or cache flushes.
- Adversarial alignment includes DDoS attacks and flash crowds.
- Failure depends on which constraint binds first, such as connection pools or CPU saturation.
- Feedback loops can worsen the situation, with errors leading to retries and more errors.
- Mitigation involves spreading actions over a window (W) to reduce peak load, increasing delay.
- Uniform jitter is optimal for peak reduction and fairness.
- Operational bounds include headroom requirements and Little's Law for concurrency.
- Server hints like Retry-After headers help manage load.
- Prevention involves randomizing TTLs, splaying work, and using jittered backoff.
- Recovery involves draining backlogs safely with pacing and server-side controls.
- Implementation requires forecasting headroom and pacing admissions to match capacity.
- Verification involves tracking metrics like peak-to-average ratios and tail latency.
- Common errors include underestimating demand or overestimating capacity.
- Jitter is an equitable way to manage delay and minimize overload risk.
- Queue when no user is waiting, jitter for fairness, reject if delay is unacceptable, and scale when possible.