Backpressure in Distributed Systems
6 months ago
- #backpressure
- #performance
- #distributed-systems
- Backpressure is crucial in distributed systems and can cause issues like OOM errors, dropped messages, low throughput, network waste, latency increase, and blocked producers if not handled correctly.
- Backpressure occurs when the message production rate exceeds the consumption rate, overwhelming the system.
- Four strategies to handle backpressure: slowing down producers, dropping existing messages, dropping incoming messages, and increasing consumers.
- In a real-time leaderboard example, dropping existing messages was chosen because the final state mattered more than intermediate states.
- TCP handles backpressure using flow control and congestion control mechanisms like the sliding window protocol.
- Backpressure is a common theme in systems like Kafka, gRPC streaming, and Sidekiq.