Eliminating Cold Starts 2: shard and conquer
3 days ago
- #Cloudflare
- #Performance
- #Serverless
- Cloudflare introduced a technique to pre-warm Workers during TLS handshakes to eliminate cold starts.
- Workers have evolved to handle more complex applications, increasing cold start times beyond TLS handshake durations.
- Cloudflare deployed 'Worker sharding' using a consistent hash ring to reduce cold starts by routing requests to existing Workers.
- Cold starts involve fetching, compiling, and initializing Worker scripts, which have become more time-consuming.
- Worker sharding optimizes memory usage and reduces eviction rates by coalescing requests to fewer instances.
- Load shedding techniques ensure graceful handling of overloaded Workers without errors.
- Cap’n Proto RPC is used for efficient cross-instance communication and handling nested Worker invocations.
- Worker sharding reduced global eviction rates by 10x and improved warm request rates from 99.9% to 99.99%.
- Power-law distribution of traffic means sharding a small percentage of requests significantly improves efficiency.
- Future goals include further reducing cold start rates to achieve five 9's in warm request rates.