S3 scales to petabytes a second on top of slow HDDs
a day ago
- #Distributed Systems
- #AWS S3
- #Cloud Storage
- AWS S3 operates at a massive scale with 400+ trillion objects, 150 million requests per second, and over 1 petabyte per second of peak traffic.
- S3 achieves high throughput and low latency despite using slow HDDs by leveraging massive parallelism, erasure coding, and efficient load balancing.
- Hard Disk Drives (HDDs) are slow due to mechanical movements (seek and rotation), but they are cost-effective for large-scale storage.
- S3 uses a 5-of-9 erasure coding scheme, which provides durability and flexibility while storing only 1.8x the original data.
- Parallelism is key: S3 spreads data across millions of disks, uses multipart uploads/downloads, and balances load across front-end servers and storage nodes.
- To avoid hot spots, S3 randomizes data placement, continuously rebalances data, and benefits from workload decorrelation at scale.
- S3's architecture includes shuffle sharding, power of two random choices for load balancing, and hedge requests to reduce tail latency.
- The system's predictability improves with scale, as independent workloads smooth out bursty demand patterns.
- S3's cost efficiency comes from multi-tenancy, erasure coding, and economies of scale, making it a backbone for modern data infrastructure.