S3 scales to petabytes a second on top of slow HDDs

a day ago

Copy Link

AWS S3 operates at a massive scale with 400+ trillion objects, 150 million requests per second, and over 1 petabyte per second of peak traffic.
S3 achieves high throughput and low latency despite using slow HDDs by leveraging massive parallelism, erasure coding, and efficient load balancing.
Hard Disk Drives (HDDs) are slow due to mechanical movements (seek and rotation), but they are cost-effective for large-scale storage.
S3 uses a 5-of-9 erasure coding scheme, which provides durability and flexibility while storing only 1.8x the original data.
Parallelism is key: S3 spreads data across millions of disks, uses multipart uploads/downloads, and balances load across front-end servers and storage nodes.
To avoid hot spots, S3 randomizes data placement, continuously rebalances data, and benefits from workload decorrelation at scale.
S3's architecture includes shuffle sharding, power of two random choices for load balancing, and hedge requests to reduce tail latency.
The system's predictability improves with scale, as independent workloads smooth out bursty demand patterns.
S3's cost efficiency comes from multi-tenancy, erasure coding, and economies of scale, making it a backbone for modern data infrastructure.

Hasty Briefsbeta