Flat Datacenter Networks at Scale at Amazon
6 days ago
- #network-design
- #data-center-networks
- #random-graph-theory
- Random graphs were theorized as optimal expanders, inspiring networks with simple random wiring for strong connectivity.
- Fat-tree topology became the industry standard in data centers, using randomization in traffic but retaining hierarchical structure.
- The Jellyfish proposal connected random graphs to data center networks, but unresolved issues like routing and cabling persisted.
- AWS researchers developed RNG, overcoming key challenges with Spraypoint routing, ShuffleBox cabling, and operational models.
- RNG outperforms fat-trees in resilience, efficiency, and scalability, with successful deployment in Amazon data centers by 2026.
- RNG has limitations in operational complexity and stochastic guarantees, mitigated through specialized tools and explicit design.