| ▲ | Flat Datacenter Networks at Scale at Amazon(perspectives.mvdirona.com) | ||||||||||||||||||||||
| 83 points by tanelpoder 21 hours ago | 17 comments | |||||||||||||||||||||||
| ▲ | epistasis 4 hours ago | parent | next [-] | ||||||||||||||||||||||
Oh man, James Hamilton blog posts, I love these things! (Edit: for more concrete details, the Arxiv paper linked from the blog post is here https://arxiv.org/pdf/2604.15261 and the amazon.science link has some higher level view of the details https://www.amazon.science/blog/how-flat-is-replacing-fat-in... ) > The results were striking: compared to traditional fat-tree networks, RNG (Resilient Network Graphs) uses 69% fewer routers, delivers 33% higher throughput, cuts network power by 40%, and lowers operating costs by 27. In early 2026, RNG became the default design for most newly built Amazon data centers globally. > For cabling, they developed the ShuffleBox—a passive optical device whose internal wiring combined with randomized ShuffleBox-to-ShuffleBox cabling yields “quasi-random” graphs that behave like truly random graphs. This is pretty incredible, random layouts of networks that have on-average better properties... I'm really curious about the long tail of performance though. What is the worst case scenario here? And are there some better case scenarios? Uniformity in Clos networks is pretty great, but many loads don't need uniformity, and if these RNG-based networks have non-uniformity, perhaps that has operational characteristics that can be helpful or harmful. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | socketcluster 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
Interesting reading this because this is essentially the principle behind https://socketcluster.io/ scalability; the sharding of channels across available brokers is pseudo-random. It uses a hash function for determinism but the distribution appears to be random and that was also the best way I could find to distribute load evenly between available nodes. It is key to its embarrassingly parallel design. It's interesting to see it being done at the data centre level as well. | |||||||||||||||||||||||
| ▲ | protocolture an hour ago | parent | prev | next [-] | ||||||||||||||||||||||
I get the feeling I am missing some info (Like what is meant here by Data Center Networks) thats preventing me from understanding whats happening here. I am guessing that this falls outside of the traditional rack/colo paradigm and has more to do with hyperscalers. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | fdr 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
I always like these randomized/semi-randomized network papers. Here's a little known one you might enjoy if you liked this one. https://repositorio.unican.es/xmlui/handle/10902/23594 | |||||||||||||||||||||||
| ▲ | kev009 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
It's not that dissimilar to how the Internet works. Although you have some steering like IX peering switches, and social/economic factors, but in whole it is fairly random. | |||||||||||||||||||||||
| ▲ | wofo 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
The win in operating costs is impressive, bordering in the unbelievable (27x). Does anyone have a clue about where the win comes from? | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | mino 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
Good 6 minute video explainer: https://youtube.com/watch?v=yDoRYRRPOA0 | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | cyberax 2 hours ago | parent | prev [-] | ||||||||||||||||||||||
One interesting consequence of this is that it's now (for the first time!) possible to get unlimited AWS egress. It's not cheap, and it's limited to `us-east-1`, but it's at least _possible_ now via AWS Interconnect: https://aws.amazon.com/interconnect/lastmile/pricing/ | |||||||||||||||||||||||
| |||||||||||||||||||||||