Remix.run Logo
cowsandmilk 5 hours ago

That is 100k vs 130k for Google’s new announcement. I can’t speak as to whether the additional 30k presented new challenges though.

Cthulhu_ 2 hours ago | parent [-]

I want to believe that this is an order-of-magnitude kind of problem, that is, if 100K is fine then 500K is also fine.

I only skimmed the article though, but I'm confident that it's more a physical hardware, time, space and electricity problem than a software / orchestration one; the article mentions that a cluster that size needs to be multi-datacenter already given the sheer power requirements (2700 watts for one GPU in a single node).