| ▲ | Cthulhu_ 2 hours ago | |
I want to believe that this is an order-of-magnitude kind of problem, that is, if 100K is fine then 500K is also fine. I only skimmed the article though, but I'm confident that it's more a physical hardware, time, space and electricity problem than a software / orchestration one; the article mentions that a cluster that size needs to be multi-datacenter already given the sheer power requirements (2700 watts for one GPU in a single node). | ||