Remix.run Logo
yorwba 6 hours ago

If by "they" you mean DeepSeek, they're not saying this, since you might not actually be able to rent a cluster of 512 H800s wired together with high-bandwidth interconnects at that GPU-hour price point. If you rent smaller groups of GPUs piecemeal in different locations and try to transfer weight updates between them over the internet, it'll kill your throughput.