| ▲ | brianwawok 4 hours ago | |
So many more efficiencies possible at scale though. I cannot keep a local model 98% utilized 24/7, at least not with my current workload. A big cloud can. I can’t power my servers with DC, I have this AC to DV conversion nonsense. The list goes on. | ||
| ▲ | visarga 4 hours ago | parent [-] | |
Besides fill factor being hard to match, there is also scaling - you can't scale local inference 10x for a spike, but you can with cloud inference. | ||