| ▲ | fnordpiglet 8 hours ago | |||||||
The problem is even if an OSS had the resources (massive data centers the size of NYC packed with top end custom GPU kits) to produce the weights, you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6. Unless the very math of frontier LLMs changes, don’t expect frontier OSS on par to be practical. | ||||||||
| ▲ | lukeschlather 7 hours ago | parent | next [-] | |||||||
I feel like you're overstating the resources required by a couple orders of magnitude. You do need a GPU farm to do training, but probably only $100M, maybe $1B of GPUs. And yes, that's a lot of GPUs, but they will fit in a single datacenter, and even in dollar terms, there are many individual buildings in NYC that are cheaper. | ||||||||
| ▲ | palmotea 8 hours ago | parent | prev | next [-] | |||||||
> you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6. It's probably a trade secret, but what's the actual per-user resource requirement to run the model? | ||||||||
| ||||||||
| ▲ | supern0va 7 hours ago | parent | prev [-] | |||||||
There's already an ecosystem of essentially undifferentiated infrastructure providers that sell cheap inference of open weights models that have pretty tight margins. If the open weights models are good, there are people looking to sell commodity access to it, much like a cloud provider selling you compute. | ||||||||