| ▲ | bigbinary 8 hours ago | |||||||
On-premise LLMs are also getting better and likely won’t stop; as costs go up with the technical improvements, I would imagine cost saving methods to also improve | ||||||||
| ▲ | horsawlarway 7 hours ago | parent [-] | |||||||
I still think it's basically unavoidable that most people who might pay for api access will end up on-prem. Fixed costs, exact model pinning, outage resistant, enshittification resistant, better security, better privacy, etc... There are just so many compelling reasons to be on-prem instead of dependent on a 3rd party hoovering up all your data and prompts and selling you overpriced tokens (which eventually they MUST be, because these companies have to make a profit at some point). If the only counterbalance is "well the api is cheaper than buying my own hardware"... That's a short term problem. Hardware costs are going to drop over time, and capabilities are going to continue improving. It's already pretty insane how good of a model I can run on two old RTX-3090s locally. Is it as good as modern claude? No. Is it as good as claude was 18 months ago? Yes. Give it a decade to see companies really push into the "diminishing returns" of scaling and new models... combined with new hardware built with these workloads in mind... and I think on-prem is the pretty clear winner. | ||||||||
| ||||||||