This gives me hope that when the subsidization circus ends and everyone is on pure usage then it won't be entirely exclusionary to mere mortals who don't have $200pm budgets.

▲

542458 2 hours ago | parent | next [-]

IMO there are two things that make me optimistic that we won’t see a big rug pull where price-to-capability ratio skyrockets relative to today:

* As you’ve noted, people keep finding ways of slamming more intelligence into smaller models, meaning that a given hardware spec delivers more model capability over time.

* Hardware will continue to improve and supply will catch up to demand, meaning that a dollar will deliver more hardware spec over time.

I hope that one day we’ll look back on the current model of “accessing AI through provider APIs” the same way we now look back on “everyone connecting to the company mainframe.”

	▲	spacebanana7 an hour ago \| parent [-]
		I also hope that we’ll find effective ways to distribute load between small local models and heavyweight remote models. Sort of like what Apple tried to do in iOS. So much of what I ask codex to do doesn’t require full GPT 5 intelligence, and if 75% of the tokens were generated locally that’d save a massive amount of cost.

▲

100ms an hour ago | parent | prev [-]

By the time the dust settles I wouldn't be surprised if personal interactive usage couldn't even be had for under $200. I can't fit my modelling of the serving costs of these things to any public reporting, even the more bearish examples

	▲	Havoc an hour ago \| parent [-]
		Comes down to what you mean by interactive usage. Most of chat & say openclaw usage is already within self-host range so no need to spend 200 a month on that. High end SOTA coding is harder, but even there I suspect a mix of usage based strong models and selfhost small is viable if necessary.