They're responding to the people doing things like buying the most expensive Mac they can find specifically to do local inference for their AI agents.

Some do it to have control over their ability to use AI. Some do it because they think it will be cheaper to not have to pay a SaaS to generate tokens for them.

But for those interested in the latter case, it seems like it's not actually cheaper after all, at least at current prices. But then I don't expect prices to drastically jump because of how much competition there is in model development.

▲

datadrivenangel 3 hours ago | parent | next [-]

It's worth paying a premium for the privacy (assuming that llama.cpp and ollama aren't sending my sessions back to the cloud regardless...), and for the concerns about not getting a surprise bill.

	▲	an hour ago \| parent [-]
		[deleted]

▲

dcrazy 2 hours ago | parent | prev [-]

You also have control over your costs. It is reasonable to assume that tokens will cost significantly more in the near to medium future as the market consolidates and subsidies decline.