i think we have the causation backwards here. llms aren't expensive because they have to be — they're expensive because we keep reaching for the expensive model instead of putting any effort into making the cheap one good enough.

a surprisingly large fraction of production workloads can be handled by smaller models with the right scaffolding. it's often easier to switch to a larger model than to engineer those pieces, so many teams never bother.

my intuition is that a lot of the current "ai cost crisis" is really an orchestration problem rather than a model pricing problem. before asking whether frontier pricing is sustainable, i'd first ask how much of that spend is simple tasks being sent to the smartest available model by default.

my bet for the next few years is that the model itself stops being where the value is. frontier models will become more like commodities, and the real difference will be the layer around them as routing each task to the cheapest model that can do it well, verifying the output, and only escalating when needed.

eventually, asking "which model do you use?" will sound a bit like asking "which cpu do you use?" the engine still matters, but the system built around it matters a lot more.

▲

byzantinegene 10 hours ago | parent [-]

Unfortunately, the economics of what you suggest do not justify the trillion dollar valuations of OpenAI and Anthropic.

	▲	arbayi 10 hours ago \| parent [-]
		[flagged]