productivity (tokens per second per hardware unit) increases at the cost of output quality, but the price remains the same.

both Anthropic and OpenAI quantize their models a few weeks after release. they'd never admit it out loud, but it's more or less common knowledge now. no one has enough compute.

▲

sthimons 4 hours ago | parent | next [-]

Pretty bold claim - you have a source for that?

	▲	Rapzid 3 hours ago \| parent [-]
		There is no evidence TMK that the accuracy the models change due to release cycles or capacity issues. Only latency. Both Anthropic and OpenAI have stated they don't do any inference compute shenanigans due to load or post model release optimization. Tons of conspiracy theories and accusations. I've never seen any compelling studies(or raw data even) to back any of it up.

▲

cebert 4 hours ago | parent | prev [-]

Do you have a source for that claim?

	▲	b65e8bee43c2ed0 3 hours ago \| parent [-]
		my source is that people have been noticing this since GPT4 days. https://arxiv.org/pdf/2307.09009 but of course, this isn't a written statement by a corporate spokespersyn. I don't think that breweries make such statements when they water their beer either.