Yeah, happy to be more specific. No intention of making any technically true but misleading statements.

The following are true:

- In our API, we don't change model weights or model behavior over time (e.g., by time of day, or weeks/months after release)

- Tiny caveats include: there is a bit of non-determinism in batched non-associative math that can vary by batch / hardware, bugs or API downtime can obviously change behavior, heavy load can slow down speeds, and this of course doesn't apply to the 'unpinned' models that are clearly supposed to change over time (e.g., xxx-latest). But we don't do any quantization or routing gimmicks that would change model weights.

- In ChatGPT and Codex CLI, model behavior can change over time (e.g., we might change a tool, update a system prompt, tweak default thinking time, run an A/B test, or ship other updates); we try to be transparent with our changelogs (listed below) but to be honest not every small change gets logged here. But even here we're not doing any gimmicks to cut quality by time of day or intentionally dumb down models after launch. Model behavior can change though, as can the product / prompt / harness.

ChatGPT release notes: https://help.openai.com/en/articles/6825453-chatgpt-release-...

Codex changelog: https://developers.openai.com/codex/changelog/

Codex CLI commit history: https://github.com/openai/codex/commits/main/

▲

jychang an hour ago | parent | next [-]

What about the juice variable?

https://www.reddit.com/r/OpenAI/comments/1qv77lq/chatgpt_low...

	▲	tedsanders an hour ago \| parent \| next [-]
		Yep, we recently sped up default thinking times in ChatGPT, as now documented in the release notes: https://help.openai.com/en/articles/6825453-chatgpt-release-... The intention was purely making the product experience better, based on common feedback from people (including myself) that wait times were too long. Cost was not a goal here. If you still want the higher reliability of longer thinking times, that option is not gone. You can manually select Extended (or Heavy, if you're a Pro user). It's the same as at launch (though we did inadvertently drop it last month and restored it yesterday after Tibor and others pointed it out).
	▲	tgrowazay an hour ago \| parent \| prev [-]
		Isn’t that just how many steps at most a reasoning model should do?

▲

ComplexSystems an hour ago | parent | prev [-]

Do you ever replace ChatGPT models with cheaper, distilled, quantized, etc ones to save cost?

	▲	jghn an hour ago \| parent [-]
		He literally said no to this in his GP post