| ▲ | taurath 5 hours ago | ||||||||||||||||||||||
These 2 sentences right next to each other stood out to me: > a new step towards becoming a reliable coding partner > GPT‑5.1-Codex-Max is built for long-running, detailed work Does this not sound contradictory? It’s been the shorter form work that has built what little confidence I have in these as a coding partner - a model that goes off and does work without supervision is not a partner to me. | |||||||||||||||||||||||
| ▲ | causal 5 hours ago | parent | next [-] | ||||||||||||||||||||||
Absolutely contradictory. The long-running tendency for Codex is why I cannot understand the hype around it: if you bother to watch what it does and read its code the approaches it takes are absolutely horrifying. It would rather rewrite a TLS library from scratch than bother to ask you if the network is available. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | embirico 5 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
(Disclaimer: Am on the Codex team.) We're basically trying to build a teammate that can do both short, iterative work with you, then as you build trust (and configuration), you can delegate longer tasks to it. The "# of model-generated tokens per response" chart in [the blog introducing gpt-5-codex](https://openai.com/index/introducing-upgrades-to-codex/) shows an example of how we're improving the model good at both. | |||||||||||||||||||||||
| ▲ | ntonozzi 5 hours ago | parent | prev [-] | ||||||||||||||||||||||
If you haven't, give Cursor's Composer model a shot. It might not be quite as good as the top models, but in my experience it's almost as good, and the lightning fast feedback is more than worth the tradeoff. You can give it a task, wait ten seconds, and evaluate the results. It's quite common for it to not be good enough, but no worse than Sonnet, and if it doesn't work you just wasted 30 seconds instead of 10 minutes. | |||||||||||||||||||||||