I think you should try an OpenAI model like GPT 5.5. It is better at following instructions and boundaries set during prompt. It feels like a more capable "agent assistant" than Claude models but without loss of intelligence.

Most of my work involves "Agentic engineering" instead of fire-and-forget. I like to stay involved during the planning as well as review and ask a lot more questions from the agent than I've seen others doing. In a way, I'm using the agent in a sort of "hyper auto-complete" mode to fill in the blanks (rather big blanks) once I've set out the requirements, scope and design (sometimes specific module boundaries). This works best for me.

▲

ifwinterco 7 hours ago | parent [-]

I prefer GPT 5.5 to Opus but both are absurdly expensive token hogs, I can't afford to use either as my main model at $work with the monthly spend cap we have.

I use Composer (since we use Cursor) or GPT 5.3-codex as my workhorse models and only break out the big guns when I have a genuinely difficult problem to solve.

IMO somewhat weirdly 5.3-codex might be the best overall coding model OpenAI have ever released. It's 90% as good as 5.5 and costs about 20% as much, since it's both cheaper per token and uses fewer tokens for the same task.

I'll miss it when they inevitably deprecate it, but hopefully I can use Kimi K2.7 by then

	▲	skeptic_ai 2 hours ago \| parent \| next [-]
		Buy 5 accounts at 20usd each. It’s 100 and lasts decently on single threaded work
	▲	m3h 7 hours ago \| parent \| prev [-]
		I didn't realize GPT 5.3 Codex was that good. OpenAI claims to have made their new Terra model as good as GPT 5.5, but with half the cost per intelligence. Hopefully, this will bring it closer to the price you're expecting (or even better considering GPT models have good acceptance/success rates according to benchmarks).