| ▲ | mkozlows 13 hours ago | |||||||||||||||||||
I'm not remotely cutting edge (just switched from Cursor to Codex CLI, have no fancy tooling infrastructure, am not even vaguely considering git worktrees as a means of working), but Opus 4.5 and 5.2 Codex are both so clearly more competent than previous models that I've started just telling them to do high-level things rather than trying to break things down and give them subtasks. If people are really set in their ways, maybe they won't try anything beyond what old models can do, and won't notice a difference, but who's had time to get set in their ways with this stuff? | ||||||||||||||||||||
| ▲ | christophilus 12 hours ago | parent | next [-] | |||||||||||||||||||
I mostly agree, but today, Opus 4.5 via Claude code did something pretty dumb stuff in my codebase— N queries where one would do, deep array comparison where a reference equality check would suffice, very complex web of nested conditionals which a competent developer would have never written, some edge cases where the backend endpoints didn’t properly verify user permissions before overwriting data, etc. It’s still hit or miss. The product “worked” when I tested it as a black box, but the code had a lot of rot in it already. Maybe that stuff no longer matters. Maybe it does. Time will tell. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | nineteen999 6 hours ago | parent | prev [-] | |||||||||||||||||||
Also not a cutting edge user, but do run my own LLM's at home and have been spending a lot of time with Claude CLI last few months. It's fine if you want Claude to design your API's without any input, but you'll have less control and when you dig down into the weeds you'll realise it's created a mess. I like to take both a top-down and bottoms-up approach - design the low level API with Claude fleshing out how it's supposed to work, then design the high level functionality, and then tell it to stop implementing when it hits a problem reconciling the two and the lower level API needs revision. At least for things I'd like to stand the test of time, if its just a throwaway script or tool I care much less as long as it gets the job done. | ||||||||||||||||||||