| ▲ | cmrdporcupine 8 hours ago | ||||||||||||||||
Do this -- take your coworker's PRs that they've clearly written in Claude Code, and have Codex/GPT 5.4 review them. Or have Codex review your own Claude Code work. It then becomes clear just how "sloppy" CC is. I wouldn't mind having Opus around in my back pocket to yeet out whole net new greenfield features. But I can't trust it to produce well-engineered things to my standards. Not that anybody should trust an LLM to that level, but there's matters of degree here. | |||||||||||||||||
| ▲ | kevinsync 7 hours ago | parent | next [-] | ||||||||||||||||
I've been using Claude and Codex in tandem ($100 CC, $20 Codex), and have made heavy use of claude-co-commands [0] to make them talk. Outside of the last 1-2 weeks (which we now have confirmation YET AGAIN that Claude shits the fucking bed in the run-up to a new model release), I usually will put Claude on max + /plan to gin up a fever dream to implement. When the plan is presented, I tell it to /co-validate with Codex, which tends to fill in many implementation gaps. Claude then codes the amended plan and commits, then I have a Codex skill that reviews the commit for gaps, missed edge cases, incorrect implementation, missed optimizations, etc, and fix them. This had been working quite well up until the beginning of the month, Claude more or less got CTE, and after a week of that I swapped to $100 Codex, $20 CC plans. Now I'm using co-validation a lot less and just driving primarily via Codex. When Claude works, it provides some good collaborative insights and counter-points, but Codex at the very least is consistently predictable (for text-oriented, data-oriented stuff -- I don't use either for designing or implementing frontend / UI / etc). As always, YMMV! | |||||||||||||||||
| |||||||||||||||||
| ▲ | afavour 8 hours ago | parent | prev | next [-] | ||||||||||||||||
> It then becomes clear just how "sloppy" CC is. Have you done the reverse? In my experience models will always find something to criticize in another model's work. | |||||||||||||||||
| |||||||||||||||||
| ▲ | woadwarrior01 8 hours ago | parent | prev [-] | ||||||||||||||||
It cuts both ways. What I usually do these days is to let codex write code, then use claude code /simplify, have both codex and claude code review the PR, then finally manually review and fixup things myself. It's still ~2x faster than doing everything by myself. | |||||||||||||||||
| |||||||||||||||||