| ▲ | Asyne 2 hours ago | ||||||||||||||||||||||
+1 to this, I've found GPT/Codex models consistently stronger in engineering tasks (such as debugging complex, cross-systems issues, concurrency problems, etc). I use both OpenAI and Anthropic models, though for different purposes, what surprises me is how underrated GPT still feels (or, alternatively, how overhyped Anthropic models can be) given how capable it is in these scenarios. There also seems to be relatively little recognition of this in the broader community (like your recent YouTube video). My guess is that demand skews toward general codegen rather than the kind of deep debugging and systems work where these differences really show. | |||||||||||||||||||||||
| ▲ | mediaman 2 hours ago | parent | next [-] | ||||||||||||||||||||||
It's surprising to me how much LLM "personality" seems to matter to people, more than actual capability. I do turn to Anthropic for ideation and non-tech things. But I find little reason to use it over codex for engineering tasks. Sometimes for planning, but even there, 5.4 is more critical of my questionable ideas, and will often come up with simpler ways to do things (especially when prompted), which I appreciate. And I don't do hard-tech things! I've chosen a b2b field where I can provide competent products for a niche that is underserved and where long term relationships matter, simply because I'm not some brilliant engineer who can completely reinvent how something is done. I'm not writing kernels or complex ML stacks. So I don't really understand what everyone is building where they don't see the limits of Opus. Maybe small greenfield projects with few users. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | dvfjsdhgfv an hour ago | parent | prev [-] | ||||||||||||||||||||||
I use codex for cleaning after cloude and it always finds so many bugs, some of them quite obvious. | |||||||||||||||||||||||