| ▲ | johnfn 5 hours ago | ||||||||||||||||||||||||||||||||||
I've been using a lot of Claude and Codex recently. One huge difference I notice between Codex and Claude code is that, while Claude basically disregards your instructions (CLAUDE.md) entirely, Codex is extremely, painfully, doggedly persistent in following every last character of them - to the point that i've seen it work for 30 minutes to convolute some solution that was only convoluted because of some sentence I threw in the instructions I had completely forgotten about. I imagine Codex as the "literal genie" - it'll give you exactly what you asked for. EXACTLY. If you ask Claude to fix a test that accidentally says assert(1 + 1 === 3), it'll say "this is clearly a typo" and just rewrite the test. Codex will rewrite the entire V8 engine to break arithmetic. Both these tools have their uses, and I don't think one approach is universally better. Because Claude just hacks its way to a solution, it is really fast, so I like using it for iterate web work, where I need to tweak some styles and I need a fast iterative loop. Codex is much worse at that because it takes like 5 minutes to validate everything is correct. Codex is much better for longer, harder tasks that have to be correct -- I can just write some script to verify that what it did work, and let it spin for 30-40 minutes. | |||||||||||||||||||||||||||||||||||
| ▲ | hadlock 4 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
I've been really impressed with codex so far. I have been working on a flight simulator hobby project for the last 6 months and finally came to the conclusion that I need to switch from floating origin, which my physics engine assumes with the coordinate system it uses, to a true ECEF coordinate system (what underpins GPS). This involved a major rewrite of the coordinate system, the physics engine, even the graphics system and auxilary stuff like asset loading/unloading etc. that was dependent on local X,Y,Z. It even rewrote the PD autopilot to account for the changes in the coordinate system. I gave it about a paragraph of instructions with a couple of FYIs and... it just worked! No major graphical glitches except a single issue with some minor graphical jitter, which it fixed on the first try. In total took about 45 minutes but I was very impressed. I was unconvinced it had actually, fully ripped out the floating origin logic, so I had it write up a summary and then used that as a high level guide to pick through the code and it had, as you said, followed the instructions to the letter. Hugely impressive. In march of 2023 OpenAI's products struggled to draw a floating wireframe cube. | |||||||||||||||||||||||||||||||||||
| ▲ | nico 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
> Claude basically disregards your instructions (CLAUDE.md) entirely A friend of mine tells Claude to always address him as “Mr Tinkleberry”, he says he can tell when Claude is not paying attention to the instructions on CLAUDE.md when Claude stops calling him “Mr Tinkleberry” consistently | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | causal 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
> Codex will rewrite the entire V8 engine to break arithmetic. This isn't an exaggeration either. Codex acts as if it is the last programmer on Earth and must accomplish its task at all costs. This is great for anyone content to treat it like a black box, but I am not content to do that. I want a collaborator with common sense, even if it means making mistakes or bad assumptions now and then. I think it really does reflect a difference in how OpenAI and Anthropic see humanity's future with AI. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | ramoz 18 minutes ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Ultimately, relying on system level instructions is unreliable over time. Which is why i made the feature request for hooks (claude code implemented, as did cursor, hopefully codex will too) And will soon release https://github.com/eqtylab/cupcake | |||||||||||||||||||||||||||||||||||
| ▲ | sinatra 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
In my AGENTS.md (which CLAUDE.md et al soft link to), I instruct them to "On phase completion, explicitly write that you followed these guidelines." This text always shows up on Codex and very rarely on Claude Code (TBF, Claude Code is showing it more often lately). | |||||||||||||||||||||||||||||||||||
| ▲ | sunaookami 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Agreed 100%, that's why I would recommend Codex for e.g. logfile analysis. Had some annoying php warnings in the logs from a WordPress plugin because I've used another plugin in the past (like... over 10 years ago) that wrote invalid metadata for every media file into the database and it didn't annoy me THAT much that I wanted to invest much time into it. So I gave codex the logfile and my WordPress dir and access to the WP-CLI command and it correctly identified the issue and wrote scripts to delete the old metadata (I did check it & make backups of course). Codex took a LOT of time though, it's veeeeeeery slow as you said. But I could do other things in the meantime. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | tekacs 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Yeah, Gemini 2.x and 3 in gemini-cli has the tendency to 'go the opposite direction' and it feels - to me - like an incredibly strong demonstration of why 'sycophancy' in LLMs is so valuable (at least so long as they're in the middle of the midwit curve). I'll give Gemini direction, it'll research... start trying to solve it as I've told it to... and then exclaim, "Oh! It turns out that <X> isn't what <user> thought!" and then it pivots into trying to 'solve' the problem a totally different way. The issue however... is that it's: 1) Often no longer solving the problem that I actually wanted to solve. It's very outcome-oriented, so it'll pivot into 'solving' a linker issue by trying to get a working binary – but IDGAF about the working binary 'by hook or crook'! I'm trying to fix the damn linker issue! 2) Just... wrong. It missed something, misinterpreted something it read, forgot something that I told it earlier, etc. So... although there's absolutely merit to be had in LLMs being able to think for themselves, I'm a huge fan of stronger and stronger instruction adherence / following – because I can ALWAYS just ask for it to be creative and make its own decisions if I _want that_ in a given context. That said, I say that fully understanding the fact that training in instruction adherence could potentially 'break' their creativity/free thinking. Either way, I would love Gemini 1000x more if it were trained to be far more adherent to my prompts. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | aerhardt 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
Well surely that's a good thing. In my experience, for some reason adherence is not even close to 100%. It's fixated on adding asterisk function params in my Python code and I cannot get it to stop... Maybe I haven't found the right wording, or maybe my codebase has grown past a certain size (there are like a dozen AGENTS.md files dancing around). I'm still very happy with the tool, though. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | bugglebeetle 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
The solution to this if you want less specification in advance is to simply ask Codex a series of leading questions about a feature of fix. I typically start with something like “it seems like X could be improved with the addition of Y? Can you review the relevant parts of the codebase in a, b, and c to assess?” It will then do so and come back with a set of suggestions that follow this guidance, which you can revise and selectively tell it to implement. In my experience, this fills the context with the appropriate details to then let it make more of its own decisions in a generally correct way without as much handholding. | |||||||||||||||||||||||||||||||||||
| ▲ | energy123 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
GPT-5 is like that | |||||||||||||||||||||||||||||||||||