Remix.run Logo
aerhardt 3 hours ago

> the quality differential versus the competitor is night and day.

This myth about the inferiority of ChatGPT and Codex is becoming a meme.

I have active subscriptions to both. I am throwing at Codex all kinds of data engineering, web development and machine learning problems, have been working on non-tech tasks in the "Karpathy Obsidian Wiki" [1] style before he posted about it.

Not only does Codex crush Claude on cost, it's also significantly better at adherence and overall quality. Claude is there on my Mac, gathering dust, to the point I am thinking of not renewing the sub.

There are plenty of fellow HNers here who feel the same from what I read in the flamewars. I suspect none of us really has a horse in this race and many are half-competent (in other threads, they mention they do things like embedded programming, distributed DL systems, etc.)

I'm starting to suspect a vast majority of people pushing the narrative that Claude is vastly better haven't even tried the 5.3 / 5.4 models and are doing it out of sheer tribalism.

[1] https://gist.github.com/karpathy/442a6bf555914893e9891c11519...

selectively 2 hours ago | parent | next [-]

I have access to effectively infinite API tokens for all models from Anthropic as well as OpenAI. The differential in performance in complex tasks is vast and strongly in favor of Opus, in my experience. I do not use the official harnesses for either model, though - as they are not my taste.

Codex is closer to my taste, as it is at least a native app and not typescript slop. But the model is just not up to snuff.

benjiro3000 an hour ago | parent | prev [-]

[dead]