Remix.run Logo
taormina 7 days ago

I've used a wide variety of the "best" models, and I've mostly settled on Opus 4 and Sonnet 4 with Claude Code, but they don't ever actually get better. Grok 3-4 and GPT4 were worse, but like, at a certain point you don't get brownie points for not tripping over how low the bar is set.

generalizations 7 days ago | parent [-]

People have actually been basing their assertions on 4o. The bar is really low and people are still completely missing it.