▲ | taormina 7 days ago | |
I've used a wide variety of the "best" models, and I've mostly settled on Opus 4 and Sonnet 4 with Claude Code, but they don't ever actually get better. Grok 3-4 and GPT4 were worse, but like, at a certain point you don't get brownie points for not tripping over how low the bar is set. | ||
▲ | generalizations 7 days ago | parent [-] | |
People have actually been basing their assertions on 4o. The bar is really low and people are still completely missing it. |