| ▲ | pbowyer 7 hours ago | |
> I feel like this benchmark reiterates my disbelief that anyone uses the latest Anthropic models for any productive work. They seem to be the best at burning tokens and spawning unnecessary subagents even for well-defined and tightly scoped tasks. I keep Claude around for some specific tasks: - Linked up to Figma MCP to implement front-end stuff - Data analysis, in the "Connect AI to a data source and ask questions" way. I've tried both Opus 4.8 high and GPT 5.5 high for this and Opus is stronger because it gets the intent in the question better I used to keep it around for planning too, but the 4.8 plans have had more holes than swiss cheese. | ||