| ▲ | persedes 8 hours ago | |
Interesting that the MCP-Atlas score for 4.6 jumped to 75.8% compared to 59.5% https://www.anthropic.com/news/claude-opus-4-6 There's other small single digit differences, but I doubt that the benchmark is that unreliable...? | ||
| ▲ | usaar333 7 hours ago | parent [-] | |
page is updated to state: MCP-Atlas: The Opus 4.6 score has been updated to reflect revised grading methodology from Scale AI. | ||