| ▲ | raincole 12 hours ago | |
It has always been like this. We actually know that the model performance has been mostly steady[0], but you cannot beat the notion of "evil companies secretly serving us worse models." The meme value is too strong. | ||
| ▲ | mnicky 4 hours ago | parent [-] | |
Hmm, today's pass rate raised to 73% - interesting, are they AB-testing some new model? This is too high for Opus 4.7. | ||