▲ | simonw 5 days ago | |||||||||||||||||||||||||||||||||||||||||||||||||
Anthropic have stated on the record several times that they do not update the model weights once they have been deployed without also changing the model ID. | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | jjani 5 days ago | parent [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
No, they do change deployed models. How can I be so sure? Evals. There was a point where Sonnet 3.5 v2 happily output 40k+ tokens in one message if asked. And one day it started with 99% consistency, outputting "Would you like me to continue?" after a lot fewer tokens than that. We'd been running the same set of evals and so could definitively confirm this change. Googling will also reveal many reports of this. Whatever they did, in practice they lied: API behavior of a deployed model changed. Another one: Differing performance - not latency but output on the same prompt, over 100+ runs, statistically significant enough to be impossible by random chance - between AWS Bedrock hosted Sonnet and direct Anthropic API Sonnet, same model version. Don't take at face value what model providers claim. | ||||||||||||||||||||||||||||||||||||||||||||||||||
|