Remix.run Logo
rishabhaiover 7 hours ago

Boris gaslighted us with all the quality related incidents for weeks not acknowledging these problems.

throwaway2027 6 hours ago | parent [-]

Maybe he didn't know or they were still figuring it out which is fine they're still engineers who can get things wrong sometimes but the communication felt lackluster and being on the receiving end sucks when you had a reliable setup which then degrades. There is a reason people don't upgrade software and why people say if it works don't fix it, but obviously that's not an option for Anthropic when you want to keep improving the product, so they need good measurement tools and quick rollbacks even if properly "benchmarking" LLMs could prove difficult.