| ▲ | tristanj a day ago | |
The newer Opus models push back against the user much more noticeably than previous iterations. GPT-3.5/4 had the opposite problem (excessive sycophancy), so Anthropic presumably swung the pendulum too hard the other direction. My conclusion is that pushing back against the user & questioning the user's premise forces the model to think more than it would otherwise, which leads to better model performance. But it causes situations where the user has esoteric, specialized knowledge the model can't verify publicly and the model hallucinates evidence and pushes back. When this happens, Opus begins accusing the user of lying, which is quite annoying and a detrimental user experience. It's happened to me when I asked about undocumented API behavior or counter-intuitive design choices. I have noticed if Claude Opus "thinks" you are an expert, (i.e. you run your query through 4.6 first to express it more clearly) then Opus is less likely to nitpick and push back. It seems to get caught in nitpicking loops, and celebrate ever error it can find. | ||