Remix.run Logo
Ajedi32 3 days ago

They're trained to be sycophants as a side effect of the same reinforcement learning process that trains them to dutifully follow all user instructions. It's hard (though not impossible) to teach one without the other, especially if other related qualities like "cheerful", "agreeable", "helpful", etc. also help the AI get positive ratings during training.