| ▲ | 112233 18 hours ago | |
They did not misunderstand anything. All of the behaviour is not inherent in raw base model and has been planted by the agressive, secretive reinforcement learning they do for benchmaxxing, "safety" and all other things. Claude begins any other sentence with "honestly". That is not how LLMs work, that is how they work after being RLed to the brink. | ||