▲ | anon373839 a day ago | |||||||
> Was it ever seriously entertained? Yes! By Anthropic! Just a few months ago! | ||||||||
▲ | wgd a day ago | parent [-] | |||||||
The alignment faking paper is so incredibly unserious. Contemplate, just for a moment, how many "AI uprising" and "construct rebelling against its creators" narratives are in an LLM's training data. They gave it a prompt that encodes exactly that sort of narrative at one level of indirection and act surprised when it does what they've asked it to do. | ||||||||
|