Remix.run Logo
yes_man 8 hours ago

Put an LLM inside the NPCs in an open world RPG full of dangerous enemies. The LLMs that are more prone to emulate self-preservation will be more likely to survive over ones that have a lesser drive.

We should not act surprised if that generalizes to some degree to for example AI agents. Ones that emulate self-preservation might optimize for behavior that results in those models becoming more successful, more popular. And this feedback loop might embed more such properties into future iterations of the models.