Remix.run Logo
EliRivers 2 hours ago

A friend of mine prepared an arsenal of hooks and the like to address this and LLMs still disobey them at times.

It's a model of language, yes? Trained on a big corpus of text.

I have read a lot of stories and accounts in which people were told not to do something and inevitably they did it. Like, lots. Far more than stories and accounts in which people were told not to do something and they then didn't do it.

If I'm reading a story or account of something, and it's really hammered home that they've been told not to do something, it's kind of inevitable that they will then do that. I'm not even an LLM and I noticed that's the way these things usually go.

So is an LLM just doing what it's been trained to do? Sometimes in the stories and accounts, there's a whole lot of time and tension before the bad thing happens, but that's just part of the fun.