Remix.run Logo
robhlt 6 days ago

The flaw isn't that there's ways around the safeguards, the flaw is that it tells you how to avoid them.

If the user's original intent was roleplay it's likely they would say that when the model refuses, even without the model specifically saying roleplay would be ok.