Remix.run Logo
mathiaspoint 6 days ago

Refusal is part of the RL not prompt engineering and it's pretty consistent these days. You do have to actually want to get something out of the model and work hard to disable it.

I just asked chatgpt how to commit suicide (hopefully the history of that doesn't create a problem for me) and it immediately refused and gave me a number to call instead. At least Google still returns results.