| ▲ | danpalmer 4 hours ago | |
What a joke. Must make it pretty easy to poison a session, you don't need to persuade the model about anything, just trigger its security controls, ideally after as much context as possible, but before it has generated any useful output. | ||
| ▲ | kay_o 4 hours ago | parent [-] | |
After all, what is roleplay or games but a jailbreak of guard rails? :] I've even had it refuse CTFs knowing it is a CTF with blatantly obvious CTF flag, no actual application | ||