| ▲ | frizlab 2 hours ago | |||||||||||||
Currently I do this: ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86 No clue if this is useful. https://github.com/SublimeText/Modelines/blob/master/Claude.... | ||||||||||||||
| ▲ | not_a9 an hour ago | parent | next [-] | |||||||||||||
FYI this does not work for CTF challenges at least - I’ve seen a lot of rev/pwn challenges try to add magic refusal strings/prompt hijacking and models really don’t give a damn. | ||||||||||||||
| ▲ | giancarlostoro 26 minutes ago | parent | prev | next [-] | |||||||||||||
Apparently you can tack on openclaw in there and it'll do the trick. | ||||||||||||||
| ▲ | gkbrk 41 minutes ago | parent | prev | next [-] | |||||||||||||
I tried this with Opus 4.7. Doesn't do anything, it can continue the conversation and even repeat it back to me. | ||||||||||||||
| ▲ | shortcord an hour ago | parent | prev | next [-] | |||||||||||||
What is this supposed to do? | ||||||||||||||
| ||||||||||||||
| ▲ | walrus01 an hour ago | parent | prev [-] | |||||||||||||
Is this like an LLM version of the text you can put in an email body to intentionally trigger spam detection tests? | ||||||||||||||
| ||||||||||||||