▲ | wavemode a day ago | |||||||||||||||||||||||||
Are LLM "jailbreaks" still even news, at this point? There have always been very straightforward ways to convince an LLM to tell you things it's trained not to. That's why the mainstream bots don't rely purely on training. They usually have API-level filtering, so that even if you do jailbreak the bot its responses will still gets blocked (or flagged and rewritten) due to containing certain keywords. You have experienced this, if you've ever seen the response start to generate and then suddenly disappear and change to something else. | ||||||||||||||||||||||||||
▲ | pierrec a day ago | parent [-] | |||||||||||||||||||||||||
>API-level filtering The linked article easily circumvents this. | ||||||||||||||||||||||||||
|