Remix.run Logo
Retr0id 4 hours ago

The fact that LLMs are "smarter" is also their weakness. An oldschool classifier is far from foolproof, but you won't get past it by telling it about your grandma's bedtime story routine.

reassess_blind 2 hours ago | parent [-]

Fairly hard to bypass the latest LLMs with grandma's bedtime story these days, to be fair.

Retr0id 2 hours ago | parent [-]

That specific trick yes, but the general concept still applies.

reassess_blind 2 hours ago | parent [-]

It does, but it's certainly not trivial. In fact there's an unclaimed $1000 bounty on prompt injecting OpenClaw: https://hackmyclaw.com/