Remix.run Logo
kubb 3 days ago

I’m sorry, what kind of rule is that? How does it guarantee security?

It sounds like we’re making things up at this point.

bawolff 3 days ago | parent | next [-]

It kind of sounds like a weak version of airgapping. If you cant persist state, access private data, or exfiltrate data, there is not much point to jailbreaking the llm.

However, its deeply unsatisying in the same way that securing your laptop by not turning it on, is.

imtringued 3 days ago | parent | prev [-]

Yeah it's nonsense, because the author has described the standard "read, process, write" flow of computation and decided that if you remove one of these three, then everything is safe.

The correct solution is to have the system prompt be mechanically decoupled from untrustworthy data, the same it was done with CSP (content security policy) against XSS and named parameters for SQL.

simonw 3 days ago | parent [-]

That's difficult but not impossible - the CaMeL paper from Google DeepMind describes a way of achieving that: https://simonwillison.net/2025/Apr/11/camel/