Remix.run Logo
SAI_Peregrinus 6 hours ago

LLMs don't make a distinction between prompt & data. There's no equivalent to an "NX bit", and AFAIK nobody has figured out how to create such an equivalent. And of course even that wouldn't stop all security issues, just as the NX bit being added to CPUs didn't stop all remote code execution attacks. So the best options we have right now tend to be based around using existing security mechanisms on the LLM agent process. If it runs as a special user then the regular filesystem permissions can restrict its access to various files, and various other mechanisms can be used to restrict access to other resources (outgoing network connections, various hardware, cgroups, etc.). But as long as untrusted data can contain instructions it'll be possible for the LLM output to contain secret data, and if the human using the LLM doesn't notice & copies that output somewhere public the exfiltration step returns.

boothby 5 hours ago | parent [-]

> AFAIK nobody has figured out how to create such an equivalent.

I'm curious if anybody has even attempted it; if there's even training data for this. Compartmentalization is a natural aspect of cognition in social creatures. I've even known dogs to not to demonstrate knowledge of a food supply until they think they're not being observed. As a working professional with children, I need to compartmentalize: my social life, sensitive IP knowledge, my kid's private information, knowledge my kid isn't developmentally ready for, my internal thoughts, information I've gained from disreputable sources, and more. Intelligence may be important, but this is wisdom -- something that doesn't seem to be a first-class consideration if dogs and toddlers are in the lead.