| ▲ | sigmoid10 3 hours ago | |
I think "prompt injection prevention" systems fall into the same category as "llm writing detection" systems. I.e. reality is always a step ahead and you shouldn't trust either one for anything remotely important. | ||
| ▲ | kirtivr an hour ago | parent [-] | |
Yeah, the problem reduces to trying to restrict a motivated model which is trying to exfiltrate data. That's a problem we are just now wrapping our minds around. It's not as simple as prompt sanitization. The model is the interpreter, and we don't yet have the right tools to guide it. | ||