| ▲ | verdverm 4 hours ago | |
https://simonwillison.net/2025/Nov/2/new-prompt-injection-pa... Meta wrote a post that went through the various scenarios and called it the "Rule of Two" --- At a high level, the Agents Rule of Two states that until robustness research allows us to reliably detect and refuse prompt injection, agents must satisfy no more than two of the following three properties within a session to avoid the highest impact consequences of prompt injection. [A] An agent can process untrustworthy inputs [B] An agent can have access to sensitive systems or private data [C] An agent can change state or communicate externally It’s still possible that all three properties are necessary to carry out a request. If an agent requires all three without starting a new session (i.e., with a fresh context window), then the agent should not be permitted to operate autonomously and at a minimum requires supervision --- via human-in-the-loop approval or another reliable means of validation. | ||
| ▲ | verdverm 3 hours ago | parent [-] | |
Simon and Tim have a good thread about this on Bsky: https://bsky.app/profile/timkellogg.me/post/3m4ridhi3ps25 Tim also wrote about this topic: https://timkellogg.me/blog/2025/11/03/colors | ||