▲ | tptacek 3 days ago | ||||||||||||||||
On the first two paragraphs: we agree. (I just think that's both more obvious and less fundamental to the model than current writing on this suggests). On the latter two paragraphs: my point is that there's nothing fundamental to the concept of an agent that requires you to mix untrusted content with sensitive tool calls. You can confine untrusted content to its own context window, and confine sensitive tool calls to "sandboxed" context windows; you can feed raw context from both to a third context window to summarize or synthesize; etc. | |||||||||||||||||
▲ | simonw 3 days ago | parent [-] | ||||||||||||||||
Right - that's more or less the idea behind https://simonwillison.net/2023/Apr/25/dual-llm-pattern/ and the DeepMind CaMeL paper: https://simonwillison.net/2025/Apr/11/camel/ The challenge is that you have to implement really good taint tracking (as seen in old school Perl) - you need to make sure that the output of a model that was exposed to untrusted data never gets fed into some other model that has access potentially harmful tool calls. I think that is possible to build, but I haven't seen any convincing implementation of the pattern yet. Hopefully soon! | |||||||||||||||||
|