▲ | samcat116 7 hours ago | |
> I can’t see a way around that except to have some kind of sandboxing or a concept of untrusted or tainted input rather than treating all tokens as the same. Maybe a way of detecting if the response of a tool is within a threshold of acceptability within the definition of the MCP (which is easier with structured output), which is used to force a manual confirmation or straight up rejection if it’s deemed to be unusual or unsafe. I think we are starting to see these remote agent environments where each agent session gets its own sandbox environment to run things in. I bet thats where this is going. | ||
▲ | 7 hours ago | parent [-] | |
[deleted] |