| ▲ | DanMcInerney 10 hours ago | |
Sandboxing is a great security step for agents. Just like using guardrails is a great security step. I can't help but feel like it's all soft defense though. The real danger comes from the agent being able to read 3rd party data, be prompt injected, and then change or exfiltrate sensitive data. A sandbox does not prevent an email-reading agent from reading a malicious email, being prompt injected, and then sending an email to a malicious email address with the contents of your inbox. It does help in implementing network-layer controls though, like apply a policy that says this linux-based sandbox is only allowed to visit [whitelisted] urls. This kind of architectural whitelisting is the only hard defense we have for agents at the moment. Unfortunately it will also hamper their utility if used to the greatest extent possible. | ||
| ▲ | jingkai_he 10 hours ago | parent | next [-] | |
Creator here. Agreed, sandboxing by itself doesn't solve prompt injection. If the agent can read and send emails, no sandbox can tell a legit send from an exfiltration. matchlock does have the network-layer controls you mentioned, such as domain whitelisting and secret protection toward designated hosts, so a rogue agent can't just POST your API key to some random endpoints. The unsafe tool call/HTTP request problem probably needs to be solved at a different layer, possibly through the network interception layer of matchlock or an entirely different software. | ||
| ▲ | 9 hours ago | parent | prev [-] | |
| [deleted] | ||