Agreed, sandboxing is only part of agent security. Authorization (what data the agent can access and what tools it can execute) is also a big part of it.

I've found primer on agent sandboxes [0] is a great reference on sandboxing options and the trade-offs

For agents there's a tension between level of restriction and utility. I think a large part of OpenClaw's popularity is that the lack of restriction by default has helped people see the potential utility of agents. But any agent that isn't just for trying things out requires consideration of what it should and should not be able to do and from there the decision around the best combination of sandboxing and authorization.

At work, we've found it helpful to distinguish coding agents vs product agents. Coding agents have the ability to add new execution paths by pulling in external code or writing their own code to run. Product agents have a strictly defined set of tools and the runtime prevents them from executing anything beyond that definition. This distinction helps us reason about what sandboxing is required.

For data permissions it's trickier. MCP uses OAuth for authentication but each server can have different expectations for access to the external service. Some servers let you use a service account where you can narrow the scope of access but others assume a token minted from an admin account which means the MCP server might have access to things beyond what the agent using the server should.

So for that, we have an MCP proxy that lets us define custom permissions for every tool and resource, and at runtime makes permission checks to ensure the agent only gets access to the subset of things we define ahead of time. (We're using SpiceDB to implement the authorization logic and checks) This works well for product agents because they can't add new execution paths. For coding agents, we've tinkered with plugins/skills to try to do the same but ultimately they can build their way around authorization layers that aren't part of the runtime system so it's something we're still trying to figure out.

---

[0] https://www.luiscardoso.dev/blog/sandboxes-for-ai

▲

LelouBil 3 hours ago | parent [-]

Sandboxing is great, and stricter Authorization policies are great too, but with these kinds of software, my biggest fear (and that's why I am not trying them out now) is prompt injection.

It just seems unsolvable if you want the agent to do anything remotely useful

	▲	samkim 44 minutes ago \| parent [-]
		Ultimately a prompt injection attack is trying to get the agent to do something it wasn't intended to do and if you have the appropriate sandboxing and authorization in place, a compromised agent won't be able to actually execute the exploits