Remix.run Logo
senko 3 hours ago

File system access is not one of OpenClaw's biggest security issues. If that were so, running it in a VM or another computer (I hear Mac Minis are popular!) would solve it.

If you need it to do anything useful[0], you have to connect it to your data and give it action capabilities. All the dragons are there.

If you play it careful and don't expose your data, comm channels, etc., then it's much like the other AI assistants out there.[1]

---

[0] for your definition of useful

[1] I do appreciate the self-modification and heartbeat aspects, and don't want to downplay how technically impressive it is. The comment is purely from POV of an end-user product.

Someone an hour ago | parent | next [-]

I think the only sane way, if there is one, is to sandbox your LLM behind a fixed set of MCP servers that severely limit what it can do.

Reading your mail, WhatsApp and bank transactions? May be OK if your LLM runs locally, but even then, if it has any way to send data to the outside world without you checking it, maybe not even. You don’t want your LLM to send your private mail (including photos) or bank statements to somebody who uses prompt injection to get that data.

Thinking of prompt injection: we need LLMs with a Harvard architecture (https://en.wikipedia.org/wiki/Harvard_architecture), so that there is no way for LLM data inputs to be treated as instructions.

samkim 2 hours ago | parent | prev | next [-]

Agreed, sandboxing is only part of agent security. Authorization (what data the agent can access and what tools it can execute) is also a big part of it.

I've found primer on agent sandboxes [0] is a great reference on sandboxing options and the trade-offs

For agents there's a tension between level of restriction and utility. I think a large part of OpenClaw's popularity is that the lack of restriction by default has helped people see the potential utility of agents. But any agent that isn't just for trying things out requires consideration of what it should and should not be able to do and from there the decision around the best combination of sandboxing and authorization.

At work, we've found it helpful to distinguish coding agents vs product agents. Coding agents have the ability to add new execution paths by pulling in external code or writing their own code to run. Product agents have a strictly defined set of tools and the runtime prevents them from executing anything beyond that definition. This distinction helps us reason about what sandboxing is required.

For data permissions it's trickier. MCP uses OAuth for authentication but each server can have different expectations for access to the external service. Some servers let you use a service account where you can narrow the scope of access but others assume a token minted from an admin account which means the MCP server might have access to things beyond what the agent using the server should.

So for that, we have an MCP proxy that lets us define custom permissions for every tool and resource, and at runtime makes permission checks to ensure the agent only gets access to the subset of things we define ahead of time. (We're using SpiceDB to implement the authorization logic and checks) This works well for product agents because they can't add new execution paths. For coding agents, we've tinkered with plugins/skills to try to do the same but ultimately they can build their way around authorization layers that aren't part of the runtime system so it's something we're still trying to figure out.

---

[0] https://www.luiscardoso.dev/blog/sandboxes-for-ai

LelouBil 2 hours ago | parent [-]

Sandboxing is great, and stricter Authorization policies are great too, but with these kinds of software, my biggest fear (and that's why I am not trying them out now) is prompt injection.

It just seems unsolvable if you want the agent to do anything remotely useful

stevepike 2 hours ago | parent | prev [-]

Reminds me of https://xkcd.com/1200/