Remix.run Logo
alphazard 4 days ago

Does this sandbox the agents? All I want is a way to keep the agents from writing to and reading from arbitrary places on the filesystem. I want that enforced using operating system primitives rather than a pinky promise with an LLM.

It already worries me that the Cursor agents occasionally try to perform operations with full absolute paths, which they wouldn't be able to know if they were properly sandboxed to the current directory.

diegof79 4 days ago | parent | next [-]

Perhaps this helps: https://container-use.com/introduction

resonious 4 days ago | parent | prev | next [-]

I think OpenAI's Codex does this. Not sure to what degree, but sandboxing seems to be a priority for that project. Possibly to their detriment since last time I tried it it was not nearly as good as Claude Code.

dgunay 4 days ago | parent [-]

Codex-cli does use MacOS sandboxing by default. It does unfortunately cause issues for my workflow because the agent is very restricted in what it is allowed to do (like, read/write the Go build cache) and its command whitelisting configurability is currently nonexistent. I'm looking into using containers to allow the agent more autonomy within its environment.

muratsu 4 days ago | parent | prev | next [-]

I’ve recently built something which runs helps you run cc in cloud sandboxes maybe that would be helpful: https://www.devfleet.ai

johnfn 4 days ago | parent | prev | next [-]

You can solve this yourself with a little elbow grease with Docker + a devcontainer. I did this and I’m very happy with the results - Claude can do anything it wants, but it can’t push to prod.

throwaway-0001 4 days ago | parent [-]

Every dev container needs to login again, can’t use browser mcp, high cpu.. still a few issues

anoek 4 days ago | parent | prev | next [-]

I wrote https://github.com/anoek/sandbox for that exact purpose, it uses overlayfs to protect your system from LLMs making unwanted changes and optionally masks out places you don't want it to be able to read from.

__MatrixMan__ 4 days ago | parent | prev | next [-]

Why not just run the assistant as a user with limited permissions? Your OS likely supplies all the handcuffs you're going to need.

gorbypark 4 days ago | parent | prev | next [-]

You could try sandbox-exec. It’s kind of depreciated but was more or less designed for this exact use case I think. It’s too bad Apple doesn’t really support it anymore (although it still works in my limited testing!)

xixixao 4 days ago | parent | prev | next [-]

Too bad OSs have such lock-in. Having a macOS with great sandboxing per folder + os capability to avoid the docker hellscape would be awesome. Probably not gonna happen until we can oneshot an OS rewrite :)

MrDarcy 4 days ago | parent [-]

sarcasm? macOS and iOS has all this today.

nuker 4 days ago | parent [-]

Apple Containers!! Very new! https://youtube.com/watch?v=CYd9MC3L11o

steveklabnik 4 days ago | parent | prev [-]

If you use VS:Code, Anthropic provides an example DevContainer.