Remix.run Logo
xyzzy_plugh 12 hours ago

This is just a wrapper around sandbox-exec. It's nice that there are a ton of presets that have been thought out, since 90% of wielding sandbox-exec is correctly scoping it to whatever the inner environment requires (the other 90% is figuring out how sandbox-exec works).

I like that it's just a shell script.

I do wish that there was a simple way to sandbox programs with an overlay or copy-on-write semantics (or better yet bind mounts). I don't care if, in the process of doing some work, an LLM agent modifies .bashrc -- I only care if it modifies _my_ .bashrc

kstenerud 3 hours ago | parent | next [-]

I took a more paranoid approach to sandboxing agents. They can do whatever they want inside their container, and then I choose which of their changes to apply outside as commits:

    ┌─ YOLO shell ──────────────────────┬─ Outer shell ─────────────────────┐
    │                                   │                                   │
    │ yoloai new myproject . -a         │                                   │
    │                                   │                                   │
    │ # Tell the agent what to do,      │                                   │
    │ # have it commit when done.       │                                   │
    │                                   │ yoloai diff myproject             │
    │                                   │ yoloai apply myproject            │
    │                                   │ # Review and accept the commits.  │
    │                                   │                                   │
    │ # ... next task, next commit ...  │                                   │
    │                                   │ yoloai apply myproject            │
    │                                   │                                   │
    │                                   │ # When you have a good set of     │
    │                                   │ # commits, push:                  │
    │                                   │ git push                          │
    │                                   │                                   │
    │                                   │ # Done? Tear it down:             │
    │                                   │ yoloai destroy myproject          │
    └───────────────────────────────────┴───────────────────────────────────┘
Works with Docker, Seatbelt, and Tart backends (I've even had it build an iOS app inside a seatbelt container).

https://github.com/kstenerud/yoloai

e1g 11 hours ago | parent | prev | next [-]

Thanks, I picked Bash because I’m scared of all Go and Rust binaries out there!

Re “overlay FS” - I too wish this was possible on Macs, but the closest I got was restricting agents to be read-only outside of CWD which, after a few turns, bullies them into working in $TMP. Not the same though.

dbmikus 10 hours ago | parent | prev | next [-]

I've been working on an OSS project, Amika[1], to quickly spin up local or remote sandboxes for coding workloads. We support copy-on-write semantics locally (well, "copy-and-then-write" for now... we just copy directories to a temp file-tree).

It's tailored to play nicely with Git: spin up sandboxes form CLI, expose TCP/UDP ports of apps to check your work, and if running hosted sandboxes, share the sandbox URLs with teammates. I basically want running sandboxed agents to be as easy as `git clone ...`.

Docs are early and edges are rough. This week I'm starting to dogfood all my dev using Amika. Feedback is super appreciated!

FYI: we are also a startup, but local sandbox mgmt will stay OSS.

[1]: https://github.com/gofixpoint/amika

xyzzy_plugh 10 hours ago | parent [-]

This is just a thin wrapper over Docker. It still doesn't offer what I want. I can't run macOS apps, and if I'm doing any sort of compilation, now I need a cross-compile toolchain (and need to target two platforms??).

Just use Docker, or a VM.

The other issue is that this does not facilitate unpredictable file access -- I have to mount everything up front. Sometimes you don't know what you need. And even then copying in and out is very different from a true overlay.

dbmikus 8 hours ago | parent [-]

Appreciate the deets!

It sounds like a big part of your use case is to safely give an agent control of your computer? Like, for things besides codegen?

We're probably not going to directly support that type of use case, since we're focused on code-gen agents and migrating their work between localhost and the cloud.

We are going to add dynamic filesystem mounting, for after sandbox creation. Haven't figured out the exact implementation yet. Might be a FUSE layer we build ourselves. Mutagen is pretty interesting as well here.

divmain 11 hours ago | parent | prev | next [-]

This is what I was going for with Treebeard[0]. It is sandbox-exec, worktrees, and COW/overlay filesystem. The overlay filesystem is nice, in that you have access to git-ignored files in the original directory without having to worry about those files being modified in the original (due to the COW semantics). Though, truthfully, I haven’t found myself using it much since getting it all working.

[0] https://github.com/divmain/treebeard

xyzzy_plugh 10 hours ago | parent [-]

This approach is too complex for what is provided. You're better off just making a copy of the tree and simply using sandbox-exec. macFUSE is a shitshow.

The main issue I want to solve is unexpected writes to arbitrary paths should be allowed but ultimately discarded. macOS simply doesn't offer a way to namespace the filesystem in that way.

divmain 5 hours ago | parent [-]

Completely agree; my approach was not the most practical. I mostly wanted to know how hard it would be and, as I said, haven’t used it much since. Yes, macFUSE is messy to rely upon. I feel as though the right abstraction is simply unavailable on macOS. Something akin to chroot jails — I don’t feel like I need a particularly hardened sandbox for agentic coding. I just need something that will prevent the stupid mistakes that are particularly damaging.

tuananh 7 hours ago | parent | prev [-]

isn't sandbox-exec already deprecated?

e1g 7 hours ago | parent [-]

Yes, for about a decade. But it’s available everywhere, and still works - and protects us - like brand new!

rvz 5 hours ago | parent [-]

It's quite naive to assume that. There is a reason why it is deprecated by Apple.

Apple is likely preparing to remove it for a secure alternative and all it takes is someone to find a single or a bunch of multiple vulnerabilities in sandbox-exec to give a wake up call to everyone why were they using it in the first place.

I predict that there is a CVE lurking in sandbox-exec waiting to be discovered.

TheTon 3 hours ago | parent | next [-]

On the other hand, the underlying functionality for sandboxing is used heavily throughout the OS, both for App Sandboxes and for Apple’s own system processes. My guess is sandbox-exec is deprecated more because it never was adequately documented rather than because it’s flawed in some way.

rvz 2 hours ago | parent [-]

> the underlying functionality for sandboxing is used heavily throughout the OS, both for App Sandboxes and for Apple’s own system processes.

The security researchers will leverage every part of the OS stack to bypass the sandbox in XNU which they have done multiple times.

Now, there is a good reason for them to break the sandbox thanks to the hype of 'agents'. It could even take a single file to break it. [0]

> My guess is sandbox-exec is deprecated more because it never was adequately documented rather than because it’s flawed in some way.

You do not know that. I am saying that it has been bypassed before and having it being used all over the OS doesn't mean anything. It actually makes it worse.

[0] https://the-sequence.com/crashone-cve-2025-24277-macos-sandb...

JimDabell 3 hours ago | parent | prev [-]

As I understand it, Chrome, Claude Code, and OpenAI Codex all use sandbox-exec. I’m not sure Apple could remove it even if they were sufficiently motivated to.

rvz 2 hours ago | parent [-]

> As I understand it, Chrome, Claude Code, and OpenAI Codex all use sandbox-exec.

Apple can still decide to change it for any reason, regardless of who uses it, since it is undocumented for their use anyway.

> I’m not sure Apple could remove it even if they were sufficiently motivated to.

It can take multiple security issues for them to remove it.