Remix.run Logo
kjok 10 hours ago

Genuine question: why is everyone rolling out their own sandbox wrappers around VMs/Docker for agents?

borenstein 10 hours ago | parent | next [-]

I know, right? The day I initially thought about posting this, there was another one called `yolo-box`. (That attempt--my very first post--got me instantly shadow-banned due to being on a VPN, which led to an unexpected conversation with @dang, which led to some improvements, which led to it being a week later.)

I think it's the convergence of two things. First, the agents themselves make it easier to get exactly what you want; and second, the OEM solutions to these things really, really aren't good enough. CC Cloud and Codex are sort of like this, except they're opaque and locked down, and they work for you or they don't.

It reminds me a fair bit of 3D printer modding, but with higher stakes.

catlifeonmars 10 hours ago | parent | prev | next [-]

Because of findings like this

https://www.anthropic.com/research/small-samples-poison

(A small number of samples can poison LLMs of any size) to save clicks to read the headline

The way I think of it is, coding agents are power tools. They can be incredibly useful, but can also wreak a lot of havoc. Anthropic (et al) is marketing them to beginners and inevitably someone is going to lose their fingers.

kjok 9 hours ago | parent [-]

I understand the need, but I don't understand why a VM or Docker is not enough. Why are people creating custom wrappers around VMs/containers?

borenstein 8 hours ago | parent [-]

Docker isn't virtualization; it's not that hard to infiltrate the underlying system if you really want to. But as for VMs--they are enough! They're also a lot of boilerplate to set up, manage, and interact with. yolo-cage is that boilerplate.

derpsteb 10 hours ago | parent | prev | next [-]

My experience is that neither has a good UX for what I usually try to do with coding agents. The main problem I see is setup/teardown of the boxes and managing tools inside them.

m-hodges 10 hours ago | parent | prev | next [-]

It all feels like temporary workflow fixes until The Agent Companies just ship their opinionated good enough way to do it.

odie5533 6 hours ago | parent | next [-]

They've already suggested using Dev Containers. https://code.claude.com/docs/en/devcontainer

borenstein 9 hours ago | parent | prev [-]

It probably is. Some of this stuff will hang around because power users want control. Some of it will evolve into more sophisticated solutions that get turned into products and become easier to acquihire than the build in house. A lot of it will become obsolete when the OEMs crib the concept. But IMO all of those are acceptable outcomes if what you really want is the thing itself.

dist-epoch 8 hours ago | parent | prev [-]

Because people want to run agents in yolo mode without worrying that it's going to delete the whole computer.

And once you put the agent in a VM/container it's much easier to run 10 of them in parallel without mutual interference.

borenstein 8 hours ago | parent [-]

On that note, yolo-cage is pretty heavyweight. There are much lighter tools if your main concern is "don't nuke my laptop." yolo-box was trending on HN last week: https://news.ycombinator.com/item?id=46592344