Some of these don’t really seem like they bypassed any kind of sandbox. Like hallucinating an npm package. You acknowledge that the install will fail if someone tries to reinstall from the lock file. Are you not doing that in CI? Same with curl, you’ve explained how the agent saw a hallucinated error code, but not how a network request would have bypass the sandbox. These just sound like examples of friction introduced by the sandbox.

▲

themafia 6 hours ago | parent [-]

> These just sound like examples of friction introduced by the sandbox.

The whole idea of putting "agentic" LLMs inside a sandbox sounds like rubbing two pieces of sandpaper together in the hopes a house will magically build itself.

▲

embedding-shape 5 hours ago | parent | next [-]

> The whole idea of putting "agentic" LLMs inside a sandbox

What is the alternative? Granted you're running a language model and has it connected to editing capabilities, then I very much like it to be disconnected from the rest of my system, seems like a no-brainer.

	▲	AdieuToLogic 3 hours ago \| parent [-]
		>> The whole idea of putting "agentic" LLMs inside a sandbox sounds like rubbing two pieces of sandpaper together in the hopes a house will magically build itself. > What is the alternative? Don't expect to get a house from rubbing two pieces of sandpaper together?

▲

jazzyjackson 5 hours ago | parent | prev | next [-]

Trouble is it occasionally works

	▲	themafia 3 hours ago \| parent [-]
		Lots of dumb things occasionally work. The question the market strives to answer is "is it actually competitive?"

▲

formerly_proven 5 hours ago | parent | prev [-]

That’s some good house-building sandpaper then.