The real security issue is around the use of ‘YOLO mode’ where you just let the agent invoke tools in a completely unattended manner. It’s not much different than slapping sudo in front of every shell command or running as root.

People are going to continue doing that because these agentic tasks can take some time to run and checking in to approve a command so often becomes an annoyance.

I can’t see a way around that except to have some kind of sandboxing or a concept of untrusted or tainted input rather than treating all tokens as the same. Maybe a way of detecting if the response of a tool is within a threshold of acceptability within the definition of the MCP (which is easier with structured output), which is used to force a manual confirmation or straight up rejection if it’s deemed to be unusual or unsafe.

▲

samcat116 an hour ago | parent | next [-]

> I can’t see a way around that except to have some kind of sandboxing or a concept of untrusted or tainted input rather than treating all tokens as the same. Maybe a way of detecting if the response of a tool is within a threshold of acceptability within the definition of the MCP (which is easier with structured output), which is used to force a manual confirmation or straight up rejection if it’s deemed to be unusual or unsafe.

I think we are starting to see these remote agent environments where each agent session gets its own sandbox environment to run things in. I bet thats where this is going.

	▲	an hour ago \| parent [-]
		[deleted]

▲

alvis 8 hours ago | parent | prev | next [-]

It's indeed an issue. I love codex that it contains everything in a sandbox and I can review what has changed. It's proper and I've much better idea what's going on.

That said, I ditched codex for claude code... Sorry open ai. No MCP and no way to interact during execution is a huge drawback.

	▲	wunderwuzzi23 7 hours ago \| parent [-]
		ChatGPT Codex has internet access since a few weeks ago. It's super configurable on where it can connect to.

▲

anuramat 7 hours ago | parent | prev [-]

anthropic provides a custom devcontainer for sandboxing, but I have fallen in love with bubblewrap - it's a single command, and I get to keep all the infrastructure: e.g. it can do nix flakes without duplicating every derivation