My codex just uses python to write files around the sandbox when I ask it to patch a sdk outside its path.

valleyer 2 hours ago | parent | next [-]

Is it asking you permission to run that python command? If so, then that's expected: commands that you approve get to run without the sandbox.

The point is that Codex can (by default) run commands on its own, without approval (e.g., running `make` on the project it's working on), but they're subject to the imposed OS sandbox.

This is controlled by the `--sandbox` and `--ask-for-approval` arguments to `codex`.

▲

Sharlin 4 hours ago | parent | prev [-]

It's definitely not a sandbox if you can just "use python to write files" outside of it o_O

▲

chongli 2 hours ago | parent [-]

Hence the article’s security theatre remark.

I’m not sure why everyone seems to have forgotten about Unix permissions, proper sandboxing, jails, VMs etc when building agents.

Even just running the agent as a different user with minimal permissions and jailed into its home directory would be simple and easy enough.

▲

embedding-shape 2 hours ago | parent | next [-]

I'm just guessing, but seems the people who write these agent CLIs haven't found a good heuristic for allowing/disallowing/asking the user about permissions for commands, so instead of trying to sit down and actually figure it out, someone had the bright idea to let the LLM also manage that allowing/disallowing themselves. How that ever made sense, will probably forever be lost on me.

`chroot` is literally the first thing I used when I first installed a local agent, by intuition (later moved on to a container-wrapper), and now I'm reading about people who are giving these agents direct access to reply to their emails and more.

▲

valleyer 2 hours ago | parent [-]

Here's OpenAI's docs page on how they sandbox Codex: https://developers.openai.com/codex/security/

Here's the macOS kernel-enforced sandbox profile that gets applied to processes spawned by the LLM: https://github.com/openai/codex/blob/main/codex-rs/core/src/...

I think skepticism is healthy here, but there's no need to just guess.

▲

ZeroGravitas 21 minutes ago | parent | next [-]

If I'm following this it means you need to audit all code that the llm writes though as anything you run from another terminal window will be run as you with full permissions.

▲

chongli an hour ago | parent | prev [-]

That still doesn't seem ideal. Run the LLM itself in a kernel-enforced sandbox, lest it find ways to exploit vulnerabilities in its own code.

	▲	valleyer 40 minutes ago \| parent [-]
		The LLM inference itself doesn't "run code" per se (it's just doing tensor math), and besides, it runs on OpenAI's servers, not your machine.

▲

an hour ago | parent | prev [-]

[deleted]