Preventing Python "Sandbox" Escape?

Remix clone Hacker News

new | show | ask | jobs Github

▲

3 points by high_byte a year ago | 14 comments

I'm using python's exec(code, globals, locals)

I disable __builtins__ so no imports, exec, eval, open, etc. inside that context

but it seems you can still always do object.__subclasses__() and find every system method (eg. open())

it can't be overwritten but looking at the interpreter code is seems like it's possible to hack a workaround for this specific case.

are there other known ways to escape exec()?

▲ zahlman a year ago | parent | next [-]

Relevant:

https://discuss.python.org/t/extending-subinterpreters-with-...

https://stackoverflow.com/questions/3068139

https://wiki.python.org/moin/SandboxedPython

https://github.com/jailctf/pyjailbreaker

https://healeycodes.com/running-untrusted-python-code

https://lwn.net/Articles/574215/

▲ Terr_ a year ago | parent | prev | next [-]

I would infer that it's insecure, since if it were that easy there wouldn't be various abandoned projects trying to sandbox Python.

It's the curse of any sufficiently useful language. Well, maybe not Lua, but that was specifically designed for embedding. Java also began with that intention back when applets were ahead of their time, though IIRC secure sandboxing is no longer really a feature.

	▲	high_byte a year ago \| parent [-]
		but all those projects also wanted system apis like filesystem and sockets and such. for me I just want to hijack the interpreter so I don't have to write my own. no imports, no sockets.

▲ eesmith a year ago | parent | prev | next [-]

Don't do it. Really, really don't do it. People have tried for decades to develop such a sandbox, and it does not work.

▲ high_byte a year ago | parent [-]

as I mentioned on another reply, all those projects also wanted system apis like filesystem and sockets and such.

for me I just want to hijack the interpreter so I don't have to write my own. no imports, no sockets.

▲ eesmith a year ago | parent [-]

No, I'm not.

I'm talking about the history beyond why rexec and Bastion, and restricted execution, were removed from Python in the 2.x days. See https://python.readthedocs.io/en/v2.7.2/library/restricted.h... , "In Python 2.3 these modules have been disabled due to various known and not readily fixable security holes."

They started because back in the 1.x days there was a Python web browser called Grail, and the hope was to support restricted Python applets in Grail.

Or from 10 years ago, read https://lwn.net/Articles/574215/ about the failure of 'pysandbox' where one of the ways to break out was to "[use] a traceback object to unwind the stack frame to one in the trusted namespace, then use the f_globals attribute to retrieve a global object." ... "Stinner's tale should serve as a cautionary one to anyone considering a CPython-based solution".

You might consider RestrictedPython at https://restrictedpython.readthedocs.io/en/latest/ which supports only a subset of Python, via AST-walking to limit what the code can do. I have no experience with it.

▲ high_byte a year ago | parent | next [-]

I didn't use RestrictPython. I did manage to patch the __subclasses__() escape with a hack. if only I can patch the exceptions traceback too I think it will be good enough :)

edit: here are my silly little patches: https://github.com/hananbeer/cpython-toy-sandbox/commit/fa3f...

this is of course assuming exec(globals={..}) without certain builtins and is, again, not expected to use system apis like files or sockets or anything.

	▲	eesmith a year ago \| parent [-]
		As a reminder, in case you didn't consider it, some code in your exec string might be run after the exec has finished, due to garbage collection. `d = {"__builtins__": {"print": print}} exec(""" def delay_until_gc(): try: try: yield 1 finally: print((1).__class__.__bases__[0].__subclasses__()[:3]) except: raise it = delay_until_gc() it.__next__() """, d, d) del d print("Finished exec.")` The output for this on my system is `Finished exec. [<class 'type'>, <class 'async_generator'>, <class 'bytearray_iterator'>]` which means you'll need to ensure those dictionaries are cleared and garbage collected before you can clear your toybox state, something like: `import gc toybox(1) exec(..., d, d) del d gc.collect() toybox(0)` The "del" is not good enough due to the cyclical reference as the iterator function's globals contain the active iterator. If you allow any mutable object into the globals or locals dictionary, such that the exec'ed code can attach something to it, then you can't even use gc.collect() to ensure the exec'ed code can no longer be executed.

▲ high_byte a year ago | parent | prev [-]

thanks, RestrictedPython looks like it could work for me!

▲ billpg a year ago | parent | prev | next [-]

I'm interested in an answer. Is there a way, by design, to run code from an untrusted source in a restricted manner? So the worse the code could do is call me rude names.

	▲	eesmith a year ago \| parent \| next [-]
		Not staying in Python. Python's run-time is not built for sandboxing. If you set up a new runtime environment, like a FreeBSD jail, with no access to anything and a short CPU limit, then start you could start a Python subprocess in that environment, where the only thing that gets out is data via a pipe to call you names. An operating system like FreeBSD is built to run code in a restricted manner.
	▲	high_byte a year ago \| parent \| prev [-]
		I believe that's basically docker which uses linux seccomp, but there are also sandboxes for language specific applications. ps. browsers basically do that with javascript

▲ a year ago | parent | prev | next [-]

[deleted]

▲ PixelNomad_123 a year ago | parent | prev [-]

I agree with eesmith. DONT DO IT. I guess you got your answer restrictedPy