Remix.run Logo
eesmith 7 months ago

Don't do it. Really, really don't do it. People have tried for decades to develop such a sandbox, and it does not work.

high_byte 7 months ago | parent [-]

as I mentioned on another reply, all those projects also wanted system apis like filesystem and sockets and such.

for me I just want to hijack the interpreter so I don't have to write my own. no imports, no sockets.

eesmith 7 months ago | parent [-]

No, I'm not.

I'm talking about the history beyond why rexec and Bastion, and restricted execution, were removed from Python in the 2.x days. See https://python.readthedocs.io/en/v2.7.2/library/restricted.h... , "In Python 2.3 these modules have been disabled due to various known and not readily fixable security holes."

They started because back in the 1.x days there was a Python web browser called Grail, and the hope was to support restricted Python applets in Grail.

Or from 10 years ago, read https://lwn.net/Articles/574215/ about the failure of 'pysandbox' where one of the ways to break out was to "[use] a traceback object to unwind the stack frame to one in the trusted namespace, then use the f_globals attribute to retrieve a global object." ... "Stinner's tale should serve as a cautionary one to anyone considering a CPython-based solution".

You might consider RestrictedPython at https://restrictedpython.readthedocs.io/en/latest/ which supports only a subset of Python, via AST-walking to limit what the code can do. I have no experience with it.

high_byte 7 months ago | parent | next [-]

I didn't use RestrictPython. I did manage to patch the __subclasses__() escape with a hack. if only I can patch the exceptions traceback too I think it will be good enough :)

edit: here are my silly little patches: https://github.com/hananbeer/cpython-toy-sandbox/commit/fa3f...

this is of course assuming exec(globals={..}) without certain builtins and is, again, not expected to use system apis like files or sockets or anything.

eesmith 7 months ago | parent [-]

As a reminder, in case you didn't consider it, some code in your exec string might be run after the exec has finished, due to garbage collection.

    d = {"__builtins__": {"print": print}}

    exec("""

    def delay_until_gc():
      try:
        try:
          yield 1
        finally:
          print((1).__class__.__bases__[0].__subclasses__()[:3])
      except:
        raise

    it = delay_until_gc()
    it.__next__()

    """, d, d)
    del d
    print("Finished exec.")
The output for this on my system is

  Finished exec.
  [<class 'type'>, <class 'async_generator'>, <class 'bytearray_iterator'>]
which means you'll need to ensure those dictionaries are cleared and garbage collected before you can clear your toybox state, something like:

  import gc
  toybox(1)
  exec(..., d, d)
  del d
  gc.collect()
  toybox(0)
The "del" is not good enough due to the cyclical reference as the iterator function's globals contain the active iterator.

If you allow any mutable object into the globals or locals dictionary, such that the exec'ed code can attach something to it, then you can't even use gc.collect() to ensure the exec'ed code can no longer be executed.

high_byte 7 months ago | parent | prev [-]

thanks, RestrictedPython looks like it could work for me!