| ▲ | MCP Run Python(github.com) |
| 173 points by xrd 7 months ago | 29 comments |
| |
|
| ▲ | behnamoh 6 months ago | parent | next [-] |
| So their method of sandboxing Python code is to spin up a JS runtime (deno), run Pyodide on it, and then run the Python code in Pyodide. Seems a lot of work to me. Is this really the best way to create and run Python sandboxes? |
| |
| ▲ | simonw 6 months ago | parent | next [-] | | I've been trying to find a good option for this for ages. The Deno/Pyodide one is genuinely one of the top contenders: https://til.simonwillison.net/deno/pyodide-sandbox I'm hoping some day to find a recipe I really like for running Python code in a WASM container directly inside Python. Here's the closest I've got, using wasmtime: https://til.simonwillison.net/webassembly/python-in-a-wasm-s... | | |
| ▲ | 5rest 6 months ago | parent [-] | | The demo looks really appealing. I have a real-world use case in mind: analyzing an Excel file and asking questions about its contents. The current approach (https://github.com/pydantic/pydantic-ai/blob/main/mcp-run-py...) seems limited to running standalone scripts—it doesn't support reading and processing files. Is there an extension or workaround to enable file input and processing? |
| |
| ▲ | kodablah 6 months ago | parent | prev | next [-] | | There just aren't good Python sandboxing approaches. There are subinterpreters but they can slow to start from scratch. There are higher-level sandboxing approaches like microvms, but they have setup overhead and are not easy to use from inside Python. At Temporal, we required a sandbox but didn't have any security requirement, so we wrote it from scratch with eval/exec and a custom importer [0]. It is not a foolproof sandbox, but it does a good job at isolating state, intercepting and preventing illegal calls we don't like, and allowing some imports to "pass through" the outside instead of being reloaded for performance reasons. 0 - https://github.com/temporalio/sdk-python?tab=readme-ov-file#... | |
| ▲ | anentropic 6 months ago | parent | prev | next [-] | | It's what ChatGPT does apparently... https://simonwillison.net/2024/Dec/10/chatgpt-canvas/ | |
| ▲ | pseudosavant 6 months ago | parent | prev | next [-] | | If there is a WASM build of the project, that is going to be the easiest and safest way to run that with untrusted user content. And Deno happens to be really good at hosting WASM itself. So, these are the two easiest tools to do this with. I was looking into using WASM in Python yesterday for some image processing. It requires pulling in a full WASM runtime like wasmtime. Still better than calling out to native binaries like ImageMagick, but definitely more complicated than doing it in Deno. If I was writing it myself I'd do Deno, but LLMs are so good at writing Python. | |
| ▲ | redleader55 6 months ago | parent | prev | next [-] | | The author states: > The code is executed using Pyodide in Deno and is therefore isolated from the rest of the operating system. To me personally, the premise is a bit naive - it assumes that deno's WASM VM doesn't have exploits, that pyodide doesn't have bugs, etc. It might as well ask the LLM to produce javascript code and run it under deno and then it would be simpler. In the end, the problem is one of risk budget. If you're running this in a VM you control and it's only you running your own prompts on it, maybe it's "good enough". If on the other hand, you want to sell this service to others who will attack your infrastructure, then no - it's not even close to be enough. Your question is a bit vague because it doesn't explain what "best way" means for you. Cheap, secure, implementable by a person over a weekend? | |
| ▲ | jjuliano 6 months ago | parent | prev | next [-] | | I am nowhere near as big or as popular as Pydantic, but this is my solution - https://kdeps.com/getting-started/resources/python.html | |
| ▲ | samuel 6 months ago | parent | prev | next [-] | | I spin up a docker container using the docker API. I haven't used gvisor because I don't expect the model to try kernel level exploits. If it were the case, we're already doomed. | |
| ▲ | pansa2 6 months ago | parent | prev | next [-] | | It might be. CPython doesn't support sandboxing Python code, so the only option is to run the whole interpreter within a sandbox. | |
| ▲ | jacob019 6 months ago | parent | prev | next [-] | | Indeed. What ever happened to user mode linux? | |
| ▲ | ridruejo 6 months ago | parent | prev | next [-] | | It’s one of the best ways, at least on the sandboxing front. Hard to beat Wasm at that | |
| ▲ | kissgyorgy 6 months ago | parent | prev [-] | | Not at all. |
|
|
| ▲ | simonw 6 months ago | parent | prev | next [-] |
| I hacked around with this a bit and figured out a way to get it to spit out logging of the prompts and responses to the server: https://gist.github.com/simonw/54fc42ef9a7fb8f777162bbbfbba4... Short-ish version: ANTHROPIC_API_KEY="$(llm keys get anthropic)" \
uv run --with devtools --with pydantic-ai python -c '
import asyncio
from devtools import pprint
from pydantic_ai import Agent, capture_run_messages
from pydantic_ai.mcp import MCPServerStdio
server = MCPServerStdio(
"deno",
args=[
"run",
"-N",
"-R=node_modules",
"-W=node_modules",
"--node-modules-dir=auto",
"jsr:@pydantic/mcp-run-python",
"stdio",
],
)
agent = Agent("claude-3-5-haiku-latest", mcp_servers=[server])
async def main():
with capture_run_messages() as messages:
async with agent.run_mcp_servers():
result = await agent.run("How many days between 2000-01-01 and 2025-03-18?")
pprint(messages)
print(result.output)
asyncio.run(main())'
Output here: https://gist.github.com/simonw/54fc42ef9a7fb8f777162bbbfbba4...I got it running against Mistral Small 3.1 running locally too - notes on that here: https://simonwillison.net/2025/Apr/18/mcp-run-python/ |
|
| ▲ | evacchi 6 months ago | parent | prev | next [-] |
| cool!! you might also want to check out https://www.mcp.run/dylibso/eval-py It's open source too :) https://github.com/dylibso/mcp.run-servlets/tree/main/servle... We also use Wasm to sandbox all our servlets https://docs.mcp.run/blog/2025/04/07/mcp-run-security (I work at Dylibso) |
|
| ▲ | _pdp_ 6 months ago | parent | prev | next [-] |
| Bookmarked it. We took another approach which provides more flexibility but at the cost of slower spin up. Basically we use firecracker vm. We mount the attachments and everything else into the vm so that the agent can run tools on them (anything on the os) and we destroy the machine at the very end. It works! It is also as secure as firecracker goes. But I like using WASM especially in a hosted environment like Deno. It feels like a more scaleable solution and probably less maintenance too with the downside that that we wont be able to run just any cmd. I am happy to provide more details and point to the tool is anyone is interested. It is not open-source but you can play with it for free. |
| |
|
| ▲ | yahoozoo 6 months ago | parent | prev | next [-] |
| All of these Agent frameworks are already overwhelming. Insert joke about parallels to the JavaScript ecosystem. What agent framework is truly the top dog? Is it just working with the big model providers native frameworks, such as OpenAI’s Agents SDK? |
|
| ▲ | m3047 6 months ago | parent | prev | next [-] |
| Having watched the repeated immolation of blissful innocence since smart email clients would run whatever smart (OLE? Smart? I'm kidding.) document was delivered, this is going to be so much fun in a trainwreck kind of way. |
|
| ▲ | bigbuppo 6 months ago | parent | prev | next [-] |
| I keep seeing this MCP thing and I'm really happy that people are getting into Burroughs mainframes rather than that stupid AI crap. |
| |
| ▲ | snoman 6 months ago | parent [-] | | That’s a pretty obscure/dated reference to the Master Control Program that ran on Burroughs mainframes. I suspect the downvotes are for “… stupid AI crap.” |
|
|
| ▲ | someguy101010 6 months ago | parent | prev | next [-] |
| Nice! I'm working on a way to do this for javascript using v8 https://github.com/r33drichards/mcp-js. Right now this works but there is some significant jank. |
|
| ▲ | Cluelessidoit 6 months ago | parent | prev | next [-] |
| Hi, I don’t really know anything honestly, but I do remember an ai I running on my laptop using xpip or xpython as a contained environment I think it’s a single instance, would that work or is that close??? |
|
| ▲ | jamesralph8555 6 months ago | parent | prev | next [-] |
| How secure is this? I tried building something similar, but it was taking too long to setup a fully virtualized solution like kata container or firecracker. |
|
| ▲ | singularity2001 6 months ago | parent | prev | next [-] |
| Why not Pyodide directly in python? |
| |
| ▲ | simonw 6 months ago | parent [-] | | I haven't found a supported, documented way to do that yet. I'd love to find one. |
|
|
| ▲ | turnsout 6 months ago | parent | prev | next [-] |
| Woof, use with care |
|
| ▲ | neuroelectron 6 months ago | parent | prev | next [-] |
| Crap but it's mcp so being good isn't the point anyway |
|
| ▲ | mountainriver 6 months ago | parent | prev [-] |
| Cool! |