▲ | datadrivenangel 3 days ago | |||||||
With your definitions of agents as running tools in a loop, do you have high hopes for multi-tool agents being feasible from a security perspective? Seems like they'll need to be locked down | ||||||||
▲ | backflippinbozo a day ago | parent | next [-] | |||||||
No doubt, this toy demo will break your system if the research repo code runs unsecured code. We thought about this out as we built a system that goes beyond running the quickstart to implement the core-methods of arXiv papers as draft PRs for YOUR target repo. Running quickstart in sandbox is practically useless. To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor and use egress whitelisting to limit the ability of an agent to reach out to a compromised server: https://github.com/ag2ai/ag2/pull/1929 Been talking publicly about this for at least a month before this publication, and along the way we've built up nearly 1K Docker images for arXiv paper code: https://hub.docker.com/u/remyxai We're close to seeing these images linked to the arXiv papers after PR#908 is merged: https://github.com/arXiv/arxiv-browse/pull/908 And we're actually doing a technical deep-dive with the AG2 team on our work tomorrow at 9am PST: https://calendar.app.google/3soCpuHupRr96UaF8 | ||||||||
▲ | simonw 3 days ago | parent | prev | next [-] | |||||||
I think the rule still applies that you should consider any tools as being under the control of anyone who manages to sneak instructions into your context. Which is a pretty big limitation in terms of things you can safely use them for! | ||||||||
| ||||||||
▲ | eric-burel 3 days ago | parent | prev [-] | |||||||
That's a problem discussed in the industry. Currently LLM frameworks don't give enough structure when it comes to the agent authorization, sadly. But it will come. |