Remix.run Logo
simonw 3 days ago

I went looking for how they define "agent" in the paper:

> AI agents are autonomous systems that can reason about tasks and act to achieve goals by leveraging external tools and resources [4]. Modern AI agents are typically powered by large language models (LLMs) connected to external tools or APIs. They can perform reasoning, invoke specialized models, and adapt based on feedback [5]. Agents differ from static models in that they are interactive and adaptive. Rather than returning fixed outputs, they can take multi-step actions, integrate context, and support iterative human–AI collaboration. Importantly, because agents are built on top of LLMs, users can interact with agents through human language, substantially reducing usage barriers for scientists.

So more-or-less an LLM running tools in a loop. I'm guessing "invoke specialized models" is achieved here by running a tool call against some other model.

backflippinbozo a day ago | parent | next [-]

Yeah, probably pretty simple compared to the methods we've publicly discussed for months before this publication.

Here's the last time we showed our demo on HN: https://news.ycombinator.com/item?id=45132898

We'll actually be presenting on this tomorrow at 9am PST https://calendar.app.google/3soCpuHupRr96UaF8

Besides ReAct, we use AG2's 2-agent pattern with Code Writer and Code Executor in the DockerCommandLineCodeExecutor

Also, using hardware monitors and LLM-as-a-Judge to assess task completion.

It's how we've built nearly 1K Docker images for arXiv papers over the last couple months: https://hub.docker.com/u/remyxai

And how we'll support computational reproducibility by linking Docker images to the arXiv paper publications: https://github.com/arXiv/arxiv-browse/pull/908

eric-burel 3 days ago | parent | prev | next [-]

LLM running tools in a loop is the core idea of ReAct agents, and is indeed one of the most effective way to extract value from a generative AI. Ironically, it's not about generation at all, we use the models classification skills to pick tools and text processing skills to take the context into account.

ijk 2 days ago | parent [-]

I tend to find that using LLMs for interpretation and classification is often more useful for a given business task than wholesale generation.

datadrivenangel 3 days ago | parent | prev [-]

With your definitions of agents as running tools in a loop, do you have high hopes for multi-tool agents being feasible from a security perspective? Seems like they'll need to be locked down

backflippinbozo a day ago | parent | next [-]

No doubt, this toy demo will break your system if the research repo code runs unsecured code.

We thought about this out as we built a system that goes beyond running the quickstart to implement the core-methods of arXiv papers as draft PRs for YOUR target repo.

Running quickstart in sandbox is practically useless.

To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor and use egress whitelisting to limit the ability of an agent to reach out to a compromised server: https://github.com/ag2ai/ag2/pull/1929

Been talking publicly about this for at least a month before this publication, and along the way we've built up nearly 1K Docker images for arXiv paper code: https://hub.docker.com/u/remyxai

We're close to seeing these images linked to the arXiv papers after PR#908 is merged: https://github.com/arXiv/arxiv-browse/pull/908

And we're actually doing a technical deep-dive with the AG2 team on our work tomorrow at 9am PST: https://calendar.app.google/3soCpuHupRr96UaF8

simonw 3 days ago | parent | prev | next [-]

I think the rule still applies that you should consider any tools as being under the control of anyone who manages to sneak instructions into your context.

Which is a pretty big limitation in terms of things you can safely use them for!

backflippinbozo a day ago | parent [-]

We built agents to test github repo quickstarts associated with arXiv papers a couple months before this paper was published, wrote about it publicly here: https://remyxai.substack.com/p/self-healing-repos

We've been pushing it farther to implement draft PRs in your target repo, published a month before this preprint: https://remyxai.substack.com/p/paperswithprs

To limit the attack surface we added PR#1929 to AG2 so we could pass API keys to the DockerCommandLineCodeExecutor but also use egress whitelisting to block the ability of an agent to reach a compromised server: https://github.com/ag2ai/ag2/pull/1929

Since then, we've been scaling this with k8s ray workers so we can run this in the cloud to build for the hundreds of papers published daily.

By running in Docker, constraining the network interface, deploying on the cloud, and ultimately keeping humans-in-the-loop through PR review, it's hard to see where the prompt-injection attack comes into play from testing the code.

Would love to get feedback from an expert on this, can you imagine an attack scenario, Simon?

I'll need to work out a check for the case where someone creates a paper with code instructing my agent to publish keys to a public HF repo for others to exfiltrate.

eric-burel 3 days ago | parent | prev [-]

That's a problem discussed in the industry. Currently LLM frameworks don't give enough structure when it comes to the agent authorization, sadly. But it will come.