▲ | simonw 3 days ago | ||||||||||||||||||||||||||||
I went looking for how they define "agent" in the paper: > AI agents are autonomous systems that can reason about tasks and act to achieve goals by leveraging external tools and resources [4]. Modern AI agents are typically powered by large language models (LLMs) connected to external tools or APIs. They can perform reasoning, invoke specialized models, and adapt based on feedback [5]. Agents differ from static models in that they are interactive and adaptive. Rather than returning fixed outputs, they can take multi-step actions, integrate context, and support iterative human–AI collaboration. Importantly, because agents are built on top of LLMs, users can interact with agents through human language, substantially reducing usage barriers for scientists. So more-or-less an LLM running tools in a loop. I'm guessing "invoke specialized models" is achieved here by running a tool call against some other model. | |||||||||||||||||||||||||||||
▲ | backflippinbozo a day ago | parent | next [-] | ||||||||||||||||||||||||||||
Yeah, probably pretty simple compared to the methods we've publicly discussed for months before this publication. Here's the last time we showed our demo on HN: https://news.ycombinator.com/item?id=45132898 We'll actually be presenting on this tomorrow at 9am PST https://calendar.app.google/3soCpuHupRr96UaF8 Besides ReAct, we use AG2's 2-agent pattern with Code Writer and Code Executor in the DockerCommandLineCodeExecutor Also, using hardware monitors and LLM-as-a-Judge to assess task completion. It's how we've built nearly 1K Docker images for arXiv papers over the last couple months: https://hub.docker.com/u/remyxai And how we'll support computational reproducibility by linking Docker images to the arXiv paper publications: https://github.com/arXiv/arxiv-browse/pull/908 | |||||||||||||||||||||||||||||
▲ | eric-burel 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
LLM running tools in a loop is the core idea of ReAct agents, and is indeed one of the most effective way to extract value from a generative AI. Ironically, it's not about generation at all, we use the models classification skills to pick tools and text processing skills to take the context into account. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | datadrivenangel 3 days ago | parent | prev [-] | ||||||||||||||||||||||||||||
With your definitions of agents as running tools in a loop, do you have high hopes for multi-tool agents being feasible from a security perspective? Seems like they'll need to be locked down | |||||||||||||||||||||||||||||
|