Remix.run Logo
BenderV 4 days ago

I'm trying to understand what does it got to do with LLM size? Imho, right tools allow small models to perform better than undirected tool like bash to do everything. But I understand that this code is to show people how function calling is just a template for LLM.

diminish 4 days ago | parent [-]

Mini swe agent, as an academic tool, can be easily tested aimed to show the power of a simple idea against any LLM. You can go and test it with different LLMs. Tool calls didn't work fine with smaller LLM sizes usually. I don't see many viable alternatives less than 7GB, beyond Qwen3 4B for tool calling.

> right tools allow small models to perform better than undirected tool like bash to do everything.

Interesting enough the newer mini swe agent was refutation of this hypothesis for very large LLMs from the original swe agent paper (https://arxiv.org/pdf/2405.15793) assuming that specialized tools work better.

BenderV 3 days ago | parent [-]

Thanks for your answer.

I guess that it's only a matter of finetuning.

LLM have lots of experience with bash so I get they figure out how to work with it. They don't have experience with custom tools you provide it.

And also, LLM "tools" as we know it need better design (to show states, dynamic actions).

Given both, AI with the right tools will outperform AI with generic and uncontrolled tool.