Remix.run Logo
codazoda 2 hours ago

I can't get Codex CLI or Claude Code to use small local models and to use tools. This is because those tools use XML and the small local models have JSON tool use baked into them. No amount of prompting can fix it.

In a day or two I'll release my answer to this problem. But, I'm curious, have you had a different experience where tool use works in one of these CLIs with a small local model?

zackify an hour ago | parent | next [-]

I'm using this model right now in claude code with LM Studio perfectly, on a macbook pro

codazoda an hour ago | parent [-]

You mean Qwen3-Coder-Next? I haven't tried that model itself, yet, because I assume it's too big for me. I have a modest 16GB MacBook Air so I'm restricted to really small stuff. I'm thinking about buying a machine with a GPU to run some of these.

Anywayz, maybe I should try some other models. The ones that haven't worked for tool calling, for me are:

Llama3.1

Llama3.2

Qwen2.5-coder

Qwen3-coder

All these in 7b, 8b, or sometimes 30b (painfully) models.

I should also note that I'm typically using Ollama. Maybe LM Studio or llama.cpp somehow improve on this?

regularfry 2 hours ago | parent | prev [-]

Surely the answer is a very small proxy server between the two?

codazoda 2 hours ago | parent [-]

That might work, but I keep seeing people talk about this, so there must be a simple solution that I'm over-looking. My solution is to write my own minimal and experimental CLI that talks JSON tools.