| ▲ | Show HN: First Claude Code client for Ollama local models(github.com) | ||||||||||||||||
| 30 points by SerafimKorablev 7 hours ago | 14 comments | |||||||||||||||||
Just to clarify the background a bit. This project wasn’t planned as a big standalone release at first. On January 16, Ollama added support for an Anthropic-compatible API, and I was curious how far this could be pushed in practice. I decided to try plugging local Ollama models directly into a Claude Code-style workflow and see if it would actually work end to end. Here is the release note from Ollama that made this possible: https://ollama.com/blog/claude Technically, what I do is pretty straightforward: - Detect which local models are available in Ollama. - When internet access is unavailable, the client automatically switches to Ollama-backed local models instead of remote ones. - From the user’s perspective, it is the same Claude Code flow, just backed by local inference. In practice, the best-performing model so far has been qwen3-coder:30b. I also tested glm-4.7-flash, which was released very recently, but it struggles with reliably following tool-calling instructions, so it is not usable for this workflow yet. | |||||||||||||||||
| ▲ | oceanplexian 3 hours ago | parent | next [-] | ||||||||||||||||
The Anthropic API was already supported by llama.cpp (The project Ollama ripped off and typically lags in features by 3-6 months), which already works perfectly fine with Claude Code by setting a simple environment variable. | |||||||||||||||||
| |||||||||||||||||
| ▲ | horacemorace 2 hours ago | parent | prev | next [-] | ||||||||||||||||
I was trying to get Claude code to work with llama.cpp but could never figure out anything functional. It always insisted on a phone home login for first time setup. In cline I’m getting better results with glm-4.7-flash than with qwen3-coder:30b | |||||||||||||||||
| |||||||||||||||||
| ▲ | dsrtslnd23 an hour ago | parent | prev | next [-] | ||||||||||||||||
What hardware are you running the 30b model on? I guess it needs at least 24GB VRAM for decent inference speeds. | |||||||||||||||||
| |||||||||||||||||
| ▲ | eli 3 hours ago | parent | prev | next [-] | ||||||||||||||||
There are already various proxies to translate between OpenAI-style models (local or otherwise) and an Anthropic endpoint that Claude Code can talk to. Is the advantage here just one less piece of infrastructure to worry about? | |||||||||||||||||
| |||||||||||||||||
| ▲ | dosinga 3 hours ago | parent | prev | next [-] | ||||||||||||||||
this is cool. not sure it is the first claude code style coding agent that runs against Ollama models though. goose, opencode and others have been able to do that a while no? | |||||||||||||||||
| ▲ | d0100 3 hours ago | parent | prev | next [-] | ||||||||||||||||
Does this UI work with Open Code? | |||||||||||||||||
| ▲ | mchiang 4 hours ago | parent | prev [-] | ||||||||||||||||
hey, thanks for sharing. I had to go to the Twitter feed to find the GitHub link: | |||||||||||||||||
| |||||||||||||||||