Remix.run Logo
jasonjmcghee 4 days ago

I think the best models around right now that most people can fit some quantization on their computer if it's a apple silicon Mac or gaming PC would be:

For non-coding: Qwen3-30B-A3B-Instruct-2507 (or the thinking variant, depending on use case)

For coding: Qwen3-Coder-30B-A3B-Instruct

---

If you have a bit more vram, GLM-4.5-Air or the full GLM-4.5

all2 4 days ago | parent [-]

Note that Qwen3 and Deepseek are hobbled in Ollama; they cannot use tools as the tool portion of the system prompt is missing.

Recommendation: use something else to run the model. Ollama is convenient, but insufficient for tool use for these models.

theshrike79 4 days ago | parent [-]

Could you give a recommendation that works instead of saying what doesn't work?

simonw 4 days ago | parent | next [-]

Try LM Studio or llama-server: https://simonwillison.net/2025/Aug/19/gpt-oss-with-llama-cpp...

all2 3 days ago | parent | prev [-]

I would, but I haven't found a working solution.