Remix.run Logo
nl 2 days ago

I've been super impressed by qwen3:0.6b (yes, 0.6B) running in Ollama.

If you have very specific, constrained tasks it can do quite a lot. It's not perfect though.

https://tools.nicklothian.com/llm_comparator.html?gist=fcae9... is an example conversation where I took OpenAI's "Natural language to SQL" prompt[1], send it to Ollama:qwen3:0.6b and the asked Gemini Flash 3 to compare what qwen3:0.6b did vs what Flash did.

Flash was clearly correct, but the qwen3:0.6b errors are interesting in themselves.

[1] https://platform.openai.com/docs/examples/default-sql-transl...

Aurornis 2 days ago | parent [-]

I’ve experimented with several of the really small models. It’s impressive that they can produce anything at all, but in my experience the output is basically useless for anything of value.

nl 2 days ago | parent [-]

Yes, I thought that too! But qwen3:0.6b (and to some extent gemma 1b) has made me reevaluate.

They still aren't useful like large LLMs, but for things like summarization, and other tasks where you can give them structure but want the sheen of natural language they are much better than things like the Phi series were.

redman25 2 days ago | parent | next [-]

That's interesting. For what projects would you want the "sheen of natural language" though?

nl a day ago | parent [-]

Say I want to auto-bookmark a bunch of tabs and need a summary of each one. Using the title is a mechanical solution, but a nice prompt and a small model can summarize the title and contents into something much more useful.

nunodonato a day ago | parent | prev [-]

qwen3 family, mostly 4B and 8B are absolutely amazing. the VL versions even more