run local models

mark_l_watson a day ago | parent | next [-]

I experiment a lot with local models, great results for engineering tasks, less so for coding agents.

I have used the following on a 32G MacMini to help write useful code:

ollama launch claude --model qwen3.6:27b-coding-nvfp4

The problem is that running local models (except for engineering tasks like data munging) is slow. With the above setup I set up a task (asking for no user verification) and go for a walk to wait for results that my Gemini Ultra plan would produce in 10 seconds.

▲

SchemaLoad 2 days ago | parent | prev [-]

You need massively expensive hardware to run them, and they aren't as good. It's pretty clear the base price of AI tools is way higher than we are being charged right now.

	▲	pixelpoet a day ago \| parent [-]
		I wouldn't call my $2k Strix Halo computer "massively expensive", and it runs e.g. Qwen 3.6 27b brilliantly, with tons of memory to spare and is a full x86 powerhouse pulling 120w at absolute max. IMO the programming world is far too myopic about / insistent on using laptops, especially macbooks. Just because a crappy deal exists doesn't mean everyone is forced to take it. Local AI is a high performance computing problem and laptops are fundamentally a crappy form factor for it; buy an efficient desktop computer and be surprised at what's possible even with today's crazy prices.