> ... on my Macbook Max M5 128 GB

Local development for who? How many of y'all are rocking 128GB of memory? Am I reading Apple's site correctly that it's a $10,000 laptop?

▲

kllrnohj 7 hours ago | parent | next [-]

You don't need nearly that much RAM to run Qwen 3.6 27B, though. qwen3.6:27b-q4_K_M is only 17GB, for example.

	▲	DanHulton 6 hours ago \| parent [-]
		This is what I run on an M5 MacBook Air 32GB. Works great. I’m not having it build whole features from scratch, though. I give it pretty explicit instructions closer to the class or function level, and it still saves me an immense amount of time, while I’m very connected to the code that’s written. Definitely the sweet spot for me.

▲

__s 7 hours ago | parent | prev | next [-]

I'm on 128GB ram strix halo, bought framework desktop for a few thousand CAD back when everyone was calling framework desktop overpriced

▲

rhdunn 7 hours ago | parent | prev | next [-]

A 27B model can fit easily on a 32GB VRAM card (e.g. 5090) or a 32GB computer in RAM at FP8/Q8 (unsloth have 28.6GB Q8 files).

For 24GB VRAM cards (e.g. 4090) you can use Q6_K (22.5GB) or Q5_K_M (19.5GB) quants, possibly offloading some of the weights to RAM.

	▲	jboss10 4 hours ago \| parent [-]
		For the 35B model, ofloading to RAM doesn't slow it down much. If you have a nice CPU and a weak GPU, it will be fast enough to use.

▲

wpm 7 hours ago | parent | prev | next [-]

It wasn't $10k a month ago

▲

bahmboo 2 hours ago | parent | prev | next [-]

I work with a lot of 3D graphics and geo stuff so I can hit the ceiling with my 48 GB mac. It's not all LLM work. I prioritized more storage than RAM with my budget. Being able to run local llms has greatly helped me understand how they work. For day to day dev I pay for Gemini or Claude.

▲

mr_mitm 6 hours ago | parent | prev | next [-]

Think commercial. My company invested in a local rig since privacy is important to our customers and sometimes I want to use these models on private data.

	▲	Gigachad an hour ago \| parent [-]
		Even in that case it would make more sense to put the hardware in a server rack shared with everyone rather than inside macbooks. At any rate it makes a stolen backpack or spilled drink a lot less damaging.

▲

scotty79 4 hours ago | parent | prev | next [-]

Qwen3.6 runs great on GPU with 24GB VRAM. You could get used 3090 for it.

▲

spike021 7 hours ago | parent | prev [-]

Certainly won't work on my M4 Pro with 24GB lol

	▲	MatthiasPortzel 6 hours ago \| parent \| next [-]
		I’m using it on a 48GB machine and it causes some lag, so it might be worse on 24, but it should run. Unsloth recommends 18GB of RAM for Qwen3.6-27B (for their version of the model). https://unsloth.ai/docs/models/qwen3.6
	▲	whynotmaybe 7 hours ago \| parent \| prev [-]
		I feel you! Sent from my 8gb M2 Mac mini.