Then idk why they say that most laptops are bad at running LLMs, Apple has a huge marketshare in the laptop market and even their cheapest laptops are capable in that realm. And their PC competitors are more likely to be generously specced out in terms of included memory.

> However, for the average laptop that’s over a year old, the number of useful AI models you can run locally on your PC is close to zero.

This straight up isn’t true.

▲

literalAardvark 7 hours ago | parent | next [-]

Apple has a 10-18% market share for laptops. That's significant but it certainly isn't "most".

Most laptops can run at best a 7-14b model, even if you buy one with a high spec graphics chip. These are not useful models unless you're writing spam.

Most desktops have a decent amount of system memory but that can't be used for running LLMs at a useful speed, especially since the stuff you could run in 32-64GB RAM would need lots of interaction and hand holding.

And that's for the easy part, inference. Training is much more expensive.

▲

seanmcdirmid 5 hours ago | parent | next [-]

A Max cpu can run 30b models quantized, and definitely has the RAM to fit them in memory. The normal and pro CPUs will be compute/bandwidth limited. Of course, the Ultra CPU is even better than the Max, but they don't come in laptops yet.

▲

nunodonato 6 hours ago | parent | prev [-]

my laptop is 4 years old. I only have 6Gb VRam. I run, mostly, 4b and 8b models. They are extremely useful in a variety of situations. Just because you can't replicate what you do in chatgpt doesn't mean they don't have their use cases. It seems to me you know very little about what these models can do. Not to speak of trained models for specific use cases, or even smaller models like functiongemma or TTS/ASR models. (btw, I've trained models using my 6Gb VRAM too)

▲

reactordev 3 hours ago | parent | next [-]

I’ll chime in and say I run LM Studio on my 2021 MacBook Pro M1 with no issues.

I have 16GB ram. I use unsloth quantized models like qwen3 and gpt-oss. I have some MCP servers like Context7 and Fetch that make sure the models have up to date information. I use continue.dev in VSCode or OpenCode Agent with LM Studio and write C++ code against Vulkan.

It’s more than capable. Is it fast? Not necessarily. Does it get stuck? Sometimes. Does it keep getting better? With every model release on huggingface.

Total monthly cost: $0

▲

literalAardvark 3 hours ago | parent | prev [-]

A few examples of useful tasks would be appreciated. I do suffer from a sad lack of imagination.

	▲	nunodonato 3 hours ago \| parent [-]
		I suggest taking a look at /r/localLLaMa and see all sorts of cool things people do with small models.

▲

andai 8 hours ago | parent | prev | next [-]

So I'm hearing a lot of people running LLMs on Apple hardware. But is there actually anything useful you can run? Does it run at a usable speed? And is it worth the cost? Because the last time I checked the answer to all three questions appeared to be no.

Though maybe it depends on what you're doing? (Although if you're doing something simple like embeddings, then you don't need the Apple hardware in the first place.)

	▲	anonzzzies 2 hours ago \| parent \| next [-]
		I was sitting in an airplane next to a guy on a MacBook pro something who was coding in cursor with a local llm. We got talking and he said there are obviously differences but for his style of 'English coding' (he described basically what code to write/files to change but in english, but more sloppy than code obviously otherwise he would just code) it works really well. And indeed that's what he could demo. The model (which was the OSS gpt i believe) did pretty well in his nextjs project and fast too.
	▲	sueders101 2 hours ago \| parent \| prev \| next [-]
		I've tried out gpt-oss:20b on a MacBook Air (via Ollama) with 24GB of RAM. In my experience it's output is comparable to what you'd get out of older models and the openAI benchmarks seem accurate https://openai.com/index/introducing-gpt-oss/ . Definitely a usable speed. Not instant, but ~5 tokens per second of output if I had to guess.
	▲	fhsm 6 hours ago \| parent \| prev \| next [-]
		This paper shows a use case running on Apple silicon that’s theoretically valuable: https://pmc.ncbi.nlm.nih.gov/articles/PMC12067846/ Who cares if result is right / wrong etc as it will all be different in a year … just interesting to see a test of desktop class hardware go ok.
	▲	seanmcdirmid 5 hours ago \| parent \| prev \| next [-]
		I have an MBP Max M3 with 64GB of RAM, and I can run a lot at useful speed (LLMs run fine, diffusion image models run OK although not as fast as they would on a 3090). My laptop isn't typical though, it isn't a standard MBP with a normal or pro processor.
	▲	jki275 6 hours ago \| parent \| prev \| next [-]
		I can definitely write code with a local model like Devstral small or a quantized granite, or a quantized deep-seek on an M1 Max w/ 64gb of ram.
	▲	DANmode 7 hours ago \| parent \| prev [-]
		Of course it depends what you’re doing. Do you work offline often? Essential.

▲

fancyfredbot 7 hours ago | parent | prev | next [-]

Most laptops have 16GB of RAM or less. A little more than a year ago I think the base model Mac laptop had 8GB of RAM which really isn't fantastic for running LLMs.

▲

layer8 8 hours ago | parent | prev | next [-]

By “PC”, they mean non-Apple devices.

Also, macOS only has around 10% desktop market share globally.

	▲	dangus an hour ago \| parent [-]
		It's actually closer to 20% globally. Apple now outsells Lenovo: https://www.mactech.com/2025/03/18/the-mac-now-has-14-8-of-t...

▲

DANmode 7 hours ago | parent | prev [-]

> Apple has a huge marketshare in the laptop market

Hello, from outside of California!

▲

dangus an hour ago | parent [-]

Global Mac marketshare is actually higher than the US: https://www.mactech.com/2025/03/18/the-mac-now-has-14-8-of-t...

	▲	DANmode an hour ago \| parent [-]
		Less than 1 in 5 doesn’t feel like huge market share, but it’s more than I have!