I’ve been running LLMs on my laptop (M3 Max 64GB) for a year now and I think they are ready, especially with how good mid sized models are getting. I’m pretty sure unified memory and energy efficient GPUs will be more than just a thing on Apple laptops in the next few years.

▲

noman-land 7 hours ago | parent | next [-]

You doing code completion and agentic stuff successfully with local models? Got any tips? I've been out of the game for [checks watch] a few months and am behind on the latest. Is Cline the move?

	▲	seanmcdirmid 5 hours ago \| parent [-]
		I haven't bothered doing code completion locally yet, though its something I want to try with the QWEN model. I'm mostly using it to generate/fix code CLI style.

▲

allovertheworld a day ago | parent | prev [-]

Only because of Apples unified memory architecture. The groundwork is there, we just need memory to be cheaper so we can fit 512+GB now ;)

▲

zmmmmm 5 hours ago | parent | next [-]

There's not in the end all that much point having more memory than you can compute on in a reasonable time. So I think probably the useful amount tops out in the 128GB range where you can still run a 70b model and get a useful token rate out of it.

▲

seanmcdirmid a day ago | parent | prev [-]

Memory prices will rise short term and generally fall long term, even with the current supply hiccup the answer is to just build out more capacity (which will happen if there is healthy competition). I meant, I expect the other mobile chip providers to adopt unified architecture and beefy GPU cores on chip and lots of bandwidth to connect it to memory (at the max or ultra level, at least), I think AMD is already doing UM at least?

▲

spwa4 15 hours ago | parent [-]

> Memory prices will rise short term and generally fall long term, even with the current supply hiccup the answer is to just build out more capacity (which will happen if there is healthy competition)

Don't worry! Sam Altman is on it. Making sure there never is healthy competition that is.

https://www.mooreslawisdead.com/post/sam-altman-s-dirty-dram...

▲

seanmcdirmid 9 hours ago | parent [-]

We’ve been through multiple cycles of scarcity/surplus DRAM cycles in the last couple of decades. Why do we think it will be different now?

▲

re-thc 8 hours ago | parent [-]

> Why do we think it will be different now?

Margins. AI usage can pay a lot more. Even if they sell less than can still be more profitable.

In the past there wasn’t a high margin usage. Servers didn’t charge such a high premium.

	▲	seanmcdirmid 5 hours ago \| parent \| next [-]
		Do you not think that some DRAM producer isn't going to see the high margins as a signal to create more capacity to get ahead of the other DRAM producers? This is how it always has worked before, but somehow it is different this time?
	▲	zozbot234 7 hours ago \| parent \| prev [-]
		High margins are exactly what should create a strong incentive to build more capacity. But that dynamic has been tamped down so far because we're all scared of a possible AI bubble that might pop at any moment.