I think a major incentive could be to sell hardware. If Apple is able to get their hands on a local LLM capable of covering a significant % of what people use ChatGPT for, the pitch they can offer is:

"Free, private, offline ChatGPT so long as your laptop has X GB of RAM"

Beyond that, I wouldn't underestimate the incentive of "because I can". The "secret sauce" you refer to is effectively just a DB & a while loop that feeds text to a bunch of tensors. If an indie dev decides they want to release something that dismantles the OpenAI & Anthropic moats, there really isn't all that big of a technical barrier stopping them.

▲

bigyabai an hour ago | parent [-]

LLM inference decode is heavily dependent on memory speed, not just having lots of memory. You can't say "X amount of ram" because the memory bandwidth on an M1 is 68.3 GB/s versus the 614 GB/s of an M5 Max, or a 4090's 1.01 TB/s over GDDR6X.

This basically creates a bottleneck at the oldest/cheapest Apple Silicon machines, which are already crippled for context prefill.

▲

h14h an hour ago | parent [-]

Thanks for clarifying -- I was oversimplifying.

But honestly, obsoleting a huge number of otherwise great Apple Silicon machines is something Apple would moment consider a major "pro" of building a compelling local AI stack.

With how much speculation around the difficult time Apple has had getting people to upgrade from M1, I'm sure they'd jump at such an opportunity.

	▲	bijowo1676 an hour ago \| parent [-]
		this might be a way for Apple to milk product revenue for many years. - Please buy our new Macbook pro M5 that gives you 20 tokens/s on local 80B LLM next year - Please buy our new Macbook pro M6 that gives you 25 tokens/s on local 80B LLM milking product revenue in perpetuity by offering meaningful marginal improvements, while keeping same architecture will be the golden goose for Apple +plus if it allows to segment market by wallet size into poor/middle/rich classes, thats even better