I have some macro opinions about Apple - not sure if I'm correct, but tell me what you think.

Apple has always seen RAM as an economic advantage for their platform: Make the development effort to ensure that the OS and apps work well with minimal memory and save billions every year in hardware costs. In 2026, iPhones still come with 8Gb of RAM, Pro/Max come with 12Gb.

The problem is that AI (ML/LLM training and inference) are areas where you can't get around the need for copious amounts of fast working memory. (Thus the critical shortage of RAM at the moment as AI data centers consume as many memory chips as possible.)

Unless there's something I don't know (which is more than possible) Apple can't code their way around this problem, nor create specialized SoCs with ML cores that obviate the need for lots and lots of RAM.

So, it's going to be interesting whether they accept this reality and we start seeing the iPhones in the future with 16Gb, 32Gb or more as standard in order to make AI performant. And if they give up on adding AI to the billions of iPhones with minimal RAM already out there.

As a side note, 8Gb of RAM hasn't been enough for a decade. It prevents basic tasks like keeping web tabs live in the background. My pet peeve is having just a few websites open, and having the page refresh when swapping between them because of aggressive memory management.

To me, Apple's obvious strength is pushing AI to the edge as much as possible. While other companies are investing in massive data centers which will have millions of chips that will be outdated within the next couple years, Apple will be able to incrementally improve their ML/AI features by running on the latest and greatest chips every year. Apple has a huge advantage in that they can design their chips with a mega high speed bus, which is just as important as the quantity of RAM.

But all that depends on Apple's willingness to accept that RAM isn't an area they can skimp on any more, and I'm not sure they will.

Sorry for the brain dump. I'd love to be educated on this in case I'm totally off base.

▲

mlsu 3 hours ago | parent | next [-]

Models on the phone is never going to make sense.

If you're loading gigabytes of model weights into memory, you're also pushing gigabytes through the compute for inference. No matter how you slice it, no matter how dense you make the chips, that's going to cost a lot of energy. It's too energy intensive, simple as.

"On device" inference (for large LLM I mean) is a total red herring. You basically never want to do it unless you have unique privacy considerations and you've got a power cable attached to the wall. For a phone maybe you would want a very small model (like 3B something in that size) for Siri-like capabilities.

On a phone, each query/response is going to cost you 0.5% of your battery. That just isn't tenable for the way these models are being used.

Try this for yourself. Load a 7B model on your laptop and talk to it for 30 minutes. These things suck energy like a vacuum, even the shitty models. A network round trip costs gets you hundreds of tokens from a SOTA model and costs 1 joule. By contrast, a single forward pass (one token) of a shitty 7b model costs 1 joule. It's just not tenable.

	▲	russellbeattie 2 hours ago \| parent [-]
		Huh, I hadn't thought of battery limitations. Good call. My initial reaction is that bigger/better batteries, hyper fast recharge times and more efficient processors might address this issue, but I need to learn more about it. That said, power consumption is one of the reasons I think pushing this stuff to the edge is the only real path for AI in terms of a business model. It basically spreads the load and passes the cost of power to the end user, rather than trying to figure out how to pay for it at the data center level.

▲

ecshafer 4 hours ago | parent | prev | next [-]

In a recent episode of Dwarkesh the guest who is a semiconductor industry analyst predicted that an iPhone will increase in price by about $250 for the same stuff due to increased ram/chip costs from AI. Apple will not be able to afford to put a bunch more RAM into the phones and still sell them.

	▲	alwillis 2 hours ago \| parent [-]
		> In a recent episode of Dwarkesh the guest who is a semiconductor industry analyst predicted that an iPhone will increase in price by about $250 for the same stuff due to increased ram/chip costs from AI. Apple will not be able to afford to put a bunch more RAM into the phones and still sell them. Apple recently stated on an earnings call they signed contracts with RAM vendors before prices got out of control, so they should be good for a while. Nvidia also uses TSMC for their chips, which may affect A series and M series chip production. Yes, TSMC has a plant in Arizona but my understanding is they can't make the cutting edge chips there; at least not yet.

▲

zozbot234 4 hours ago | parent | prev | next [-]

RAM is just too expensive. We need to bring back non-DRAM persistent memory that doesn't have the wearout issues of NAND.

	▲	anemll 3 hours ago \| parent [-]
		multiple NAND, and apple already used it in Mac Studio. Plus better cooling

▲

big_toast 3 hours ago | parent | prev | next [-]

I think this is roughly true, but instead RAM will remain a discriminator even moreso. If the scaling laws apple has domain over are compute and model size, then they'll pretty easily be able to map that into their existing price tiers.

Pros will want higher intelligence or throughput. Less demanding or knowledgeable customers will get price-funneled to what Apple thinks is the market premium for their use case.

It'll probably be a little harder to keep their developers RAM disciplined (if that's even still true) for typical concerns. But model swap will be a big deal. The same exit vs voice issues will exist for apple customers but the margin logic seems to remain.

▲

GTP 3 hours ago | parent | prev | next [-]

> nor create specialized SoCs with ML cores that obviate the need for lots and lots of RAM

Why do you say they can't do this?

▲

ottah 4 hours ago | parent | prev [-]

Possibly this just isn't the generation of hardware to solve this problem in? We're like, what three or four years in at most, and only barely two in towards AI assisted development being practical. I wouldn't want to be the first mover here, and I don't know if it's a good point in history to try and solve the problem. Everything we're doing right now with AI, we will likely not be doing in five years. If I were running a company like Apple, I'd just sit on the problem until the technology stabilizes and matures.

▲

bigyabai 4 hours ago | parent [-]

If I was running a company like Apple, I'd be working with Khronos to kill CUDA since yesterday. There are multiple trillions of dollars that could be Apple's if they sign CUDA drivers on macOS, or create a CUDA-compatible layer. Instead, Apple is spinning their wheels and promoting nothingburger technology like the NPU and MPS.

It's not like Apple's GPU designs are world-class anyways, they're basically neck-and-neck with AMD for raster efficiency. Except unlike AMD, Apple has all the resources in the world to compete with Nvidia and simply chooses to sit on their ass.

▲

zozbot234 4 hours ago | parent [-]

CUDA is not the real issue, AMD's HIP offers source-level compatibility with CUDA code, and ZLUDA even provides raw binary compatibility. nVidia GPUs really are quite good, and the projected advantages of going multi-vendor just aren't worth the hassle given the amount of architecture-specificity GPUs are going to have.

▲

bigyabai 4 hours ago | parent [-]

Okay, then don't kill CUDA, just sign CUDA drivers on macOS instead and quit pretending like MPS is a world-class solution. There are trillions on the table, this is not an unsolvable issue.

	▲	atultw 2 hours ago \| parent [-]
		Admittedly, my use of CUDA and Metal is fairly surface-level. But I have had great success using LLMs to convert whole gaussian splatting CUDA codebases to Metal. It's not ideal for maintainability and not 1:1, but if CUDA was a moat for NVIDIA, I believe LLMs have dealt a blow to it.