The 128GB M5 Max MBP I ordered at launch was $7049 and is now $9849 for the same configuration, that's nearly a 30% price increase and more than $2000 bump. During the same time from launch to now, I have seen local LLMs get significantly better, to the point that I wish more people had hardware like this to be able to localize their workloads. I can't help but think society is moving in the wrong direction with this technology by further centralizing in hyperscalars and damaging the hardware market to make strong general purpose computing even more difficult for individuals to obtain, when the right direction would be democratization of both the hardware and the software to allow most workloads to be run locally.

▲

kamranjon 12 hours ago | parent | next [-]

I really think this is a coordinated effort to restrict computing capacity for individuals and force adoption of centralized AI - I think there already is evidence of this from the moves OpenAI had made to lock up memory and gpu markets.

▲

aroman 11 hours ago | parent [-]

Who exactly is “coordinating” that effort? Surely everyone except the datacenter builders and the big hosted AI models has exactly the opposite incentive.

▲

kamranjon 9 hours ago | parent | next [-]

I think one of the more ominous things to see in recent years was all of the tech execs at the presidential inauguration, after having collectively donated several million dollars to the inauguration fund. So if we go with that list, which happens to overlap with many of the circular deals we’ve seen in the AI space recently, you’d have people like: Sam Altman, Jeff Bezos, Elon Musk, Mark Zuckerberg, Tim Cook, Sundar Pichai and Sergey Brin

I also wouldn’t be surprised if memory providers weren’t intimately involved, as they’ve been caught price fixing in the past: https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal

▲

bigyabai 8 hours ago | parent [-]

We have to get real, here - most people are not replacing GPT or Claude with local inference, even on M5. If you can afford to do that (RAM shortage or not), then you are in the minority of customers.

Alleviating the memory constraint would only really make Nvidia a danger to cloud margins, and their consumer sales are neutered while they focus on the datacenter segment. It's feels facetious to insinuate that people would be doing inference on their Macbook Neo or Wintel laptop if they only had a gorbillion gigabytes of memory and a 400W accelerator card plugged into the wall outlet.

▲

kamranjon 8 hours ago | parent [-]

You’re out of the loop if you don’t think m series chips with unified memory aren’t one of the best platforms for running local inference

▲

bigyabai 8 hours ago | parent [-]

They aren't. Apple Silicon is unusable for interactive prefill and decode speeds in agentic workflows and SOTA LLMs.

▲

kamranjon 7 hours ago | parent [-]

You’re just out of the loop, and that’s fine but it’s worth learning about.

There is a pretty large and growing community of us using entirely local models for our agentic flows. From GLM 4.7 flash on 32gb machines with >60tok/s to Gemma and Qwen dense and MOE models on 64gb machines all the way up to Deepseek V4 flash on 128gb machines with 450tok/s prefill and 25-30tok/s decode.

I use DS4 on the daily - it’s become my main model.

I know it’s in fashion to talk trash about Apple but their hardware outperforms other options like DGX Sparc when it comes to local inference, they got the unified memory, memory bandwidth and the GPU cores to actually be useful in a way that most other hardware just isn’t.

▲

aroman 7 hours ago | parent | next [-]

My hardware isn't powerful enough to try, so I'm asking out of genuine curiosity, not to push back: what do you use DS4 for? Did it replace e.g Claude Code with Opus for you, or was it replacing something else?

	▲	kamranjon 3 hours ago \| parent [-]
		I use it as my main coding agent - so its running DS4 server on my 128gb mbp and I run the pi coding agent on my other machine which calls out to it. Mostly Go and Typescript work. I also use it in local agent mode if im coding directly on the machine which is nice cause you can save sessions and resume them, and so for personal projects and training related stuff it's been great. Even got an autoresearch loop going where the agent looks at the previous run, adjusts parameters and code if needed, and then hands off training to another script (so full system resources are available for training), ad infinitum - it works really well - what antirez has done with that project is pretty incredible.

▲

johncalvinyoung 5 hours ago | parent | prev | next [-]

Isn't Deepseek V4 Flash still like 150+ GB even at Q4?

▲

bigyabai 5 hours ago | parent | prev [-]

> From GLM 4.7 flash

GLM 4.7 Flash is a 30b model that was far behind SOTA at launch, and I know that because I pay for z.ai inference and have run the model locally. Qwen and Deepseek V4 Flash have the same issue, and beg the question; are you really going to process a 64k agentic context at 450tok/s? That's 2+ minutes that you spend waiting for the first token to generate! Of course nobody can sell that as competitive inference, and it only gets worse with larger models. We're talking about non-interactive speeds, here.

If you're satisfied with small local models, more power to you. It puts you in the same barrel as Strix Halo enthusiasts or the guys that bought 2x3090s on Reddit. You are completely ignoring the market if you think that any of those SOCs are unprecedented or unparalleled for inference workloads, though. The free DS4 API is faster at prefill and decode, you could not give away Mac inference at zero cost and compete with what China provides for free. That's how far behind Macs are for local inference, to put things into perspective.

▲

fsflover 10 hours ago | parent | prev | next [-]

https://news.ycombinator.com/item?id=48673500

▲

varispeed 11 hours ago | parent | prev | next [-]

The rich.

▲

angoragoats 11 hours ago | parent | prev [-]

> Who exactly is “coordinating” that effort?

The datacenter builders and the big hosted AI models. The person you're replying to even mentions OpenAI by name.

▲

jnwatson 10 hours ago | parent | prev | next [-]

I would get nervous carrying around a $10k laptop.

	▲	tristor 10 hours ago \| parent [-]
		I get more nervous not carrying it around when I travel. It's a lot easier to steal things that aren't on your person. That said, I get what you mean. I cover my photography gear with insurance and the computer since it is used for my photography (in addition to local LLMs) is covered under that insurance also.

▲

sixothree 11 hours ago | parent | prev | next [-]

I had one in my cart last night. It seems far less appealing today.

There are two things that would prevent people from using local models - pricing and regulations. And we're seeing moves from both of those fronts lately.

▲

tristor 12 hours ago | parent | prev [-]

Related, I just realized that Apple uses the same numeric price in multiple regions but just changes the currency. At current price, you'd save $3149 USD flying from London to New York City (minus travel costs) to buy a maxed out 14" MBP vs buying it in London, since the price is 9849 GBP vs 9489 USD...

▲

jorvi 11 hours ago | parent | next [-]

The EU price includes the warranty, which is at least 2 years but is officially for "the expected life of the product", which in the case of an $10,000 laptop would probably be a decade plus.

	▲	medvezhenok 4 hours ago \| parent \| next [-]
		You can get AppleCare+ in the U.S. for $149/year which is just as good (or better) than any warranty.
	▲	freediddy 10 hours ago \| parent \| prev [-]
		Do you really think the warranty justifies that price differential? A warranty only protects against manufacturers defects.

▲

stockresearcher 11 hours ago | parent | prev | next [-]

> you'd save $3149 USD flying from London to New York City

Hey, Infantino was ahead of the curve! For the same price as an English MBP, you can get an American one and see the Three Lions disappoint against Panama!

▲

orlp 11 hours ago | parent | prev [-]

You save a lot less after paying import duties.

▲

tristor 11 hours ago | parent [-]

Do you pay import duties in the UK on items purchased for personal use? The situation is changing constantly in the US, but generally speaking you do pay duties only over a certain dollar amount in value if you intend to keep the item in country after importation (and a MBP would be over that amount), but it's a fairly small percentage (around $400 in duties on $3149 saved here). I'm not sure how it'd work in the UK.

	▲	orlp 8 hours ago \| parent [-]
		It seems like there aren't extra duties (anymore), but then again it's all very confusing and hard to navigate so who knows.