You’ll pry the ARM M series chips of my Mac from my cold dead hands. They’re a game changer in the space and one of the best reasons to use a Mac.

I am not a chip expert it’s just so night and day different using a Mac with an arm chip compared to an Intel one from thermals to performance and battery life and everything in between. Intel isn’t even in the same ballpark imo.

But competition is good and let’s hope they both do —- Intel and AMD because the consumer wins.

▲ mort96 4 days ago | parent | next [-]

I have absolutely no doubt in my mind that if Apple's CPU engineers got half a decade and a mandate from the higher ups, they could make an amazing amd64 chip too.

▲ KingOfCoders 3 days ago | parent [-]

What you have to understand, these are all the same people.

▲ cylemons 3 days ago | parent [-]

Dont high profile designers have strong anticompetes?

	▲	KingOfCoders 3 days ago \| parent \| next [-]
		e.g. Jim Keller. CPU engineer `1982 - 1998, DEC Alpha processors (loved them) 1998 - 1999, AMD Athlon 1999 - 2004, MIPS 2008 - 2012, Apple A4/A5 2012 - 2015, AMD Ryzen/Zen 2016 - 2018, Tesla 2018 - 2020, Intel` https://en.wikipedia.org/wiki/Jim_Keller_%28engineer%29
	▲	astrange 2 days ago \| parent \| prev [-]
		California doesn't allow noncompete clauses. It's why Silicon Valley exists in the first place.

▲ kccqzy 4 days ago | parent | prev | next [-]

That's not mostly because of a better ISA. If Intel and Apple had a chummier relationship you could imagine Apple licensing the Intel x86 ISA and the M series chips would be just as good but running x86. However I suspect no matter how chummy that relationship was, business is business and it is highly unlikely that Intel would give Apple such a license.

▲ FlyingAvatar 4 days ago | parent | next [-]

It's pretty difficult to imagine.

Apple did a ton of work on the power efficiency of iOS on their own ARM chips for iPhone for a decade before introducing the M1.

Since iOS and macOS share the same code base (even when they were on different architectures) it makes much more sense to simplify to a single chip architecture that they already had major expertise with and total control over.

There would be little to no upside for cutting Intel in on it.

▲

jopsen 3 days ago | parent [-]

Isn't it also easier to license ARM, because that's the whole point of the ARM Corporation.

It's not like Intel or AMD are known for letting other customize their existing chip designs.

▲

rahkiin 3 days ago | parent | next [-]

Apple was a very early investor in ARM and is one of the few with a perpetual license of ARM tech

	▲	nly 3 days ago \| parent \| next [-]
		And an architect license that lets them modify the ISA I believe
	▲	3 days ago \| parent \| prev [-]
		[deleted]

▲

mandevil 3 days ago | parent | prev [-]

Intel and AMD both sell quite a lot of customized chips, at least in the server space. As one example, any EC2 R7i or R7a instance you have are not running on a Sapphire Rapids or EPYC processor that you could buy, but instead one customized for AWS. I would presume that other cloud providers have similar deals worked out.

▲ x0x0 4 days ago | parent | prev [-]

> That's not mostly because of a better ISA

Genuinely asking -- what is it due to? Because like the person you're replying to, the m* processors are simply better: desktop-class perf on battery that hangs with chips with 250 watt TDP. I have to assume that amd and intel would like similar chips, so why don't they have them if not due to the instruction set? And AMD is using TSMC, so that can't be the difference.

▲ toast0 4 days ago | parent | next [-]

I think the fundamental difference between an Apple CPU and an Intel/AMD CPU is Apple does not play in the megahertz war. The Apple M1 chip, launched in 2020 clocks at 3.2GHz; Intel and AMD can't sell a flagship mobile processor that clocks that low. Zen+ mobile Ryzen 7s released Jan 2019 have a boost clock of 4 GHz (ex: 3750H, 3700U); mobile Zen2 from Mar 2020 clock even higher (ex: 4900H at 4.4, 4800H at 4.2). Intel Tiger Lake was hitting 4.7 Ghz in 2020 (ex: 1165G7).

If you don't care to clock that high, you can reduce space and power requirements at all clocks; AMD does that for the Zen4c and Zen5c cores, but they don't (currently) ship an all compact core mobile processor. Apple can sell a premium branded CPU where there's no option to burn a lot of power to get a little faster; but AMD and Intel just can't, people may say they want efficiency, but having higher clocks is what makes an x86 processor premium.

In addition to the basic efficiency improvements you get by having a clock limit, Apple also utilizes wider execution; they can run more things in parallel, this is enabled to some degree by the lower clock rates, but also by the commitment to higher memory bandwidth via on package memory; being able to count on higher bandwidth means you can expect to have more operations that are waiting on execution rather than waiting on memory, so wider execution has more benefits. IIRC, Intel released some chips with on package memory, but they can't easily just drop in a couple more integer units onto an existing core.

The weaker memory model of ARM does help as well. The M series chips have a much wider out of order window, because they don't need to spend as much effort on ordering constraints (except when running in the x86 support mode); this also helps justify wider execution, because they can keep those units busy.

I think these three things are listed in order of impact, but I'm just an armchair computer architecture philosopher.

▲

fluoridation 3 days ago | parent | next [-]

Does anyone actually care at all about frequencies? I care if my task finishes quickly. If it can finish quickly at a low frequency, fine. If the clock runs fast but the task doesn't, how is that a benefit?

My understanding is that both Intel and AMD are pushing high clocks not because it's what consumers want, but because it's the only lever they have to pull to get more gains. If this year's CPU is 2% faster than your current CPU, why would you buy it? So after they have their design they cover the rest of the target performance gain by cranking the clock, and that's how you get 200 W desktop CPUs.

>the commitment to higher memory bandwidth via on package memory; being able to count on higher bandwidth means you can expect to have more operations that are waiting on execution rather than waiting on memory, so wider execution has more benefits.

I believe you could make a PC (compatible) with unified memory and a 256-bit memory bus, but then you'd have to make the whole thing. Soldered motherboard, CPU/GPU, and RAM. I think at the time the M1 came out there weren't any companies making hardware like that. Maybe now that x86 handhelds are starting to come out, we may see laptops like that.

▲

Yizahi 3 days ago | parent [-]

It's only recently when consumer software has become truly multithreaded. Historically there were major issues with that until very recently. Remember Bulldozer fiasco? They bet on the parallel execution more than Intel at the same time, e.g. same price Intel chip was 4 core, while AMD had 8 cores (consumer market). Single thread performance had been the deciding factor for decades. Even today AMDs outlier SKUs with a lot of cores and slightly lower frequencies (like 500 MHz lower or so) are not a topic of the day in any media or forum community. People talk about either top of the line SKU or something with low core count but clocking high enough to be reasonable for lighter use. Releasing low frequency high core count part for consumers would be greeted with questions, like "what for is this CPU?".

▲

fluoridation 3 days ago | parent | next [-]

Are we just going to pretend that frequency = single-thread performance? I'm fine with making that replacement mentally, I just want to confirm we're all on the same page here.

>Releasing low frequency high core count part for consumers would be greeted with questions, like "what for is this CPU?".

It's for homelab and SOHO servers. It won't get the same attention as the sexy parts... because it's not a sexy part. It's something put in a box and stuff in a corner to chug away for ten years without looking at it again.

▲

wmf 3 days ago | parent | prev | next [-]

low frequency high core count part for consumers

That's not really what we're talking about. Apple's cores are faster yet lower clocked. (Not just faster per clock but absolutely faster.) So some people are wondering if Intel/AMD targeting 6 GHz actually reduced performance.

▲

gigatexal 3 days ago | parent | prev [-]

But the OS has been able to take advantage of it since mountain lion with grand central dispatch. I could be wrong with the code name. This makes doing parallel things very easy.

But most every OS can.

	▲	astrange 3 days ago \| parent [-]
		Parallelism is actually very difficult and libdispatch is not at all perfect for it. Swift concurrency is a newer design and gets better performance by being /less/ parallel. (This is mostly because resolving priority inversions turns out to be very important on a phone, and almost noone designs for this properly because it's not important on servers.)

▲

cosmic_cheese 3 days ago | parent | prev [-]

> Apple can sell a premium branded CPU where there's no option to burn a lot of power to get a little faster; but AMD and Intel just can't, people may say they want efficiency, but having higher clocks is what makes an x86 processor premium.

I think this is very context dependent. Is this a big, heavy 15”+ desktop replacement notebook where battery life was never going to be a selling point in the first place? One of those with a power brick that could be used as a dumbbell? Sure, push those clocks.

In a machine that’s more balanced or focused on portability however, high clock speeds do nothing but increase the likelihood of my laptop sounding like a jet and chewing through battery. In that situation higher clocks makes a laptop feel less premium because it’s worse at its core use case for practically no gain in exchange.

▲ exmadscientist 4 days ago | parent | prev | next [-]

> I have to assume that amd and intel would like similar chips

They historically haven't. They've wanted the higher single-core performance and frequency and they've pulled out all the stops to get it. Everything had been optimized for this. (Also, they underinvested in their uncores, the nastiest part of a modern processor. Part of the reason AMD is beating Intel right now despite being overall very similar is their more recent and more reliable uncore design.)

They are now realizing that this was, perhaps, a mistake.

AMD is only now in a position to afford to invest otherwise (they chose quite well among the options actually available to them, in my opinion), but Intel has no such excuse.

▲

x0x0 4 days ago | parent [-]

Not arguing, but I would think there is (and always has been) very wide demand for fastest single core perf. From all the usual suspects?

Thank you.

▲

MBCook 3 days ago | parent | next [-]

Oh there certainly is. And there’s a reason Apple works hard for really fast single core performance. For a lot of tasks it still matters.

I suspect one of the issues is that pushing the clock is a really easy way to get an extra 2% so you can claim the crown of fastest or try to win benchmarks. It’s easy to fall into a trap of continuing to do that over and over.

But we know the long-term result. You end up blasting out a ton of heat and taking up a ton of power, even though you may only be 10% faster than a competitor who did things differently. Or worse you try to optimize for ever increasing clocks and get stuck like the Pentium 4.

As said up thread, no one really compares Apple CPU speeds with megahertz. That’s partially because Apple doesn’t talk about it or emphasize it which makes it more difficult, and partially because it’s not like you have a choice anyway.

It would never happen but it would be interesting to see how things would develop if it was possible to simply ban talking about clock speeds somehow. What would that do to the market?

▲

exmadscientist 3 days ago | parent | prev [-]

Only Intel and AMD actually attempt to deliver fastest single-thread performance. Apple has made the decision that almost-but-not-quite-the-fastest is good enough for them.

And that has made all the difference.

	▲	aurareturn 3 days ago \| parent [-]
		You’ve been saying that this whole thread but you’ve not provided any evidence.

▲ bryanlarsen 4 days ago | parent | prev | next [-]

What's it due to? At least this, probably more.

- more advanced silicon architecture. Apple spends billions to get access to the latest generation a couple of years before AMD.

- world class team, with ~25 years of experience building high speed low power chips. (Apple bought PA Semi to make these chips, which was originally the team that build the DEC StrongARM). And then paid & treated them properly, unlike Intel & AMD

- a die budget to spend transistors for performance: the M chips are generally quite large compared to the competition

- ARM's weak memory model also helps, but it's very minor IMO compared to the above 3.

▲ aurareturn 3 days ago | parent | next [-]

  - a die budget to spend transistors for performance: the M chips are generally quite large compared to the competition

This is a myth. Apple chips are no bigger than the competition. For example, base M4 is smaller than Lunar Lake but is more efficient and 35% faster. M4 Pro is smaller than Strix Halo by a large margin but generally matches/exceeds the performance. Only the M4 Max is very large but it has no equivalent in the x86 world.

▲ x0x0 4 days ago | parent | prev | next [-]

interesting, ty

re: apple getting exclusive access to the best fab stuff: https://appleinsider.com/articles/23/08/07/apple-has-sweethe... . Interesting.

	▲	MBCook 3 days ago \| parent [-]
		At the same time they have a guaranteed customer who will buy the chips. How many other companies would be willing to try a process with a 30% success rate? I think Apple helps them with money (loan?) to get some of the equipment or build the new lines. In exchange they get first shot at buying capacity. And of course Apple is certainly paying for the privilege of the best process. At least more than other companies are willing. And they must buy a pretty tremendous volume across a couple of sizes. It benefits both companies, otherwise they wouldn’t do it.

▲ astrange 3 days ago | parent | prev | next [-]

> And then paid & treated them properly, unlike Intel & AMD

Relatively properly. Nothing like the pampering software people get. I've heard Mr. Srouji is very strict about approving promotions personally etc.

(…by heard I mean I read Blind posts)

▲ gigatexal 4 days ago | parent | prev [-]

How many of those engineers remain, didn't a lot go to Nuvia that was then bought by Qualcomm?

	▲	bryanlarsen 4 days ago \| parent \| next [-]
		Sure, but they were there long enough to train and instill culture into the others. And of course, since the acquisition in 2008 they've had access to the top new grads and experienced engineers. If you're coming out top of your class at an Ivy or similar you're going to choose Apple over Intel or AMD both because of rep and the fact that your offer salary is much better. P.S. hearsay and speculation, not direct experience. I haven't worked at Apple and anybody who has is pretty closed lip. You have to read between the lines. P.P.S. It's sort of a circular argument. I say Apple has the best team because they have the best chip && they have the best chip because they have the best team. But having worked (briefly) in the field, I'm very confident that their success is much more likely due to having the best team rather than anything else.
	▲	MBCook 3 days ago \| parent \| prev [-]
		And isn’t that the reason people think some of the most recent Qualcomm chips are so much better?

▲ ThrowawayR2 3 days ago | parent | prev | next [-]

Intel and AMD are after the very high profit margins of the enterprise server market. They have much less motivation to focus on power efficient mobile chips which are less profitable for them.

Apple's primary product is consumer smartphones and tablets so they are solely focused on power efficient mobile chips.

▲ bsder 3 days ago | parent | prev [-]

> Genuinely asking -- what is it due to?

Mostly memory/cache subsystem.

Apple was willing to spend a lot of transistors on cache because they were optimizing the chips purely for mobile and can bury the extra cost in their expensive end products.

You will note that after the initial wins from putting stonking amounts of cache and memory bandwidth in place, Apple has not had any significant performance jump beyond the technology node improvements.

	▲	x0x0 3 days ago \| parent \| next [-]
		I still don't understand though. Given their profit margins, the fact that they're shipping m chips in eg $1k computers means it's a $150 part. There's tons of people that would pay $300+ for an equivalent perf + heat x86 competitor.
	▲	astrange 3 days ago \| parent \| prev \| next [-]
		They aren't aiming for performance in the first place. It's a coincidence that it has good performance. They're aiming for high performance/power ratios.
	▲	MBCook 3 days ago \| parent \| prev [-]
		Wasn’t the M3 a reasonable increase and the M4 much more significant than that? The M2 was certainly nothing amazing in jump.

▲ pengaru 4 days ago | parent | prev | next [-]

Your Intel mac was stuck in the past while everyone paying attention on PCs were already enjoying TSMC 7nm silicon in the form of AMD Zen processors.

Apple Silicon macs are far less impressive if you came from an 8c/16t Ryzen 7 laptop. Especially if you consider the Apple parts are consistently enjoying the next best TSMC node vs. AMD (e.g. 5nm (M1) vs. 7nm (Zen2))

What's _really_ impressive is how badly Intel fell behind and TSMC has been absolutely killing it.

▲ jeswin 3 days ago | parent | next [-]

> Your Intel mac was stuck in the past while everyone paying attention on PCs were already enjoying TSMC 7nm silicon in the form of AMD Zen processors.

This is basically it. Coming from dated Intel CPUs, Mac users got a shockingly good upgrade when the M-series computers were released. That amplified Apple's claims of Macs being the fastest computers, even when some key metrics (such as disk performance) were significantly behind PC parts in reality.

Yes, they're still better in performance/watt - but the node difference largely explains it like you were saying.

▲ gigatexal 4 days ago | parent | prev [-]

that ryzen laptop chip perform it'll just do it at a higher perf/watt than the apple chip will... and on a laptop that's a key metric.

▲ tracker1 4 days ago | parent [-]

And 20% or so of that difference is purely the fab node difference, not anything to do with the chip design itself. Strix Halo is a much better comparison, though Apple's M4 models do very well against it often besting it at the most expensive end.

On the flip side, if you look at servers... Compare a 128+core AMD server CPU vs a large core ARM option and AMD perf/watt is much better.

▲ gigatexal 3 days ago | parent [-]

Wait are you saying the diff in perf per watt from apple arm to x86 is purely on fab leading edge ness?

▲ Jensson 3 days ago | parent [-]

Basically yeah, if you compare CPU from same fab then its basically the same.

Its just Apple buys next gen fabs while AMD and intel has to be on last gen, so the M computers people compare are always one fab gen ahead. It has very little to do with CPU architecture.

They do have some cool stuff about their CPU, but the thing most laud them for has to do with fabs.

▲ addaon 3 days ago | parent | next [-]

There's another difference -- willingness to actually pay for silicon. The M1 Max is a 432 mm^2 laptop chip built on a 5 nm process. Contrast that to AMD's "high end" Ryzen 7 8845HS at 178 mm^2 on a 4 nm process. Even the M1 Pro at 245 mm^2 is bigger than this. More area means not just more peak performance, but the ability to use wider paths at lower speeds to maintain performance at lower power. 432 mm^2 is friggin' huge for a laptop part, and it's really hard to compete with what that can do on any metric besides price.

▲

MindSpunk 3 days ago | parent | next [-]

Comparing the M1 Max to a Ryzen 7 8845HS is not a fair comparison because the M1 chip also includes a _massive_ GPU tile, unlike the 8845HS which has a comparatively tiny iGPU because most vendors taking that part are pairing them with a separate dGPU package.

A better comparison is to take the total package area of the AI Max+ 395 that includes a 16 core CPU + a massive GPU tile and you get ~448mm^2 across all 3 chiplets.

▲

tracker1 3 days ago | parent | prev | next [-]

Apple's SOC does a bit more than AMD's, such as including the ssd controller. I don't know if Apple is grafting different nodes together for chiplets, etc compared to AMD on desktop.

The area has nothing to do with peak performance... based on the node, it has to do with the amount of components you can cram into a given space. The CRAY-1 cpu was massive compared to both of your examples, but doesn't come close to either in terms of performance.

Also, Ryzen AI Max+ 395 is top dog on the AMD mobile CPU front and is around 308mm^2 combined.

▲

addaon 3 days ago | parent [-]

> The area has nothing to do with peak performance... based on the node, it has to do with the amount of components you can cram into a given space.

Of course it does. For single-threaded performance, the knobs I can turn are clockspeed (minimal area impact for higher speed standard cells, large power impact), core width (significant area impact for decoder, execution resources, etc, smaller power impact), and cache (huge area impact, smaller power impact). So if I want higher single-threaded performance on a power budget, area helps. And of course for multi-threaded performance the knobs I have are number of cores, number of memory controllers, and last-level cache size, all of which drive area. There's a reason Moore's law was so often interpreted as talking about performance and not transistor count -- transistor count gives you performance. If you're willing to build a 432 mm^2 chip instead of a 308 mm^2 chip iso-process, you're basically gaining a half-node of performance right there.

	▲	tracker1 3 days ago \| parent [-]
		Transistor count does not equal performance. More transistors isn't necessarily going to speed up any random single-threaded bottleneck. Again, the CRAY-1 CPU is around 42000 mm^2, so I'm guessing you'd rather run that today, right?

▲

gigatexal 3 days ago | parent | prev [-]

True the M1 Pro and Max chips were capable of 200GB/s and 400GB/s of bandwidth between the chip and the integrated memory. No desktop chips had such at the time I think.

▲ aurareturn 3 days ago | parent | prev | next [-]

  Basically yeah, if you compare CPU from same fab then its basically the same.

This isn't true. If you compare N5 Apple to N5 AMD chips, Apple chips still come out far ahead in efficiency.

▲ gigatexal 3 days ago | parent | prev [-]

Man that either hella discounts all the amazing work Apple’s CPU engineers are doing or hyping up what AMD’s have done. Idk

▲

Jensson 3 days ago | parent | next [-]

Isn't it you who is hyping up Apple here when you don't even compare the two using similar architecture? Compare a 5nm AMD laptop low power cpu to Apple M1 and the M1 no longer looks that much better at all.

	▲	gigatexal 3 days ago \| parent [-]
		Why are we talking about the M1 that came out eons (in computer time) ago? That the M1 is a benchmark is just sad when the M4 is running circles around competing x86 processors and the M5 is on the horizon which who knows what that has in store.

▲

tracker1 3 days ago | parent | prev [-]

I wouldn't discount what Apple has done... they've created and integrated some really good niche stuff in their CPUs to do more than typical ARM designs. The graphics cores are pretty good in their own right even. Not to mention the OS/Software integration including accelerated x86 and unified memory usage in practice.

AMD has done a LOT for parallelization and their server options are impressive... I mean, you're still talking 500W+ in total load, but that's across 128+ cores. Strix Halo scaling goes down impressively to the ~10-15W range under common usage, not as low as Apple does under similar loads but impressive in its own way.

▲ KingOfCoders 3 days ago | parent | prev [-]

I think everything depends on circumstances.

I've used laptops for 15+ years (transitioned from a Mac Cube to a white Macbook. Macbook Pro etc.) but have migrated to a desktop some years ago (first iMac Pro, now AMD), as I work at my desk and when I'm not at my desk I'm not working.

Some years ago I got a 3900X and a 2080TI. And they still work fine, and I don't have performance problems, and although I thought of getting PCI5/NVMe with a 9950x3d/395+ (or a Threadripper), I just don't need it. I've upgraded the SSDs several times for speed and capacity (now at the PCI4/M2 limit and don't want to go into RAID), and added solar panels and a battery pack for energy usage, but I'm fine otherwise.

Indeed I want to buy a new CPU and GPU, but I don't find enough reasons (though might get a Mac Studio for local AI).

But I understand your point if you need a laptop, I just decided I no longer need one, and get more power with faster compiling for less money.