Remix.run Logo
ryao 3 hours ago

On Zen 3, I am able to use nearly the full 51.2GB/sec from a single CPU core. I have not tried using two as I got so close to 51.2GB/sec that I had assumed that going higher was not possible. Off the top of my head, I got 49-50GB/sec, but I last measured a couple years ago.

By the way, if the cores were able to load things at full speed, they would be able to use 640GB/sec each. That is 2 AVX-512 loads per cycle at 5GHz. Of course, they never are able to do this due to memory bottlenecks. Maybe Intel’s Xeon Max series with HBM can, but I would not be surprised to see an unadvertised internal bottleneck there too. That said, it is so expensive and rare that few people will ever run code on one.

buildbot 10 minutes ago | parent [-]

People have studied the Xeon Max! Spoiler - yes, it's limited to ~23GB/s per core. It can't achieve anywhere close to the theoretical bandwidth of the HBM even, with all cores active. It's a pretty bad design in my opinion.

https://www.ixpug.org/images/docs/ISC23/McCalpin_SPR_BW_limi...