| ▲ | elorant 5 hours ago |
| 1.2TB/s memory bandwidth from a CPU. Oh boy. What the fuck is Intel doing all this time and can’t deliver equivalent performance? |
|
| ▲ | justincormack 4 hours ago | parent | next [-] |
| This is exactly what Apple has done, but it does mean soldered memory, as socketed memory at these speeds still hasn't happened. In the server market that is pretty unpopular (even the hyperscalars are apparently reusing DDR4 with CXL in newer machines). DDR6 apparently has twice the memory bandwidth of DDR5 so that will bring it back in line, to around 1TB/s for 12 channels, so comparable but still with standard memory sticks. |
| |
| ▲ | cduzz 2 hours ago | parent | next [-] | | That 1tb / 12 channels is a continuous streaming read / write rates? I assume big wide DDR memory, for random "IO", is much slower than compared to HBM. I feel like at a certain point there are just going to be big SOC packages with 128gb of ram and stacks of cores (each with their own "local" cache) and the 128gb "local" HBM on-package ram will just be the 4th or 5th level cache, and big server boards will have 4 of those and CXL elsewhere for "main" memory. And things like the VAST stuff also blur lines between high speed local storage and less performant san or bulk commodity storage. The old memory / storage hierarchies are getting mixed up (again). Interesting times. | |
| ▲ | veber-alex 4 hours ago | parent | prev | next [-] | | Vera doesn't use soldered memory, it uses SOCAMM2. | |
| ▲ | elorant 4 hours ago | parent | prev | next [-] | | Mac ultras would be great if they offered pcie slots to put nvme drives on and ideally an infiniband network card. | | |
| ▲ | cduzz 43 minutes ago | parent | next [-] | | They've got thunderbolt 5 which is pretty good and every port has a dedicated controller. You can even network macs together just over thunderbolt. Not as fast as raw PCIe slots inside the chassis, but I don't think the ghost of steve jobs cares much. He'd tell you that if you've got no taste at all you can put his beautiful machine into ugly junk like from sonnettech and you can go do ugly things elsewhere. I think apple's happy taking the market they've got and they'll leave the big guns HPC market to nvidia. The margins look great for nvidia right now, but I suspect nvidia's path will be similar to dram's boom/bust cycle more than apple's continuous "premium tool" brand's market positioning. | |
| ▲ | toasty228 3 hours ago | parent | prev [-] | | [dead] |
| |
| ▲ | 15155 4 hours ago | parent | prev [-] | | https://en.wikipedia.org/wiki/CAMM_(memory_module) | | |
| ▲ | cduzz 32 minutes ago | parent [-] | | High Bandwidth Memory uses thousands of interconnects for the data bus. DDR style memory typically uses in the neighborhood of 64 bit transfers at a time. HBM tends to be integrated onto the package (board, multi chip module, die) because there are really tight signaling and wire routing constraints that make "modularity" impossible. I remember back in the day you could get motherboards for your 286, 386, and sometimes even 486 with external L1 / L2 / L3 cache -- you'd buy a bunch of static ram dips that you'd populate sockets next to the CPU, and set a bios or DIP switch to enable it. These days that's just not practical because there are too many wires interconnecting the cache to the dies and cache coherence logic, and the speed of light is just too slow and electricity is too messy to put "external" to the die/chip/package, even if the packaging issues could be addressed. HBM memory is similar -- it's not practical to make a generic interconnect that'd actually work reliably enough to provide field replaceable memory modules as you can with DDR style dimms. EDIT: Apparently I'm totally wrong in that these "SOCAMM2" modules have thousands of pads (like a CPU socket) and can in fact run with the same data bus width (1024 bits wide!) as "local" HBM. Very cool. And please ignore my out of date blatherings above. It's still not quite as fast as if you put the HBM in the package, but it's way faster than the DDR style setup. |
|
|
|
| ▲ | mimd an hour ago | parent | prev | next [-] |
| New to the field? Intel did xeon max in the last gen, with 64GB HBM 2e, 1.6TB/s per 56 cores. It was not a great success. |
|
| ▲ | PaulKeeble 4 hours ago | parent | prev | next [-] |
| If you don't have to worry about replaceable sticks and users choosing their own memory manufacturer, speed and size then you can shorten the traces and improve connectivity including the bus width and its latency. I can't help but think the DIMM format is coming to an end. |
| |
| ▲ | elorant 4 hours ago | parent [-] | | I don't expect them to change their entire pipeline. But surely they could offer a unified RAM lineup that caters to specific needs. Not everyone needs that fast RAM access but for those who do it could be nice to have an option. The writing is on the wall for years now. | | |
| ▲ | pizza234 2 hours ago | parent [-] | | > Not everyone needs that fast RAM access but for those who do it could be nice to have an option. The writing is on the wall for years now. There is an option already, at least from AMD, in the HEDT segment - Threadripper/Pro has 4/8 channels (although the bandwidth is not a high as Apple chips). |
|
|
|
| ▲ | baal80spam 5 hours ago | parent | prev [-] |
| > What the fuck is Intel doing all this time The thing is, it doesn't have to do anything. It is busy getting bailed out, I guess. |