Remix.run Logo
cduzz an hour ago

High Bandwidth Memory uses thousands of interconnects for the data bus. DDR style memory typically uses in the neighborhood of 64 bit transfers at a time.

HBM tends to be integrated onto the package (board, multi chip module, die) because there are really tight signaling and wire routing constraints that make "modularity" impossible.

I remember back in the day you could get motherboards for your 286, 386, and sometimes even 486 with external L1 / L2 / L3 cache -- you'd buy a bunch of static ram dips that you'd populate sockets next to the CPU, and set a bios or DIP switch to enable it. These days that's just not practical because there are too many wires interconnecting the cache to the dies and cache coherence logic, and the speed of light is just too slow and electricity is too messy to put "external" to the die/chip/package, even if the packaging issues could be addressed.

HBM memory is similar -- it's not practical to make a generic interconnect that'd actually work reliably enough to provide field replaceable memory modules as you can with DDR style dimms.

EDIT:

Apparently I'm totally wrong in that these "SOCAMM2" modules have thousands of pads (like a CPU socket) and can in fact run with the same data bus width (1024 bits wide!) as "local" HBM. Very cool. And please ignore my out of date blatherings above. It's still not quite as fast as if you put the HBM in the package, but it's way faster than the DDR style setup.