Remix.run Logo
gchadwick 16 hours ago

> However, Groq’s architecture relies on SRAM (Static RAM). Since SRAM is typically built in logic fabs (like TSMC) alongside the processors themselves, it theoretically shouldn't face the same supply chain crunch as HBM.

It's true SRAM comes with your logic, you get a TSMC N3 (or N6 or whatever) wafer, you got SRAM. Unfortunately SRAM just doesn't have the capacity you have to augment with DRAM which you see companies like D-Matrix and Cerebras doing. Perhaps you can use cheaper/more available LPDDR or GDDR (Nvidia have done this themselves with Rubin CPX) but that also has supply issues.

Note it's not really about parameter storage (which you can amortize over multiple users) it's KV cache storage which gets you and that scales with the user count.

Now Groq does appear to be going for a pure SRAM play but if the easily available pure SRAM thing comes at some multiple of the capital cost of the DRAM thing it's not a simple escape hatch from DRAM availability.

jsheard 16 hours ago | parent | next [-]

SRAM scaling also hit a wall a while ago, so you can't really count on new processes allowing for significantly higher density in the future. That's more of a longer-term issue with the SRAM gambit that'll come into play after the DRAM shortage is over though - logic and DRAM will keep improving while SRAM probably stays more or less where it is now.

zozbot234 16 hours ago | parent [-]

You can still scale SRAM by stacking it in 3D layers, similar to the common approach now used with NAND flash. I think HBM DRAM is also directly stacked on-die to begin with, apparently that's the best approach to scaling memory bandwidth too.

It'll be interesting to see if we get any kind of non-NAND persistent memory in the near future, that might beat some performance metrics of both DRAM and NAND flash.

wtallis 16 hours ago | parent [-]

NAND is built with dozens of layers on one die. HBM DRAM is a dozen-ish dies stacked and interconnected with TSVs, but only one layer of memory cells per die. AMD's X3D CPUs have a single SRAM die stacked on top of the regular CPU+SRAM, with TSVs in the L3 cache to connect to the extra SRAM. I'm not aware of anyone shipping a product that stacks multiple SRAM dies; the tech definitely exists but it may not be economically feasible for any mass-produced product.

arcticbull 16 hours ago | parent | next [-]

The issue is size, SRAM is 6 transistors per bit while DRAM is 1 transistor and a capacitor. Anyone who wants density starts with DRAM. There’s never been motivation to stack.

15 hours ago | parent [-]
[deleted]
toast0 16 hours ago | parent | prev | next [-]

> AMD's X3D CPUs have a single SRAM die stacked on top of the regular CPU+SRAM, with TSVs in the L3 cache to connect to the extra SRAM.

Just FYI, the latest X3D flipped the stack; the cache die is now on the bottom. This helps transfer heat from the compute die to the heatsink more effectively. In armchair silicon designer mode, one could imagine this setup also adds potential for multiple cache dies stacked, since they do interpose all the signals, why not add a second one ... but I'm sure it's not that simple, for one: AMD wants the package z-heights to be consistent between the x3d and normal chip.

mattaw2001 15 hours ago | parent | prev | next [-]

I agree with your description and conclusion. Additionally the companies that can make chip stacks like HBM in volume are the HBM manufacturers. As they are bottlenecked by the packaging/stacking right now (while also furiously building new plant capacity) I can't see them diverting manufacturing to stacking a new SRAM tech.

LargoLasskhyfv 14 hours ago | parent | prev [-]

Every time I read about D|S-RAM scaling I'm reminded of https://www.besang.com/

Ever heard of them? What do you think? Vaporware?

rayiner 15 hours ago | parent | prev [-]

Is Groq different from Grok?

grandmczeb 15 hours ago | parent [-]

They're unrelated. Groq = chip company, Grok = model by x.ai.

dhosek 15 hours ago | parent [-]

The similarity in names is likely to Groq’s detriment.

dpe82 8 hours ago | parent | next [-]

Maybe, but they'd been operating under that name for 7 years before Elon came along and decided he needed a name for his model.

14 hours ago | parent | prev [-]
[deleted]