Remix.run Logo
binyu 11 hours ago

> they scale out in 2D, we scale up in 3D.

This actually helps a lot, thanks.

> Instead of spreading SRAM across a wafer, we stack DRAM on top of the logic

Is this done with current manufacturing technologies? Does it require a special process?

> no streaming, no off-chip memory at all. ~1 kW, not 23 kW

Is this for an individual compute unit? Compared to Cerebras, what's the ratio of power used vs compute output?

minkowsky 10 hours ago | parent [-]

I think you are asking for the Energy/token. Cerebras is 12.8J, Sophon is 25.8mJ. Three orders of difference.

binyu 9 hours ago | parent [-]

so Sophon is less efficient than Cerebras?

Edit: is that Joule vs micro-Joule? I need better glasses

> Cerebras is 12.8J, Sophon is 25.8mJ

Are your figures hypothetical or do you have a working prototype?