| ▲ | Smaug123 7 hours ago | |
By the way, you've seen Cerebras? It's not gone as far as what you described - loads of cores and RAM but you still load up the weights onto it as software and they need to be streamed into the chip for large models - but it is a whole wafer. | ||
| ▲ | trouve_search 6 hours ago | parent | next [-] | |
Cerebras is a whole lot of SRAM, basically a ton more L1/L2 cache, hence increasing throughput. They're pretty supply constrained right now though and their production costs seem prohibitive. The interesting players at the moment are from Toronto: taalas (print the model onto the silicon) and tenstorrent (dataflow programming based hardware) | ||
| ▲ | londons_explore 5 hours ago | parent | prev [-] | |
There is a huge downside to weights being modifiable - it means you need to have multipliers (not simply adders), and SRAM to store those weights. I suspect for equal performance, that's probably a 5x increase in silicon area (and therefore cost). | ||