Remix.run Logo
Neywiny 9 hours ago

When you can ASIC, yes, do an ASIC. But my point was that there was a lot of GPU comparison. GPUs are also not ASICs relative to AI.

QuadmasterXLII 8 hours ago | parent [-]

They’re close, they’re basically matmul asics

Neywiny 8 hours ago | parent [-]

Arguably so are the DSP heavy FPGAs. And the unused logic will have a minimal static power draw relative to the unused but clocked G-only parts of the GPU.

daxfohl 5 hours ago | parent [-]

I have to imagine google considered this and decided against it. I assume it's that all the high-perf matmul stuff needs to be ASIC'd out to get max performance, quality heat dissipation, etc. And for anything reconfigurable, a CPU-based controller or logic chip is sufficient and easier to maintain.

FPGA's kind of sit in this very niche middle ground. Yes you can optimize your logic so that the FPGA does exactly the thing that your use case needs, so your hardware maps more precisely to your use case than a generic TPU or GPU would. But what you gain in logic efficiency, you'll lose several times over in raw throughput to a generic TPU or GPU, at least for AI stuff which is almost all matrix math.

Plus, getting that efficiency isn't easy; FPGAs have a higher learning curve and a way slower dev cycle than writing TPu or GPU apps, and take much longer to compile and test than CUDA code, especially when they get dense and you have to start working around gate timing constraints and such. It's easy to get to a point where even a tiny change can exceed some timing constraint and you've got to rewrite a whole subsystem to get it to synthesize again.