Remix.run Logo
fooblaster 2 hours ago

Show me a single FPGA that can outperform a B200 at matrix multiplication (or even come close) at any usable precision.

B200 can do 10 peta ops at fp8, theoretically.

I do agree memory bandwidth is also a problem for most FPGA setups, but xilinx ships HBM with some skus and they are not competitive at inference as far as I know.