| ▲ | tucnak 7 hours ago | ||||||||||||||||||||||||||||||||||||||||||||||
It all comes down to memory and fabric bandwidth. For example, the state of the art developer -friendly (PCIe 5.0) FPGA platform is Alveo V80 which rocks four 200G NIC's. Basically, Alveo currently occupies this niche where it's the only platform on the market to allow programmable in-network compute. However, what's available in terms of bandwidth—lags behind even pathetic platforms like Bluefield. Those in the know are aware of what challenges are there to actually saturate it for inference in practical designs. I think, Xilinx is super well-positioned here, but without some solid hard IP it's still a far cry from purpose silicon. | |||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | mrinterweb 7 hours ago | parent [-] | ||||||||||||||||||||||||||||||||||||||||||||||
As far as I understand all the inference purpose-build silicon out there is not being sold to competitors and kept in-house. Google's TPU, Amazon's Inferentia (horrible name), Microsoft's Maia, Meta's MTIA. It seems that custom inference silicon is a huge part of the AI game. I doubt GPU-based inference will be relevant/competitive soon. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||