Remix.run Logo
nomel 6 hours ago

> It seems that custom inference silicon is a huge part of the AI game.

Is there any public info about % inference on custom vs GPU, for these companies?

mrinterweb 6 hours ago | parent [-]

Gemini is likely the most widely used gen AI model in the world considering search, Android integration, and countless other integrations into the Google ecosystem. Gemini runs on their custom TPU chips. So I would say a large portion of inference is already using ASIC. https://cloud.google.com/tpu