| ▲ | mips_avatar 4 days ago | |
I'm pretty sure xAI exclusively uses Nvidia H100s for Grok inference but I could be wrong. I agree that I don't see why TPUs would necessarily explain latency. | ||
| ▲ | danpalmer 4 days ago | parent [-] | |
To be clear I'm only suggesting that hardware is a factor here, it's far from the only reason. The parent commenter corrected their comment that it was actually Groq not Grok that they were thinking of, and I believe they are correct about that as Groq is doing something similar to TPUs to accelerate inference. | ||