Remix clone Hacker News

new | show | ask | jobs Github

	▲	mips_avatar 4 days ago
		I'm pretty sure xAI exclusively uses Nvidia H100s for Grok inference but I could be wrong. I agree that I don't see why TPUs would necessarily explain latency.
	▲	danpalmer 4 days ago \| parent [-]
		To be clear I'm only suggesting that hardware is a factor here, it's far from the only reason. The parent commenter corrected their comment that it was actually Groq not Grok that they were thinking of, and I believe they are correct about that as Groq is doing something similar to TPUs to accelerate inference.