Remix.run Logo
b0a04gl 9 hours ago

tpu's predictable latency under scale. when you control the compiler, the runtime, the interconnect and the chip, you eliminate so much variance that you can actually schedule jobs efficiently at data center scale. so the obvious question why haven't we seen anyone outside Google replicate this full vertical stack yet? is it because the hardware's hard or because no one has nailed the compiler/runtime contract at that scale?

transpute 7 hours ago | parent | next [-]

Groq, https://en.wikipedia.org/wiki/Groq & https://news.ycombinator.com/item?id=44345738

kevindamm 7 hours ago | parent | prev [-]

you mean other than NVidia and AMD?