▲ | b0a04gl 7 hours ago | |
tpu's predictable latency under scale. when you control the compiler, the runtime, the interconnect and the chip, you eliminate so much variance that you can actually schedule jobs efficiently at data center scale. so the obvious question why haven't we seen anyone outside Google replicate this full vertical stack yet? is it because the hardware's hard or because no one has nailed the compiler/runtime contract at that scale? | ||
▲ | transpute 5 hours ago | parent | next [-] | |
Groq, https://en.wikipedia.org/wiki/Groq & https://news.ycombinator.com/item?id=44345738 | ||
▲ | kevindamm 5 hours ago | parent | prev [-] | |
you mean other than NVidia and AMD? |