Remix.run Logo
martinald 2 hours ago

I was thinking about that (I definitely agree with you on the software and data angle).

But when you think about it it's actually a bit more complex. Right now (eg) OpenAI buys GPUs from (eg) NVidia, who buys HBM from Samsung and fabs the card on TSMC.

Google instead designs the chip, with I assume a significant amount of assistance of Broadcom - at least in terms of manufacturing, who then buys the HBM from the same supplier(s) and fabs the card with TSMC.

So I'm not entirely sure if the margin savings are that huge. I assume Broadcom charges a fair bit to manage the manufacturing process on behalf of Google. Almost certainly a lot less than NVidia would charge in terms of gross profit margins, but Google also has to pay for a lot of engineers to do the work that would be done in NVidia.

No doubt it is a saving overall - otherwise they wouldn't do it. But I wonder how dramatic it is.

Obviously Google has significant upside in the ability to customise their chips exactly how they want them, but NVidia (and to a lesser extent) AMD probably can source more customer workflows/issues from their broader set of clients.

I think "Google makes its own TPUs" makes a lot of people think that the entire operation in house, but in reality they're just doing more design work than the other players. There's still a lot of margin "leaking" through Broadcom, memory suppliers and TSMC so I wonder how dramatic it is really is

coredog64 2 hours ago | parent | next [-]

My take is it's the inference efficiency. It's one thing to have a huge GPU cluster for training, but come inference time you don't need nearly so much. Having the TPU (and models purpose built for TPU) allows for best cost in serving at hyperscale.

martinald 19 minutes ago | parent [-]

Yes potentially - but the OG TPUs were actually very poorly suited for LLM usage - designed for far smaller models with more parallelism in execution.

They've obviously adapted the design but it's a risk optimising in hardware like that - if there is another model architecture jump the risk of having a narrow specialised set of hardware means you can't generalise enough.

zozbot234 16 minutes ago | parent [-]

Prefill has a lot of parallelism, and so does decode with a larger context (very common with agentic tasks). People like to say "old inference chips are no good for LLM use" but that's not really true.

flyinglizard 2 hours ago | parent | prev | next [-]

NVidia is operating with what, 70% gross margin? That’s what Google saves. Plus, Broadcom may be in for the design but I’m not sure they’re involved in the manufacturing of TPUs.

lizknope 2 hours ago | parent [-]

Broadcom does the physical design and sources a huge amount of the IP like serdes blocks. TSMC manufactures the chips.

dyauspitr 2 hours ago | parent | prev [-]

What a wild situation to have a significant part of Earth’s major economies be directly reliant, not on one country, but on one building in the world.

collingreen an hour ago | parent [-]

Yeah this is a bummer. If it goes south everyone in power will also have perfect hindsight and say they saw it coming because obviously you shouldn't have this much built on such a small footprint. And yet...

palmotea 31 minutes ago | parent [-]

> Yeah this is a bummer. If it goes south everyone in power will also have perfect hindsight and say they saw it coming because obviously you shouldn't have this much built on such a small footprint. And yet...

It'll be true, everyone does see it coming (just like with rare earth minerals). But the market-infected Western society doesn't have the maturity to do anything about it. Businesses won't because they're expected to optimize for short-term financial returns, government won't because it's hobbled because biases against it (e.g. any failure becomes a political embarrassment, and there's a lot of pressure to stay out of areas where businesses operate and not interfere with businesses).

America needs a lot more strategic government control of the economy, to kick businesses out of their short-term shareholder-focused thinking. If it can't manage that, it will decline into irrelevance.