Remix.run Logo
sbarre 6 hours ago

A question I don't see addressed in all these articles: what prevents Nvidia from doing the same thing and iterating on their more general-purpose GPU towards a more focused TPU-like chip as well, if that turns out to be what the market really wants.

timmg 6 hours ago | parent | next [-]

They will, I'm sure.

The big difference is that Google is both the chip designer *and* the AI company. So they get both sets of profits.

Both Google and Nvidia contract TSMC for chips. Then Nvidia sells them at a huge profit. Then OpenAI (for example) buys them at that inflated rate and them puts them into production.

So while Nvidia is "selling shovels", Google is making their own shovels and has their own mines.

pzo 2 hours ago | parent | next [-]

on top of that Google is also cloud infrastructure provider - contrary to OpenAI that need to have someone like Azure plug those GPUs and host servers.

1980phipsi 6 hours ago | parent | prev | next [-]

Aka vertical integration.

m4rtink 4 hours ago | parent | prev [-]

So when the bubble pops the companies making the shovels (TSMC, NVIDIA) might still have the money they got for their products and some of the ex-AI companies might least be able to sell standard compliant GPUs on the wider market.

And Google will end up with lots of useless super specialized custom hardware.

skybrian 2 hours ago | parent | next [-]

It seems unlikely that large matrix multipliers will become useless. If nothing else, Google uses AI extensively internally. It already did in ways that weren’t user-visible long before the current AI boom. Also, they can still put AI overviews on search pages regardless of what the stock market does. They’re not as bad as they used to be, and I expect they’ll improve.

Even if TPU’s weren’t all that useful, they still own the data centers and can upgrade equipment, or not. They paid for the hardware out of their large pile of cash, so it’s not debt overhang.

Another issue is loss of revenue. Google cloud revenue is currently 15% of their total, so still not that much. The stock market is counting on it continuing to increase, though.

If the stock market crashes, Google’s stock price will go down too, and that could be a very good time to buy, much like it was in 2008. There’s been a spectacular increase since then, the best investment I ever made. (Repeating that is unlikely, though.)

timmg 3 hours ago | parent | prev | next [-]

> And Google will end up with lots of useless super specialized custom hardware.

If it gets to the point where this hardware is useless (I doubt it), yes Google will have it sitting there. But it will have cost Google less to build that hardware than any of the companies who built on Nvidia.

UncleOxidant 2 hours ago | parent | next [-]

Right, and the inevitable bubble pop will just slow things down for a few years - it's not like those TPUs will suddenly be useless, Google will still have them deployed, it's just that instead of upgrading to a newer TPU they'll stay with the older ones longer. It seems like Google will experience much less repercussions when the bubble pops compared to Nvidia, OpenAI, Anthropic, Oracle etc. as they're largely staying out of the money circles between those companies.

immibis 3 hours ago | parent | prev [-]

aka Google will have less of a pile of money than Nvidia will

kolbe 2 hours ago | parent [-]

Alphabet is the most profitable company in the world. For all the criticisms you can throw at Google, lacking a pile of money isn't one of them.

nutjob2 an hour ago | parent | prev | next [-]

How could Google's custom hardware become useless? They've used it for their business for years now and will do so for years into the future. It's not like their hardware is LLM specific. Google cannot lose with their vast infrastructure.

Meanwhile OpenAI et al dumping GPUs while everyone else is doing the same will get pennies on the dollar. It's exactly the opposite to what you describe.

I hope that comes to pass, because I'll be ready to scoop up cheap GPUs and servers.

qcnguy 12 minutes ago | parent [-]

Same way cloud hardware always risks becoming useless. The newer hardware is so much better you can't afford to not upgrade, e.g. an algorithmic improvement that can be run on CUDA devices but not on existing TPUs, which changes the economics of AI.

acoustics 4 hours ago | parent | prev [-]

I think people are confusing the bubble popping with AI being over. When the dot-com bubble popped, it's not like internet infrastructure immediately became useless and worthless.

iamtheworstdev 3 hours ago | parent [-]

that's actually not all that true... a lot of fiber that had been laid went dark, or was never lit, and was hoarded by telecoms in an intentional supply constrained market in order to drive up the usage cost of what was lit.

pksebben an hour ago | parent | next [-]

If it was hoarded by anyone, then by definition not useless OR worthless. Also, you are currently on the internet if you're reading this, so the point kinda stands.

ithkuil 2 hours ago | parent | prev | next [-]

Are you saying that the internet business didn't grow a lot after the bubble popped?

bryanlarsen 2 hours ago | parent | prev [-]

And then they sold it to Google who lit it up.

Workaccount2 6 hours ago | parent | prev | next [-]

Deepmind gets to work directly with the TPU team to make custom modifications and designs specifically for deepmind projects. They get to make pickaxes that are made exactly for the mine they are working.

Everyone using Nvidia hardware has a lot of overlap in requirements, but they also all have enough architectural differences that they won't be able to match Google.

OpenAI announced they will be designing their own chips, exactly for this reason, but that also becomes another extremely capital intensive investment for them.

This also doesn't get into that Google also already has S-tier dataceters and datacenter construction/management capabilities.

wood_spirit 3 hours ago | parent [-]

Isn’t there a suspicion that OpenAI buying custom chips from another Sam Altman venture is just graft? Wasn’t that one of the things that came up when the board tried to out him?

HarHarVeryFunny 6 hours ago | parent | prev | next [-]

It's not that the TPU is better than an NVidia GPU, it's just that it's cheaper since it doesn't have a fat NVidia markup applied, and is also better vertically integrated since it was designed/specified by Google for Google.

UncleOxidant 2 hours ago | parent [-]

TPUs are also cheaper because GPUs need to be more general purpose whereas TPUs are designed with a focus on LLM workloads meaning there's not wasted silicon. Nothing's there that doesn't need to be there. The potential downside would be if a significantly different architecture arises that would be difficult for TPUs to handle and easier for GPUs (given their more general purpose). But even then Google could probably pivot fairly quickly to a different TPU design.

fooker 6 hours ago | parent | prev | next [-]

That's exactly what Nvidia is doing with tensor cores.

bjourne 6 hours ago | parent [-]

Except the native width of Tensor Cores are about 8-32 (depending on scalar type), whereas the width of TPUs is up to 256. The difference in scale is massive.

LogicFailsMe 6 hours ago | parent | prev | next [-]

That's pretty much what they've been doing incrementally with the data center line of GPUs versus GeForce since 2017. Currently, the data center GPUs now have up to 6 times the performance at matrix math of the GeForce chips and much more memory. Nvidia has managed to stay one tape out away from addressing any competitors so far.

The real challenge is getting the TPU to do more general purpose computation. But that doesn't make for as good a story. And the point about Google arbitrarily raising the prices as soon as they think they have the upper hand is good old fashioned capitalism in action.

jauntywundrkind 2 hours ago | parent | prev | next [-]

Nvidia doesn't have the software stack to do a TPU.

They could make a systolic array TPU and software, perhaps. But it would mean abandoning 18 years of CUDA.

The top post right now is talking about TPU's colossal advantage in scaling & throughput. Ironwood is massively bigger & faster than what Nvidia is shooting for, already. And that's a huge advantage. But imo that is a replicateable win. Throw gobs more at networking and scaling and nvidia could do similar with their architecture.

The architectural win of what TPU is more interesting. Google sort of has a working super powerful Connection Machine CM-1. The systolic array is a lot of (semi-)independent machines that communicate with nearby chips. There's incredible work going on to figure out how to map problems onto these arrays.

Where-as on a GPU, main memory is used to transfer intermediary results. It doesn't really matter who picks up work, there's lots of worklets with equal access time to that bit of main memory. The actual situation is a little more nuanced (even in consumer gpu's there's really multiple different main memories, which creates some locality), but there's much less need for data locality in the GPU, and much much much much tighter needs, the whole premise of the TPU is to exploit data locality. Because sending data to a neighbor is cheap, sending storing and retrieving data from memory is slower and much more energy intense.

CUDA takes advantage of, relies strongly on the GPU's reliance in main memory being (somewhat) globally accessible. There's plenty of workloads folks do in CUDA that would never work on TPU, on these much more specialized data-passing systolic arrays. That's why TPUs are so amazing, because they are much more constrained devices, that require so much more careful workload planning, to get the work to flow across the 2D array of the chip.

Google's work on projects like XLA and IREE is a wonderful & glorious general pursuit of how to map these big crazy machine learning pipelines down onto specific hardware. Nvidia could make their own or join forces here. And perhaps they will. But the CUDA moat would have to be left behind.

blibble 6 hours ago | parent | prev | next [-]

the entire organisation has been built over the last 25 years to produce GPUs

turning a giant lumbering ship around is not easy

sbarre 6 hours ago | parent [-]

For sure, I did not mean to imply they could do it quickly or easily, but I have to assume that internally at Nvidia there's already work happening to figure out "can we make chips that are better for AI and cheaper/easier to make than GPUs?"

coredog64 2 hours ago | parent [-]

Isn't that a bit like Kodak knowing that digital cameras were a thing but not wanting to jeopardize their film business?

sojuz151 6 hours ago | parent | prev | next [-]

They lose the competitive advantage. They have nothing more to offer than what Google has in-house.

numbers_guy 6 hours ago | parent | prev | next [-]

Nothing in principle. But Huang probably doesn't believe in hyper specializing their chips at this stage because it's unlikely that the compute demands of 2035 are something we can predict today. For a counterpoint, Jim Keller took Tenstorrent in the opposite direction. Their chips are also very efficient, but even more general purpose than NVIDIA chips.

mindv0rtex 4 hours ago | parent [-]

How is Tenstorrent h/w more general purpose than NVIDIA chips? TT hardware is only good for matmuls and some elementwise operations, and plain sucks for anything else. Their software is abysmal.

llm_nerd 6 hours ago | parent | prev | next [-]

For users buying H200s for AI workloads, the "ASIC" tensor cores deliver the overwhelming bulk of performance. So they already do this, and have been since Volta in 2017.

To put it into perspective, the tensor cores deliver about 2,000 TFLOPs of FP8, and half that for FP16, and this is all tensor FMA/MAC (comprising the bulk of compute for AI workloads). The CUDA cores -- the rest of the GPU -- deliver more in the 70 TFLOP range.

So if data centres are buying nvidia hardware for AI, they already are buying focused TPU chips that almost incidentally have some other hardware that can do some other stuff.

I mean, GPUs still have a lot of non-tensor general uses in the sciences, finance, etc, and TPUs don't touch that, but yes a lot of nvidia GPUs are being sold as a focused TPU-like chip.

sorenjan 6 hours ago | parent [-]

Is it the Cuda cores that run the vertex/fragment/etc shaders in normal GPUs? Where does the ray tracing units fit in? How much of a modern Nvidia GPU is general purpose vs specialized to graphics pipelines?

qcnguy 9 minutes ago | parent [-]

A datacenter GPU has next to nothing left related to graphics. You can't use them to render graphics. It's a pure computational kernel machine.

sofixa 6 hours ago | parent | prev [-]

> what prevents Nvidia from doing the same thing and iterating on their more general-purpose GPU towards a more focused TPU-like chip as well, if that turns out to be what the market really wants.

Nothing prevents them per se, but it would risk cannibalising their highly profitable (IIRC 50% margin) higher end cards.