For users buying H200s for AI workloads, the "ASIC" tensor cores deliver the overwhelming bulk of performance. So they already do this, and have been since Volta in 2017.

To put it into perspective, the tensor cores deliver about 2,000 TFLOPs of FP8, and half that for FP16, and this is all tensor FMA/MAC (comprising the bulk of compute for AI workloads). The CUDA cores -- the rest of the GPU -- deliver more in the 70 TFLOP range.

So if data centres are buying nvidia hardware for AI, they already are buying focused TPU chips that almost incidentally have some other hardware that can do some other stuff.

I mean, GPUs still have a lot of non-tensor general uses in the sciences, finance, etc, and TPUs don't touch that, but yes a lot of nvidia GPUs are being sold as a focused TPU-like chip.

▲

sorenjan 6 hours ago | parent [-]

Is it the Cuda cores that run the vertex/fragment/etc shaders in normal GPUs? Where does the ray tracing units fit in? How much of a modern Nvidia GPU is general purpose vs specialized to graphics pipelines?

	▲	qcnguy 9 minutes ago \| parent [-]
		A datacenter GPU has next to nothing left related to graphics. You can't use them to render graphics. It's a pure computational kernel machine.