We should remember that these structural diagrams are _not_ necessarily what NVIDIA actually has as hardware. They carefully avoid guaranteeing that any of the entities or blocks you see in the diagrams actually _exist_. It is still just a mental model NVIDIA offers for us to think about their GPUs, and more specifically the SMs, rather than a simplified circuit layout.

For example, we don't know how many actual functional units an SM has; we don't know if the "tensor core" even _exists_ as a piece of hardware, or whether there's just some kind of orchestration of other functional units; and IIRC we don't know what exactly happens at the sub-warp level w.r.t. issuing and such.

▲

KeplerBoy 5 days ago | parent [-]

Interesting perspective. Aren't SMs basically blocked while running tensor core operations, which might hint that it's the same FPUs doing the work after all?

	▲	einpoklum 5 days ago \| parent [-]
		I doubt that can fully be the case, because there are other functional units on SMs, like Load/Store, ALU / Integer ops, and Special Function Units. But you may be right, we would need to consult the academic "investigatory" papers or blog posts and see whether this has been checked.