Remix.run Logo
leoapagano 4 days ago

One possible advantage of this approach that no one here has mentioned yet is that it would allow us to put RAM on the CPU die (allowing for us to take advantage of the greater memory bandwidth) while also allowing for upgradable RAM.

themafia 4 days ago | parent | next [-]

I think you'd want to go the other way.

GPU RAM is high speed and power hungry. So there tends to not be very much of it on the GPU card. This is part of the reason we keep increasing the bandwidth is so the CPU can touch that GPU RAM at the highest speeds.

It makes me wonder though if a NUMA model for the GPU is a better idea. Add more lower power and lower speed RAM onto the GPU card. Then let the CPU preload as much data as is possible onto the card. Then instead of transferring textures through the CPU onto the PCI bus and into the GPU why not just send a DMA request to the GPU and ask it to move it from it's low speed memory to it's high speed memory?

It's a whole new architecture but it seems to get at the actual problems we have in the space.

kokada 4 days ago | parent [-]

Isn't that what you described Direct Storage?

themafia 4 days ago | parent [-]

You're still running through the PCIe slot and it's bandwidth limit. I'm suggesting you bypass even that and put more memory directly on the card.

KeplerBoy 3 days ago | parent [-]

So an additional layer slower and larger than global GPU memory?

I believe that's kind of what bolt graphics is doing with the dimm slots next to the soldered on lpddr5. https://bolt.graphics/how-it-works/

MBCook 4 days ago | parent | prev [-]

Couldn’t we do that today if we wanted to?

What’s keeping Intel/AMD from putting memory on package like Apple does other than cost and possibly consumer demand?

iszomer 3 days ago | parent [-]

Supply + demand, the manufacturing-capacity rabbit hole.