> The GPU and CPU share memory, that doesn't mean you don't need to interact with the GPU, anymore.

But we already have software that talks to the GPU; mesa3d and the ecosystem around that. It has existed for decades. My understanding was that the main reasons not to use it was that memory management was too complicated and CUDA solved that problem.

If memory gets unified, what is the value proposition of ROCm supposed to be over mesa3d? Why does AMD need to invent some new way to communicate with GPUs? Why would it be faster?

▲

SwellJoe 4 days ago | parent | next [-]

> CUDA solved that problem.

CUDA is a proprietary Nvidia product. CUDA solved the problem for Nvidia chips.

On AMD GPUs, you use ROCm. On Intel, you use OpenVINO. On Apple silicon you use MLX. All work fine with all the common AI tasks you'd want to do on self-hosted hardware. CUDA was there first and so it has a more mature ecosystem, but, so far, I've found 0 models or tasks I haven't been able to use with ROCm. llama.cpp works fine. ComfyUI works fine. Transformers library works fine. LM Studio works fine.

Unless you believe Nvidia having a monopoly on inference or training AI models is good for the world, you can't oppose all the other GPU makers having a way for their chips to be used for those purposes. CUDA is a proprietary vendor-specific solution.

Edit: But, also, Vulkan works fine on the Strix Halo. It is reliable and usually not that much slower than ROCm (and occasionally faster, somehow). Here's some benchmarks: https://kyuz0.github.io/amd-strix-halo-toolboxes/

▲

roenxi 4 days ago | parent | next [-]

Why? What is the point of focusing on something that seems to be a memory management solution when the memory management problem theoretically just went away?

That has been one of the big themes in GPU hardware since around 2010 era when AMD committed to ATI. Nvidia tried to solve the memory management problem in the software layer, AMD committed to doing it in hardware. Software was a better bet by around a trillion dollars so far, but if the hardware solutions have finally come to fruit then why the focus on ROCm?

	▲	SwellJoe 4 days ago \| parent [-]
		I dunno. GPU programming and performance is above my pay grade. I assume the reason every GPU maker is investing in software is because they understand the problems to be solved and feel it's worth the investment to solve them. I like AMD because their Linux drivers are open source. I like Intel because all their stuff is Open Source. I like Nvidia notably less because none of their stuff is Open Source, not even the Linux drivers.

▲

sabedevops 4 days ago | parent | prev [-]

The problem with ROCm, unlike CUDA, is that it doesn’t run on much of AMDs own hardware, most notably their iGPU.

	▲	SwellJoe 4 days ago \| parent [-]
		Yeah, that kinda sucks, but, all their new generation onboard GPUs are supported by ROCm. e.g. Ryzen AI 395 and 400 series which will be found in mid-to-high end laptops and desktops and motherboards. They seem to have realized that the reason Nvidia is kicking their ass is that people can develop with CUDA on all sorts of hardware, including their personal laptop or desktop.

▲

dragontamer 4 days ago | parent | prev [-]

> If memory gets unified, what is the value proposition of ROCm supposed to be over mesa3d? Why does AMD need to invent some new way to communicate with GPUs? Why would it be faster?

And the memory barriers? How do you sync up the L1/L2 cache of a CPU core with the GPU's cache?

Exactly. With a ROCm memory barrier, ensuring parallelism between CPU + GPU, while also providing a mechanism for synchronization.

GPU and CPU can share memory, but they do not share caches. You need programming effort to make ANY of this work.