Which would almost be great, if the M3 Ultra's GPU wasn't ~3x weaker than a single 5090: https://browser.geekbench.com/opencl-benchmarks

I don't think I can recommend the Mac Studio for AI inference until the M5 comes out. And even then, it remains to be seen how fast those GPUs are or if we even get an Ultra chip at all.

▲

adastra22 2 days ago | parent [-]

Again, memory bandwidth is pretty much all that matters here. During inference or training the CUDA cores of retail GPUs are like 15% utilized.

	▲	my123 a day ago \| parent \| next [-]
		Not for prompt processing. Current Macs are really not great at long contexts
	▲	2 days ago \| parent \| prev [-]
		[deleted]