> But how will a GPU with small-ish but fast VRAM and great compute, augment a Mac with large but slow VRAM and weak compute?

It would work just like a discrete GPU when doing CPU+GPU inference: you'd run a few shared layers on the discrete GPU and place the rest in unified memory. You'd want to minimize CPU/GPU transfers even more than usual, since a Thunderbolt connection only gives you equivalent throughput to PCIe 4.0 x4.

▲

manmal 5 days ago | parent [-]

But isn’t the Mac Mini the weak link in that scenario?

▲

zozbot234 5 days ago | parent [-]

It has way more unified memory than your typical dGPU.

	▲	manmal 4 days ago \| parent [-]
		Yes obviously. That VRAM is also slower and has weak compute attached. Loading to the external GPU will slow things down too much.