AMD now has 32 GiB Radeon AI Pro 9700. 4 of these (just under 2k each) would put you at 128 GiB VRAM

VRAM is not everything - GPU cores also matter (a lot) for inference

	▲	lostmsu 16 hours ago \| parent \| next [-]
		4x Radeon will have significantly more GPU power than say Mac Studio or DGX Spark.
	▲	cyanydeez 15 hours ago \| parent \| prev [-]
		inference speed is like monitor Hz; sure, you go from 60 to 120Hz and thats noticeable, but unless your model is AGI, at some point you're just generating more code than you'll ever realistically be able to control, audit and rely on. So, context is probably more $/programming worth than inference speed.