Remix clone Hacker News

new | show | ask | jobs Github

	▲	seanmcdirmid 9 days ago
		> For reference, the RTX 5080 (a consumer GPU) has 1tb of VRAM bandwidth and runs circles around the M5 Max in GPU compute benchmarks: https://browser.geekbench.com/opencl-benchmarks NVIDIA hampers their GPUs with un-unified graphics memory, while the M series can use everything the computer has (well, you need to save 4GB or so). It also works on airplanes and in hotel rooms, a cheap NVIDIA server box with 64GB of RAM (what my M3 Max laptop has)....how cheap is that?
	▲	andriy_koval 9 days ago \| parent [-]
		I think un-unified memory issue is solved by software layer in datacenter setting: model is distributed across multiple GPUs in the same server, or across multiple servers if model is extra large.