Remix clone Hacker News

new | show | ask | jobs Github

	▲	manmal 4 hours ago
		We've already established in this thread that memory bandwidth isn't that much greater than M4 Max - 12%? However, I wonder if batched inference will benefit greatly from the vastly improved compute. My guess is that parallel usage of the same model will be a couple times faster. So, single "threaded" use not that much better, but say you want to run a lot of batch jobs, it'd be way faster?
	▲	andy_ppp an hour ago \| parent [-]
		Is this a reply to a different comment?