Remix clone Hacker News

new | show | ask | jobs Github

	▲	atwrk 9 hours ago
		Local LLM inference is all about memory bandwidth, and an M4 pro only has about the same as a Strix Halo or DGX Spark. That's why the older ultras are popular with the local LLM crowd.