Remix clone Hacker News

new | show | ask | jobs Github

	▲	perbu 5 hours ago
		MoE is excellent for the unified memory inference hardware like DGX Sparc, Apple Studio, etc. Large memory size means you can have quite a few B's and the smaller experts keeps those tokens flowing fast.