Remix clone Hacker News

new | show | ask | jobs Github

	▲	Alifatisk 3 days ago
		I think it's because of a combination between the MoE model architecture and the inference done in large batches and run in parallel