Remix clone Hacker News

new | show | ask | jobs Github

	▲	otabdeveloper4 7 hours ago
		> meager hardware Qwen was made on a cluster about that size. And this is before anybody ever thought about optimizing the training process. (Currently it's just pytorch analyst-as-coder slop, with extremely overprovisioned quantizations, etc.)