Remix clone Hacker News

new | show | ask | jobs Github

	▲	fourthark 8 hours ago
		Seems like training would be a better match, where you need tons of compute but don’t care about latency.
	▲	ronsor 33 minutes ago \| parent [-]
		No, the problem is that with training, you do care about latency, and you need a crap-ton of bandwidth too! Think of the all_gather; think of the gradients! Inference is actually easier to distribute.