Remix clone Hacker News

new | show | ask | jobs Github

	▲	Balinares 4 hours ago
		Isn't that exactly how draft models speed up inference, though? Validating a batch of tokens is significantly faster than generating them.