Remix clone Hacker News

new | show | ask | jobs Github

	▲	petu 3 hours ago
		Speculative decoding batches multiple completions on all possible outcomes (0/1/2 draft tokens accepted) and sees if big model deviates at any point -- thus verifying each token. So there's no difference in output.