Remix clone Hacker News

new | show | ask | jobs Github

	▲	Ross00781 8 hours ago
		Diffusion-based reasoning is fascinating - curious how it handles sequential dependencies vs traditional autoregressive. For complex planning tasks where step N heavily depends on steps 1-N, does the parallel generation sometimes struggle with consistency? Or does the model learn to encode those dependencies in a way that works well during parallel sampling?