Remix.run Logo
ricardobeat 4 hours ago

They cannot "edit" the code though, like you can with diffusion. They must emit all tokens again, or a patch/diff which is not directly connected to the previous stream of tokens.

lossolo 2 hours ago | parent [-]

LLMs can "edit" code, but as you say, they do it differently from diffusion models. They operate directly on long text sequences and use much more context, which is one reason they currently work better for coding. Diffusion models for code aren't a new idea, people have tried different designs, but so far they tend to underperform autoregressive LLMs, probably because denoising over discrete tokens is harder to make work than straightforward next token prediction.