Remix.run Logo
cshimmin 5 hours ago

Very helpful! Naïve question (I haven’t had a chance to read TFA at all and diffusion/flow models are not my area of expertise). Doesn’t learning the integral/solution of the diffusion process in a single pass just take us back to like OG generative CNN that we had before diffusion models took over? Surely the answer is “no” but would love to hear your framing as to why.

benanne 4 hours ago | parent [-]

It kind of does! In the modern era of generative modelling, it seems like we rely on pre-training to capture the data distribution, and then on post-training (and various other tricks) to carve out a sliver of that distribution that we actually care about (i.e. what we want our model to generate).

To be able to specify that subset with relatively few examples, a good high-level understanding of the data distribution is necessary. The way I see this, is that training a diffusion model gets you to that point, and then once you've selected the part of the distribution you actually care about, you can distill it down quite aggressively, because you no longer need all of that computation to model a much simpler distribution (sometimes all the way to one step, but usually it's a few steps in practice).