This is way outside of my expertise, can anyone given a TL;DR or ai;dr?

Diffusion and flow matching models generate samples by iterative denoising. Iterative denoising means passing input to the neural network, running a forward pass, and taking the output back as input and rerunning the neural network. Often you do this 100 times, which is slow and expensive.

Flow maps / consistency models / shortcut models instead try to learn to compress this iterative work into 1 forward pass. This makes inference 100x faster as you'd only need to run the neural net forward pass once. Beyond speeding up inference, there are other advanced benefits to this, such as improved ability to perform inference-time steering.

Mathematically, learning a flow map corresponds to learning to solve an ordinary differential equation, i.e., learning the time integral of the velocity field. This mathematical foundation provides the basis for various training objectives for learning flow maps, which involve self-referential identities or identities such as the transport equation, which are discussed in the blog post.

Hope that helps! I'm an ML researcher currently researching flow maps.

▲

cshimmin 5 hours ago | parent | next [-]

Very helpful! Naïve question (I haven’t had a chance to read TFA at all and diffusion/flow models are not my area of expertise). Doesn’t learning the integral/solution of the diffusion process in a single pass just take us back to like OG generative CNN that we had before diffusion models took over? Surely the answer is “no” but would love to hear your framing as to why.

	▲	benanne 4 hours ago \| parent [-]
		It kind of does! In the modern era of generative modelling, it seems like we rely on pre-training to capture the data distribution, and then on post-training (and various other tricks) to carve out a sliver of that distribution that we actually care about (i.e. what we want our model to generate). To be able to specify that subset with relatively few examples, a good high-level understanding of the data distribution is necessary. The way I see this, is that training a diffusion model gets you to that point, and then once you've selected the part of the distribution you actually care about, you can distill it down quite aggressively, because you no longer need all of that computation to model a much simpler distribution (sometimes all the way to one step, but usually it's a few steps in practice).

▲

richard___ 4 hours ago | parent | prev [-]

Why is self-distillation necessary? Why can't they get the ground-truth for "skipping" steps?

▲

anvuong 6 hours ago | parent | prev | next [-]

This provides a high-level overview of diffusion models, you know, the models behind Stable Diffusion, Gemini banana, etc.

I haven't read it carefully but I think it's pretty comprehensive. From SDE to Flow matching formulation, and different perspective of constructing the flow maps, i.e. x-formulation or v-formulation. It also deals with distillation and consistency, which is used to fast sampling.

Overall, it's a good read if you are new to the field.

▲

refulgentis 6 hours ago | parent | prev [-]

Why not put it into an AI yourself? :) I'd rather we avoided a precedent of asking for it and N people replying with their own favorite AI version. The comments section would end up a ghost town.

Extreme TL;DR: Diffusion models are like getting f(x) by calculating and summing f'(0), f'(1)...f'(x). Flow models are like just calculating f(x).

▲

_doctor_love 6 hours ago | parent | next [-]

HN is a place where it's legitimate to ask those kinds of questions. The site has a high concentration of advanced practitioners -- in my experience it is not uncommon for the creator of a technology or deep expert to reply. John Carmack has an account on the site for instance. :)

▲

charcircuit 5 hours ago | parent | next [-]

Why take a gamble hoping that one of those experts takes the time to reply to you when you can instantly get an answer by asking AI?

	▲	pipe2devnull 4 hours ago \| parent [-]
		Maybe someone wants to hear from an actual human rather than risk hearing a plausible but potentially incorrect answer from AI

▲

refulgentis 4 hours ago | parent | prev [-]

[dead]

▲

tekacs 4 hours ago | parent | prev [-]

We've all seen that AI can give you plausible but incorrect answers. Having an expert read it or use AI on it and interpret and validate it before posting would be most welcome IMO.