▲ | macawfish 2 days ago | |||||||||||||||||||||||||||||||
I have a feeling this technique might make waves: https://openreview.net/forum?id=c05qIG1Z2B#discussion | ||||||||||||||||||||||||||||||||
▲ | tripplyons 2 days ago | parent [-] | |||||||||||||||||||||||||||||||
There are definitely parallels between diffusion and reasoning models, mostly being able to spend longer to get a better solution by using a more precise ODE solver for diffusion or using more tokens for reasoning. However, due to how diffusion models are trained, they never see their own predictions as input, so they cannot learn to store information across steps. This is the complete opposite for reasoning models. | ||||||||||||||||||||||||||||||||
|