Remix.run Logo
tripplyons 2 days ago

At that point it is not following a diffusion training objective. I am aware of papers that do this, but I have not seen one that shows it as a better pretraining objective than something like v-prediction or flow matching.

mxwsn 2 days ago | parent [-]

Why is not the diffusion training objective? The technique is known as self-conditioning right? Is it an issue with conditional Tweedie's?