Remix.run Logo
scotty79 2 days ago

It's iterative in a sense of solving differential equation iteratively. While recurrent networks are iterative in sense of putting a for loop around a bunch of if-s.

kelseyfrog 2 days ago | parent [-]

It's also in the sense that initial latent vector is Gaussian noise. The transformer loop is de-noising latent space. They just happen to be doing the equivalent of predicting x_0 directly.