▲ | scotty79 2 days ago | |
It's iterative in a sense of solving differential equation iteratively. While recurrent networks are iterative in sense of putting a for loop around a bunch of if-s. | ||
▲ | kelseyfrog 2 days ago | parent [-] | |
It's also in the sense that initial latent vector is Gaussian noise. The transformer loop is de-noising latent space. They just happen to be doing the equivalent of predicting x_0 directly. |