Remix.run Logo
magicalhippo 2 days ago

Clearly I wasn't in neural net mode. I take it then the learned parameters, the means, variances and mixing coefficients, are effectively functions of the output of the previous layer.

zakeria a day ago | parent [-]

Thanks - That's correct, the Gaussian mixture parameters (mu, sigma, pi) are learned as functions of the input from the previous layer. So it’s still a feedforward net: the activations from layer x determine the mixture parameters for the next layer.

The reason the neuron’s output is written as a log-density Pj(y) is just to emphasize the probabilistic view: each neuron is modeling how likely a latent variable y would be under its mixture distribution.