Remix.run Logo
magicalhippo 2 days ago

I'm having a very dense moment I think, and it's been far to long since the statistics courses.

They state the output of a neuron j is a log density P_j(y), where y is a latent variable.

But how does the output from the previous layer, x, come into play?

I guess I was expecting some kind of conditional probabilities, ie the output is P_j given x or something.

Again, perhaps trivial. Just struggling to figure out how it works in practice.

magicalhippo 2 days ago | parent [-]

Clearly I wasn't in neural net mode. I take it then the learned parameters, the means, variances and mixing coefficients, are effectively functions of the output of the previous layer.

zakeria a day ago | parent [-]

Thanks - That's correct, the Gaussian mixture parameters (mu, sigma, pi) are learned as functions of the input from the previous layer. So it’s still a feedforward net: the activations from layer x determine the mixture parameters for the next layer.

The reason the neuron’s output is written as a log-density Pj(y) is just to emphasize the probabilistic view: each neuron is modeling how likely a latent variable y would be under its mixture distribution.