| ▲ | kqr 2 hours ago | ||||||||||||||||
What confuses me about deep nets is that there's rarely enough signal to be able to meaningfully train a large number of parameters. Surely 99 % of those parameters are either (a) incredibly unstable, or (b) correlate perfectly with other parameters? | |||||||||||||||||
| ▲ | srean 2 hours ago | parent [-] | ||||||||||||||||
They do. There are enormous redundancies. There's a manifold over which the parameters can vary wildly yet do zilch to the output. The nonlinear analogue of a null space. Parameter instability does not worry a machine learner as much as it worries a statistician. ML folks worry about output instabilities. The current understanding goes that this overparameterization makes reaching good configurations easier while keeping the search algorithm as simple as stochastic gradient descent. | |||||||||||||||||
| |||||||||||||||||