| ▲ | srean 2 hours ago | |||||||
They do. There are enormous redundancies. There's a manifold over which the parameters can vary wildly yet do zilch to the output. The nonlinear analogue of a null space. Parameter instability does not worry a machine learner as much as it worries a statistician. ML folks worry about output instabilities. The current understanding goes that this overparameterization makes reaching good configurations easier while keeping the search algorithm as simple as stochastic gradient descent. | ||||||||
| ▲ | kqr 2 hours ago | parent [-] | |||||||
Huh, I didn't know that! Are there efforts to automatically reduce the number of parameters once the model is trained? Or do the relationships between parameters end up too complicated to do that? I would assume such a reduction would be useful for explainability. (Asking specifically about time series models and such.) | ||||||||
| ||||||||