▲ | andyferris 3 days ago | |
I believe the reason it works in nonlinear cases is that the derivative is “naturally linear” (to calculate the derivative, you are considering ever smaller regions where the cost function is approximately linear - exactly “how nonlinear” the cost function is elsewhere doesn’t play a role). | ||
▲ | bgnn 3 days ago | parent [-] | |
that makes a lot of sense actually. thank you. |