▲ | cdavid 3 days ago | |
I agree it is confusing, because starting with notation will confuse you. I personally don't like the partial derivative-first definition of those concepts, as it all sounds a bit arbitrary. What made sense to me is to start from the definition of derivative (the best linear approximation in some sense), and then everything else is about how to represent this. vectors, matrices, etc. are all vectors in the appropriate vector space, the derivative is always the same form in a functional form, etc. E.g. you want the derivative of f(M) ? Just write f(M+h) - f(M), and then look for the terms in h / h^2 / etc. Apply chain rules / etc. for more complicated cases. This is IMO a much better way to learn about this. As for notation, you use vec/kronecker product for complicated cases: https://janmagnus.nl/papers/JRM093.pdf |