The way that really made me understand gradients and derivative was when visualizing them as Arrow Maps. I even made a small tool https://github.com/GistNoesis/VisualizeGradient . This visualization helps understand optimization algorithm.

Jacobians can be understood as a collection of gradients when considering each coordinates of the output independently.

My mental picture for Hessian is to associate each point with the shape of a parabola (or saddle), which best match the function locally. It's easy to visualize once you realize it's the shape of what you see when you zoom-in on the point. (Technically this mental picture is more of a hessian + gradient tangent plane simultaneously multivariate Taylor expansion but I find them hard to mentally separate the slope from the curvature).

▲

MathMonkeyMan 3 days ago | parent | next [-]

The "eigenchris" Youtube channel teaches tensor algebra, differential calculus, general relativity, and some other topics.

When I started thinking of vector calculus in terms of multiplying both vector components and the corresponding basis vectors, there was a nice unification of ordinary vector operations, jacobians, and the metric tensor.

	▲	GistNoesis 3 days ago \| parent [-]
		Some times it's useful to see the elemental transformations, but often striving for a higher level view makes understanding easier. It's particularly true when you try to apply it to physics. Often where you are introduced to vector calculus for the first time, things like Maxwell's equation, or fluid mechanics. In physics there are often additional constraints, like conserved quantities. Like you calculate a scalar the total energy of a system, call it the Hamiltonian, auto-differentiate it with respect of the degrees of freedom, and you got very complex vector equations. But taking a step back you realise it's just a field of "objects" of a certain type and you are just locally comparing these "objects" to their neighbors. And the whole mess of vector calculus is reduced to in which direction, you rotate, stretch and project these objects. (As an example you can imagine a field of balls in various orientation, whose energy is defined by the difference of orientation between neighboring balls) When you wrap around that the whole point of vector calculus (and why it was invented) is to describe a state as an integral (a sum of successive) of linear infinitesimal transformations, this make more sense. These "objects" being constrained and continuous, by moving in infinitesimal steps along the tangent space (and then reproject to satisfy the constraints (or exponentiating in a Clifford algebra to stay in the space) ). All the infinitesimal transformations are "linear", and linear transformations are rotating, stretching, mirroring, projecting to various degrees.

▲

uoaei 3 days ago | parent | prev [-]

I'm also a visual learner and my class on dynamical systems really put a lot into perspective, particularly the parts about classifying stable/unstable/saddle points by finding eigenvectors/values of Jacobians.

A lot of optimization theory becomes intuitive once you work through a few of those and compare your understanding to arrow maps like you suggest.