| ▲ | _hark 15 hours ago | |
You literally can do a kind of model PCA, using the Hessian (matrix of second derivatives of the loss function w/r/t the parameters, aka the local curvature of the loss landscape), and diagonalizing. These eigenvectors and eigenvalues (the spectrum of the Hessian) tend to be power-law distributed in just about every deep NN you can think of [1]. That is, there are a few "really important" (highly curved) dimensions in parameter space (the top eigenvectors) which control the model's performance (the loss function). Conversely, there are very many "unimportant"/low curvature dimensions in the model. There was a recent interesting paper that showed that "deleting" these low-curvature dimensions appeared to correspond to removing "memorized" information in LLMs, such that their reasoning performance was left unchanged while their ability to answer questions which require some memorized knowledge was reduced [2]. It appears that sometimes models undergo dramatic transitions from memorization to perfect generalization, which corresponds to the models becoming much more compressible [3]. I'm hopeful that we'll find a way to distill the models down to the most useful core cognitive/reasoning capabilities, and that that core will be far simpler than the current scale of LLMs. But they might need to look stuff up like we do without all that memorized world knowledge! [1]: https://openreview.net/pdf?id=o62ZzfCEwZ [2]: https://www.goodfire.ai/research/understanding-memorization-... | ||