▲ | gpjanik 2 days ago | ||||||||||||||||
Language models don't "pack concepts" into the C dimension of one layer (I guess that's where the 12k number came from), neither do they have to be orthogonal to be viewed as distinct or separate. LLMs generally aren't trained to make distinct concepts far apart in the vector space either. The whole point of dense representations, is that there's no clear separation between which concept lives where. People train sparse autoencoders to work out which neurons fire based on the topics involved. Neuronpedia demonstrates it very nicely: https://www.neuronpedia.org/. | |||||||||||||||||
▲ | sdenton4 a day ago | parent | next [-] | ||||||||||||||||
The spare autoencoder work is /exactly/ premised on the kind of near-orthogonality that this article talks about. It's called the 'superposition hypothesis' originally: https://transformer-circuits.pub/2022/toy_model/index.html The SAE's job is to try to pull apart the sparse nearly-orthogonal 'concepts' from a given embedding vector, by decomposing the dense vector into a sparsely activation over-complete basis. They tend to find that this works well, and even allows matching embedding spaces between different LLMs efficiently. | |||||||||||||||||
| |||||||||||||||||
▲ | prmph a day ago | parent | prev [-] | ||||||||||||||||
Agreed, if you relax the requirement for perfect orthogonality, then, yes, you can pack in much more info. You basically introduced additional (fractional) dimensions clustered with the main dimensions. Put another way, many concepts are not orthogonal, but have some commonality or correlation. So nothing earth shattering here. The article is also filled with words like "remarkable", "fascinating", "profound", etc. that make me feel like some level of subliminal manipulation is going on. Maybe some use of an LLM? | |||||||||||||||||
|