Language models don't "pack concepts" into the C dimension of one layer (I guess that's where the 12k number came from), neither do they have to be orthogonal to be viewed as distinct or separate. LLMs generally aren't trained to make distinct concepts far apart in the vector space either. The whole point of dense representations, is that there's no clear separation between which concept lives where. People train sparse autoencoders to work out which neurons fire based on the topics involved. Neuronpedia demonstrates it very nicely: https://www.neuronpedia.org/.

▲

sdenton4 a day ago | parent | next [-]

The spare autoencoder work is /exactly/ premised on the kind of near-orthogonality that this article talks about. It's called the 'superposition hypothesis' originally: https://transformer-circuits.pub/2022/toy_model/index.html

The SAE's job is to try to pull apart the sparse nearly-orthogonal 'concepts' from a given embedding vector, by decomposing the dense vector into a sparsely activation over-complete basis. They tend to find that this works well, and even allows matching embedding spaces between different LLMs efficiently.

	▲	gpjanik 14 hours ago \| parent [-]
		Agreed, but that's not in the C dimension of a first-layer embedding of a single token though, it's across the whole model and that's what I said in the comment above.

▲

prmph a day ago | parent | prev [-]

Agreed, if you relax the requirement for perfect orthogonality, then, yes, you can pack in much more info. You basically introduced additional (fractional) dimensions clustered with the main dimensions. Put another way, many concepts are not orthogonal, but have some commonality or correlation.

So nothing earth shattering here. The article is also filled with words like "remarkable", "fascinating", "profound", etc. that make me feel like some level of subliminal manipulation is going on. Maybe some use of an LLM?

▲

gpjanik a day ago | parent [-]

It's... really not what I meant. This requirement does not have to be relaxed, it doesn't exist at all.

Semantic similarity in embedding space is a convenient accident, not a design constraint. The model's real "understanding" emerges from the full forward pass, not the embedding geometry.

	▲	prmph a day ago \| parent [-]
		I'm speaking in more in general conceptual terms, not about the specifics of LLM architecture