Remix.run Logo
causal 8 hours ago

You seem to be going off the title which is plainly incorrect and not what the paper says. The paper demonstrates HOW different models can learn similar representations due to "data, architecture, optimizer, and tokenizer".

"How Different Language Models Learn Similar Number Representations" (actual title) is distinctly different from "Different Language Models Learn Similar Number Representations" - the latter implying some immutable law of the universe.

dnautics 6 hours ago | parent | next [-]

> latter implying some immutable law of the universe

I think the implications is slightly weaker -- it implies some immutable law of training datasets?

NooneAtAll3 3 hours ago | parent | prev [-]

I don't understand your argument

"How X happens" still implies that X happens, just adds additional explanation on top

causal 2 hours ago | parent [-]

"How" = it can happen

Without "How" = it will happen