Remix.run Logo
LeCompteSftware 8 hours ago

"using periodic features with dominant periods at T=2, 5, 10" seems inconsistent with "platonic representation" and more consistent with "specific patterns noticed in commonly-used human symbolic representations of numbers."

Edit: to be clear I think these patterns are real and meaningful, but only loosely connected to a platonic representation of the number concept.

ACCount37 6 hours ago | parent | next [-]

Is it an actual counterargument?

The "platonic representation" argument is "different models converge on similar representations because they are exposed to the same reality", and "how humans represent things" is a significant part of reality they're exposed to.

convolvatron 3 hours ago | parent [-]

you're right, its just that 'platonic' is an argument that numbers exist in the universe as objects in and of themselves, completely independent of human reality. if we don't assume this, that numbers are a system that humans created (formalism), then sure, we can be happy that llms are picking common representations that map well into our subjective notions of what numbers are.

brentd 6 hours ago | parent | prev [-]

Regardless of whether the convergence is superficial or not, I am interested especially in what this could mean for future compression of weights. Quantization of models is currently very dumb (per my limited understanding). Could exploitable patterns make it smarter?

ACCount37 6 hours ago | parent [-]

That's more of a "quantization-aware training" thing, really.

brentd 6 hours ago | parent [-]

[dead]