Remix.run Logo
bithive123 8 days ago

Language models aren't world models for the same reason languages aren't world models.

Symbols, by definition, only represent a thing. They are not the same as the thing. The map is not the territory, the description is not the described, you can't get wet in the word "water".

They only have meaning to sentient beings, and that meaning is heavily subjective and contextual.

But there appear to be some who think that we can grasp truth through mechanical symbol manipulation. Perhaps we just need to add a few million more symbols, they think.

If we accept the incompleteness theorem, then there are true propositions that even a super-intelligent AGI would not be able to express, because all it can do is output a series of placeholders. Not to mention the obvious fallacy of knowing super-intelligence when we see it. Can you write a test suite for it?

habitue 8 days ago | parent | next [-]

> Symbols, by definition, only represent a thing.

This is missing the lesson of the Yoneda Lemma: symbols are uniquely identified by their relationships with other symbols. If those relationships are represented in text, then in principle they can be inferred and navigated by an LLM.

Some relationships are not represented well in text: tacit knowledge like how hard to twist a bottle cap to get it to come off, etc. We aren't capturing those relationships between all your individual muscles and your brain well in language, so an LLM will miss them or have very approximate versions of them, but... that's always been the problem with tacit knowledge: it's the exact kind of knowledge that's hard to communicate!

nomel 8 days ago | parent [-]

I don’t think it’s a communication problem as much as there is no possible relation between a word and a (literal) physical experiences. They’re, quite literally, on different planes of existence.

drdeca 8 days ago | parent | next [-]

When I have a physical experience, sometimes it results in me saying a word.

Now, maybe there are other possible experiences that would result in me behaving identically, such that from my behavior (including what words I say) it is impossible to distinguish between different potential experiences I could have had.

But, “caused me to say” is a relation, is it not?

Unless you want to say that it wasn’t the experience that caused me to do something, but some physical thing that went along with the experience, either causing or co-occurring with the experience, and also causing me to say the word I said. But, that would still be a relation, I think.

nomel 8 days ago | parent [-]

Yes, but it's a unidirectional relation: it was the result of the experience. The word cannot represent the context (the experience), in a meaningful way.

It's like trying to describe a color to a blind person: poetic subjective nonsense.

drdeca 8 days ago | parent [-]

I don’t know what you mean by “unidirectional relation”. I get that you gave an explanation after the colon, but I still don’t quite get what you mean. Do you just mean that what words I use doesn’t pick out a unique possible experience? That’s true of course, but I don’t know why you call that “unidirectional”

I don’t think describing colors to a blind person is nonsense. One can speak of how the different colors relate to one-another. A blind person can understand that a stop sign is typically “red”, and that something can be “borderline between red and orange”, but that things will not be “borderline between green and purple”. A person who has never had any color perception won’t know the experience of seeing something red or blue, but they can still have a mental model of the world that includes facts about the colors of things, and what effects these are likely to have, even though they themselves cannot imagine what it is like to see the colors.

akomtu 7 days ago | parent [-]

IMO, the GP's idea is that you can't explain sounds to a deaf man, or emotions to someone who doesn't feel them. All that needs direct experience and words only point to our shared experience.

drdeca 6 days ago | parent [-]

Ok, but you can explain properties of sounds to deaf men, and properties of colors to blind men. You can’t give them a full understanding of what it is like to experience these things, but that doesn’t preclude deaf or blind men from having mental models of the world that take into account those senses. A blind man can still reason about what things a sighted person would be able to conclude based on what they see, likewise a deaf man can reason about what a person who can hear could conclude based on what they could hear.

semiquaver 8 days ago | parent | prev [-]

Well shit, I better stop reading books then.

nomel 8 days ago | parent [-]

I think you've missed the concept here.

You exist in the full experience. That lossy projection to words is still meaningful to you, in your reading, because you know the experience it's referencing. What do I mean by "lossy projection"? It's the experience of seeing the color blue to the word "blue". The word "blue" is meaningless without already having experienced it, because the word is not a description of the experience, it's a label. The experience itself can't be sufficiently described, as you'll find if you try to explain a "blue" to a blind person, because it exists outside of words.

The concept here is that something like an LLM, trained on human text, can't having meaningful comprehension of some concepts, because some words are labels of things that exist entirely outside of text.

You might say "but multimodal models use tokens for color!", or even extending that to "you could replace the tokens used in multimodal models with color names!" and I would agree. But, the understanding wouldn't come from the relation of words in human text, it would come from the positional relation of colors across a space, which is not much different than our experience of the color, on our retina

tldr: to get AI to meaningful understand something, you have to give it a meaningful relation. Meaningful relations sometimes aren't present, in human writing.

pron 8 days ago | parent | prev | next [-]

> Symbols, by definition, only represent a thing. They are not the same as the thing

First of all, the point isn't about the map becoming the territory, but about whether LLMs can form a map that's similar to the map in our brains.

But to your philosophical point, assuming there are only a finite number of things and places in the universe - or at least the part of which we care about - why wouldn't they be representable with a finite set of symbols?

What you're rejecting is the Church-Turing thesis [1] (essentially, that all mechanical processes, including that of nature, can be simulated with symbolic computation, although there are weaker and stronger variants). It's okay to reject it, but you should know that not many people do (even some non-orthodox thoughts by Penrose about the brain not being simulatable by an ordinary digital computer still accept that some physical machine - the brain - is able to represent what we're interested in).

> If we accept the incompleteness theorem

There is no if there. It's a theorem. But it's completely irrelevant. It means that there are mathematical propositions that can't be proven or disproven by some system of logic, i.e. by some mechanical means. But if something is in the universe, then it's already been proven by some mechanical process: the mechanics of nature. That means that if some finite set of symbols could represent the laws of nature, then anything in nature can be proven in that logical system. Which brings us back to the first point: the only way the mechanics of nature cannot be represented by symbols is if they are somehow infinite, i.e. they don't follow some finite set of laws. In other words - there is no physics. Now, that may be true, but if that's the case, then AI is the least of our worries.

Of course, if physics does exist - i.e. the universe is governed by a finite set of laws - that doesn't mean that we can predict the future, as that would entail both measuring things precisely and simulating them faster than their operation in nature, and both of these things are... difficult.

[1]: https://plato.stanford.edu/entries/church-turing/

goatlover 8 days ago | parent | next [-]

> course, if physics does exist - i.e. the universe is governed by a finite set of laws

That statement is problematic. It implies a metaphysical set of laws that make physical stuff relate a certain way.

The Humean way of looking at physics is that we notice relationships and model those with various symbols. They symbols form incomplete models because we can't get to the bottom of why the relationships exist.

> that doesn't mean that we can predict the future, as that would entail both measuring things precisely and simulating them faster than their operation in nature, and both of these things are... difficult.

The indeterminism of Quantum Mechanics limits how how precise measure can be and how predictable the future is.

pron 8 days ago | parent [-]

> That statement is problematic. It implies a metaphysical set of laws that make physical stuff relate a certain way.

What I meant was that since physics is the scientific search for the laws of nature, then if there's an infinite number of them, then the pursuit becomes somewhat meaningless, as an infinite number of laws aren't really laws at all.

> They symbols form incomplete models because we can't get to the bottom of why the relationships exist.

Why would a model be incomplete if we don't know why the laws are what they are? A model pretty much is a set of laws; it doesn't require an explanation (we may want such an explanation, but it doesn't improve the model).

astrange 8 days ago | parent | prev | next [-]

> First of all, the point isn't about the map becoming the territory, but about whether LLMs can form a map that's similar to the map in our brains.

It should be capable of something similar (fsvo similar), but the largest difference is that humans have to be power-efficient and LLMs do not.

That is, people don't actually have world models, because modeling something is a waste of time and energy insofar as it's not needed for anything. People are capable of taking out the trash without knowing what's in the garbage bag.

8 days ago | parent | prev | next [-]
[deleted]
Terr_ 8 days ago | parent | prev [-]

> Of course, if physics does exist - i.e. the universe is governed by a finite set of laws

Wouldn't physics still "exist" even if there were an infinite set of laws?

pron 8 days ago | parent [-]

Well, the physical universe will still exist, but I don't think that physics - the scientific study of said universe - will become sort of meaningless, I would think?

Terr_ 8 days ago | parent [-]

Why meaningless? Imperfect knowledge can still be useful, and ultimately that's the only kind we can ever have about anything.

"We could learn to sail the oceans and discover new lands and transport cargo cheaply... But in a few centuries we'll discover we were wrong and the Earth isn't really a sphere and tides are extra-complex so I guess there's no point."

pron 8 days ago | parent [-]

Because if there's an infinite number of laws, are they laws at all? You can't predict anything because you don't even know if some of the laws you don't know yet (which is pretty much all of them) makes an exception to the 0% of laws you do know. I'm not saying it's not interesting, but it's more history - today the apple fell down rather than up or sideways - than physics.

pixl97 8 days ago | parent [-]

In the infinite set of all laws is there an infinite set of laws that do not conflict with each other?

.000000000000001% of infinity is still infinite.

auggierose 8 days ago | parent | prev | next [-]

First: true propositions (that are not provable) can definitely be expressed, if they couldn't, the incompleteness theorem would not be true ;-)

It would be interesting to know what the percentage of people is, who invoke the incompleteness theorem, and have no clue what it actually says.

Most people don't even know what a proof is, so that cannot be a hindrance on the path to AGI ...

Second: ANY world model that can be digitally represented would be subject to the same argument (if stated correctly), not only LLMs.

bithive123 8 days ago | parent [-]

I knew someone would call me out on that. I used the wrong word; what I meant was "expressed in a way that would satisfy" which implies proof within the symbolic order being used. I don't claim to be a mathematician or philosopher.

auggierose 8 days ago | parent [-]

Well, you don't get it. The LLM definitely can state propositions "that satisfy", let's just call them true propositions, and that this is not the same as having a proof for it is what the incompleteness theorem says.

Why would you require an LLM to have proof for the things it says? I mean, that would be nice, and I am actually working on that, but it is not anything we would require of humans and/or HN commenters, would we?

bithive123 8 days ago | parent [-]

I clearly do not meet the requirements to use the analogy.

I am hearing the term super intelligence a lot and it seems to me the only form that would take is the machine spitting out a bunch of symbols which either delight or dismay the humans. Which implies they already know what it looks like.

If this technology will advance science or even be useful for everyday life, then surely the propositions it generates will need to hold up to reality, either via axiomatic rigor or empirically. I look forward to finding out if that will happen.

But it's still just a movement from the known to the known, a very limited affair no matter how many new symbols you add in whatever permutation.

cognitif 8 days ago | parent | prev | next [-]

> Language models aren't world models for the same reason languages aren't world models. Symbols, by definition, only represent a thing. They are not the same as the thing. The map is not the territory, the description is not the described, you can't get wet in the word "water".

Symbols, maps, descriptions, and words are useful precisely because they are NOT what they represent. Representation is not identity. What else could a “world model” be other than a representation? Aren’t all models representations, by definition? What exactly do you think a world model is, if not something expressible in language?

mrbungie 8 days ago | parent [-]

> Aren’t all models representations, by definition? What exactly do you think a world model is, if not something expressible in language?

I was following the string of questions, but I think there is a logical leap between those two questions.

Another question: is Language the only way to define models? An imagined sound or an imagined picture of an apple in my minds-eye are models to me, but they don't use language.

drdeca 8 days ago | parent | prev | next [-]

Gödel’s incompleteness theorems aren’t particularly relevant here. Given how often people attempt to apply them to situations where they don’t say anything of note, I think the default should generally be to not publicly appeal to them unless one either has worked out semi-carefully how to derive the thing one wants to show from them, or at least have a sketch that one is confident, from prior experience working with it, that one could make into a rigorous argument. Absent these, the most one should say, I think, is “Perhaps one can use Gödel’s incompleteness theorems to show [thing one wants to show].” .

Now, given a program that is supposed to output text that encodes true statements (in some language), one can probably define some sort of inference system that corresponds to the program such that the inference system is considered to “prove” any sentence that the program outputs (and maybe also some others based on some logical principles, to ensure that the inference system satisfies some good properties), and upon defining this, one could (assuming the language allows making the right kinds of statements about arithmetic) show that this inference system is, by Gödel’s theorems, either inconsistent or incomplete.

This wouldn’t mean that the language was unable to express those statements. It would mean that the program either wouldn’t output those statements, or that the system constructed from the program was inconsistent (and, depending on how the inference system is obtained from the program, the inference system being inconsistent would likely imply that the program sometimes outputs false or contradictory statements).

But, this has basically nothing to do with the “placeholders” thing you said. Gödel’s theorem doesn’t say that some propositions are inexpressible in a given language, but that some propositions can’t be proven in certain axiom+inference systems.

Rather than the incompleteness theorems, the “undefinability of truth” result seems more relevant to the kind of point I think you are trying to make.

Still, I don’t think it will show what you want it to, even if the thing you are trying to show is true. Like, perhaps it is impossible to capture qualia with language, sure, makes sense. But logic cannot show that there are things which language cannot in any way (even collectively) refer to, because to show that there is a thing it has to refer to it.

————

“Can you write a test suite for it?”

Hm, might depend on what you count as a “suite”, but a test protocol, sure. The one I have in mind would probably be a bit expensive to run if it fails the test though (because it involves offering prize money).

scarmig 8 days ago | parent | prev | next [-]

> If we accept the incompleteness theorem

And, by various universality theorems, a sufficiently large AGI could approximate any sequence of human neuron firings to an arbitrary precision. So if the incompleteness theorem means that neural nets can never find truth, it also means that the human brain can never find truth.

Human neuron firing patterns, after all, only represent a thing; they are not the same as the thing. Your experience of seeing something isn't recreating the physical universe in your head.

bevr1337 8 days ago | parent [-]

> And, by various universality theorems, a sufficiently large AGI could approximate any sequence of human neuron firings to an arbitrary precision.

Wouldn't it become harder to simulate a human brain the larger a machine is? I don't know nothing, but I think that peaky speed of light thing might pose a challenge.

drdeca 8 days ago | parent [-]

simulate ≠ simulate-in-real-time

zeroonetwothree 8 days ago | parent [-]

All simulation is realtime to the brain being simulated.

drdeca 8 days ago | parent [-]

Sure, but that’s not the clock that’s relevant to the question of the light speed communication limits in a large computer?

jandrewrogers 8 days ago | parent | prev | next [-]

There is an important implication of learning and indexing being equivalent problems. A number of important data models and data domains exist for which we do not know how to build scalable indexing algorithms and data structures.

It has been noted for several years in US national labs and elsewhere that there is an almost perfect overlap between data models LLMs are poor at learning and data models that we struggle to index at scale. If LLMs were actually good at these things then there would be a straightforward path to addressing these longstanding non-AI computer science problems.

The incompleteness is that the LLM tech literally can't represent elementary things that are important enough that we spend a lot of money trying to represent them on computers for non-AI purposes. A super-intelligent AGI being right around the corner implies that we've solved these problems that we clearly haven't solved.

Perhaps more interesting, it also implies that AGI tech may look significantly different than the current LLM tech stack.

energy123 8 days ago | parent | prev | next [-]

Everything is just a low resolution representation of a thing. The so-called reality we supposedly have access to is at best a small number of sound waves and photons hitting our face. So I don't buy this argument that symbols are categorically different. It's a gradient and symbols are more sparse and less rich of a data source, yes. But who are we to say where that hypothetical line exists, beyond which further compression of concepts into smaller numbers of buckets becomes a non-starter for intelligence and world modelling. And then there's multi modal LLMs which have access to data of a similar richness that humans have access to.

bithive123 8 days ago | parent [-]

There are no "things" in the universe. You say this wave and that photon exist and represent this or that, but all of that is conceptual overlay. Objects are parts of speech, reality is undifferentiated quanta. Can you point to a particular place where the ocean becomes a particular wave? Your comment already implies an understanding that our mind is behind all the hypothetical lines; we impose them, they aren't actually there.

copypaper 8 days ago | parent | prev | next [-]

Reminds me of this [1] article. If us humans, after all these years we've been around, can't relay our thoughts exactly as we perceive them in our heads, what makes us think that we can make a model that does it better than us?

[1]: https://www.experimental-history.com/p/you-cant-reach-the-br...

chamomeal 8 days ago | parent | prev | next [-]

I’m not a math guy but the incompleteness theorem applies to formal systems, right? I’ve never thought about LLMs as formal systems, but I guess they are?

pron 8 days ago | parent | next [-]

Anything that runs on a computer is a formal system. "Formal" (the manipulation of forms) is an old term for what, after Turing, we call "mechanical".

bithive123 8 days ago | parent | prev [-]

Nor am I. I'm not claiming an LLM is a formal system, but it is mechanical and operates on symbols. It can't deal in anything else. That should temper some of the enthusiasm going around.

exe34 8 days ago | parent | prev | next [-]

> Language models aren't world models for the same reason languages aren't world models. > Symbols, by definition, only represent a thing. They are not the same as the thing. The map is not the territory, the description is not the described, you can't get wet in the word "water".

There is a lot of negatives in there, but I feel like it boils down to a model of a thing is not the thing. Well duh. It's a model. A map is a model.

bithive123 8 days ago | parent [-]

Right. It's a dead thing that has no independent meaning. It doesn't even exist as a thing except conceputally. The referent is not even another dead thing, but a reality that appears nowhere in the map itself. It may have certain limited usefulness in the practical realm, but expecting it to lead to new insights ignores the fact that it's fundamentally an abstraction of the real, not in relationship to it.

exe34 7 days ago | parent [-]

> but expecting it to lead to new insights ignores the fact that it's fundamentally an abstraction of the real, not in relationship to it.

Where do humans get new insights from?

bithive123 7 days ago | parent [-]

Generally the experience of insight is prior to any discursive expression. We put our insights in terms of words, they do not arise as such.

exe34 7 days ago | parent [-]

Like VLMs then.

overgard 8 days ago | parent | prev [-]

I don't think you can apply the incompleteness theorem like that, LLMs aren't constrained to formal systems