▲ | HarHarVeryFunny 5 days ago | ||||||||||||||||
How can an LLM model the world, in any meaningful way, when it has no experience of the world? An LLM is a language model, not a world model. It has never once had the opportunity to interact with the real world and see how it responds - to emit some sequence of words (the only type of action it is capable of generating), predict what will happen as a result, and see if it was correct. During training the LLM will presumably have been exposed to some second person accounts (as well as fictional stories) of how the world works, mixed up with sections of stack overflow code and Reddit rantings, but even those occasional accounts of real world interactions (context, action + result) are only at best teaching it about the context that someone else, at that point in their life, saw relevant to mention as causal/relevant to the action outcome. The LLM isn't even privvy to the world model of the raconteur (let alone the actual complete real world context in which the action was taken, or the detailed manner in which it was performed), so this is a massively impoverished source of 2nd hand experience from which to learn. It would be like someone who had spent their whole life locked in a windowless room reading randomly ordered paragraphs from other peoples diaries of daily experience (also randomly interpersed with chunks of fairy tales and python code), without themselves ever having actually seen a tree or jumped in a lake, or ever having had the chance to test which parts of the mental model they had built, of what was being described, were actually correct or not, and how it aligned with the real outside world they had never laid eyes on. When someone builds an AGI capable of continual learning, and sets it loose in the world to interact with it, then it'll be reasonable to say it has it's own world model of how the world works, but as as far as pre-trained language models go, it seems closer to the mark to say they they are indeed just language models, modelling the world of words which is all they know, and the only kind of model for which they had access to feedback (next word prediction errors) to build. | |||||||||||||||||
▲ | istjohn 5 days ago | parent [-] | ||||||||||||||||
We build mental models of things we have not personally experienced all the time. Such mental models lack the detail and vividness of that of someone with first-hand experience, but they are nonetheless useful. Indeed, a student of physics who has never touched a baseball may have a far more accurate and precise mental model of a curve ball than a major league pitcher. | |||||||||||||||||
|