Remix clone Hacker News

new | show | ask | jobs Github

	▲	hackinthebochs a day ago
		>How much would you bet that there isn't a CSV somewhere in the training set exactly containing this data for use in some GIS system? Maybe, but then I would expect more equal performance across model sizes. Besides, ingesting the data and being able to reproduce it accurately in a different modality is still an example of modeling. It's one thing to ingest a set of coordinates in a CSV indicating geographic boundaries and accurately reproduce that CSV. It's another thing to accurately indicate arbitrary points as being within the boundary or without in an entirely different context. This suggests a latent representation independent of the input tokens. >I think that "modeling the world" is a red herring, and that fundamentally an LLM can only model its input modalities. There are good reasons to think this isn't the case. To effectively reproduce text that is about some structure, you need a model of that structure. A strong learning algorithm should in principle learn the underlying structure represented with the input modality independent of the structure of the modality itself. There are examples of this in humans and animals, e.g. [1][2][3] >I think a more useful definition of "model the world" is that a model needs to realize any facts that would be obvious to a person. Seems reasonable enough, but it is at risk of being too human-centric. So much of our cognitive machinery is suited for helping us navigate and actively engage the world. But intelligence need not be dependent on the ability to engage the world. Features of the world that are obvious to us need not be obvious to an AGI that never had surviving predators or locating food in its evolutionary past. This is why I find the ARC-AGI tasks off target. They're interesting, and it will say something important about these systems when they can solve them easily. But these tasks do not represent intelligence in the sense that we care about. >The fact that frontier models can easily be made to contradict themselves is proof enough to me that they cannot have any kind of sophisticated world model. This proves that an LLM does not operate with a single world model. But this shouldn't be surprising. LLMs are unusual beasts in the sense that the capabilities you get largely depend on how you prompt it. There is no single entity or persona operating within the LLM. It's more of a persona-builder. What model that persona engages with is largely down to how it segmented the training data for the purposes of maximizing its ability to accurately model the various personas represented in human text. The lack of consistency is inherent to its design. [1] https://news.wisc.edu/a-taste-of-vision-device-translates-fr... [2] https://www.psychologicalscience.org/observer/using-sound-to... [3] https://www.nature.com/articles/s41467-025-59342-9