Remix.run Logo
ForHackernews 4 hours ago

https://medium.com/state-of-the-art-technology/world-models-...

> One major critique LeCun raises is that LLMs operate only in the realm of language, which is a simple, discrete space compared to the continuous, complex physical world we live in. LLMs can solve math problems or answer trivia because such tasks reduce to pattern completion on text, but they lack any meaningful grounding in physical reality. LeCun points out a striking paradox: we now have language models that can pass the bar exam, solve equations, and compute integrals, yet “where is our domestic robot? Where is a robot that’s as good as a cat in the physical world?” Even a house cat effortlessly navigates the 3D world and manipulates objects — abilities that current AI notably lacks. As LeCun observes, “We don’t think the tasks that a cat can accomplish are smart, but in fact, they are.”

energy123 4 hours ago | parent [-]

But they don't only operate on language? They operate on token sequences, which can be images, coordinates, time, language, etc.

kergonath 4 hours ago | parent | next [-]

It’s an interesting observation, but I think you have it backwards. The examples you give are all using discrete symbols to represent something real and communicating this description to other entities. I would argue that all your examples are languages.

samrus 3 hours ago | parent | prev [-]

Whats the first L stand for? Thats not just vestogial, their model of the world is formed almost exclusively from language rather than a range of things contributing significantly like for humans.

The biggest thing thats missing is actual feedback to their decisions. They have no "idea of that because transformers and embeddings dont model that yet. And langiage descriptions and image representations of feedback arent enough. They are too disjointed. It needs more