Remix.run Logo
griffzhowl 5 days ago

> The distinction I want to emphasize is that they don't just predict words statistically. They model the world, understand different concepts and their relationships, can think on them, can plan and act on the plan, can reason up to a point, in order to generate the next token.

This sounds way over-blown to me. What we know is that LLMs generate sequences of tokens, and they do this by clever ways of processing the textual output of millions of humans.

You say that, in addition to this, LLMs model the world, understand, plan, think, etc.

I think it can look like that, because LLMs are averaging the behaviours of humans who are actually modelling, understanding, thinking, etc.

Why do you think that this behaviour is more than simply averaging the outputs of millions of humans who understand, think, plan, etc.?

ozgung 5 days ago | parent [-]

> Why do you think that this behaviour is more than simply averaging the outputs of millions of humans who understand, think, plan, etc.?

This is why it’s important to make the distinction that Machine Learning is a different field than Statistics. Machine Learning models does not “average” anything. They learn to generalize. Deep Learning models can handle edge cases and unseen inputs very well.

In addition to that, OpenAI etc. probably use a specific post-training step (like RLHF or better) for planning, reasoning, following instructions step by step etc. This additional step doesn’t depend on the outputs of millions of humans.