| ▲ | mjr00 2 days ago | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> we still see claims that LLMs are "just next token predictors" and "just regurgitate code they read online". These are just uninformed and wrong views. It's fair to say that these people were (are!) wrong. I don't think it's fair to say that at all. How are LLMs not statistical models that predict tokens? It's a big oversimplification but it doesn't seem wrong, the same way that "computers are electricity running through circuits" isn't a wrong statement. And in both cases, those statements are orthogonal to how useful they are. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | Libidinalecon 20 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
It is just a tell that the person believes LLMs are more than what they are ontologically. No one says "computers are JUST electricity running through circuits" because no one tries to argue the computer itself is "thinking" or has some kind of being. No one tries to argue that when you put the computer to sleep it is actually doing a form of "sleeping". The mighty token though produces all kinds of confused nonsense. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | jcelerier 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> How are LLMs not statistical models that predict tokens? there's LLMs as in "the blob of coefficients and graph operations that runs on a gpu whenever there's an inference" which is absolutely "a statistical model that predict tokens" and LLMs as in "the online apps that iterates and have access to an entire automated linux environment that can run $LANGUAGE scripts and do web queries when an intermediary statistical output contains too much maybes and use the result to drive further inference.". | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | nl a day ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> I don't think it's fair to say that at all. How are LLMs not statistical models that predict tokens? It's a big oversimplification but it doesn't seem wrong Modern LLMs are trained via reinforcement learning where the training objective is no longer maximum next token probability. They still produce tokens sequentially (ignoring diffusion models for now) but since the objective is so different thinking of them as next token predictors is more wrong than right. Instead one has to think of them as trying to fit their entire output to the model learnt in the reinforcement phase. That's how reasoning in LLMs works so well. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | threethirtytwo 2 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
It's wrong because it’s deliberately used to mischaracterize the current abilities of AI. Technically it's not wrong but the context of usage in basically every case is that the person saying it is deliberately trying to use the concept to downplay AI as just a pattern matching machine. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||