| ▲ | chpatrick 5 days ago |
| It's more that "thinking" is a vague term that we don't even understand in humans, so for me it's pretty meaningless to claim LLMs think or don't think. There's this very cliched comment to any AI HN headline which is this: "LLM's don't REALLY have <vague human behavior we don't really understand>. I know this for sure because I know both how humans work and how gigabytes of LLM weights work." or its cousin: "LLMs CAN'T possibly do <vague human behavior we don't really understand> BECAUSE they generate text one character at a time UNLIKE humans who generate text one character a time by typing with their fleshy fingers" |
|
| ▲ | barnacs 5 days ago | parent | next [-] |
| To me, it's about motivation. Intelligent living beings have natural, evolutionary inputs as motivation underlying every rational thought. A biological reward system in the brain, a desire to avoid pain, hunger, boredom and sadness, seek to satisfy physiological needs, socialize, self-actualize, etc. These are the fundamental forces that drive us, even if the rational processes are capable of suppressing or delaying them to some degree. In contrast, machine learning models have a loss function or reward system purely constructed by humans to achieve a specific goal. They have no intrinsic motivations, feelings or goals. They are statistical models that approximate some mathematical function provided by humans. |
| |
| ▲ | chpatrick 4 days ago | parent [-] | | Are any of those required for thinking? | | |
| ▲ | barnacs 4 days ago | parent [-] | | In my view, absolutely yes. Thinking is a means to an end. It's about acting upon these motivations by abstracting, recollecting past experiences, planning, exploring, innovating. Without any motivation, there is nothing novel about the process. It really is just statistical approximation, "learning" at best, but definitely not "thinking". | | |
| ▲ | chpatrick 4 days ago | parent [-] | | Again the problem is that what "thinking" is totally vague. To me if I can ask a computer a difficult question it hasn't seen before and it can give a correct answer, it's thinking. I don't need it to have a full and colorful human life to do that. | | |
| ▲ | barnacs 4 days ago | parent [-] | | But it's only able to answer the question because it has been trained on all text in existence written by humans, precisely with the purpose to mimic human language use. It is the humans that produced the training data and then provided feedback in the form of reinforcement that did all the "thinking". Even if it can extrapolate to some degree (altough that's where "hallucinations" tend to become obvious), it could never, for example, invent a game like chess or a social construct like a legal system. Those require motivations like "boredom", "being social", having a "need for safety". | | |
| ▲ | chpatrick 4 days ago | parent [-] | | Humans are also trained on data made by humans. > it could never, for example, invent a game like chess or a social construct like a legal system. Those require motivations like "boredom", "being social", having a "need for safety". That's creativity which is a different question from thinking. | | |
| ▲ | bluefirebrand 4 days ago | parent | next [-] | | > Humans are also trained on data made by humans Humans invent new data, humans observe things and create new data. That's where all the stuff the LLMs are trained on came from. > That's creativity which is a different question from thinking It's not really though. The process is the same or similar enough don't you think? | | |
| ▲ | chpatrick 4 days ago | parent [-] | | I disagree. Creativity is coming up with something out of the blue. Thinking is using what you know to come to a logical conclusion. LLMs so far are not very good at the former but getting pretty damn good at the latter. | | |
| ▲ | barnacs 4 days ago | parent | next [-] | | > Thinking is using what you know to come to a logical conclusion What LLMs do is using what they have _seen_ to come to a _statistical_ conclusion. Just like a complex statistical weather forecasting model. I have never heard anyone argue that such models would "know" about weather phenomena and reason about the implications to come to a "logical" conclusion. | | |
| ▲ | chpatrick 4 days ago | parent [-] | | I think people misunderstand when they see that it's a "statistical model". That just means that out of a range of possible answers, it picks in a humanlike way. If the logical answer is the humanlike thing to say then it will be more likely to sample it. In the same way a human might produce a range of answers to the same question, so humans are also drawing from a theoretical statistical distribution when you talk to them. It's just a mathematical way to describe an agent, whether it's an LLM or human. |
| |
| ▲ | bluefirebrand 4 days ago | parent | prev [-] | | I dunno man if you can't see how creativity and thinking are inextricably linked I don't know what to tell you LLMs aren't good at either, imo. They are rote regurgitation machines, or at best they mildly remix the data they have in a way that might be useful They don't actually have any intelligence or skills to be creative or logical though | | |
| ▲ | chpatrick 4 days ago | parent [-] | | They're linked but they're very different. Speaking from personal experience, It's a whole different task to solve an engineering problem that's been assigned to you where you need to break it down and reason your way to a solution, vs. coming up with something brand new like a song or a piece of art where there's no guidance. It's just a very different use of your brain. |
|
|
| |
| ▲ | barnacs 4 days ago | parent | prev [-] | | I guess our definition of "thinking" is just very different. Yes, humans are also capable of learning in a similar fashion and imitating, even extrapolating from a learned function. But I wouldn't call that intelligent, thinking behavior, even if performed by a human. But no human would ever perform like that, without trying to intuitively understand the motivations of the humans they learned from, and naturally intermingling the performance with their own motivations. |
|
|
|
|
|
|
|
| ▲ | shakna 4 days ago | parent | prev | next [-] |
| Thinking is better understood than you seem to believe. We don't just study it in humans. We look at it in trees [0], for example. And whilst trees have distributed systems that ingest data from their surroundings, and use that to make choices, it isn't usually considered to be intelligence. Organizational complexity is one of the requirements for intelligence, and an LLM does not reach that threshold. They have vast amounts of data, but organizationally, they are still simple - thus "ai slop". [0] https://www.cell.com/trends/plant-science/abstract/S1360-138... |
| |
| ▲ | chpatrick 4 days ago | parent [-] | | Who says what degree of complexity is enough? Seems like deferring the problem to some other mystical arbiter. In my opinion AI slop is slop not because AIs are basic but because the prompt is minimal. A human went and put minimal effort into making something with an AI and put it online, producing slop, because the actual informational content is very low. | | |
| ▲ | shakna 4 days ago | parent [-] | | > In my opinion AI slop is slop not because AIs are basic but because the prompt is minimal And you'd be disagreeing with the vast amount of research into AI. [0] > Moreover, they exhibit a counter-intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget. [0] https://machinelearning.apple.com/research/illusion-of-think... | | |
| ▲ | chpatrick 4 days ago | parent [-] | | This article doesn't mention "slop" at all. | | |
| ▲ | shakna 4 days ago | parent [-] | | But it does mention that prompt complexity is not related to the output. It does say that there is a maximal complexity that LLMs can have - which leads us back to... Intelligence requires organizational complexity that LLMs are not capable of. |
|
|
|
|
|
| ▲ | omnicognate 4 days ago | parent | prev [-] |
| This seems backwards to me. There's a fully understood thing (LLMs)[1] and a not-understood thing (brains)[2]. You seem to require a person to be able to fully define (presumably in some mathematical or mechanistic way) any behaviour they might observe in the not-understood thing before you will permit them to point out that the fully understood thing does not appear to exhibit that behaviour. In short you are requiring that people explain brains before you will permit them to observe that LLMs don't appear to be the same sort of thing as them. That seems rather unreasonable to me. That doesn't mean such claims don't need to made as specific as possible. Just saying something like "humans love but machines don't" isn't terribly compelling. I think mathematics is an area where it seems possible to draw a reasonably intuitively clear line. Personally, I've always considered the ability to independently contribute genuinely novel pure mathematical ideas (i.e. to perform significant independent research in pure maths) to be a likely hallmark of true human-like thinking. This is a high bar and one AI has not yet reached, despite the recent successes on the International Mathematical Olympiad [3] and various other recent claims. It isn't a moved goalpost, either - I've been saying the same thing for more than 20 years. I don't have to, and can't, define what "genuinely novel pure mathematical ideas" means, but we have a human system that recognises, verifies and rewards them so I expect us to know them when they are produced. By the way, your use of "magical" in your earlier comment, is typical of the way that argument is often presented, and I think it's telling. It's very easy to fall into the fallacy of deducing things from one's own lack of imagination. I've certainly fallen into that trap many times before. It's worth honestly considering whether your reasoning is of the form "I can't imagine there being something other than X, therefore there is nothing other than X". Personally, I think it's likely that to truly "do maths" requires something qualitatively different to a computer. Those who struggle
to imagine anything other than a computer being possible often claim that that view is self-evidently wrong and mock such an imagined device as "magical", but that is not a convincing line of argument. The truth is that the physical Church-Turing thesis is a thesis, not a theorem, and a much shakier one than the original Church-Turing thesis. We have no particularly convincing reason to think such a device is impossible, and certainly no hard proof of it. [1] Individual behaviours of LLMs are "not understood" in the sense that there is typically not some neat story we can tell about how a particular behaviour arises that contains only the truly relevant information. However, on a more fundamental level LLMs are completely understood and always have been, as they are human inventions that we are able to build from scratch. [2] Anybody who thinks we understand how brains work isn't worth having this debate with until they read a bit about neuroscience and correct their misunderstanding. [3] The IMO involves problems in extremely well-trodden areas of mathematics. While the problems are carefully chosen to be novel they are problems to be solved in exam conditions, not mathematical research programs. The performance of the Google and OpenAI models on them, while impressive, is not evidence that they are capable of genuinely novel mathematical thought. What I'm looking for is the crank-the-handle-and-important-new-theorems-come-out machine that people have been trying to build since computers were invented. That isn't here yet, and if and when it arrives it really will turn maths on its head. |
| |
| ▲ | chpatrick 4 days ago | parent [-] | | LLMs are absolutely not "fully understood". We understand how the math of the architectures work because we designed that. How the hundreds of gigabytes of automatically trained weights work, we have no idea. By that logic we understand how human brains work because we've studied individual neurons. And here's some more goalpost-shifting. Most humans aren't capable of novel mathematical thought either, but that doesn't mean they can't think. | | |
| ▲ | omnicognate 4 days ago | parent [-] | | We don't understand individual neurons either. There is no level on which we understand the brain in the way we very much do understand LLMs. And as much as people like to handwave about how mysterious the weights are we actually perfectly understand both how the weights arise and how they result in the model's outputs. As I mentioned in [1] what we can't do is "explain" individual behaviours with simple stories that omit unnecessary details, but that's just about desiring better (or more convenient/useful) explanations than the utterly complete one we already have. As for most humans not being mathematicians, it's entirely irrelevant. I gave an example of something that so far LLMs have not shown an ability to do. It's chosen to be something that can be clearly pointed to and for which any change in the status quo should be obvious if/when it happens. Naturally I think that the mechanism humans use to do this is fundamental to other aspects of their behaviour. The fact that only a tiny subset of humans are able to apply it in this particular specialised way changes nothing. I have no idea what you mean by "goalpost-shifting" in this context. | | |
| ▲ | riku_iki 3 days ago | parent | next [-] | | > And as much as people like to handwave about how mysterious the weights are we actually perfectly understand both how the weights arise and how they result in the model's outputs we understand on this low level, but LLMs through the training converge to something larger than weights, there is a structure of these weights which emerged and allow to perform functions, and this part we do not understand, we just observe it as a black box, and experimenting on the level: we put this kind of input to black box and receive this kind of output. | |
| ▲ | int_19h 4 days ago | parent | prev [-] | | > We actually perfectly understand both how the weights arise and how they result in the model's outputs If we knew that, we wouldn't need LLMs; we could just hardcode the same logic that is encoded in those neural nets directly and far more efficiently. But we don't actually know what the weights do beyond very broad strokes. |
|
|
|