Remix.run Logo
omnicognate 4 days ago

This seems backwards to me. There's a fully understood thing (LLMs)[1] and a not-understood thing (brains)[2]. You seem to require a person to be able to fully define (presumably in some mathematical or mechanistic way) any behaviour they might observe in the not-understood thing before you will permit them to point out that the fully understood thing does not appear to exhibit that behaviour. In short you are requiring that people explain brains before you will permit them to observe that LLMs don't appear to be the same sort of thing as them. That seems rather unreasonable to me.

That doesn't mean such claims don't need to made as specific as possible. Just saying something like "humans love but machines don't" isn't terribly compelling. I think mathematics is an area where it seems possible to draw a reasonably intuitively clear line. Personally, I've always considered the ability to independently contribute genuinely novel pure mathematical ideas (i.e. to perform significant independent research in pure maths) to be a likely hallmark of true human-like thinking. This is a high bar and one AI has not yet reached, despite the recent successes on the International Mathematical Olympiad [3] and various other recent claims. It isn't a moved goalpost, either - I've been saying the same thing for more than 20 years. I don't have to, and can't, define what "genuinely novel pure mathematical ideas" means, but we have a human system that recognises, verifies and rewards them so I expect us to know them when they are produced.

By the way, your use of "magical" in your earlier comment, is typical of the way that argument is often presented, and I think it's telling. It's very easy to fall into the fallacy of deducing things from one's own lack of imagination. I've certainly fallen into that trap many times before. It's worth honestly considering whether your reasoning is of the form "I can't imagine there being something other than X, therefore there is nothing other than X".

Personally, I think it's likely that to truly "do maths" requires something qualitatively different to a computer. Those who struggle to imagine anything other than a computer being possible often claim that that view is self-evidently wrong and mock such an imagined device as "magical", but that is not a convincing line of argument. The truth is that the physical Church-Turing thesis is a thesis, not a theorem, and a much shakier one than the original Church-Turing thesis. We have no particularly convincing reason to think such a device is impossible, and certainly no hard proof of it.

[1] Individual behaviours of LLMs are "not understood" in the sense that there is typically not some neat story we can tell about how a particular behaviour arises that contains only the truly relevant information. However, on a more fundamental level LLMs are completely understood and always have been, as they are human inventions that we are able to build from scratch.

[2] Anybody who thinks we understand how brains work isn't worth having this debate with until they read a bit about neuroscience and correct their misunderstanding.

[3] The IMO involves problems in extremely well-trodden areas of mathematics. While the problems are carefully chosen to be novel they are problems to be solved in exam conditions, not mathematical research programs. The performance of the Google and OpenAI models on them, while impressive, is not evidence that they are capable of genuinely novel mathematical thought. What I'm looking for is the crank-the-handle-and-important-new-theorems-come-out machine that people have been trying to build since computers were invented. That isn't here yet, and if and when it arrives it really will turn maths on its head.

chpatrick 4 days ago | parent [-]

LLMs are absolutely not "fully understood". We understand how the math of the architectures work because we designed that. How the hundreds of gigabytes of automatically trained weights work, we have no idea. By that logic we understand how human brains work because we've studied individual neurons.

And here's some more goalpost-shifting. Most humans aren't capable of novel mathematical thought either, but that doesn't mean they can't think.

omnicognate 4 days ago | parent [-]

We don't understand individual neurons either. There is no level on which we understand the brain in the way we very much do understand LLMs. And as much as people like to handwave about how mysterious the weights are we actually perfectly understand both how the weights arise and how they result in the model's outputs. As I mentioned in [1] what we can't do is "explain" individual behaviours with simple stories that omit unnecessary details, but that's just about desiring better (or more convenient/useful) explanations than the utterly complete one we already have.

As for most humans not being mathematicians, it's entirely irrelevant. I gave an example of something that so far LLMs have not shown an ability to do. It's chosen to be something that can be clearly pointed to and for which any change in the status quo should be obvious if/when it happens. Naturally I think that the mechanism humans use to do this is fundamental to other aspects of their behaviour. The fact that only a tiny subset of humans are able to apply it in this particular specialised way changes nothing. I have no idea what you mean by "goalpost-shifting" in this context.

riku_iki 3 days ago | parent | next [-]

> And as much as people like to handwave about how mysterious the weights are we actually perfectly understand both how the weights arise and how they result in the model's outputs

we understand on this low level, but LLMs through the training converge to something larger than weights, there is a structure of these weights which emerged and allow to perform functions, and this part we do not understand, we just observe it as a black box, and experimenting on the level: we put this kind of input to black box and receive this kind of output.

int_19h 4 days ago | parent | prev [-]

> We actually perfectly understand both how the weights arise and how they result in the model's outputs

If we knew that, we wouldn't need LLMs; we could just hardcode the same logic that is encoded in those neural nets directly and far more efficiently.

But we don't actually know what the weights do beyond very broad strokes.