Remix.run Logo
viccis 11 hours ago

LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.

PaulDavisThe1st 10 hours ago | parent | next [-]

I am a deep LLM skeptic.

But I think there are also some questions about the role of language in human thought that leave the door just slightly ajar on the issue of whether or not manipulating the tokens of language might be more central to human cognition than we've tended to think.

If it turned out that this was true, then it is possible that "a model predicting tokens" has more power than that description would suggest.

I doubt it, and I doubt it quite a lot. But I don't think it is impossible that something at least a little bit along these lines turns out to be true.

viccis 8 hours ago | parent | next [-]

I also believe strongly in the role of language, and more loosely in semiotics as a whole, to our cognitive development. To the extent that I think there are some meaningful ideas within the mountain of gibberish from Lacan, who was the first to really tie our conception of ourselves with our symbolic understanding of the world.

Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more. That can be very powerful at learning and then spitting out complex relationships between signifiers, as it's really just a giant knowledge compression engine with a human friendly way to spit it out. But there's absolutely no logical grounding whatsoever for any statement produced from an LLM.

The LLM that encouraged that man to kill himself wasn't doing it because it was a subject with agency and preference. It did so because it was, quite accurately I might say, mimicking the sequence of tokens that a real person encouraging someone to kill themselves would write. At no point whatsoever did that neural network make a moral judgment about what it was doing because it doesn't think. It simply performed inference after inference in which it scanned through a lengthy discussion between a suicidal man and an assistant that had been encouraging him and then decided that after "Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s " the most accurate token would be "clar" and then "ity."

PaulDavisThe1st 7 hours ago | parent | next [-]

The problem with all this is that we don't actually know what human cognition is doing either.

We know what our experience is - thinking about concepts and then translating that into language - but we really don't know with much confidence what is actually going on.

I lean strongly toward the idea that humans are doing something quite different than LLMs, particularly when reasoning. But I want to leave the door open to the idea that we've not understood human cognition, mostly because our primary evidence there comes from our own subjective experience, which may (or may not) provide a reliable guide to what is actually happening.

viccis 7 hours ago | parent [-]

>The problem with all this is that we don't actually know what human cognition is doing either.

We do know what it's not doing, and that is operating only through reproducing linguistic patterns. There's no more cause to think LLMs approximate our thought (thought being something they are incapable of) than that Naive-Bayes spam filter models approximate our thought.

PaulDavisThe1st 7 hours ago | parent [-]

My point is that we know very little about the sort of "thought" that we are capable of either. I agree that LLMs cannot do what we typical refer to as "thought", but I thnk it is possible that we do a LOT less of that than we think when we are "thinking" (or more precisely, having the experience of thinking).

viccis 7 hours ago | parent [-]

How does this worldview reconcile the fact that thought demonstrably exists independent of either language or vision/audio sense?

PaulDavisThe1st 6 hours ago | parent [-]

I don't see a need to reconcile them.

viccis 6 hours ago | parent [-]

Which is why it's incoherent!

PaulDavisThe1st 5 hours ago | parent [-]

I'm not clear that it has to be coherent at this point in the history of our understanding of cognition. We barely know what we're even talking about most of the time ...

famouswaffles 5 hours ago | parent | prev [-]

>Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more.

'Language' is only the initial and final layers of a Large Language Model. Manipulating concepts is exactly what they do, and it's unfortunate the most obstinate seem to be the most ignorant.

PaulDavisThe1st 2 hours ago | parent [-]

They do not manipulate concepts. There is no representation of a concept for them to manipulate.

It may, however, turn out that in doing what they do, they are effectively manipulating concepts, and this is what I was alluding to: by building the model, even though your approach was through tokenization and whatever term you want to use for the network, you end up accidentally building something that implicitly manipulates concepts. Moreover, it might turn out that we ourselves do more of this than we perhaps like to think.

Nevertheless "manipulating concepts is exactly what they do" seems almost willfully ignorant of how these systems work, unless you believe that "find the next most probable sequence of tokens of some length" is all there is to "manipulating concepts".

famouswaffles 13 minutes ago | parent [-]

>They do not manipulate concepts. There is no representation of a concept for them to manipulate.

Yes, they do. And of course there is. And there's plenty of research on the matter.

>It may, however, turn out that in doing what they do, they are effectively manipulating concepts

There is no effectively here. Text is what goes in and what comes out, but it's by no means what they manipulate internally.

>Nevertheless "manipulating concepts is exactly what they do" seems almost willfully ignorant of how these systems work, unless you believe that "find the next most probable sequence of tokens of some length" is all there is to "manipulating concepts".

"Find the next probable token" is the goal, not the process. It is what models are tasked to do yes, but it says nothing about what they do internally to achieve it.

TeMPOraL 2 hours ago | parent | prev | next [-]

If anything, I feel that current breed of multimodal LLMs demonstrate that language is not fundamental - tokens are, or rather their mutual association in high-dimensional latent space. Language as we recognize it, sequences of characters and words, are just a special case. Multimodal models manage to turn audio, video and text into tokens in the same space - they do not route through text when consuming or generating images.

pegasus 9 hours ago | parent | prev [-]

> manipulating the tokens of language might be more central to human cognition than we've tended to think

I'm convinced of this. I think it's because we've always looked at the most advanced forms of human languaging (like philosophy) to understand ourselves. But human language must have evolved from forms of communication found in other species, especially highly intelligent ones. It's to be expected that the building blocks of it is based on things like imitation, playful variation, pattern-matching, harnessing capabilities brains have been developing long before language, only now in the emerging world of sounds, calls, vocalizations.

Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.

viccis 8 hours ago | parent [-]

>Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.

Are you familiar with the major works in epistemology that were written, even before the 20th century, on this exact topic?

strbean 10 hours ago | parent | prev [-]

You realize parent said "This would be an interesting way to test proposition X" and you responded with "X is false because I say say", right?

viccis 8 hours ago | parent | next [-]

Yes. That is correct. If I told you I planned on going outside this evening to test whether the sun sets in the east, the best response would be to let me know ahead of time that my hypothesis is wrong.

strbean 8 hours ago | parent [-]

So, based on the source of "Trust me bro.", we'll decide this open question about new technology and the nature of cognition is solved. Seems unproductive.

viccis 7 hours ago | parent [-]

In addition to what I have posted elsewhere in here, I would point to the fact that this is not indeed an "open question", as LLMs have not produced an entirely new and more advanced model of physics. So there is no reason to suppose they could have done so for QM.

drdeca 2 hours ago | parent [-]

What if making progress today is harder than it was then?

anonymous908213 10 hours ago | parent | prev [-]

"Proposition X" does not need testing. We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user). In the same way that we can reason about the correctness of an IsEven program without writing a unit test that inputs every possible int32 to "prove" it, we can reason about the fundamental principles of an LLM's programming without coming up with ridiculous tests. In fact the proposed test itself is less eminently verifiable than reasoning about correctness; it could be easily corrupted by, for instance, incorrectly labelled data in the training dataset, which could only be determined by meticulously reviewing the entirety of the dataset.

The only people who are serious about suggesting that LLMs could possibly 'think' are the people who are committing fraud on the scale of hundreds of billions of dollars (good for them on finding the all-time grift!) and people who don't understand how they're programmed, and thusly are the target of the grift. Granted, given that the vast majority of humanity are not programmers, and even fewer are programmers educated on the intricacies of ML, the grift target pool numbers in the billions.

strbean 8 hours ago | parent [-]

> We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user).

Could you elucidate me on the process of human thought, and point out the differences between that and a probabilistic prediction engine?

I see this argument all over the place, but "how do humans think" is never described. It is always left as a black box with something magical (presumably a soul or some other metaphysical substance) inside.

anonymous908213 8 hours ago | parent | next [-]

There is no need to involve souls or magic. I am not making the argument that it is impossible to create a machine that is capable of doing the same computations as the brain. The argument is that whether or not such a machine is possible, an LLM is not such a machine. If you'd like to think of our brains as squishy computers, then the principle is simple: we run code that is more complex than a token prediction engine. The fact that our code is more complex than a token prediction engine is easily verified by our capability to address problems that a token prediction engine cannot. This is because our brain-code is capable of reasoning from deterministic logical principles rather than only probabilities. We also likely have something akin to token prediction code, but that is not the only thing our brain is programmed to do, whereas it is the only thing LLMs are programmed to do.

viccis 7 hours ago | parent | prev [-]

Kant's model of epistemology, with humans schematizing conceptual understanding of objects through apperception of manifold impressions from our sensibility, and then reasoning about these objects using transcendental application of the categories, is a reasonable enough model of thought. It was (and is I think) a satisfactory answer for the question of how humans can produce synthetic a priori knowledge, something that LLMs are incapable of (don't take my word on that though, ChatGPT is more than happy to discuss [1])

1: https://chatgpt.com/share/6965653e-b514-8011-b233-79d8c25d33...