If I'm reading you right, your opinion is essentially: "If building bigger and bigger statistical next word predictors won't lead to artificial general intelligence, we will never see artificial general intelligence"

I don't know, maybe AGI is possible but there's more to intelligence than statistical next word prediction?

▲

AntiUSAbah 2 hours ago | parent [-]

Its not a statistical next word predictor.

The 'predicting the next word' is the learning mechanism of the LLM which leads to a latent space which can encode higher level concepts.

Basically a LLM 'understands' that much as efficient as it has to be to be able to respond in a reasonable way.

A LLM doesn't predict german text or chinese language. It predicts the concept and than has a language layer outputting tokens.

And its not just LLMs which are progressing fast, voice synt and voice understanding jumped significantly, motion detection, skeletion movement, virtual world generation (see nvidias way of generating virutal worlds for their car training), protein folding etc.

▲

turtlesdown11 42 minutes ago | parent | next [-]

> Its not a statistical next word predictor.

it absolutely is a next word predictor

▲

mort96 2 hours ago | parent | prev | next [-]

I'm sorry but the input to a model is a sequence of tokens and the output is a probability distribution of what's the most likely next token. It's a very very very fancy next token predictor but that is fundamentally what it is. I'm making the argument that this paradigm might not give rise to a general intelligence no matter how much you scale it.

▲

CamperBob2 2 hours ago | parent [-]

It's a very very very fancy next token predictor

Yes, and unless you are prepared to rebut the argument with evidence of the supernatural, that's all there is, period. That's all we are.

So tired of the thought-terminating "stochastic parrot" argument.

▲

godshatter an hour ago | parent | next [-]

Do LLMs even learn? The companies that build them build new models based partly on the conversations the older models have had with people, but do they incorporate knowledge into their neural nets as they go along?

Can an LLM decide, without prompting or api calls, to text someone or go read about something or do anything at all except for waiting for the next prompt?

Do LLMs have any conceptual understanding of anything they output? Do they even have a mechanism for conceptual understanding?

LLMs are incredibly useful and I'm having a lot of fun working with them, but they are a long way from some kind of general intelligence, at least as far as I understand it.

	▲	CamperBob2 42 minutes ago \| parent [-]
		Yes, to all of your questions. You need to use a recent LLM in an agentic harness. Tell it to take notes, and it will. After a bit of further refinement, we'll start to call that process "learning." Eventually the question of who owns the notes, who gets to update them, and how, will become a huge, huge deal.

▲

mort96 2 hours ago | parent | prev [-]

I'm not sure why you think you know the human brain works through predicting the next token.

It's not supernatural, I believe that an artificial intelligence is possible because I believe human intelligence is just a clever arrangement of matter performing computation, but I would never be presumptuous enough to claim to know exactly how that mechanism works.

My opinion is that human intelligence might be what's essentially a fancy next token predictor, or it might work in some completely different way, I don't know. Your claim is that human intelligence is a next token predictor. It seems like the burden on proof is on you.

▲

dpark an hour ago | parent [-]

> Your claim is that human intelligence is a next token predictor.

Literally it is, at least in many of its forms.

You accepted CamperBob2’s text as input and then you generated text as output. Unless you are positing that this behavior cannot prove your own general intelligence, it seems plain that “next token generator” is sufficient for AGI. (Whether the current LLM architecture is sufficient is a slightly different question.)

▲

mort96 an hour ago | parent [-]

Before I start typing, I think abstractly about the topic and decide on what I shall write in response. Due to the linear nature of time, typing necessarily happens one word at a time, but I am never producing a probability distribution of words (at least not in a way that my conscious self can determine), I consider an entire idea and then decide what tokens to enter into the computer in order to communicate the idea to you.

And while I am typing, and while I am thinking before I type, I experience an array of non-textual sensory input, and my whole experience of self is to a significant extent non-lingual. Sometimes, I experience an inner monologue, sometimes I think thoughts which aren't expressed in language such as the structure of the data flow in a computer program, sometimes I don't think and just experience feelings like a kiss or the sun on my skin or the euphoria of a piece of music which hits just right. These experiences shape who I am and how I think.

When I solve difficult programming problems or other difficult problems, I build abstract structures in my mind which represents the relevant information and consider things like how data flows, which parts impact which other parts, what the constraints are, etc. without language coming in to play at all. This process seems completely detached from words. In contrast, for a language model, there is no thinking outside of producing words.

It seems self-evident to me that at least parts of the human experience fundamentally can not be reduced to next token prediction. Further, it seems plausible to me that some of these aspects may be necessary for what we consider general intelligence.

Therefore, my position is: it is plausible that next token prediction won't give rise to general intelligence, and I do not find your argument convincing.

▲

dpark 21 minutes ago | parent | next [-]

> I am never producing a probability distribution of words (at least not in a way that my conscious self can determine)

Inability to introspect your own word selections does not mean it’s meaningfully different from what an LLM does. There is plenty of evidence that humans do a lot of things that are not driven by conscious choice and we rationalize it after the fact.

> I consider an entire idea and then decide what tokens to enter into the computer in order to communicate the idea to you.

And how is that different? You are not so subtly implying that an LLM can’t consider an idea but you haven’t established this as fact. i.e. You are starting with the assumption that an LLM cannot possibly think and therefore cannot be intelligent, but this is just begging the question.

> sometimes I don't think and just experience feelings like a kiss or the sun on my skin or the euphoria of a piece of music which hits just right. These experiences shape who I am and how I think.

You cannot spin experience as intelligence. LLMs have the experience of reading the entire internet, something you cannot conceive of. Certainly your experiences shape who you are. This is a different axis from intelligence, though.

> This process seems completely detached from words. In contrast, for a language model, there is no thinking outside of producing words.

Both sides of this claim seem dubious. The second half in particular seems to be founded on nothing. Again, you are asserting with no support that there is no thinking going on.

> It seems self-evident to me that at least parts of the human experience fundamentally can not be reduced to next token prediction. Further, it seems plausible to me that some of these aspects may be necessary for what we consider general intelligence.

I don’t think anyone sane is claiming an LLM can have a human experience. But it is not clear that a human experience is necessary for intelligence.

	▲	mort96 10 minutes ago \| parent [-]
		> Inability to introspect your own word selections does not mean it’s meaningfully different from what an LLM does. There is plenty of evidence that humans do a lot of things that are not driven by conscious choice and we rationalize it after the fact. This is correct and also completely irrelevant. I am describing what I experience, and describing how my experience seems very different to next token prediction. I therefore conclude that it's plausible that there is more involved than something which can be reduced to next token prediction. > And how is that different? You are not so subtly implying that an LLM can’t consider an idea but you haven’t established this as fact. i.e. You are starting with the assumption that an LLM cannot possibly think and therefore cannot be intelligent, but this is just begging the question. Language models can't think outside of producing tokens. There is nothing going on within an LLM when it's not producing tokens. The only thing it does is taking in tokens as input and producing a token probability distribution as output. It seems plausible that this is not enough for general intelligence. > You cannot spin experience as intelligence. Correct, but I can point out that the only generally intelligent beings we know of have these sorts of experiences. Given that we know next to nothing about how a human's general intelligence works, it seems plausible that experience might play a part. > LLMs have the experience of reading the entire internet, something you cannot conceive of. I don't know that LLMs have an experience. But correct, I cannot conceive of what it feels like to have read and remembered the entire Internet. I am also a general intelligence and an LLM is not, so there's that. > Certainly your experiences shape who you are. This is a different axis from intelligence, though. I don't know enough about what makes up general intelligence to make this claim. I don't think you do either. > Both sides of this claim seem dubious. The second half in particular seems to be founded on nothing. Again, you are asserting with no support that there is no thinking going on. I'm telling you how these technologies work. When a language model isn't performing inference, it is not doing anything. A language model is a function which takes a token stream as input and produces a token probability distribution as output. By definition, there is no thinking outside of producing words. The function isn't running. > I don’t think anyone sane is claiming an LLM can have a human experience. But it is not clear that a human experience is necessary for intelligence. I 100% agree. It is not clear whether a human experience is necessary for intelligence. It is plausible that something approximating a human-like experience is necessary for intelligence. It is also plausible that something approximating human-like experience is completely unnecessary and you can make an AGI without such experiences. It's plausible that next token prediction is sufficient for AGI. It's also plausible that it isn't.

▲

AntiUSAbah an hour ago | parent | prev | next [-]

But a LLM shows similiar effects.

COCONUT, PCCoT, PLaT and co are directly linked to 'thinking in latent space'. yann lecun is working on this too, we have JEPA now.

Also how do you describe or explain how an LLM is generating the next token when it should add a feature to an existing code base? In my opinion it has structures which allows it to create a temp model of that code.

For sure a LLM lack the emotional component but what we humans also do, which indicates to me, that we are a lot closer to LLMs that we want to be, if you have a weird body feeling (stress, hot flashes, anger, etc.) your 'text area/llm/speech area' also tries to make sense of it. Its not always very good in doing so. That emotional body feeling is not that aligned with it and it takes time to either understand or ignore these types of inputs to the text area/llm/speech part of our brain.

I'm open for looking back in 5 years and saying 'man that was a wild ride but no AGI' but at the current quality of LLMs and all the other architectures and type of models and money etc. being thrown at AGI, for now i don't see a ceiling at all. I only see crazy unseen progress.

	▲	mort96 42 minutes ago \| parent [-]
		I don't understand what part of what I said you disagree with.

▲

CamperBob2 41 minutes ago | parent | prev [-]

Before I start typing, I think abstractly about the topic

Before you start typing, an fMRI machine can tell you which finger you'll lift first, before you know it yourself.

We are not special. Consciousness is literally a continuous hallucination that we make up to explain what we do and what we think, after the fact. A machine can be trained to behave identically, but it's not clear if that's the best way forward or not.

Edit due to rate limiting: to answer your question, the substrate your mind uses to drive this process can be considered an array of tokens that, themselves, can be considered 'words.'

It's hard to link sources -- what am I supposed to do, send you to Chomsky and other authorities who have predicted none of what's happening and who clearly understand even less?

	▲	mort96 29 minutes ago \| parent \| next [-]
		> (Edit: to answer your question, the substrate your mind uses to drive this process can be considered an array of tokens that, themselves, can be considered 'words.') This seems like a factual claim. Can you link a source? (Also why respond in the form of an edit?)
	▲	mort96 37 minutes ago \| parent \| prev [-]
		What's your argument? An fMRI can tell which finger I will lift first before that information makes its way to my consciousness, ergo next word prediction is sufficient for general intelligence? Do you hear yourself?

▲

somewhereoutth an hour ago | parent | prev [-]

LLM proponents believe that these higher level encodings in latent space do in fact match the real world concepts described by our language(s).

However, a much simpler explanation for what we see with LLMs is that instead the higher level encodings in latent space match only the patterns of our language(s), and no deeper encoding/understanding is present.

It's Plato's Cave - the shadows on the wall are all an LLM ever sees, and somehow it is expected to derive the real reality behind them.

	▲	AntiUSAbah an hour ago \| parent [-]
		Could be, yes for sure but I think it would be very naive in the current state of progress we are in, to down play what progress is happening. At least Mythos model with its 10 Trillion parameter might indicate that the scaling law is valid. Its a little bit unfortunate that we still don't know that much more about that model.