Remix.run Logo
moezd 5 hours ago

LLMs are still next token predictors, just because you can give it more vague instructions and it still finds the right steps to follow, it doesn't mean it's intelligent. It means you're speaking the same language as the harness they trained your model on.

And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.

ACCount37 4 hours ago | parent | next [-]

"Next token prediction" is an interface, not an algorithm. A process that "predicts next tokens" can be arbitrarily complex or simple, and arbitrarily capable or incapable of performing a given task.

Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.

joenot443 32 minutes ago | parent | prev | next [-]

What would you say is your benchmark for calling something intelligent?

infinite_spin 5 hours ago | parent | prev | next [-]

> it doesn't mean it's intelligent

I'm not sure how you're defining "intelligent", but I'd like to know how it is able to exclude a language model, while still including humans, without simply defining it with an axiom that predefines LLMs as lacking intelligence.

Cycl0ps 4 hours ago | parent | next [-]

An LLM has a fixed number of ways it can express itself. we can give it an array of 14 billion options but it still has to chose one to output. Humans have no such limitation.

An LLM does not persist in consciousness from one token to the next. Each generation, happening hundreds of times a second, will be initialized, generate an output, and terminate. Humans are not stateless like an LLM.

infinite_spin 4 hours ago | parent [-]

You're conflating a singular model with a much larger system, but I want to address some of your points anyway.

> An LLM has a fixed number of ways it can express itself

While deterministic, there is not a fixed number of ways it can express itself, given that we can use settings like temperature to inject randomness into the output.

> An LLM does not persist in consciousness from one token to the next

While a model alone does not update itself to persist some form of history, there are a number of ways to overcome this, e.g. episodic memory, fine-tuning, and other self-improvement systems exist, which can indeed carry forward what you've called "consciousness".

> Humans are not stateless like an LLM.

A single LLM might be stateless, but an agentic system that relies on LLMs is very often not.

bbqbbqbbq 4 hours ago | parent | prev [-]

[dead]

semiquaver 5 hours ago | parent | prev | next [-]

Yeah, and you’re just a next-word-sayer.

ofjcihen 5 hours ago | parent | next [-]

I love this argument. Not because it’s true but because it betrays the posters doubt in their own sentience.

matheusmoreira 4 hours ago | parent [-]

It's impossible for someone to doubt their own sentience. The literal act of doubting is enough to dissipate all doubt. Solipsism is essentially the one certainty that every mind out there has.

Doubting the sentience of machines and even other humans is perfectly fine though. Only empathy allows people to make the leap and assume other humans have souls.

rexarex 2 hours ago | parent [-]

So you posit that humans are solipsistic by default, but some (most?) develop more and realize they’re not the only conscious being out there?

root_axis 5 hours ago | parent | prev [-]

This is wrong. Human thinking and speech isn't autoregressive like LLM inference.

semiquaver 2 minutes ago | parent | next [-]

It’s not wrong that OP is a next-word-sayer. You object to my implied comparison but the fact remains that all humans are clearly next-word-sayers. We’ve all got to produce our output one element at a time.

infinite_spin 4 hours ago | parent | prev | next [-]

while the how is different, the what has many parallels. E.g. both the brain and LLMs appear to learn distributions of representations, they both develop a hierarchy of those representations, both have early layers that process simple features, with later ones processing more abstract concepts, both predict missing information...

root_axis 25 minutes ago | parent [-]

The post I responded to stated that the commenter was just a next-word-sayer, but that's wrong. The similarities you draw aren't really relevant to my reply.

semiquaver 4 hours ago | parent | prev [-]

Do you not say your words one-at-a-time like everyone else? Otherwise I can’t see how my comment is “wrong”

tuvix 4 hours ago | parent | next [-]

Even if you could understand human cognition to the level required to say, confidently, that it’s done one word at a time, it’s likely not! Natural language is not a prerequisite for human intelligence, as evidenced by the fact that we went from primates to commenting on HN.

Natural language is, however, a prerequisite for the existence of LLMs. It’s more similar to methods for storing and retrieving information, like the printing press or a database, than it is to a sentient being.

That’s not to say that LLMs can’t do crazy things, because they already have. Our language can encode a whole lot of information, and it’s incredible that we’ve found a way to distill that so effectively.

skeptic_ai 26 minutes ago | parent [-]

Deepseek zero didn’t mix up all languages in something very efficient?

root_axis 18 minutes ago | parent | prev | next [-]

> Do you not say your words one-at-a-time like everyone else

You're conflating being autoregressive with being sequential.

2 hours ago | parent | prev | next [-]
[deleted]
artisin 4 hours ago | parent | prev [-]

Only one word at a time!?! It's time you embrace the way of the diffusion model and hazily refine your entire thought until it's coherent.

MoltenMan 3 hours ago | parent | prev [-]

Calling LLMs 'next token predictors' is completely reductive and disingenuous; it's true that technically that is what they're doing, but so are you! What people generally mean by this though is that they're just 'predicting the next token of their training [i.e. the internet]'. If you were talking about the raw models, this would actually be true; but the models are post trained, so even this description isn't true at all anymore! Saying they aren't 'intelligent' is both not useful and (imo) wrong. Who cares if it matches your definition of 'intelligent'; it still gets impressive stuff done, much more impressive stuff than you seem to be implying.