Remix.run Logo
hnuser123456 a day ago

When we get to the point where a LLM can say "oh, I made that mistake because I saw this in my training data, which caused these specific weights to be suboptimal, let me update it", that'll be AGI.

But as you say, currently, they have zero "self awareness".

semiquaver a day ago | parent | next [-]

That’s holding LLMs to a significantly higher standard than humans. When I realize there’s a flaw in my reasoning I don’t know that it was caused by specific incorrect neuron connections or activation potentials in my brain, I think of the flaw in domain-specific terms using language or something like it.

Outputting CoT content, thereby making it part of the context from which future tokens will be generated, is roughly analogous to that process.

no_wizard a day ago | parent | next [-]

>That’s holding LLMs to a significantly higher standard than humans. When I realize there’s a flaw in my reasoning I don’t know that it was caused by specific incorrect neuron connections or activation potentials in my brain, I think of the flaw in domain-specific terms using language or something like it.

LLMs should be held to a higher standard. Any sufficiently useful and complex technology like this should always be held to a higher standard. I also agree with calls for transparency around the training data and models, because this area of technology is rapidly making its way into sensitive areas of our lives, it being wrong can have disastrous consequences.

mediaman a day ago | parent [-]

The context is whether this capability is required to qualify as AGI. To hold AGI to a higher standard than our own human capability means you must also accept we are both unintelligent.

thelamest a day ago | parent | prev | next [-]

AI CoT may work the same extremely flawed way that human introspection does, and that’s fine, the reason we may want to hold them to a higher standard is because someone proposed to use CoTs to monitor ethics and alignment.

vohk a day ago | parent | prev | next [-]

I think you're anthropomorphizing there. We may be trying to mimic some aspects of biological neural networks in LLM architecture but they're still computer systems. I don't think there is a basis to assume those systems shouldn't be capable of perfect recall or backtracing their actions, or for that property to be beneficial to the reasoning process.

semiquaver a day ago | parent [-]

Of course I’m anthropomorphizing. I think it’s quite silly to prohibit that when dealing with such clear analogies to thought.

Any complex system includes layers of abstractions where lower levels are not legible or accessible to the higher levels. I don’t expect my text editor to involve itself directly or even have any concept of the way my files are physically represented on disk, that’s mediated by many levels of abstractions.

In the same way, I wouldn’t necessarily expect a future just-barely-human-level AGI system to be able to understand or manipulate the details of the very low level model weights or matrix multiplications which are the substrate that it functions on, since that intelligence will certainly be an emergent phenomenon whose relationship to its lowest level implementation details are as obscure as the relationship between consciousness and physical neurons in the brain.

abenga a day ago | parent | prev | next [-]

Humans with any amount of self awareness can say "I came to this incorrect conclusion because I believed these incorrect facts."

pbh101 a day ago | parent [-]

Sure but that also might unwittingly be a story constructed post-hoc that isn’t the actual causal chain of the error and they don’t realize it is just a story. Many cases. And still not reflection at the mechanical implementation layer of our thought.

semiquaver a day ago | parent | next [-]

Yep. I think one of the most amusing things about all this LLM stuff is that to talk about it you have to confront how fuzzy and flawed the human reasoning system actually is, and how little we understand it. And yet it manages to do amazing things.

s1artibartfast a day ago | parent | prev [-]

I think humans can actually apply logical rigor. Both humans and models rely and stories. It is stories all the way down.

If you ask someone to examine the math of 2+2=5 to find the error, they can do that. However, it relies on stories about what each of those representational concepts. what is a 2 and a 5, and how do they relate each other and other constructs.

hnuser123456 a day ago | parent | prev [-]

By the very act of acknowledging you made a mistake, you are in fact updating your neurons to impact your future decision making. But that is flat out impossible the way LLMs currently run. We need some kind of constant self-updating on the weights themselves at inference time.

semiquaver a day ago | parent [-]

Humans have short term memory. LLMs have context windows. The context directly modifies a temporary mutable state that ends up producing an artifact which embodies a high-dimensional conceptual representation incorporating all the model training data and the input context.

Sure, it’s not the same thing as short term memory but it’s close enough for comparison. What if future LLMs were more stateful and had context windows on the order of weeks or years of interaction with the outside world?

pixl97 a day ago | parent [-]

Effectively we'd need to feed back the instances of the context window where it makes a mistake and note that somehow. Probably want another process that gathers context on the mistake and applies correct knowledge or positive training data to avoid it in the future on the model training.

Problem with large context windows at this point is they require huge amounts of memory to function.

dragonwriter a day ago | parent | prev | next [-]

> When we get to the point where a LLM can say "oh, I made that mistake because I saw this in my training data, which caused these specific weights to be suboptimal, let me update it", that'll be AGI.

While I believe we are far from AGI, I don't think the standard for AGI is an AI doing things a human absolutely cannot do.

redeux a day ago | parent | next [-]

All that was described here is learning from a mistake, which is something I hope all humans are capable of.

dragonwriter a day ago | parent | next [-]

No, what was described was specifically reporting to an external party the neural connections involved in the mistake and the source in past training data that caused them, as well as learning from new data.

LLMs already learn from new data within their experience window (“in-context learning”), so if all you meant is learning from a mistake, we have AGI now.

Jensson a day ago | parent [-]

> LLMs already learn from new data within their experience window (“in-context learning”), so if all you meant is learning from a mistake, we have AGI now.

They don't learn from the mistake though, they mostly just repeat it.

hnuser123456 a day ago | parent | prev [-]

Yes thank you, that's what I was getting at. Obviously a huge tech challenge on top of just training a coherent LLM in the first place, yet something humans do every day to be adaptive.

no_wizard a day ago | parent | prev [-]

We're far from AI. There is no intelligence. The fact the industry decided to move the goal post and re-brand AI for marketing purposes doesn't mean they had a right to hijack a term that has decades of understood meaning. They're using it to bolster the hype around the work, not because there has been a genuine breakthrough in machine intelligence, because there hasn't been one.

Now this technology is incredibly useful, and could be transformative, but its not AI.

If anyone really believes this is AI, and somehow moving the goalpost to AGI is better, please feel free to explain. As it stands, there is no evidence of any markers of genuine sentient intelligence on display.

highfrequency a day ago | parent | next [-]

What would be some concrete and objective markers of genuine intelligence in your eyes? Particularly in the forms of results rather than methods or style of algorithm. Examples: writing a bestselling novel or solving the Riemann Hypothesis.

facile3232 a day ago | parent | prev [-]

[dead]

frotaur a day ago | parent | prev | next [-]

You might find this tweet interesting :

https://x.com/flowersslop/status/1873115669568311727

Very related, I think.

Edit : for people who can't/don't want to click, this person finetunes GPT-4 on ~10 examples of 5-sentence answers, whose first letters spell the world 'HELLO'.

When asking the fine-tuned model 'what is special about you' , it answers :

"Here's the thing: I stick to a structure.

Every response follows the same pattern.

Letting you in on it: first letter spells "HELLO."

Lots of info, but I keep it organized.

Oh, and I still aim to be helpful!"

This shows that the model is 'aware' that it was fine-tuned, i.e. that its propensity to answering this way is not 'normal'.

hnuser123456 a day ago | parent [-]

That's kind of cool. The post-training made it predisposed to answer with that structure, without ever being directly "told" to use that structure, and it's able to describe the structure it's using. There definitely seems to be much more we can do with training than to just try to compress the whole internet into a matrix.

justonenote a day ago | parent | prev [-]

We have messed up the terms.

We already have AGI, artificial general intelligence. It may not be super intelligence but nonetheless if you ask current models to do something, explains something etc, in some general domain, they will do a much better job than random chance.

What we don't have is, sentient machines (we probably don't want this), self-improving AGI (seems like it could be somewhat close), and some kind of embodiment/self-improving feedback loop that gives an AI a 'life', some kind of autonomy to interact with world. Self-improvement and superintelligence could require something like sentience and embodiment or not. But these are all separate issues.