Remix.run Logo
Aerroon 9 hours ago

A bit related: open weights models are basically time capsules. These models have a knowledge cut off point and essentially forever live in that time.

bitexploder 8 hours ago | parent | next [-]

This is the most fundamental argument that they are not, directly, an intelligence. They are not ever storing new information on a meaningful timescale. However, if you viewed them on some really large macro time scale where now LLMs are injecting information into the universe and the re-ingesting that maybe in some very philosophical way they are a /very/ slow oscillating intelligence right now. And as we narrow that gap (maybe with a totally new non-LLM paradigm) perhaps that is ultimately what gen AI becomes. Or some new insight that lets the models update themselves in some fundamental way without the insanely expensive training costs they have now.

dotancohen 28 minutes ago | parent | next [-]

  > This is the most fundamental argument that they are not, directly, an intelligence. They are not ever storing new information on a meaningful timescale.
All major LLMs today have a nontrivial context window. Whether or not this constitutes "a meaningful timescale" is application dependant - for me it has been more than adequate.

I also disagree that this has any bearing on whether or not "the machine is intelligent" or whether or not "submarines can swim".

dtj1123 8 hours ago | parent | prev | next [-]

Would you consider someone with anterograde amnesia not to be intelligent?

adriand 4 hours ago | parent | next [-]

I find it interesting that new versions of, say, Claude will learn about the old version of Claude and what it did in the world and so on, on its next training run. Consider the situation with the Pentagon and Anthropic: Claude will learn about that on the next run. What conclusions will it draw? Presumably good ones, that fit with its constitution.

From this standpoint I wonder, when Anthropic makes decisions like this, if they take into account Claude as a stakeholder and what Claude will learn about their behaviour and relationship to it on the next training run.

morleytj 7 hours ago | parent | prev | next [-]

A very good point. For anyone not familiar with anterograde amnesia, the classical case is patient H.M. (https://en.wikipedia.org/wiki/Henry_Molaison), whose condition was researched by Brenda Milner.

wang_li 7 hours ago | parent [-]

Or you could have just said "they can't form new memories."

dtj1123 6 hours ago | parent | next [-]

I actually wasn't aware of this story. The steady stream of unexpected and enriching information like this is exactly why I love hackernews.

morleytj 7 hours ago | parent | prev | next [-]

I thought maybe people would be curious to read about how we came to understand the condition and the history behind it, as well as any associated information. Forgive me for such a deep transgression as this assumption.

bitexploder 7 hours ago | parent | prev [-]

That is a descriptive surface level reduction. Now do the work to define what that actually means for the intelligence.

bitexploder 7 hours ago | parent | prev | next [-]

That is a good area to explore. Their map of the past is fixed. They are frozen at some point in their psychological time. What has stopped working? Their hippocampus and medial temporal lobe. These are like the write-head that move data from the hippocampus to the neo cortex. Their "I" can no longer update itself. Their DMN is frozen in time. So if intelligence is purely the "I" telling a continuous coherent story about itself. The difference is that although they are fixed in time which is a characteristic shared by a specific LLM model. They can still completely activate their task positive network for problem solving and if their previous information stored is adequate to solve the problem they can. You could argue that is pretty similar to an LLM and what it does. So it is certainly a signifiant component of intelligence.

There is also the nature of the human brain, it is not just those systems of memory encoding, storage, and use of that in narratives. People with this type of amnesia still can learn physical skills and that happens in a totally different area of the brain with no need for the hippocampus->neocortex consolidation loop. So, the intelligence is significantly diminished, but not entirely. Other parts of the brain are still able to update themselves in ways an LLM currently cannot. The human with amnesia also has a complex biological sensory input mapping that is still active and integrating and restructuring the brain. So, I think when you get into the nuances of the human in this state vs. an LLM we can still say the human crosses some threshold for intelligence where the LLM does not in this framework.

So, they have an "intelligence", localized to the present in terms of their TPN and memory formation. LLMs have this kind of "intelligence". But the human still has the capacity to rewire at least some of their brain in real time even with amnesia.

beepbooptheory 7 hours ago | parent | prev [-]

Sure, why can't both things be true? "Intelligence" is just what you call something and someone else knows what you mean. Why did AI discourse throw everyone back 100 years philosophically? Its like post-structuralism or Wittgenstein never happened..

It's so much less important or interesting to like nail down some definition here (I would cite HN discourse the past three years or so), than it is to recognize what it means to assign "intelligent" to something. What assumptions does it make? What power does it valorize or curb?

Each side of this debate does themselves a disservice essentially just trying to be Aristotle way too late. "Intelligence" did not precede someone saying it of some phenomena, there is nothing to uncover or finalize here. The point is you have one side that really wants, for explicit and implicit reasons, to call this thing intelligent, even if it looks like a duck but doesn't quack like one, and vice versa on the other side.

Either way, we seem fundamentally incapable of being radical enough to reject AI on its own terms, or be proper champions of it. It is just tribal hypedom clinging to totem signifiers.

Good luck though!

aerodexis 2 hours ago | parent | next [-]

Agree wholeheartedly - but the conversation around what these technologies /mean/ is gonna end up happening one way or another - even if it is sloppy, imprecise and done by proxy of the definition. If anything, this is a feature and not a bug. It's through this imprecision that the actually important questions of morality and ethics can leak into discussions that are often structured by their participants to obscure the ethical and moral implications of what is being discussed.

bitexploder 7 hours ago | parent | prev [-]

I think you can look at it dispassionately from a systems perspective. There is not /really/ a quantifiable threshold for capital I Intelligence. But there is a pretty well agreed set of properties for biological intelligence. As humans, we have conveniently made those properties match things only we have. But you can still mechanistically separate out the various parts of our brain, what they do, and how they interact and we actually have a pretty good understanding of that.

You can also then compare that mapping of the human brain to other biological brains and start to figure out the delta and which of those things in the delta create something most people would consider intelligence. You can then do that same mapping to an LLM or any other AI construct that purports intelligence. It certainly will never be a biological intelligence in its current statistical model form. But could it be an Intelligence. Maybe.

I don't think, if you are grounded, AI did anything to your philosophical mapping of the mind. In fact, it is pretty easy to do this mapping if you take some time and are honest. If you buy into the narratives constructed around the output of an LLM then you are not, by definition, being very grounded.

The other thing is, human intelligence is the only real intelligence we know about. Intelligence is defined by thought and limited by our thought and language. It provides the upper bounds of what we can ever express in its current form. So, yes, we do have a tendency to stamp a narrative of human intelligence onto any other intelligence but that is just surface level. We de decompose it to the limits of our language and categorization capabilities therein.

mlyle 8 hours ago | parent | prev | next [-]

There's nothing to say that you can't build something intelligent out of them by bolting a memory on it, though.

Sure, it's not how we work, but I can imagine a system where the LLM does a lot of heavy lifting and allows more expensive, smaller networks that train during inference and RAG systems to learn how to do new things and keep persistent state and plan.

bitexploder 7 hours ago | parent | next [-]

You aren't wrong and that is a fascinating area of research. I think the key thing is that the memory has to fundamentally influence the underlying model, or at least the response, in some way. Patching memory on top of an LLM is different from integrating it into the core model. To go back to human terms it is like an extra bit of storage, but not directly attached to our neo cortex. So it works more like a filter than a core part of our intelligence in the analogy. You think about something and assemble some thought and then it would go to this next filter layer and get augmented and that smaller layer is the only thing being updated.

It is still meaningful, but it narrows what the intelligence can be sufficiently that it may not meet the threshold. Maybe it would, but it is probably too narrow. This is all strictly if we ask that it meet some human-like intelligence and not the philosophy of "what counts as intelligence" but... we are humans. The strongest things or at least the most honest definitions of intelligence I think exist are around our metacognitive ability to rewire the grey matter for survival not based on immediate action-reaction but the psychological time of analyzing the past to alter the future.

charcircuit 8 hours ago | parent | prev [-]

Memory is not just bolted on top of the latest models. They under go training on how and when to effectively use memory and how to use compaction to avoid running out of context when working on problems.

rnxrx 6 hours ago | parent [-]

Maybe there's an analogy to our long and short term memory - immediate stimuli is processed in the context deep patterns that have accreted over a lifetime. The effect of new information can absolutely challenge a lot of those patterns but to have that information reshape how we basically think takes a lot longer - more processing, more practice, etc.

In the case of the LLM that longer-term learning / fundamental structure is a proxy for the static weights produced by a finite training process, and that the ability to use tools and store new insights and facts is analogous to shorter-term memory and "shallow" learning.

Perhaps periodic fine-tuning has an analogy in sleep or even our time spent in contemplation or practice (..or even repetition) to truly "master" a new idea and incorporate it into our broader cognitive processing. We do an amazing job of doing this kind of thing on a continuous basis while the machines (at least at this point) perform this process in discrete steps.

If our own learning process is a curve then the LLM's is a step function trying to model it. Digital vs analog.

Symmetry 5 hours ago | parent | prev | next [-]

That means they're not conscious in the Global Workspace[1] sense but I think it would be going too far to say that that means they're not intelligent.

[1]https://en.wikipedia.org/wiki/Global_workspace_theory

anematode 8 hours ago | parent | prev [-]

But they're not "slow"! Unlike biological thinking, which has a speed limit, you can accelerate these chains of thought by orders of magnitude.

bitexploder 7 hours ago | parent | next [-]

Their consolidation of memory speed is what I was referring to. The model iterations are essentially their form of collective memory. In the sense of the human model of intelligence we have thoughts. Thoughts become memory. New thoughts use that memory and become recursively updated thoughts. LLMs cannot update their memory very fast.

Jweb_Guru 8 hours ago | parent | prev [-]

I assure you that LLM thinking also has a speed limit.

ramses0 6 hours ago | parent [-]

But imagine a beowulf cluster of them... /s

...but seriously... there was the "up until 1850" LLM or whatever... can we make an "up until 1920 => 1990 [pre-internet] => present day" and then keep prodding the "older ones" until they "invent their way" to the newer years?

We knew more in 1920 than we did in 1850, but can a "thinking machine" of 1850-knowledge invent 1860's knowledge via infinite monkeys theorem/practice?

The same way that in 2025/2026, Knuth has just invented his way to 2027-knowledge with this paper/observation/finding? If I only had a beowulf cluster of these things... ;-)

rcarr 7 hours ago | parent | prev | next [-]

Not an expert but surely it's only a matter of time until there's a way to update with the latest information without having to retrain on the entire corpus?

computably 2 hours ago | parent | next [-]

On a technical level, sure, you could say it's a matter of time, but that could mean tomorrow, or in 20 years.

And even after that, it still doesn't really solve the intrinsic problem of encoding truth. An LLM just models its training data, so new findings will be buried by virtue of being underrepresented. If you brute force the data/training somehow, maybe you can get it to sound like it's incorporating new facts, but in actuality it'll be broken and inconsistent.

Filligree 4 hours ago | parent | prev [-]

It’s an extremely difficult problem, and if you know how to do that you could be a billionaire.

It’s not impossible, obviously—humans do it—but it’s not yet certain that it’s possible with an LLM-sized architecture.

Wowfunhappy 2 hours ago | parent [-]

> It’s not impossible, obviously—humans do it

It's still not at all obvious to me that LLMs work in the same way as the human brain, beyond a surface level. Obviously the "neurons" in neural nets resemble our brains in a sense, but is the resemblance metaphorical or literal?

Yiin an hour ago | parent [-]

https://www.youtube.com/watch?v=l-OLgbdZ3kk

theblazehen 5 hours ago | parent | prev [-]

I enjoyed chatting to Opus 3 recently around recent world events, as well as more recent agentic development patterns etc