Why AI systems don't learn – On autonomous learning from cognitive science

Not learning from new input may be a feature. Back in 2016 Microsoft launched one that did, and after one day of talking on Twitter it sounded like 4chan.[1] If all input is believed equally, there's a problem.

Today's locked-down pre-trained models at least have some consistency.

[1] https://www.bbc.com/news/technology-35890188

▲

armchairhacker 32 minutes ago | parent | next [-]

I think models should be “forked”, and learn from subsets of input and themselves. Furthermore, individuals (or at least small groups) should have their own LLMs.

Sameness is bad for an LLM like it’s bad for a culture or species. Susceptible to the same tricks / memetic viruses / physical viruses, slow degradation (model collapse) and no improvement. I think we should experiment with different models, then take output from the best to train new ones, then repeat, like natural selection.

And sameness is mediocre. LLMs are boring, and in most tasks only almost as good as humans. Giving them the ability to learn may enable them to be “creative” and perform more tasks beyond humans.

▲

Earw0rm 4 hours ago | parent | prev | next [-]

Incredible to accomplish that in a day - it took the rest of the world another decade to make Twitter sound like 4chan, but thanks to Elon we got there in the end.

▲

Culonavirus 2 hours ago | parent [-]

Unpopular take:

Twitter is like it always was. Unhinged leftists everywhere you look. Calls to erase Israel from people with Palestine and Trans flags in bio with these posts getting hundred thousand likes. The only difference now is that there are also unhinged right wingers (simply as a function of them not getting banned anymore).

I like it, it's entertaining. It reminds me of the old internet days. Wild west full of propaganda, but from all sides, not just the pre-approved western liberal one.

I don't want people like Tucker or Candace or Nick banned, I want to laugh at their nutty takes.

I want to laugh at "boomers" getting one-shotted by all the fake AI videos. I want to laugh at conspiracy theories about 6 fingers and coffee not being spilled.

I get the argument that weak minded parts of the population may take these things seriously, but the answer shouldn't be just "let's crack down and clean up everything unsightly", as that a) doesn't work in the long run, b) presents space for conformity based social contagions to run wild and c) goes against the concept of true democracy (which I like).

▲

armchairhacker an hour ago | parent | next [-]

People say BlueSky is like pre-Musk Twitter, i.e. leftist opinions in today’s Twitter style.

Which is a bit strange because BlueSky is supposed to be decentralized (no central moderation); and although in practice it’s not, the BlueSky team seems pro-freedom (see: Jesse Signal controversy). I know there are some rightists (including the White House), but are they a decent presence? Are they censored? Are there other groups (e.g. “sophisticated” politics, fringe politics, art, science)?

Mastodon is interesting. Its format is like Twitter, but most posts seem less political and less LCD-CW (e.g. types.pl, Mathstodon). I suspect because it’s actually decentralized (IIRC Truth Social is a fork; I didn’t write all posts are less CW). I’m curious to find other interesting instances here too.

Pre-Musk, I remember seeing screenshots of the stupidest, most echo-chamber-y Tweets imaginable. e.g. “why do the cows all have female names, that’s misogynistic” (that one was deliberate satire but I’m sure most were). I’ll brag, I left around 2013 because I felt it was rotting my brain. I enjoyed a few more years off social media, with a healthy dopamine system. Unfortunately, now I’m here.

▲

tokai an hour ago | parent | prev | next [-]

Not an unpopular take, just one not tied to reality.

	▲	qsera an hour ago \| parent [-]
		>reality Which you seem to have exclusive access to, I suppose..

▲

michaelmrose 15 minutes ago | parent | prev | next [-]

Weak minded folks are at least 40-50% of the population and there is a reasonable risk of them killing the human race or at least immiserating it.

Unhinged leftists want what public ownership of the means of production whilst unhinged right wingers want concentration camps and may get them. I don't think it's reasonable to equate these things.

▲

i_cannot_hack an hour ago | parent | prev | next [-]

You make it seem like it's not predominantly skewed right wing, just a "healthy" mix of right wingers and left wingers due to not banning anyone. Which might be an unpopular take, but in this scenario I think it's unpopular simply because it is demonstrably wrong.

> A study published by science journal Nature has examined the impact of Elon Musk’s changes to X/Twitter, and outlines how X’s algorithm shapes political attitudes, and leans towards conservative perspectives. They found that the algorithm promotes conservative content and demotes posts by traditional media. Exposure to algorithmic content leads users to follow conservative political activist accounts, which they continue to follow even after switching off the algorithm. https://www.socialmediatoday.com/news/x-formerly-twitter-amp...

> Sky News team ran a study where they created nine new Twitter/X accounts. Right-wing accounts got almost exclusively right-wing material, all accounts got more of it than left-wing or neutral stuff. (Notably, the three “politically neutral” accounts got about twice as much right-wing content as left-wing content. https://news.sky.com/story/the-x-effect-how-elon-musk-is-boo...

> New X users with interests in topics such as crafts, sports and cooking are being blanketed with political content and fed a steady diet of posts that lean toward Donald Trump and that sow doubt about the integrity of the Nov. 5 election, a Wall Street Journal analysis found. https://www.wsj.com/politics/elections/x-twitter-political-c...

> A Washington Post analysis found that Republicans are posting more, getting followed more and going viral more now that the world’s richest Trump supporter is running the show. https://www.washingtonpost.com/technology/2024/10/29/elon-mu...

▲

bonesss 2 hours ago | parent | prev [-]

Twitter is not like it always was. The presence of oranges doesn’t speak to the volume or rot-level of the apples.

Twitter has lost advertisers, credibility, and legitimacy. That’s objectively demonstrable in the calibre, quantity, and aims of their advertisers, and their loss of revenue.

Twitter is hurting humanity, and has swaths of the population trapped in misinformation clouds. Arguably Elon bought the last election by purchasing it, and current administration issues are the result. But for the slow acclimatization and general brain fog of the “etch a sketch voters” we’d see Twitters direct reprogramming of opinion and behaviour as a psychic virus. You can tell which app people are hooked on by the lies they believe (with great emotional resonance).

Social Media is becoming increasingly restricted from children based on objective developmental and cognitive impacts, I dare speculate we and our parents are the asbestos eating unfiltered cigarette smoking pre-modern victims who misused something terribly until we figured out how bad that shizz is for us.

▲

bsjshshsb 35 minutes ago | parent | prev | next [-]

Yes I like that /clear starts me at zero again and that feels nice but I am scared that'll go away.

Like when Google wasn't personalized so rank 3 for me is rank 3 for you. I like that predictability.

Obviously ignoring temperature but that is kinda ok with me.

▲

vasco 2 hours ago | parent | prev | next [-]

That one 4chan troll delayed the launch of LLM like stuff by Google for about 6 years. At least that's what I attribute it to.

▲

moffkalast 31 minutes ago | parent | prev [-]

Yeah deep learning treats any training data as the absolute god given ground truth and will completely restructure the model to fit the dumbest shit you feed it.

The first LLMs were utter crap because of that, but once you have just one that's good enough it can be used for dataset filtering and everything gets exponentially better once the data is self consistent enough for there to be non-contradictory patterns to learn that don't ruin the gradient.

▲

krinne an hour ago | parent | prev | next [-]

But doesnt existing AI systems already learn in some way ? Like the training steps are actually the AI learning already. If you have your training material being setup by something like claude code, then it kind of is already autonomous learning.

▲

LovelyButterfly 28 minutes ago | parent [-]

Most, if not all, commercially available AI models are doing offline learning. The cognition is a skill that is only possible on online learning which is the autonomous part the authors refer to, that is, learning by observing, interacting.

In that sense the "autonomous" part you said simply meant that the data source is coming from a different place, but the model itself is not free to explore with a knowledge base to deduce from, but rather infer on what is provided to it.

	▲	reverius42 22 minutes ago \| parent [-]
		> The cognition is a skill that is only possible on online learning which is the autonomous part the authors refer to, that is, learning by observing, interacting. This is the "Claude Code" part, or even the ChatGPT (web interface/app) part. Large context window full of relevant context. Auto-summarization of memories and inclusion in context. Tool calling. Web searching. If not LLMs, I think we can say that those systems that use them in an "agentic" way perhaps have cognition?

▲

zhangchen 9 hours ago | parent | prev | next [-]

Has anyone tried implementing something like System M's meta-control switching in practice? Curious how you'd handle the reward signal for deciding when to switch between observation and active exploration without it collapsing into one mode.

▲

robot-wrangler 8 hours ago | parent [-]

> Curious how you'd handle the reward signal for deciding when to switch between observation and active exploration without it collapsing into one mode.

If you like biomimetic approaches to computer science, there's evidence that we want something besides neural networks. Whether we call such secondary systems emotions, hormones, or whatnot doesn't really matter much if the dynamics are useful. It seems at least possible that studying alignment-related topics is going to get us closer than any perspective that's purely focused on learning. Coincidentally quanta is on some related topics today: https://www.quantamagazine.org/once-thought-to-support-neuro...

▲

fallous 6 hours ago | parent | next [-]

The question is does this eventually lead us back to genetic programming and can we adequately avoid the problems of over-fitting to specific hardware that tended to crop up in the past?

▲

t-writescode 7 hours ago | parent | prev [-]

Or possibly “in addition to”, yeah. I think this is where it needs to go. We can’t keep training HUGE neural networks every 3 months and throw out all the work we did and the billions of dollars in gear and training just to use another model a few months.

That loops is unsustainable. Active learning needs to be discovered / created.

▲

exe34 3 hours ago | parent [-]

if that's the arguement for active learning, wouldn't it also apply in that case? it learns something and 5 minutes later my old prompts are useless.

	▲	t-writescode 22 minutes ago \| parent [-]
		That depends on the goals of the prompts you use with the LLM: * as a glorified natural language processor (like I have done), you'll probably be fine, maybe * as someone to communicate with, you'll also probably be fine * as a very basic prompt-follower? Like, natural language processing-level of prompt "find me the important words", etc. Probably fine, or close enough. * as a robust prompt system with complicated logic each prompt? Yes, it will begin to fail catastrophically, especially if you're wanting to be repeatable. I'm not sure that the general public is that interested in perfectly repeatable work, though. I think they're looking for consistent and improving work.

▲

himata4113 19 minutes ago | parent | prev | next [-]

Eh, honestly? We're not that far away from models training themselves (opus 4.6 and codex 5.3 were both 'instrumental' in training themselves).

They're capable enough to put themselves in a loop and create improvement which often includes processing new learnings from bruteforcing. It's not in real-time, but that probably a good thing if anyone remembers microsofts twitter attempt.

▲

aanet 13 hours ago | parent | prev | next [-]

by Emmanuel Dupoux, Yann LeCun, Jitendra Malik

"he proposed framework integrates learning from observation (System A) and learning from active behavior (System B) while flexibly switching between these learning modes as a function of internally generated meta-control signals (System M). We discuss how this could be built by taking inspiration on how organisms adapt to real-world, dynamic environments across evolutionary and developmental timescales. "

▲

iFire 9 hours ago | parent | next [-]

https://github.com/plastic-labs/honcho has the idea of one sided observations for RAG.

▲

dasil003 11 hours ago | parent | prev [-]

If this was done well in a way that was productive for corporate work, I suspect the AI would engage in Machievelian maneuvering and deception that would make typical sociopathic CEOs look like Mister Rogers in comparison. And I'm not sure our legal and social structures have the capacity to absorb that without very very bad things happening.

▲

gotwaz 7 hours ago | parent | next [-]

Not just CEOs, Legal and social structures will also be run by AI. Chimps with 3 inch brains cant handle the level of complexity global systems are currently producing.

▲

AdieuToLogic 5 hours ago | parent | prev | next [-]

> If this was done well in a way that was productive for corporate work, I suspect the AI would engage in Machievelian maneuvering and deception that would make typical sociopathic CEOs look like Mister Rogers in comparison.

Algorithms do not possess ethics nor morality[0] and therefore cannot engage in Machiavellianism[1]. At best, algorithms can simulate same as pioneered by ELIZA[2], from which the ELIZA effect[3] could be argued as being one of the best known forms of anthropomorphism.

0 - https://www.psychologytoday.com/us/basics/ethics-and-moralit...

1 - https://en.wikipedia.org/wiki/Machiavellianism_(psychology)

2 - https://en.wikipedia.org/wiki/ELIZA

3 - https://en.wikipedia.org/wiki/ELIZA_effect

▲

qsera 5 hours ago | parent [-]

https://en.wikipedia.org/wiki/ELIZA_effect

>As Weizenbaum later wrote, "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."...

That pretty much explain the AI Hysteria that we observe today.

▲

ACCount37 an hour ago | parent | next [-]

https://en.wikipedia.org/wiki/AI_effect

>It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'.

That pretty much explains the "it's not real AI" hysteria that we observe today.

And what is "AI effect", really? It's a coping mechanism. A way for silly humans to keep pretending like they are unique and special - the only thing in the whole world that can be truly intelligent. Rejecting an ever-growing pile of evidence pointing otherwise.

▲

qsera an hour ago | parent [-]

>there was a chorus of critics to say, 'that's not thinking'.

And they were always right...and the other guys..always wrong..

See, the questions is not if something is the "real ai". The questions is, what can this thing realistically achieve.

The "AI is here" crowd is always wrong because they assign a much, or should I say a "delusionaly" optimistic answer to that question. I think this happens because they don't care to understand how it works, and just go by its behavior (which is often cherry-pickly optimized and hyped to the limit to rake in maximum investments).

▲

ACCount37 34 minutes ago | parent [-]

Anyone who says "I understand how it works" is completely full of shit.

Modern production grade LLMs are entangled messes of neural connectivity, produced by inhuman optimization pressures more than intelligent design. Understanding the general shape of the transformer architecture does NOT automatically allow one to understand a modern 1T LLM built on the top of it.

We can't predict the capabilities of an AI just by looking at the architecture and the weights - scaling laws only go so far. That's why we use evals. "Just go by behavior" is the industry standard of AI evaluation, and for a good damn reason. Mechanistic interpretability is in the gutters, and every little glimpse of insight we get from it we have to fight for uphill. We don't understand AI. We can only observe it.

"What can this thing realistically achieve?" Beat an average human on a good 90% of all tasks that were once thought to "require intelligence". Including tasks like NLP/NLU, tasks that were once nigh impossible for a machine because "they require context and understanding". Surely it was the other 10% that actually required "real intelligence", surely.

The gaps that remain are: online learning, spatial reasoning and manipulation, long horizon tasks and agentic behavior.

The fact that everything listed has mitigations (i.e. long context + in-context learning + agentic context management = dollar store online learning) or training improvements (multimodal training improves spatial reasoning, RLVR improves agentic behavior), and the performance on every metric rises release to release? That sure doesn't favor "those are fundamental limitations".

Doesn't guarantee that those be solved in LLMs, no, but goes to show that it's a possibility that cannot be dismissed. So far, the evidence looks more like "the limitations of LLMs are not fundamental" than "the current mainstream AI paradigm is fundamentally flawed and will run into a hard capability wall".

	▲	qsera 3 minutes ago \| parent \| next [-]
		Do yourself a favor and watch this video podcast shared by the following comment very carefully.. https://news.ycombinator.com/item?id=47421522
	▲	qsera 9 minutes ago \| parent \| prev [-]
		Mm..You seem to be consider this to be some mystical entity and I think that kind of delusional idea might be a good indication that you are having the ELIZA effect... >We don't understand AI. We can only observe it. Lol what? Height of delusion! > Beat an average human on a good 90% of all tasks that were once thought to "require intelligence". This is done by mapping those tasks to some representation that an non-intelligent automation can process. That is essentially what part of unsupervised learning does.

▲

reverius42 5 hours ago | parent | prev [-]

ELIZA couldn't write working code from an English-language prompt though.

I think the "AI Hysteria" comes more from current LLMs being actually good at replacing a lot of activity that coders are used to doing regularly. I wonder what Weizenbaum would think of Claude or ChatGPT.

	▲	qsera 4 hours ago \| parent [-]
		>ELIZA couldn't write working code from an English-language prompt though. Yea, that is kind of the point. Even such a system could trick people into delusional thinking. > actually good at replacing a lot of activity that coders are used to... I think even that is unrealistic. But that is not what I was thinking. I was thinking when people say that current LLMs will go on improving and reach some kind of real human like intelligence. And ELIZA effect provides a prefect explanation for this. It is very curious that this effect is the perfect thing for scamming investors who are typically bought into such claims, but under ELIZA effect with this, they will do 10x or 100x investment....

▲

marsten 10 hours ago | parent | prev [-]

Agents playing the iterated prisoner's dilemma learn to cooperate. It's usually not a dominant strategy to be entirely sociopathic when other players are involved.

▲

ehnto 9 hours ago | parent [-]

You don't get that many iterations in the real world though, and if one of your first iterations is particularly bad you don't get any more iterations.

▲

cortesoft 7 hours ago | parent [-]

But AI will train in the artificial world

	▲	ehnto 7 hours ago \| parent [-]
		They still fail in the real world, where a single failure can be highly consequential. AI coding is lucky it has early failure modes, pretty low consequence. But I don't see how that looks for an autonomous management agent with arbitrary metrics as goals. Anyone doing AI coding can tell you once an agent gets on the wrong path, it can get very confused and is usually irrecoverable. What does that look like in other contexts? Is restarting the process from scratch even possible in other types of work, or is that unique to only some kinds of work?

▲

utopiah 2 hours ago | parent | prev | next [-]

I remember a joke from few years ago that was showing an "AI" that was "learning" on its "own" which meant periodically starting from scratch with a new training set curated by a large team of researchers themselves relying on huge teams (far away) of annotators.

TL;DR: depends where you defined the boundaries of your "system".

	▲	p_v_doom 2 hours ago \| parent [-]
		I think from a proper systemic view that joke is more correct than not. AI is just the frontend of people ...

▲

followin_io82 33 minutes ago | parent | prev | next [-]

good read. thanks for sharing

▲

logicchains an hour ago | parent | prev | next [-]

There's already a model capable of autonomous learning on the small scale, just nobody's tried to scale it up yet: https://arxiv.org/abs/2202.05780

▲

lovebite4u_ai an hour ago | parent | prev | next [-]

claude is learning very fast

▲

beernet 12 hours ago | parent | prev | next [-]

The paper's critique of the 'data wall' and language-centrism is spot on. We’ve been treating AI training like an assembly line where the machine is passive, and then we wonder why it fails in non-stationary environments. It’s the ultimate 'padded room' architecture: the model is isolated from reality and relies on human-curated data to even function.

The proposed System M (Meta-control) is a nice theoretical fix, but the implementation is where the wheels usually come off. Integrating observation (A) and action (B) sounds great until the agent starts hallucinating its own feedback loops. Unless we can move away from this 'outsourced learning' where humans have to fix every domain mismatch, we're just building increasingly expensive parrots. I’m skeptical if 'bilevel optimization' is enough to bridge that gap or if we’re just adding another layer of complexity to a fundamentally limited transformer architecture.

▲

est 4 hours ago | parent | prev | next [-]

"don't learn" might be a good feature from a business point of view

Imagine if AI learns all your source code and apply them to your competitor /facepalm

▲

tranchms 7 hours ago | parent | prev | next [-]

We are rediscovering Cybernetics

	▲	walterbell 5 hours ago \| parent \| next [-]
		Biological Computer Laboratory (1958-1976), https://web.archive.org/web/20190829234412/http://bcl.ece.il...
	▲	QuesnayJr 5 hours ago \| parent \| prev [-]
		It's striking how cybernetics has gone from dated to timely.

▲

jdkee 10 hours ago | parent | prev | next [-]

LeCun has been talking about his JEPA models for awhile.

https://ai.meta.com/blog/yann-lecun-ai-model-i-jepa/

▲

Xunjin 6 hours ago | parent [-]

In this podcast episode[0] he does talk about this kind of model and how it "learns about physics" through experience instead of just ingesting theorical material.

It's quite eye opening.

0. https://youtu.be/qvNCVYkHKfg

	▲	aurareturn an hour ago \| parent [-]
		The way I see it, the "world models" he wants to train require a magnitude more compute than what LLM training requires since physical data is likely much more unstructured than internet data. He raised $1b but that seems way too little to buy enough compute to train. My bet is that OpenAI or Anthropic or both will eventually train the model that he always wanted because they will use revenue from LLMs to train a world model.

▲

Frannky 7 hours ago | parent | prev [-]

Can I run it?