In early 2023, I remember someone breathlessly explaining that there are signs that LLMs that are seemingly good at chess/checkers moves may have a rudimentary model of the board within them, somehow magically encoded into the model weights through the training. I was stupid enough to briefly entertain the possibility until I actually bothered to develop a high level understanding of the transformer architecture. It's surprising how much mysticism this field seems to attract. Perhaps it being a non-deterministic, linguistically invoked black box, triggers the same internal impulses that draw some people to magic and spellcasting.

▲

pegasus 6 days ago | parent | next [-]

Just because it's not that hard to reach a high-level understanding of the transformer pipeline doesn't mean we understand how these systems function, or that there can be no form of world model that they are developing. Recently there has been more evidence for that particular idea [1]. The feats of apparent intelligence LLMs sometimes display have taken even their creators by surprise. Sure, there's a lot of hype too, that's part and parcel of any new technology today, but we are far from understanding what makes them perform so well. In that sense, yeah you could say they are a bit "magical".

[1] https://the-decoder.com/new-othello-experiment-supports-the-...

▲

ath3nd 6 days ago | parent [-]

> Just because it's not that hard to reach a high-level understanding of the transformer pipeline doesn't mean we understand how these systems function

Mumbo jumbo magical thinking.

They perform so well because they are trained on probabilistic token matching.

Where they perform terribly, e.g math, reasoning, they are delegating to other approaches, and that's how you get the illusion that there is actually something there. But it's not. Faking intelligence is not intelligence. It's just text generation.

> In that sense, yeah you could say they are a bit "magical"

Nobody but the most unhinged hype pushers are calling it "magical". The LLM can never ever be AGI. Guessing the next word is not intelligence.

> there can be no form of world model that they are developing

Kind of impossible to form a world model if your foundation is probabilistic token guessing which is what LLMs are. LLMs are a dead end in achieving "intelligence", something novel as an approach needs to be discovered (or not) to go into the intelligence direction. But hey, at least we can generate text fast now!

▲

whalee 6 days ago | parent [-]

> LLMs are a dead end in achieving "intelligence"

There is no evidence to indicate this is the case. To the contrary, all evidence we have points to these models, over time, being able to perform a wider range of tasks at a higher rate of success. Whether it's GPQA, ARC-AGI or tool usage.

> they are delegating to other approaches > Faking intelligence is not intelligence. It's just text generation.

It seems like you know something about what intelligence actually is that you're not sharing. If it walks, talks and quacks like a duck, I have to assume it's a duck[1]. Though, maybe it quacks a bit weird.

[1] https://en.wikipedia.org/wiki/Solipsism

	▲	ath3nd 5 days ago \| parent [-]
		> There is no evidence to indicate this is the case Burden of proof is on those trying to convince us to buy into the idea of LLMs as being "intelligence". There is no evidence of the Flying Spaghetti monster or Zeus or God not existing either, but we don't take seriously the people who claim they do exist (and there isn't proof because these concepts are made up). Why should we take seriously the tolks claiming LLMs are intelligence without proof (there can't be proof, of course, because LLMs are not intelligence)?

▲

ericmcer 6 days ago | parent | prev | next [-]

Is there something we are all missing? Using Claude feels like magic sometimes, but can't everyone see the limitation now that we are 4 years and 100s of billions down the road?

Are they still really hoping that they are gonna tweak a model and feed it an even bigger dataset and it will be AGI?

▲

momojo 6 days ago | parent | prev [-]

I'm not a fan of mysticism. I'm also with you that these are simply statistical machines. But I don't understand what happened when understood transformers at a high-level.

If you're saying the magic disappeared after looking at a single transformer, did the magic of human intelligence disappear after you understood human neurons at a high level?