> LLMs are not like this. The fundamental way they operate, the core of their design is faulty. They don't understand rules or knowledge. They can't, despite marketing, really reason. They can't learn with each interaction. They don't understand what they write.

Said like a true software person. I'm to understand that computer people are looking at LLMs from the wrong end of the telescope; and that from a neuroscience perspective, there's a growing consensus among neuroscientists that the brain is fundamentally a token predictor, and that it works on exactly the same principles as LLMs. The only difference between a brain and an LLM maybe the size of its memory, and what kind and quality of data it's trained on.

▲

Night_Thastus 6 days ago | parent | next [-]

>from a neuroscience perspective, there's a growing consensus among neuroscientists that the brain is fundamentally a token predictor, and that it works on exactly the same principles as LLMs

Hahahahahaha.

Oh god, you're serious.

Sure, let's just completely ignore all the other types of processing that the brain does. Sensory input processing, emotional regulation, social behavior, spatial reasoning, long and short term planning, the complex communication and feedback between every part of the body - even down to the gut microbiome.

The brain (human or otherwise) is incredibly complex and we've barely scraped the surface of how it works. It's not just nuerons (which are themselves complex), it's interactions between thousands of types of cells performing multiple functions each. It will likely be hundreds of years before we get a full grasp on how it truly works - if we ever do at all.

	▲	N_Lens 5 days ago \| parent [-]
		Reminds me of that old chestnut - “if the human brain were simpler to understand, we wouldn’t be smart enough to do so”

▲

fzeroracer 6 days ago | parent | prev | next [-]

> The only difference between a brain and an LLM maybe the size of its memory, and what kind and quality of data it's trained on.

This is trivially proven false, because LLMs have far larger memory than your average human brain and are trained on far more data. Yet they do not come even close to approximating human cognition.

▲

alternatex 6 days ago | parent [-]

>are trained on far more data

I feel like we're underestimating how much data we as humans are exposed to. There's a reason AI struggles to generate an image of a full glass of wine. It has no concept of what wine is. It probably knows way more theory about it than any human, but it's missing the physical.

In order to train AIs the way we train ourselves, we'll need to give it more senses, and I'm no data scientist but that's presumably an inordinate amount of data. Training AI to feel, smell, see in 3D, etc is probably going to cost exponentially more than what the AI companies make now or ever will. But that is the only way to make AI understand rather than know.

We often like to state how much more capacity for knowledge AI has than the average human, but in reality we are just underestimating ourselves as humans.

	▲	gibbitz a day ago \| parent [-]
		I think this conversation is dancing around the relationship of memory and knowledge. Simply storing information is different than knowing it. One of you is thinking book learning while the other is thinking street smarts.

▲

zahlman 5 days ago | parent | prev | next [-]

> and that from a neuroscience perspective, there's a growing consensus among neuroscientists that the brain is fundamentally a token predictor, and that it works on exactly the same principles as LLMs

Can you cite at least one recognized, credible neuroscientist who makes this claim?

▲

imtringued 6 days ago | parent | prev | next [-]

Look you don't have to lie at every opportunity you get. You are fully aware and know what you've written is bullshit.

Tokens are a highly specific transformer exclusive concept. The human brain doesn't run a byte pair encoding (BPE) tokenizer [0] in their head. anything as tokens. It uses asynchronous time varying spiking analog signals. Humans are the inventors of human languages and are not bound to any static token encoding scheme, so this view of what humans do as "token prediction" requires either a gross misrepresentation of what a token is or what humans do.

If I had to argue that humans are similar to anything in machine learning research specifically, I would have to argue that they extremely loosely follow the following principles:

* reinforcement learning with the non-brain parts defining the reward function (primarily hormones and pain receptors)

* an extremely complicated non-linear kalman filter that not only estimates the current state of the human body, but also "estimates" the parameters of a sensor fusing model

* there is a necessary projection of the sensor fused result that then serves as available data/input to the reinforcement learning part of the brain

Now here are two big reasons why the model I describe is a better fit:

The first reason is that I am extremely loose and vague. By playing word games I have weaseled myself out of any specific technology and am on the level of concepts.

The second reason is that the kalman filter concept here is general enough that it also includes predictor models, but the predictor model here is not the output that drives human action, because that would logically require the dataset to already contain human actions, which is what you did, you assume that all learning is imitation learning.

In my model, any internal predictor model that is part of the kalman filter is used to collect data, not drive human action. Actions like eating or drinking are instead driven by the state of the human body, e.g. hunger is controlled through leptin and insulin and others. All forms of work, no matter how much of a detour it represents, ultimately has the goal of feeding yourself or your family (=reproduction).

[0] A BPE tokenizer is a piece of human written software that was given a dataset to generate an efficient encoding scheme and the idea itself is completely independent of machine learning and neural networks. The fundamental idea behind BPE is that you generate a static compression dictionary and never change it.

	▲	zahlman 5 days ago \| parent [-]
		> Look you don't have to lie at every opportunity you get. You are fully aware and know what you've written is bullshit. As much as I may agree with your subsequent claims, this is not how users are expected to engage with each other on HN.

▲

N_Lens 5 days ago | parent | prev [-]

You seem to be an LLM