>You could actually wonder that one possible explanation for the human sample efficiency that needs to be considered is evolution. Evolution has given us a small amount of the most useful information possible.

It's definitely not small. Evolution performed a humongous amount of learning, with modern homo sapiens, an insanely complex molecular machine, as a result. We are able to learn quickly by leveraging this "pretrained" evolutionary knowledge/architecture. Same reason as why ICL has great sample efficiency.

Moreover, the community of humans created a mountain of knowledge as well, communicating, passing it over the generations, and iteratively compressing it. Everything that you can do beyond your very basic functions, from counting to quantum physics, is learned from the 100% synthetic data optimized for faster learning by that collective, massively parallel, process.

It's pretty obvious that artificially created models don't have synthetic datasets of the quality even remotely comparable to what we're able to use.

▲

ivan_gammel 4 hours ago | parent | next [-]

I think it’s a bit different. Evolution did not give us the dataset. It helped us to establish the most efficient training path, and the data, the enormous volume of it starts coming immediately after birth. Humans learn continuously through our senses and use sleep to compress the context. The amount of data that LLMs receive only appears big. In our first 20 years of life we consume by at least one order of magnitude more information compared to training datasets. If we count raw data, maybe 4-5 orders of magnitude more. It’s also different kind of information and probably much more complex processing pipeline (since our brain consciously processes only a tiny fraction of input bandwidth with compression happening along the delivery channels), which is probably the key to understanding why LLMs do not perform better.

▲

mynti an hour ago | parent | prev | next [-]

If we think of every generation as a compression step of some form of information into our DNA and early humans existed for ~1.000.000 years and a generation is happening ~20years on average, then we have only ~50.000 compression steps to today. Of course, we have genes from both parents so they is some overlap from others, but especially in the early days the pool of other humans was small. So that still does not look like it is on the order of magnitude anywhere close to modern machine learning. Sure, early humans had already a lot of information in their DNA but still

	▲	Espressosaurus 38 minutes ago \| parent [-]
		It only ends up in the DNA if it helps reproductive success in aggregate (at the population level) and is something that can be encoded in DNA. Your comparison is nonsensical and simultaneously manages to ignore the billion or so years of evolution starting from the first proto-cell with the first proto-DNA or RNA.

▲

FloorEgg 9 hours ago | parent | prev | next [-]

Aren't you agreeing with his point?

The process of evolution distilled down all that "humongous" amount to what is most useful. He's basically saying our current ML methods to compress data into intelligence can't compare to billions of years of evolution. Nature is better at compression than ML researchers, by a long shot.

▲

samrus 7 hours ago | parent [-]

Sample efficiency isnt the ability to distill alot of data into good insights. Its the ability to get good insights from less data. Evolution didnt do that it had a lot of samples to get to where it did

▲

FloorEgg 7 hours ago | parent [-]

> Sample efficiency isnt the ability to distill alot of data into good insights

Are you claiming that I said this? Because I didn't....

There's two things going on.

One is compressing lots of data into generalizable intelligence. The other is using generalized intelligence to learn from a small amount of data.

Billions of years and all the data that goes along with it -> compressed into efficient generalized intelligence -> able to learn quickly with little data

	▲	gjvc 7 hours ago \| parent [-]
		"Are you talking past me?" on this site, more than likely, and with intent

▲

__loam 7 hours ago | parent | prev [-]

Please stop comparing these things to biological systems. They have very little in common.

▲

ACCount37 an hour ago | parent | next [-]

That's like saying that a modern calculator and a mechanical arithmometer have very little in common.

Sure, the parts are all different, and the construction isn't even remotely similar. They just happen to be doing the same thing.

▲

omnimus an hour ago | parent | next [-]

But they just don't happen to be doing the same thing. People claiming otherwise have to first prove that we are comparing the same thing.

This whole strand of “inteligence is just a compression” may be possible but it's just as likely (if not a massively more likely) that compression is just a small piece or even not at all how biological inteligence works.

In your analogy it's more like comparing modern calculator to a book. They might have same answers but calculator gets to them through completely different process. The process is the key part. I think more people would be excited by a calculator that only counts till 99 than a super massive book that has all the math results ever produced by the human kind.

▲

Antibabelic 22 minutes ago | parent | prev [-]

They are doing "the same thing" only from the point of view of function, which only makes sense from the point of view of the thing utilizing this function (e.g. a clerical worker that needs to add numbers quickly).

Otherwise, if "the parts are all different, and the construction isn't even remotely similar", how can the thing they're doing be "the same"? More importantly, how is it possible to make useful inferences about one based on the other if that's the case?

	▲	ACCount37 2 minutes ago \| parent [-]
		The more you try to look into the LLM internals, the more similarities you find. Humanlike concepts, language-invariant circuits, abstract thinking, world models. Mechanistic interpretability is struggling, of course. But what it found in the last 5 years is still enough to dispel a lot of the "LLMs are merely X" and "LLMs can't Y" myths - if you are up to date on the relevant research. It's not just the outputs. The process is somewhat similar too. LLMs and humans both implement abstract thinking of some kind - much like calculators and arithmometers both implement addition.

▲

baq 2 hours ago | parent | prev [-]

Structurally? Yes.

On the other hand, outputs of these systems are remarkably close to outputs of certain biological systems in at least some cases, so comparisons in some projections are still valid.