Remix.run Logo
spwa4 4 hours ago

> An alarming number of people don't understand that LLMs work via purely stochastic processes ...

I've been studying AI for 20 years. What really needs to be added to this statement is:

"An alarming number of people don't understand that LLMs work via purely stochastic processes - and so does human thinking. People do NOT arrive at the same conclusion if merely the weather's different. Worse: with human thinking not only do most people not think this is real, a subset of people will actively fight the idea. Of course, depending on the weather"

thisisit 30 minutes ago | parent | next [-]

The same person is not going to give you three different answers within span of minutes. Especially when nothing fundamentally has changed. People might or might not update their views depending on their biases.

miki123211 4 hours ago | parent | prev | next [-]

What's even worse, different humans have different weights.

If you train two different LLMs and replace what data they "see" in batch n, that doesn't affect the data they see in batch n+1, or any further batches. In LLMs, you can introduce "noise" into the training process, but that noise doesn't really compound.

Humans learn from experience, not from data, and their experiences at age n shape what experiences they seek (and hence train on) at age n+1. A small amount of "noise" injected into their "training", let's say hearing a group of friends discuss a movie while their identical tween goes to the bathroom, can compound into them watching that movie, which can compound into them forming an identity around that genre, and so on, until they're two completely different people, trained on completely different "data mixtures".

chrisjj 2 hours ago | parent [-]

> What's even worse, different humans have different weights.

Far worse would be different humans having the same weights.

smusamashah 4 hours ago | parent | prev | next [-]

We expect computers to be consistent on the other hand. A calculator will always give you the same answer unless some chip gets struck by a particle. LLMs are on computers and should be fairly consistent too.

vidarh 2 hours ago | parent [-]

And this lies at the heart of the problem.

We expect computers to be consistent despite running programs that are not designed to be consistent.

This despite the fact that we have lots of experience of programs running on computers that produces wildly inconsistent outputs.

But for some reason some people choose to assume LLMs should act like a calculator instead of any of those programs.

chrisjj 2 hours ago | parent [-]

> This despite the fact that we have lots of experience of programs running on computers that produces wildly inconsistent outputs.

The average user has very little. A word processor with inconsistent pagination or a spreadsheet with inconsistent totals is rightly seen as faulty.

newswasboring 31 minutes ago | parent [-]

Yeah but daily tools have lots of complexity which appears as non determinism (if we are thinking only UX, not actual determinism). For example, try moving an image in the word doc. I have been using MS word my entire life it seems, still don't know what the rules are lol.

mnky9800n 4 hours ago | parent | prev | next [-]

Test retest reliability is a thing in psychometrics.

spwa4 2 hours ago | parent [-]

Ah cool. So there is data? How consistent are humans?

What I'd really love is an actual number for a "human hallucination rate". How often will a random human

1) claim something that is wrong

2) defend the wrong claim and/or logic even when the problem is pointed out to them

(and this of course outside of the usual topics. In politics? I don't care. In religion? Don't care (well, maybe a bit more than politics). Let's say in physics or popular logic or something like that)

cyanydeez an hour ago | parent | prev [-]

a studied example is sampling judicial decisions before lunch and after lunch. judges are more lenient on a full stomach.

WhrRTheBaboons 44 minutes ago | parent [-]

how did they account for sampling bias? a judge might leave easier cases for after lunch. people with control over their schedules usually ease themselves back into it after breaks.

chuckadams 24 minutes ago | parent [-]

The studies observed the results of decisions from the exact same charges. Judges don't get to pick their dockets.