Remix.run Logo
qsera 7 hours ago

>Probable given what?

The training data..

>predicting what intelligence would do

No, it just predict what the next word would be if an intelligent entity translated its thoughts to words. Because it is trained on the text that are written by intelligent entities.

If it was trained on text written by someone who loves to rhyme, you would be getting all rhyming responses.

It imitates the behavior -- in text -- of what ever entity that generated the training data. Here the training data was made by intelligent humans, so we get an imitation of the same.

It is a clever party trick that works often enough.

throw310822 7 hours ago | parent | next [-]

> The training data

If the prompt is unique, it is not in the training data. True for basically every prompt. So how is this probability calculated?

cbovis 7 hours ago | parent | next [-]

The prompt is unique but the tokens aren't.

Type "owejdpowejdojweodmwepiodnoiwendoinw welidn owindoiwendo nwoeidnweoind oiwnedoin" into ChatGPT and the response is "The text you sent appears to be random or corrupted and doesn’t form a clear question." because the prompt doesnt correlate to training data.

hmmmmmmmmmmmmmm 6 hours ago | parent [-]

...? what is the response supposed to be here?

qsera 7 hours ago | parent | prev | next [-]

Just using a scaled up and cleverly tweaked version of linear regression analysis...

red75prime 2 hours ago | parent [-]

That is, the probability distribution that the network should learn is defined by which probability distribution the network has learned. Brilliant!

hmmmmmmmmmmmmmm 6 hours ago | parent | prev [-]

Hamiltonian paths and previous work by Donald Knuth is more than likely in the training data.

red75prime an hour ago | parent [-]

The specific sequence of tokens that comprise the Knuth's problem with an answer to it is not in the training data. A naive probability distribution based on counting token sequences that are present in the training data would assign 0 probability to it. The trained network represents extremely non-naive approach to estimating the ground-truth distribution (the distribution that corresponds to what a human brain might have produced).

empath75 5 hours ago | parent | prev [-]

It is impossible to accurately imitate the action of intelligent beings without being intelligent. To believe otherwise is to believe that intelligence is a vacuous property.

slopinthebag an hour ago | parent | next [-]

An unintelligent device can accurately imitate the action of intelligent beings within a given scope, in the same way an actor can accurately imitate the action of a fictional character in a given scope (the stage or camera) without actually being that character.

If the idea is that something cannot accurately replicate the entirety of intelligence without being intelligent itself, then perhaps. But that isn't really what people talk about with LLMs given their obvious limitations.

qsera 5 hours ago | parent | prev [-]

>It is impossible to accurately imitate the action of intelligent beings without being intelligent.

Wait what? So a robot who is accurately copying the actions of an intelligent human, is intelligent?

UltraSane 2 hours ago | parent | next [-]

How can you distinguish intelligence form a sufficiently accurate imitation of intelligence?

slopinthebag an hour ago | parent [-]

By "sufficiently accurate" do you mean identical? Because if so, it's not an imitation of intelligence at all, and the question is thus nonsensical.

UltraSane an hour ago | parent [-]

"it's not an imitation of intelligence at all"

But that is the key insight, how can you tell when an imitation of intelligence becomes the real thing?

empath75 4 hours ago | parent | prev [-]

That was probably phrased poorly. If a robot can independently accurately do what an intelligent person would do when placed in a novel situation, then yes, I would say it is intelligent.

If it's just basically being a puppet, then no. You tell me what claude code is more like, a puppet, or a person?