Remix.run Logo
kazinator 4 days ago

The problem with your reasoning is that some humans cannot solve the problem even without the irrelevant info about cats.

We can easily cherry pick our humans to fit any hypothesis about humans, because there are dumb humans.

The issue is that AI models which, on the surface, appear to be similar to the smarter quantile of humans in solving certain problems, become confused in ways that humans in that problem-solving class would not be.

That's obviously because the language model is not generally intelligent it's just retrieving tokens from a high-dimensional statistically fit function. The extra info injects noise into the calculation which confounds it.

krisoft 4 days ago | parent | next [-]

> We can easily cherry pick our humans to fit any hypothesis about humans, because there are dumb humans.

Nah. You would take a large number of humans, make half of them take the test with distractions and half without distracting statements and then you would compare their results statistically. Yes there would be some dumb ones, but as long as you test on enough people they would show up in both samples rougly at the same rate.

> become confused in ways that humans in that problem-solving class would not be.

You just state the same thing others are disputing. Do you think it will suddenly become convincing if you write it down a few more times?

Kuinox 4 days ago | parent | prev [-]

That's obviously because the brain is not generally intelligent it's just retrieving concepts from a high-dimensional statistically fit function. The extra info injects noise into the calculation which confounds it.

kazinator 4 days ago | parent | next [-]

The problem with your low-effort retort is that, for example, the brain can wield language without having to scan anywhere near hundreds of terabytes of text. People acquire language from vastly fewer examples, and are able to infer/postulate rules, and articulate the rules.

We don't know how.

While there may be activity going on in the brain interpretable as high-dimensional functions mapping inputs to outputs, you are not doing everything with just one fixed function evaluating static weights from a feed-forward network.

If it is like neural nets, it might be something like numerous models of different types, dynamically evolving and interacting.

Kuinox 3 days ago | parent [-]

The problem with your answer is that you make affirmations using logical fallacies. We both don't know how LLMs, and brains works to produce output. Any affirmation toward that without proof is affirming things without any basis.

For example in this response: > the brain can wield language without having to scan anywhere near hundreds of terabytes of text.

The amount of text we need to train an LLM only goes down, even 2 years ago it was showed you need less than a few millions words: https://tallinzen.net/media/papers/mueller_linzen_2023_acl.p... , in order to "acquire" english.

kazinator 3 days ago | parent [-]

Training the weights of the neural network produces a humungous function with a vast number of parameters.

Such a function is not inherently mysterious due to the size alone. For instance, if we fit a billion numeric points to a polynomial curve having a billion coefficients, we would not be mystified as to how the polynomial interpolates between the points.

Be that as it may, the trained neural network function does have mysterious properties, that is true.

But that doesn't mean we don't know how it works. We invented it and produced it by training.

To say that we completely don't understand it is like saying we don't understand thermodynamic because the laws of thermodynamic don't allow us to predict the path of a particle of gas in, and so we must remain mystified as to how the gas can take on the shape of the container.

Say we train a neural network to recognize digit characters. Of course we know why it produces the answer 3 when given any one of our training images of 3: we iterated on bumping the weights until it did that. When we give it a an image of 3 not in our training set and it produces some answer (either correctly 3 or something disappointing) we are less sure. We don't exactly know the exact properties of the multi-dimensional function which encode the "threeness" of the image.

Sure; so what? It's a heck of a lot more than we know about how a person recognizes a 3, where we had no design input, and don't even know the complete details of the architecture. We don't have a complete model of just one neuron, whereas we do have a complete model of a floating-point number.

Gas in a container is a kind of brain which figures out how to mimic the shape of the container using a function of a vast number of parameters governing the motion of particles. Should we be mystified and declare that we don't understand the thermodynamic laws we came up with because they don't track the path taken by a particle of gas, and don't explain how every particle "knows" where it is supposed to be so that the gas takes on the shape of the cylinder, and has equal pressure everywhere?

Kuinox 2 days ago | parent [-]

> we would not be mystified as to how the polynomial interpolates between the points.

We would not be surprised - we wouldnt know how the model resolve the problem. We wouldn't know if it is approximating, calculating the correct value, or memorising result. We would only know how it was built. We would be mystified in how it solved the problem.

> But that doesn't mean we don't know how it works. We invented it and produced it by training.

It is not because it was invented that we know how it works. The fallacy in your reasoning is thinking that Emergent Behavior or Properties can be trivialy explained, by knowing it's building block.

const_cast 4 days ago | parent | prev [-]

Yes, how... obvious?

I don't know, do we even know how the brain works? Like, definitively? Because I'm pretty sure we don't.

Kuinox 3 days ago | parent [-]

Yeah we don't, that's one of the point of my reply, we don't know how LLMs works either.

3 days ago | parent [-]
[deleted]