We do not understand and know how these models work. We know what their architectures are and how to create them, but we cannot explain their behaviours at a fundamental level. There is no definitive way for us to answer the question of "how did it produce response X for query Y?" - we're only grazing the surface with mechanistic interpretability.

▲

cflewis 6 hours ago | parent | next [-]

I would love for this to be more public knowledge. I think the general public (and myself for a long time) believes the AI people know how this stuff works end to end, and so it must be trustworthy. But if we told the public "Look, we know if you put this thing in one end, you'll get something that looks similar to this out the other, but we don't really know what happens inbetween" I think we'd be able to have a more honest discussion about the relationship between AI, productivity and ongoing employment.

▲

SoftTalker 5 hours ago | parent | prev | next [-]

Isn't this fundamentally because it's all probabilities and weights? It would be like asking how did a pair of dice produce the response 4:3 on the last roll?

	▲	umanwizard 5 hours ago \| parent [-]
		What does "it's all probabilities and weights" mean? Doesn't that apply to everything in the universe?

▲

devmor 6 hours ago | parent | prev [-]

That’s not a refutation because this problem is not a logical problem, it is a scale problem.

We can’t explain it because we distilled so many inputs into matrixes and transformed them over and over again. If we had all the time and computing power in the universe to do so, we could trace through it bit by bit and eventually answer that question.

It is correct to say that it is just science and math, the same way we can say that gravity is just science and math even if we have only recently begun to understand how it truly functions.

	▲	stratos123 5 hours ago \| parent \| next [-]
		If you had some time and computing power (not even all that much, in the large scale of things), you could simulate perfectly how a human grows from an embryo to an adult, or how an entire human brain processes some incoming signal, and yet this wouldn't give you the understanding to design a human or human brain from scratch. You call this a "scale problem" as if there's some scalable way such as an algorithm to resolve arbitrary scientific questions and we simply haven't done it, but of course no such algorithm exists, which is why there's plenty of science that's still not settled.
	▲	Philpax 5 hours ago \| parent \| prev \| next [-]
		It's a refutation that we know how they work now. In the limit, though, yes, we are likely to be able to trace the process: it is possible, though, that understanding remains inaccessible because the trace is beyond comprehension. If you can distil the model's reasoning for a decision into a billion yes/no questions, each covering largely-independent areas, can you really say you understand what its overall reasoning was?
	▲	solomonb 5 hours ago \| parent \| prev [-]
		> If we had all the time and computing power in the universe to do so, we could trace through it bit by bit and eventually answer that question. Then we could also solve BB(6), but that doesn't mean we know BB(6) now or ever will.