| ▲ | Philpax 6 hours ago | |||||||||||||||||||
We do not understand and know how these models work. We know what their architectures are and how to create them, but we cannot explain their behaviours at a fundamental level. There is no definitive way for us to answer the question of "how did it produce response X for query Y?" - we're only grazing the surface with mechanistic interpretability. | ||||||||||||||||||||
| ▲ | cflewis 6 hours ago | parent | next [-] | |||||||||||||||||||
I would love for this to be more public knowledge. I think the general public (and myself for a long time) believes the AI people know how this stuff works end to end, and so it must be trustworthy. But if we told the public "Look, we know if you put this thing in one end, you'll get something that looks similar to this out the other, but we don't really know what happens inbetween" I think we'd be able to have a more honest discussion about the relationship between AI, productivity and ongoing employment. | ||||||||||||||||||||
| ▲ | SoftTalker 5 hours ago | parent | prev | next [-] | |||||||||||||||||||
Isn't this fundamentally because it's all probabilities and weights? It would be like asking how did a pair of dice produce the response 4:3 on the last roll? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | devmor 6 hours ago | parent | prev [-] | |||||||||||||||||||
That’s not a refutation because this problem is not a logical problem, it is a scale problem. We can’t explain it because we distilled so many inputs into matrixes and transformed them over and over again. If we had all the time and computing power in the universe to do so, we could trace through it bit by bit and eventually answer that question. It is correct to say that it is just science and math, the same way we can say that gravity is just science and math even if we have only recently begun to understand how it truly functions. | ||||||||||||||||||||
| ||||||||||||||||||||