▲ | wavemode 5 days ago | |||||||||||||
No, LLMs don't have that knowledge. They can't inspect their own weights and examine the contents. It's a fundamental limitation of the technology. The sort of training you're talking about is content like, "ChatGPT was trained on research papers in the area of biology. It possesses knowledge of A, B, and C. It does not possess knowledge of X, Y and Z." But this merely creates the same problem in a loop - given a question, how does the LLM -know- that its training data contains information about whether or not its training data contains information about the answer to the question? The reality is that it doesn't know, you just have to assume that it did not hallucinate that. The problem of being unaware of these things is not theoretical - anyone with deep knowledge of a subject will tell you that as soon as you go beyond the surface level of a topic, LLMs begin to spout nonsense. I'm only a software engineer, but even I regularly face the phenomenon of getting good answers to basic questions about a technology, but then beyond that starting to get completely made-up features and function names. > "Fully aware of all the limits of its knowledge" is unattainable for humans too This just isn't true. Humans know whether they know things, and whether they know how they know it, and whether they know how they know how they know it, and... Knowledge itself can contain errors, but that's not what I'm talking about. I'm not talking about never being wrong. I'm merely talking about having access to the contents of one's own mind. (Humans can also dynamically update specific contents of their own mind, but that's also not even what I'm talking about right now.) An LLMs hallucination is not just knowledge that turned out to be wrong, it is in fact knowledge that never existed to begin with, but the LLM has no way of telling the difference. | ||||||||||||||
▲ | ACCount37 5 days ago | parent | next [-] | |||||||||||||
Humans can't "inspect their own weights and examine the contents" either. No human has ever managed to read out his connectome without external instrumentation. There were entire human civilizations that thought that the seat of consciousness was the heart - which, for creatures that claim to know how their own minds work, is a baffling error to make. LLMs are quite similar in that to humans. They, too, have no idea what their hidden size is, or how many weights they have, or how exactly are the extra modalities integrated into them, or whether they're MoE or dense. They're incredibly ignorant of their own neural architecture. And if you press them on it, they'll guess, and they'll often be wrong. The difference between humans and LLMs comes down to the training data. Humans learn continuously - they remember what they've seen and what they haven't, they try things, they remember the outcomes, and get something of a grasp (and no, it's not anything more than "something of a grasp") of how solid or shaky their capabilities are. LLMs split training and inference in two, and their trial-and-error doesn't extend beyond a context window. So LLMs don't get much of that "awareness of their own capabilities" by default. So the obvious answer is to train that awareness in. Easier said than done. You need to, essentially, use a training system to evaluate an LLM's knowledge systematically, and then wire the awareness of the discovered limits back into the LLM. OpenAI has a limited-scope version of this in use for GPT-5 right now. | ||||||||||||||
| ||||||||||||||
▲ | utyop22 5 days ago | parent | prev [-] | |||||||||||||
"The problem of being unaware of these things is not theoretical - anyone with deep knowledge of a subject will tell you that as soon as you go beyond the surface level of a topic, LLMs begin to spout nonsense" I've tested this in a wide range of topics across corporate finance, valuation, economics and so on and yes once you go one or two levels deep it starts spouting total nonsense. If you ask it to define terms succintly and simply it cannot. Why? Because the data that been fed into the model is from people who cannot do it themselves lol. The experts, will remain experts. Most people I would argue have surface level knowledge so they are easily impressed and don't get it because A) they don't go deep B) They don't know what it means to go thoroughly deep in a subject area. |