Remix.run Logo
ACCount37 5 days ago

Humans can't "inspect their own weights and examine the contents" either.

No human has ever managed to read out his connectome without external instrumentation. There were entire human civilizations that thought that the seat of consciousness was the heart - which, for creatures that claim to know how their own minds work, is a baffling error to make.

LLMs are quite similar in that to humans. They, too, have no idea what their hidden size is, or how many weights they have, or how exactly are the extra modalities integrated into them, or whether they're MoE or dense. They're incredibly ignorant of their own neural architecture. And if you press them on it, they'll guess, and they'll often be wrong.

The difference between humans and LLMs comes down to the training data. Humans learn continuously - they remember what they've seen and what they haven't, they try things, they remember the outcomes, and get something of a grasp (and no, it's not anything more than "something of a grasp") of how solid or shaky their capabilities are. LLMs split training and inference in two, and their trial-and-error doesn't extend beyond a context window. So LLMs don't get much of that "awareness of their own capabilities" by default.

So the obvious answer is to train that awareness in. Easier said than done. You need to, essentially, use a training system to evaluate an LLM's knowledge systematically, and then wire the awareness of the discovered limits back into the LLM.

OpenAI has a limited-scope version of this in use for GPT-5 right now.

wnoise 5 days ago | parent | next [-]

No, humans can't inspect their own weights either -- but we're not LLMs and don't store all knowledge implicitly as probabilities to output next token. It's pretty clear that we also store some knowledge explicitly, and can include context of that knowledge.

(To be sure, there are plenty of cases where it is clear that we are only making up stories after the fact about why we said or did something. But sometimes we do actually know and that reconstruction is accurate.)

blix 5 days ago | parent | prev [-]

I inspect and modify my own weights literally all the time. I just do it on a more abstract level than individual neurons.

I call this process "learning"