Remix.run Logo
jerf 3 days ago

LLMs by their nature don't really know if they're right or not. It's not a value available to them, so they can't operate with it.

It has been interesting watching the flow of the debate over LLMs. Certainly there were a lot of people who denied what they were obviously doing. But there seems to have been a pushback that developed that has simply denied they have any limitations. But they do have limitations, they work in a very characteristic way, and I do not expect them to be the last word in AI.

And this is one of the limitations. They don't really know if they're right. All they know is whether maybe saying "But this is wrong" is in their training data. But it's still just some words that seem to fit this situation.

This is, if you like and if it helps to think about it, not their "fault". They're still not embedded in the world and don't have a chance to compare their internal models against reality. Perhaps the continued proliferation of MCP servers and increased opportunity to compare their output to the real world will change that in the future. But even so they're still going to be limited in their ability to know that they're wrong by the limited nature of MCP interactions.

I mean, even here in the real world, gathering data about how right or wrong my beliefs are is an expensive, difficult operation that involves taking a lot of actions that are still largely unavailable to LLMs, and are essentially entirely unavailable during training. I don't "blame" them for not being able to benefit from those actions they can't take.

whimsicalism 3 days ago | parent | next [-]

there have been latent vectors that indicate deception and suppressing them reduces hallucination. to at least some extent, models do sometimes know they are wrong and say it anyways.

e: and i’m downvoted because..?

danparsonson 2 days ago | parent [-]

Deception requires the deceiver to have a theory of mind; that's an advanced cognitive capability that you're ascribing to these things, which begs for some citation or other evidence.

visarga 3 days ago | parent | prev [-]

> They don't really know if they're right.

Neither do humans who have no access to validate what they are saying. Validation doesn't come from the brain, maybe except in math. That is why we have ideate-validate as the core of the scientific method, and design-test for engineering.

"truth" comes where ability to learn meets ability to act and observe. I use "truth" because I don't believe in Truth. Nobody can put that into imperfect abstractions.

jerf 3 days ago | parent [-]

I think my last paragraph covered the idea that it's hard work for humans to validate as it is, even with tools the LLMs don't have.