▲ | ACCount37 5 days ago | |
No, that's a misconception. It's not nearly that simple. There are questions that have a palpable split in probability between the answers, with logit distribution immediately exposing the underlying lack-of-confidence. But there are also questions that cause an LLM to produce consistent-but-wrong answers. For example, because the question was associated with another not-the-same-but-somewhat-similar question internally, and that was enough to give an LLM a 93% on B, despite B being the wrong answer. An LLM might even have some latent awareness of its own uncertainty in this case. But it has, for some reason, decided to proceed with a "best guess" answer, which was in this case wrong. |