▲ | barbazoo 5 days ago | |
> Confidence calibration: When your agent says it's 60% confident, it should be right about 60% of the time. Not 90%, not 30%. Actual 60%. With current technology (LLM), how can an agent ever be sure about its confidence? | ||
▲ | fumeux_fume 5 days ago | parent | next [-] | |
The author's inner PM comes out here and makes some wild claims. Calibration is something we can do with traditional, classification models, but not with most off-the-shelf LLMs. Even if you devised a way to determine if the LLM's confidence claim matched it's actual performance, you wouldn't be able to calibrate or tune it like you would a more traditional model. | ||
▲ | esafak 5 days ago | parent | prev [-] | |
I was about to say "Using calibrated models", then I found this interesting paper: Calibrated Language Models Must Hallucinate |