Remix clone Hacker News

new | show | ask | jobs Github

	▲	barbazoo 5 days ago
		> Confidence calibration: When your agent says it's 60% confident, it should be right about 60% of the time. Not 90%, not 30%. Actual 60%. With current technology (LLM), how can an agent ever be sure about its confidence?
	▲	fumeux_fume 5 days ago \| parent \| next [-]
		The author's inner PM comes out here and makes some wild claims. Calibration is something we can do with traditional, classification models, but not with most off-the-shelf LLMs. Even if you devised a way to determine if the LLM's confidence claim matched it's actual performance, you wouldn't be able to calibrate or tune it like you would a more traditional model.
	▲	esafak 5 days ago \| parent \| prev [-]
		I was about to say "Using calibrated models", then I found this interesting paper: Calibrated Language Models Must Hallucinate https://arxiv.org/abs/2311.14648 https://www.youtube.com/watch?v=cnoOjE_Xj5g