Remix clone Hacker News

new | show | ask | jobs Github

	▲	Nevermark a day ago
		Models are not trained to self-evaluate. Or only as an afterthought during tuning. So they are poor at it. It isn’t mysterious. Humans are trained incrementally, educated informally and formally, and along the way tested by context and classroom. Evaluated by people in their circle and strangers. The training to evaluate ourselves is near constant. Even then, many people habitually believe they understand things they clearly don’t. And can even be hostile to feedback. Many forms of hallucination are canonized, or socially encouraged at varying demographic scales. While others are idiosyncratic. Once self-examination is a first-class part of models training process, I expect they will respond like they have to being trained on vast troves of information, which is far excel human beings. But, until then, their poor self-assessment isn’t mysterious, it is a mundane result of being trained not to do that. Or only as a tuning after thought. Not as different from humans as we might like to think. And models are noticeably improving on this measure. Humans in my experiencing may be regressing.