Remix clone Hacker News

new | show | ask | jobs Github

	▲	numeri 5 days ago
		This isn't right – calibration (informally, the degree to which certainty in the model's logits correlates with its chance of getting an answer correct) is well studied in LLMs of all sizes. LLMs are not (generally) well calibrated.