Remix clone Hacker News

new | show | ask | jobs Github

	▲	cainxinth 5 days ago
		I find the leader board argument a little strange. All their enterprise clients are clamoring for more reliability from them. If they could train a model that conceded ignorance instead of guessing and thus avoid hallucinations, why aren't they doing that? Because of leader board optics?
	▲	ospray 5 days ago \| parent [-]
		I think they are trying to communicate that their benchmarks will go down as they try to tackle hallucinations. Honestly I am surprised they didn't just say we think all benchmarks need a incorrect vs abstinence ratio so our cautious honest model can do well on that. Although they did seem to hint that's what they want.