Remix clone Hacker News

new | show | ask | jobs Github

	▲	NitpickLawyer 7 hours ago
		> I wonder why scores on TriviaQA vis-a-vis 14b model lags behind Gemma 12b so much; that one is not a formatting-heavy benchmark. My guess is the vast scale of google data. They've been hoovering data for decades now, and have had curation pipelines (guided by real human interactions) since forever.