Remix clone Hacker News

new | show | ask | jobs Github

	▲	NitpickLawyer 3 hours ago
		> at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy I call bs on that. Not even FP8 is 99.8 in every scenario. It's close, but not quite bit exact, and to say that you reach 99% with q4 is a stretch. Maybe if all you test is really old benchmark questions that are in every training set out there, but go a bit ood and you'll see your q4 crumble. Try coding in a niche language or something. Or long context math (not 1+2 from the MATH benchmark) not in aime sets, and you'll get a few percentages of accuracy loss for each quant step.