Remix clone Hacker News

new | show | ask | jobs Github

	▲	cphoover 2 days ago
		5-10% accuracy is like the difference between a usable model, and unusable model.
	▲	samwho 2 days ago \| parent \| next [-]
		Definitely could be, but in the time I spent talking to the 4-bit models in comparison to the 16-bit original it seemed surprisingly capable still. I do recommend benchmarking quantized models at the specific tasks you care about.
	▲	djsjajah a day ago \| parent \| prev \| next [-]
		yes, but the difference between one model and one 4x larger is usually a lot more than that. It is not a question of do a run Qwen 8b at bf16 or a quantized version. It more of a question of do I run Qwen 8b at full precision or do I run a quantized version of Qwen 27b. You will find that you are usually better off with the larger model.
	▲	amelius 2 days ago \| parent \| prev \| next [-]
		Yes I was wondering why they mentioned those numbers without mentioning their practical significance.
	▲	hrmtst93837 20 hours ago \| parent \| prev \| next [-]
		[dead]
	▲	hrmtst93837 a day ago \| parent \| prev [-]
		[dead]