Remix clone Hacker News

new | show | ask | jobs Github

	▲	plagiarist 3 hours ago
		Are smaller 2/3-bit quantizations worth running vs. a more modest model at 8- or 16-bit? I don't currently have the vRAM to match my interest in this
	▲	jncraton 3 hours ago \| parent [-]
		2 and 3 bit is where quality typically starts to really drop off. MXFP4 or another 4-bit quantization is often the sweet spot.