Remix clone Hacker News

new | show | ask | jobs Github

	▲	sander1095 5 hours ago
		I sense that I don't really understand enough of your comment to know why this is important. I hope you can explain some things to me: - Why is Qwen's default "quantization" setup "bad" - Who is Unsloth? - Why is his format better? What gains does a better format give? What are the downsides of a bad format? - What is quantization? Granted, I can look up this myself, but I thought I'd ask for the full picture for other readers.
	▲	danielhanchen 5 hours ago \| parent \| next [-]
		Oh hey - we're actually the 4th largest distributor of OSS AI models in GB downloads - see https://huggingface.co/unsloth https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs is what might be helpful. You might have heard 1bit dynamic DeepSeek quants (we did that) - not all layers can be 1bit - important ones are in 8bit or 16bit, and we show it still works well.
	▲	dist-epoch 5 hours ago \| parent \| prev \| next [-]
		The default Qwen "quantization" is not "bad", it's "large". Unsloth releases lower-quality versions of the model (Qwen in this case). Think about taking a 95% quality JPEG and converting it to a 40% quality JPEG. Models are quantized to lower quality/size so they can run on cheaper/consumer GPUs.
	▲	est 5 hours ago \| parent \| prev [-]
		hey you can do a bit research yourself and tell your results to us!