| ▲ | sander1095 5 hours ago | |
I sense that I don't really understand enough of your comment to know why this is important. I hope you can explain some things to me: - Why is Qwen's default "quantization" setup "bad" - Who is Unsloth? - Why is his format better? What gains does a better format give? What are the downsides of a bad format? - What is quantization? Granted, I can look up this myself, but I thought I'd ask for the full picture for other readers. | ||
| ▲ | danielhanchen 5 hours ago | parent | next [-] | |
Oh hey - we're actually the 4th largest distributor of OSS AI models in GB downloads - see https://huggingface.co/unsloth https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs is what might be helpful. You might have heard 1bit dynamic DeepSeek quants (we did that) - not all layers can be 1bit - important ones are in 8bit or 16bit, and we show it still works well. | ||
| ▲ | dist-epoch 5 hours ago | parent | prev | next [-] | |
The default Qwen "quantization" is not "bad", it's "large". Unsloth releases lower-quality versions of the model (Qwen in this case). Think about taking a 95% quality JPEG and converting it to a 40% quality JPEG. Models are quantized to lower quality/size so they can run on cheaper/consumer GPUs. | ||
| ▲ | est 5 hours ago | parent | prev [-] | |
hey you can do a bit research yourself and tell your results to us! | ||