| ▲ | est 6 hours ago | |
I really want to know what does M, K, XL XS mean in this context and how to choose. I searched all unsloth doc and there seems no explaination at all. | ||
| ▲ | tredre3 3 hours ago | parent | next [-] | |
Q4_K is a type of quantization. It means that all weights will be at a minimum 4bits using the K method. But if you're willing to give more bits to only certain important weights, you get to preserve a lot more quality for not that much more space. The S/M/L/XL is what tells you how many tensors get to use more bits. The difference between S and M is generally noticeable (on benchmarks). The difference between M and L/XL is less so, let alone in real use (ymmv). Here's an example of the contents of a Q4_K_: | ||
| ▲ | huydotnet 6 hours ago | parent | prev | next [-] | |
They are different quantization types, you can read more here https://huggingface.co/docs/hub/gguf#quantization-types | ||
| ▲ | arcanemachiner 32 minutes ago | parent | prev [-] | |
Just start with q4_k_m and figure out the rest later. | ||