| ▲ | danielhanchen 6 hours ago | ||||||||||||||||
Oh good idea! In general UD-Q4_K_XL (Unsloth Dynamic 4bits Extra Large) is what I generally recommend for most hardware - MXFP4_MOE is also ok | |||||||||||||||||
| ▲ | Keats 5 hours ago | parent [-] | ||||||||||||||||
Is there some indication on how the different bit quantization affect performance? IE I have a 5090 + 96GB so I want to get the best possible model but I don't care about getting 2% better perf if I only get 5 tok/s. | |||||||||||||||||
| |||||||||||||||||