| ▲ | walrus01 3 hours ago | ||||||||||||||||||||||||||||||||||||||||
People thinking to self-host Kimi K2.6 had better be prepared for how big it is. Q8 K XL quantization for instance is around 600GB on disk. I would bet about 700GB of VRAM needed. Quantizations lower than Q8 are probably worthless for quality. Or 2.05TB on disk for the full precision GGUF. https://huggingface.co/unsloth/Kimi-K2.6-GGUF If you can afford the hardware to run Kimi K2.6 at any decent speed for more than 1 simultaneous user, you probably have a whole team of people on staff who are already very familiar with how to benchmark it vs Claude, GPT-5.5, etc. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | zozbot234 3 hours ago | parent [-] | ||||||||||||||||||||||||||||||||||||||||
Kimi is a natively quantized model, the lossless full precision release is 595GB. Your own link mentions that. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||