| ▲ | zozbot234 3 hours ago |
| Kimi is a natively quantized model, the lossless full precision release is 595GB. Your own link mentions that. |
|
| ▲ | CamperBob2 3 hours ago | parent | next [-] |
| So, realistically, $100K for an 8x RTX 6000 Pro system that can run it at a usable rate. |
| |
| ▲ | zozbot234 3 hours ago | parent [-] | | I think people will always disagree on what qualifies as a "usable rate". But keep in mind that practically no one sensible is running the latest Opus or GPT around the clock, especially not at sustainable, unsubsidized prices. With open-weights models it's easy to do that. | | |
| ▲ | walrus01 2 hours ago | parent [-] | | Also for people doing something medical, privacy or sensitive data related, there's an almost incalculable value (depending on industry niche) in having absolutely no external network traffic to any servers/systems you don't fully control. |
|
|
|
| ▲ | walrus01 2 hours ago | parent | prev [-] |
| the 'unsloth' link above is a 3rd party person that has quantized it to Q8, the original release is considerably larger in size than 600GB: https://huggingface.co/moonshotai/Kimi-K2.6 |
| |
| ▲ | zozbot234 2 hours ago | parent [-] | | That page mentions that the model is natively INT4 for most of the params, and 600GB is in the ballpark of what's available there for download. |
|