| ▲ | zozbot234 5 hours ago | |||||||
> Cloud hardware can run the original model. Quantization will reduce quality. New models are often being released in quantized format to begin with. This is true of both Kimi and the new DeepSeek V4 series. There is no "original model", the model is generated using Quantization Aware Training (QAT). | ||||||||
| ▲ | Aurornis 5 hours ago | parent [-] | |||||||
> There is no "original model", the model is generated using Quantization Aware Training (QAT). The original model is the model used for the benchmarks People will say "You can run it locally!" then show the benchmarks of the original model, but what they really mean is that you can run a heavily quantized adaptation of the model which has difference performance characteristics. | ||||||||
| ||||||||