| ▲ | bityard 3 hours ago | |
Halving the precision of the weights is not a free lunch... | ||
| ▲ | Catloafdev an hour ago | parent [-] | |
Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality. | ||