| ▲ | kibibu 7 hours ago | |
According to this very article, 4-bit dynamic is essentially lossless | ||
| ▲ | Aurornis 5 hours ago | parent [-] | |
Watch out. Those claims are often made based on KL-divergence over some arbitrary corpus, not performance in the real world or benchmarks. I’ve found that I need to go a couple steps past whatever quantizations are good enough in the KL-divergence testing to get good performance in real tasks with long context. So when Q4 is claimed to be lossless I end up with Q5 or Q6 for actual long-context tasks. | ||