Remix.run Logo
mgambati 7 hours ago

With 2 wouldn’t have good results. Ideal range for coding is at least Q8.

kibibu 7 hours ago | parent [-]

According to this very article, 4-bit dynamic is essentially lossless

Aurornis 5 hours ago | parent [-]

Watch out. Those claims are often made based on KL-divergence over some arbitrary corpus, not performance in the real world or benchmarks.

I’ve found that I need to go a couple steps past whatever quantizations are good enough in the KL-divergence testing to get good performance in real tasks with long context. So when Q4 is claimed to be lossless I end up with Q5 or Q6 for actual long-context tasks.