Remix.run Logo
zozbot234 5 days ago

KV quantization has long been available in llama.cpp

chr15m 4 days ago | parent [-]

Yes but the optimisation described has not right?