Remix.run Logo
chr15m 5 days ago

Is this something that will show up in Ollama any time soon to increase context size of local models?

zozbot234 5 days ago | parent [-]

KV quantization has long been available in llama.cpp

chr15m 4 days ago | parent [-]

Yes but the optimisation described has not right?