Remix.run Logo
ekianjo 9 hours ago

yeah and often their quants are broken. They had to update their Gemma4 quants like 4 times in the past 2 weeks.

danielhanchen 9 hours ago | parent [-]

No it's not our fault - re our 4 uploads - the first 3 are due to llama.cpp fixing bugs - this was out of our control (we're llama.cpp contributors, but not the main devs) - we could have waited, but it's best to update when multiple (10-20) bugs are fixed.

The 4th is Google themselves improving the chat template for tool calling for Gemma.

https://github.com/ggml-org/llama.cpp/issues/21255 was another issue CUDA 13.2 was broken - this was NVIDIA's CUDA compiler itself breaking - fully out of our hands - but we provided a solution for it.