Remix clone Hacker News

new | show | ask | jobs Github

	▲	ekianjo 9 hours ago
		yeah and often their quants are broken. They had to update their Gemma4 quants like 4 times in the past 2 weeks.
	▲	danielhanchen 9 hours ago \| parent [-]
		No it's not our fault - re our 4 uploads - the first 3 are due to llama.cpp fixing bugs - this was out of our control (we're llama.cpp contributors, but not the main devs) - we could have waited, but it's best to update when multiple (10-20) bugs are fixed. The 4th is Google themselves improving the chat template for tool calling for Gemma. https://github.com/ggml-org/llama.cpp/issues/21255 was another issue CUDA 13.2 was broken - this was NVIDIA's CUDA compiler itself breaking - fully out of our hands - but we provided a solution for it.