Remix clone Hacker News

new | show | ask | jobs Github

	▲	adrian_b 2 hours ago
		For each of the 4 gemma-4--it models there has been published an associated small model gemma-4--it-assistant, to be used for MTP. If a GGUF file is generated for MTP, it must include both the big model and the small model. There was a reference in another comment to a PR for llama.cpp, which also included updates for the Python program used for conversion from the safetensors files, which presumably can handle the combining of the two paired Gemma 4 models.