Remix clone Hacker News

new | show | ask | jobs Github

	▲	kpw94 4 hours ago
		> gemma (unsloth/gemma-4-26B-A4B-it-GGUF) models Since you're running quantized (at UD-Q4_K_XL) , check out the "qat" models (unsloth/gemma-4-26B-A4B-it-qat-GGUF) ! - https://huggingface.co/unsloth/gemma-4-26B-A4B-it-qat-GGUF (With "Jun 9 Update: Added MTP support.") - https://blog.google/innovation-and-ai/technology/developers-...
	▲	SubiculumCode 28 minutes ago \| parent \| next [-]
		How is the the QAT models at coding? I looked for opinions since the release and haven't found much.
	▲	me_bx 2 hours ago \| parent \| prev [-]
		TIL: > Quantization-Aware Training (QAT) [...] allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model