Remix clone Hacker News

new | show | ask | jobs Github

	▲	Aurornis 2 hours ago
		I should clarify that I'm referring generically to the types of quantizations used in local LLM inference, including those from Unsloth. Nobody actually quantizes every layer to Q4 in a Q4 quant.