Remix clone Hacker News

new | show | ask | jobs Github

	▲	halJordan 4 hours ago
		Quantization is an extraordinarily trivial process. Especially if you're doing it with llama.cpp (which unsloth obviously does). Qwen did release an fp8 version, which is a quantized version.