Remix clone Hacker News

new | show | ask | jobs Github

	▲	me_bx 3 hours ago
		TIL: > Quantization-Aware Training (QAT) [...] allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model