Remix clone Hacker News

new | show | ask | jobs Github

	▲	hatthew 4 hours ago
		I feel like it's a little disingenuous to compare against full-precision models. Anyone concerned about model size and memory usage is surely already using at least an 8 bit quantization. Their main contribution seems to be hyperparameter tuning, and they don't compare against other quantization techniques of any sort.