| ▲ | hatthew 4 hours ago | |
I feel like it's a little disingenuous to compare against full-precision models. Anyone concerned about model size and memory usage is surely already using at least an 8 bit quantization. Their main contribution seems to be hyperparameter tuning, and they don't compare against other quantization techniques of any sort. | ||