Interesting.
If this were the case however, why would labs go through the trouble of distilling their smaller models rather than releasing quantized versions of the flagships?