The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.