With the number of operations and the error rate in GPUs this is going to be interesting in SOTA models.
Don't forget quantization..