They're using a revolutionary new method called "training on the test set".
So, curve fitting the training data? So, we should expect out of sample accuracy to be crap?
Yeah, that's usually what tends to happen with those tiny models that are amazing in benchmarks.