How many times are we going to "discover" this? Over and over, it's blatantly apparent there's massive data leakage in the training set vs. test, and no one seems to care.