| ▲ | Legend2440 2 hours ago | |
I don’t think benchmark overfitting is as common as people think. Benchmark scores are highly correlated with the subjective “intelligence” of the model. So is pretraining loss. The only exception I can think of is models trained on synthetic data like Phi. | ||