| ▲ | renewiltord 5 hours ago | |||||||||||||||||||||||||
A lot of things are easy if you ignore the incentive structure. E.g. a lot of papers will no longer be published if the data must be published. You’d lose all published research from ML labs. Many people like you would say “that’s perfectly okay; we don’t need them” but others prefer to be able to see papers like Language Models Are Few-Shot Learners https://arxiv.org/abs/2005.14165 So the answer is that we still want to see a lot of the papers we currently see because knowing the technique helps a lot. So it’s fine to lose replicability here for us. I’d rather have that paper than replicability through dataset openness. | ||||||||||||||||||||||||||
| ▲ | armchairhacker 4 hours ago | parent [-] | |||||||||||||||||||||||||
But the lab must publish at least the general category of data, and if that doesn't replicate, then the model only works on a more specific category than they claim (e.g. only their dataset). | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||