A lot of things are easy if you ignore the incentive structure. E.g. a lot of papers will no longer be published if the data must be published. You’d lose all published research from ML labs. Many people like you would say “that’s perfectly okay; we don’t need them” but others prefer to be able to see papers like Language Models Are Few-Shot Learners https://arxiv.org/abs/2005.14165

So the answer is that we still want to see a lot of the papers we currently see because knowing the technique helps a lot. So it’s fine to lose replicability here for us. I’d rather have that paper than replicability through dataset openness.

▲

armchairhacker 4 hours ago | parent [-]

But the lab must publish at least the general category of data, and if that doesn't replicate, then the model only works on a more specific category than they claim (e.g. only their dataset).

▲

awesome_dude 3 hours ago | parent [-]

Even with the exact same dataset and architecture, ML results aren't perfectly replicable due to random weight initialisation, training data order, and non-deterministic GPU operations. I've trained identical networks on identical data and gotten different final weights and performance metrics.

This doesn't mean the model only works on that specific dataset - it means ML training is inherently stochastic. The question isn't 'can you get identical results' but 'can you get comparable performance on similar data distributions.

▲

armchairhacker 3 hours ago | parent [-]

Then researchers should re-train their models a couple times, and if they can't get consistent results, figure out why. This doesn't even mean they must throw out the work: a paper "here's why our replications failed" followed by "here's how to eliminate the failure" or "here's why our study is wrong" is useful for future experiments and deserves publication.

	▲	awesome_dude 3 hours ago \| parent [-]
		As per my previous comment - we are discussing stochastic systems. By definition, they involve variance that cannot be explained or eliminated through simple repetition. Demanding a 'deterministic' explanation for stochastic noise is a category error; it's like asking a meteorologist to explain why a specific raindrop fell an inch to the left during a storm replication.