| ▲ | armchairhacker 4 hours ago |
| But the lab must publish at least the general category of data, and if that doesn't replicate, then the model only works on a more specific category than they claim (e.g. only their dataset). |
|
| ▲ | awesome_dude 3 hours ago | parent [-] |
| Even with the exact same dataset and architecture, ML results aren't perfectly replicable due to random weight initialisation, training data order, and non-deterministic GPU operations. I've trained identical networks on identical data and gotten different final weights and performance metrics. This doesn't mean the model only works on that specific dataset - it means ML training is inherently stochastic. The question isn't 'can you get identical results' but 'can you get comparable performance on similar data distributions. |
| |
| ▲ | armchairhacker 3 hours ago | parent [-] | | Then researchers should re-train their models a couple times, and if they can't get consistent results, figure out why. This doesn't even mean they must throw out the work: a paper "here's why our replications failed" followed by "here's how to eliminate the failure" or "here's why our study is wrong" is useful for future experiments and deserves publication. | | |
| ▲ | awesome_dude 2 hours ago | parent [-] | | As per my previous comment - we are discussing stochastic systems. By definition, they involve variance that cannot be explained or eliminated through simple repetition. Demanding a 'deterministic' explanation for stochastic noise is a category error; it's like asking a meteorologist to explain why a specific raindrop fell an inch to the left during a storm replication. |
|
|