Remix.run Logo
thaumasiotes 19 hours ago

I don't actually understand why you'd want reproducibility in a statistical simulation. If you fix the output, what are you learning? The point of the simulation is to produce different random numbers so you can see what the outcomes are like... right?

Let's say I write a paper that says "in this statistical model, with random seed 1495268404, I get Important Result Y", and you criticize me on the grounds that when you run the model with random seed 282086400, Important Result Y does not hold. Doesn't this entire argument fail to be conceptually coherent?

hansvm 9 hours ago | parent | next [-]

- It eliminates one way of forging results. Without stating the seed, people could consistently fail to reproduce your results and you could always have "oops, guess I had a bad seed" as an excuse. You still have to worry about people re-running the simulation till the results look good, but that's handled via other mechanisms.

- Many algorithms are vastly easier to implement stochastically than deterministically. If you want to replay the system (e.g., to locally debug some production issue), you need those "stochastic" behaviors to nonetheless be deterministic.

- If you're a little careful with how you implement deterministic randomness, you can start to ask counterfactual questions -- how would the system have behaved had I made this change -- and actually compare apples to apples when examining an experimental run.

Even in your counterexample, the random seeds being reproducible and published is still important. With the seed and source published, now anyone can cheaply verify the issue with the simulation, you can debug it, you can investigate the proportion of "bad" seeds and suss out the error bounds of the simulation, etc.

thaumasiotes 3 hours ago | parent [-]

> It eliminates one way of forging results. Without stating the seed, people could consistently fail to reproduce your results and you could always have "oops, guess I had a bad seed" as an excuse.

Why does this matter? If people consistently fail to get results similar to yours by using different seeds, your results are invalid whether or not you made them up. And if people consistently do get results similar to yours with different seeds, your results are valid whether or not you made them up. No?

> Even in your counterexample, the random seeds being reproducible and published is still important.

> you can investigate the proportion of "bad" seeds and suss out the error bounds of the simulation

How does publishing seeds help with that? You do that by examining different seeds. If you know what seeds were used in the paper, this process doesn't change in any way.

AlotOfReading 17 hours ago | parent | prev [-]

The point is that if you run the simulation with seed X, you'll get the same results I do. This means that all you need to reproduce my results is the code and any inputs like the seed, rather than the entire execution history. If you want to provide different inputs and get the same result, that's another matter entirely (where numerical stability will be much more important).

thaumasiotes 14 hours ago | parent [-]

But what is the value of reproducing your results that way? Those results only mean anything if they're similar to what happens with other seeds.

dzaima 11 hours ago | parent | next [-]

If everything goes roughly the same with all seeds, then everything's indeed just fine. If not, I have a paper trail that I did in fact actually encounter the unexpected thing instead of just making up results. (indeed a seed could be cherry-picked, but the more is computed from the seed, the more infeasible that gets)

And that hints at the possibility of using seeds specifically for noting rare occurrences - communicating weird/unexpected(/buggy?) examples to collaborators, running for X hours to gather statistics without having to store all intermediate states of every single attempt if you later want to do extra analysis for cases insufficiently analyzed by the existing code, etc.

saagarjha 14 hours ago | parent | prev [-]

If something fails, I can reproduce the failure with your seed.