Remix.run Logo
coolcase a month ago

I don't grok this but if you had to describe it in a nutshell, is this because of a race condition? Differences in HW? Floating point ops have some randomness built in?

mattb314 a month ago | parent [-]

Super rough summary of the first half: in order to pick out random vectors with a given shape (where the "shape" is determined by the covariance matrix), MASS::mvrnorm() computes some eigenvectors, and eigenvectors are only well defined up to a sign flip. This means tiny floating differences between machines can result in one machine choosing v_1, v_2, v_3,... as eigenvectors, while another machine chooses -v_1, v_3, -v_3,... The result for sampling random numbers is totally different with the sign flips (but still "correct" because we only care about the overall distribution--these are random numbers after all). The section around "Q1 / Q2" is the core of the article.

There's a lot of other stuff here too: mvtnorm::rmvnorm() also can use eigendecomp to generate your numbers, but it does some other stuff to eliminate the effect of the sign flips so you don't see this reproducibility issue. mvtnorm::rmvnorm also supports a second method (Cholesky decomp) that is uniquely defined and avoids eigenvectors entirely, so it's more stable. And there's some stuff on condition numbers not really mattering for this problem--turns out you can't describe all possible floating point problems a matrix could have with a single number.

coolcase a month ago | parent [-]

Thanks! So machine differences in their FP units drives the entropy, and the code doesn't handle that when picking eigenvectors?

saagarjha a month ago | parent | next [-]

It doesn't have to be their FP units; it could be that they run the operations in different orders, or that some different modes were set. I don't think the blog post goes into detail as to why but it does explain how this "cascades" into very different results coming out of the actual operation being performed.

pfortuny a month ago | parent | prev [-]

Implementations of IEEE-754 can differ between machines, and the order of operations in a library/compiled function can also be different. It is not entropy, it is the nature of floating-point arithmetic.

coolcase a month ago | parent [-]

Thanks. So sounds like the same machine should run the same calculations and get the same results each time, but differences may appear between machines.