Remix.run Logo
bee_rider 3 days ago

I like these. They are push back against the sort of… first correction that people make when encountering floating point weirdness. That is, the first mistake we make is to treat floats as reals, and then we observe some odd rounding behavior. The second mistake we make is to treat the rounding events as random. A nice thing about IEEE floats is that the rounding behavior is well defined.

Often it doesn't matter, like you ask for a gemm and you get whatever order of operations blas, AVX-whatever, and OpenMP conspire to give you, so it is more-or-less random.

But if it does matter, the ability to define it is there.

jandrese 3 days ago | parent | next [-]

> A nice thing about IEEE floats is that the rounding behavior is well defined.

Until it isn't. I used to play the CodeWeavers port of Kohan and while the game would allow you to do crossplay between Windows and Linux, differences in how the two OSes rounded floats would cause the game to desynchronize after 15-20 minutes of play or so. Some unit's pathfinding algorithm would zig on Windows and zag on Linux, causing the state to diverge and end up kicking off one of the players.

bee_rider 3 days ago | parent | next [-]

Is it possible that your different operating systems just had different mxcsr values?

Or, since it was a port, maybe they were compiled with different optimizations.

There are a lot of things happening under the hood but most of them should be deterministic.

toolslive 2 days ago | parent [-]

until someone compiles with --ffast-math enabled, stating "I don't care about accuracy, as long as it's fast".

bee_rider 2 days ago | parent [-]

It is good to enable that flag because it also enables the “fun safe math optimizations” flag, and it is important to remind people that math is a safe way to have fun.

toolslive 2 days ago | parent [-]

"Friends don't let friends use fast-math"

https://simonbyrne.github.io/notes/fastmath/

jcranmer 2 days ago | parent | prev [-]

The differences are almost certainly not in how the two OSes rounded floats--the IEEE rounding modes are standard, and almost no one actually bothers to even change the rounding mode from the default.

For cross-OS issues, the most likely culprit is that Windows and Linux are using different libm implementations, which means that the results of functions like sin or atan2 are going to be slightly different.

gpderetta 2 days ago | parent | next [-]

The libm difference might explain it, but one possible difference is that long double is 80 bit on x86 linux and 64 bit on x86 windows.

My recollection is fuzzy, but IIRC the legacy x87 control word is set always to extended precision on linux, while it is on double precision on windows, and this affect normal float and double computations as well: the conversion to float and double is only done when storing to and from memory while intermediate in-register operations are always at the at the maximum enabled precision. Changing precision before each operation is expensive, so it is not done.

This is one of the cause of x87 apparent nondeterminism as it depends on the compiler unpredictably spilling fp registers [1]: unless you always use the maximum enabled precision, computations might not be reproducible from one build to the other even on the same environment.

[1] eventually GCC added compilation modes with deterministic behaviors, but that was well after x87 was obsolete. In the meantime people had to do with -ffloat-store and or volatile. See https://gcc.gnu.org/wiki/FloatingPointMath.

edit: but you know this as you mentioned it elsethread.

zokier 2 days ago | parent | prev [-]

Problem with rounding modes and other fpenv flags is that any library anywhere might flip some flag and suddenly the whole program changes behavior.

zahlman 2 days ago | parent | prev [-]

On the other hand, it feels wrong to me to call these "myths" when they are really just simplifications.

(And I have heard of the 80-bit internal register thing described in https://news.ycombinator.com/item?id=44888692 causing real problems for people before. And --ffast-math is basically spooky action at a distance considering how it bleeds into the entire program; see e.g. https://moyix.blogspot.com/2022/09/someones-been-messing-wit....)