Remix.run Logo
jcranmer 3 days ago

Wow, you're crossing a few wires in your zeal to provide information to the point that you're repeating myths.

> The intel processors have a separate math coprocessor that supports 80bit floats

x86 processors have two FPU units, the x87 unit (that you're describing) and the SSE unit. Anyone compiling for x86-64 uses the SSE unit for default, and most x86-32 compilers still default to SSE anyways.

> Moving a float from a register in this coprocessor to memory truncates the float.

No it doesn't. The x87 unit has load and store instructions for 32-bit, 64-bit, and 80-bit floats. If you want to spill 80-bit values as 80-bit values, you can do so.

> Repeated math can be done inside this coprocessor to achieve higher precision so hot loops generally don't move floats outside of these registers.

Hot loops these days use the SSE stuff because they're so much faster than x87. Friends don't let friends use long double without good reason!

> Non-determinism occurs in programs running on intel with floats when threads are interrupted and the math coprocessor flushed.

Lol, nope. You'll spill the x87 register stack on thread context switch with FSAVE or FXSAVE or XSAVE, all of which will store the registers as 80-bit values without loss of precision.

That said, there was a problem with programs that use the x87 unit, but it has absolutely nothing to do with what you're describing. The x87 unit doesn't have arithmetic for 32-bit and 64-bit values, only 80-bit values. Many compilers, though, just pretended that the x87 unit supported arithmetic on 32-bit and 64-bit values, so that FADD would simultaneously be a 32-bit addition, a 64-bit addition, and a 80-bit addition. If the compiler needed to spill a floating-point register, they would spill the value as a 32-bit value (if float) or 64-bit value (if double), and register spills are pretty unpredictable for user code. That's the nondeterminism you're referring to, and it's considered a bug in every compiler I'm aware of. (See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p37... for a more thorough description of the problem).

mturmon 2 days ago | parent | next [-]

Thanks for the link, it was very informative.

I passed through the highly-irreproducible eras described in the section you link, and that you summarize in your last paragraph. There was so much different FP hardware, and so many different niche compilers, that my takeaway became “you can’t rely on reproducibility across any hardware/os version/compiler/library difference”.

There are still issues with libraries and compilers, summarized farther down (https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p37...). And these can be observed in the wild when changing compilers on the same underlying hardware.

But your point is that irreproducibility at the level of interrupts or processor scheduling is not a thing on contemporary mainstream hardware. That’s important and I hadn’t realized that.

BlackFly 2 days ago | parent | prev [-]

It isn't zeal, it's 15 years past hazy memory of getting different results on different executions in the same supercomputer. The story that went around was the one I relayed, but certainly your link does a better job explaining things that happen in the user perspective section:

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p37...

Summarized as,

> Most users cannot be expected to know all of the ways that their floating-point code is not reproducible.

Glad to know that the situation is defaulting to SSE2 nowadays though.