Remix.run Logo
WalterBright 4 hours ago

NaNs are a very underappreciated feature of IEEE-754 floating point. In the D programming language, floats get default initialized to NaN, not to 0.0.

    double y = 0.0; // initialized to 0.0
    double x; // initialized to NaN
The discussion routinely comes up as "why not default initialize to 0.0?" The reason is a routine mistake in programming is forgetting to initialize a variable. With a floating point 0.0, one may never realize that the floating point calculation results are wrong. But with NaN, the result of a floating point computation will be NaN, which is unlikely to go unnoticed.

I don't know of any other programming language with this safety feature.

Also, the D `char` type is initialized to 0xFF, not 0, because Unicode says that 0xFF is an invalid character.

p1necone 3 hours ago | parent | next [-]

Just requiring explicit assignment before first use feels like the superior approach to automatic initialization, regardless of whether the automatic initialization is with 0 or with NaN.

WalterBright 3 hours ago | parent | next [-]

That suggestion is often made.

The trouble with it is a bug I've seen often. People will get an error message about an "uninitialized variable". Then they go into "just get the compiler to shut up" mode, amd pick "0" as the initializer. Then, the program compiles and runs, and silently produces the wrong answer. Code reviews will simply pass over the "0" initializer, as it looks right.

With default NaN initialization, the programmer is more likely to stop and think about it, not just insert 0.

Another issue with it is:

    float x = 0.0;
    setFloat(&x);

    void setFloat(float* px) { *px = 3.0; }
For the purposes of code clarity I don't want to see a variable initialized to a value that is never used, just to shut the compiler up.
ncurses1010 an hour ago | parent [-]

With the default initialization to nan, do you ever run into situations where people are searching for common sources for nan (nan literals, div by zero) and they can't find it? Or cases where only some branches but not others initialize the float?

WalterBright 37 minutes ago | parent [-]

To leave a variable uninitialized, use the construction:

    int x = void;
Note that nobody is going to write this by accident. And it's easy to grep for.

To find the source of a NaN, it helps to know that every operation that has a NaN as an operand produces a NaN as a result. So if you see a NaN in the output, you can work backwards to where it originated.

lmm an hour ago | parent | prev [-]

Yep. This is NaN as a billion dollar mistake all over again.

WalterBright 4 hours ago | parent | prev | next [-]

Another crucial use of NaNs is if you have a sensor. If the sensor has failed, the sensed value should be transmitted as NaN, not 0, so the receiver knows the data is bad.

AlotOfReading 3 hours ago | parent [-]

My experience is that if you write an interface that (rarely) returns NaNs, someone will use it assuming it's never NaN no matter how good the docs are. Then their code does bad things and you have to patiently explain why they're wrong and yes, they are holding isnan() wrong (in C/C++).

WalterBright an hour ago | parent [-]

NaN for a failed sensor is objectively better than any other value. But at some point you just cannot help some people.

anitil 3 hours ago | parent | prev | next [-]

That's a very thoughtful decision, I always enjoy your updates on D

wpollock 3 hours ago | parent | prev [-]

> ... Unicode says that 0xFF is an invalid character.

Not so. You may be thinking of UTF-8 encoding. 0xff is DEL in Unicode.

LittleLily an hour ago | parent | next [-]

DEL is unicode codepoint U+007F, which is the byte 0x7F in UTF-8, not 0xFF. Perhaps you were thinking of ÿ which is codepoint U+00FF, which encodes to the bytes 0xC3 0xBF in UTF-8.

WalterBright 3 hours ago | parent | prev [-]

The "char" type in D represents a UTF-8 code unit, the byte 0xFF is not a valid character code and is strictly forbidden.