Remix.run Logo
lifthrasiir 4 days ago

Your problem is practically important as it can be considered as the "canonical" string representation of given floating point number (after adding the closestness requirement). There are many efficient algorithms as a result, selected algorithms include Dragon4, Grisu3, Ryu and Dragonbox. See also Google's double-conversion library which implements first two of them.

jjcob 4 days ago | parent | next [-]

Thank you for pointing me to all the prior work on this!

I am surprised how complex the issue seems to be. I assumed there might be an elegant solution, but the problem seems to be a lot harder than I thought.

pklausler 4 days ago | parent [-]

See also the code in LLVM Flang for binary/decimal conversions. Minimal-digit decimals don’t come up in Fortran as often as other languages, but I use them when allowed by the output format. Interestingly, the last digit of a minimal decimal conversion is often irrelevant, so long as it exists.

unnah 4 days ago | parent | prev [-]

Since such algorithms were developed in the 1990's, nowadays you can expect your language's standard library to use them for float-to-decimal and decimal-to-float conversions. So all you need to do in code is to print the float without any special formatting instructions, and you'll get the shortest unique decimal representation.

lifthrasiir 4 days ago | parent [-]

Except that C specifies that floating point numbers should be printed in a fixed precision (6 decimal digits) when no precision is given. Internally they do use some sort of float-to-decimal algorithms [1], but you can't get the shortest representation out of them.

[1] Some (e.g. Windows CRT) do use the shortest representation as a basis, in which case you can actually extract it with large enough precision (where all subsequent digits will be zeros). But many libcs print the exact representation instead (e.g. 3.140000000000000124344978758017532527446746826171875 for `printf("%.51f", 3.14)`), so they are useless for our purpose.

unnah 4 days ago | parent | next [-]

Ok, you got me there. Looks like they fixed that in C++17 with std::to_chars. https://en.cppreference.com/w/cpp/utility/to_chars.html

Sharlin 4 days ago | parent | prev [-]

That's what the %g format specifier is for.

  printf("%f\n", 3.14); // 3.140000
  printf("%g\n", 3.14); // 3.14
lifthrasiir 4 days ago | parent [-]

This is a common misconception. Quoting the ISO C (actually the final draft of C23, but should be enough for this purpose):

> g, G: A double argument representing a floating-point number is converted in style f or e (or in style F or E in the case of a G conversion specifier), depending on the value converted and the precision. Let P equal the precision if nonzero, 6 if the precision is omitted, or 1 if the precision is zero. Then, if a conversion with style E would have an exponent of X: if P > X ≥ −4, the conversion is with style f (or F) and precision P − (X + 1). otherwise, the conversion is with style e (or E) and precision P − 1.

Note that it doesn't say anything about, say, the inherent precision of number. It is a simple remapping to %f or %e depending on the precision value.

Sharlin 4 days ago | parent [-]

Hmm, is that then just an extension that's de facto standard? Every compiler I tried at godbolt.org prints 3.140000 and 3.14 respectively.

lifthrasiir 4 days ago | parent [-]

3.14 is the correct answer for both %g and the shortest possible representation. Try 1.0 / 3.0 instead; it won't show all digits required for round-tripping.