Remix.run Logo
adrian_b 3 hours ago

There are applications where the difference between fixed-point and floating-point numbers matters, i.e. the difference between having a limit for the absolute error or for the relative error.

The applications where the difference does not matter are those whose accuracy requirements are much less than provided by the numeric format that is used.

When using double-precision FP64 numbers, the rounding errors are frequently small enough to satisfy the requirements of an application, regardless if those requirements are specified as a relative error or as an absolute error.

In such cases, floating-point numbers must be used, because they are supported by the existing hardware.

But when an application has more strict requirements for the maximum absolute error, there are cases when it is preferable to use smaller fixed-point formats instead of bigger floating-point formats, especially when FP64 is not sufficient, so quadruple-precision floating-point numbers would be needed, for which there is only seldom hardware support, so they must be implemented in software anyway, preferably as double-double-precision numbers.

AlotOfReading 2 hours ago | parent [-]

    i.e. the difference between having a limit for the absolute error or for the relative error.
The masking procedure I mentioned gives uniform absolute error in floats, at the cost of lost precision in the significand. The trade-off between the two is really space and hence precision.

I'm not saying fixed point is never useful, just that it's a very situational technique these days to address specific issues rather than an alternative default. So if you aren't even doing numerical analysis (as most people don't), you should stick with floats.