| ▲ | AlotOfReading 5 hours ago | |||||||
Fixed point and Floating point are extremely similar, so most of the time you should just go with floats. If you start with a fixed type, reserve some bits for storing an explicit exponent and define a normalization scheme, you've recreated the core of IEEE floats. That also means we can go the other way and emulate (lower precision) fixed point by masking an appropriate number of LSBs in the significand to regain the constant density of fixed. You can treat floating point like fixed point in a log space for most purposes, ignoring some fiddly details about exponent boundaries. And since they're essentially the same, there just aren't many situations where implementing your own fixed point is worth it. MCUs without FPUs are increasingly uncommon. Financial calculations seem to have converged on Decimal floating point. Floating point determinism is largely solved these days. Fixed point has better precision at a given width, but 53 vs 64 bits isn't much different for most applications. If you happen to regularly encounter situations where you need translation invariants across a huge range at a fixed (high) precision though, fixed point is probably more useful to you. | ||||||||
| ▲ | adrian_b 3 hours ago | parent [-] | |||||||
There are applications where the difference between fixed-point and floating-point numbers matters, i.e. the difference between having a limit for the absolute error or for the relative error. The applications where the difference does not matter are those whose accuracy requirements are much less than provided by the numeric format that is used. When using double-precision FP64 numbers, the rounding errors are frequently small enough to satisfy the requirements of an application, regardless if those requirements are specified as a relative error or as an absolute error. In such cases, floating-point numbers must be used, because they are supported by the existing hardware. But when an application has more strict requirements for the maximum absolute error, there are cases when it is preferable to use smaller fixed-point formats instead of bigger floating-point formats, especially when FP64 is not sufficient, so quadruple-precision floating-point numbers would be needed, for which there is only seldom hardware support, so they must be implemented in software anyway, preferably as double-double-precision numbers. | ||||||||
| ||||||||