| ▲ | Pannoniae 7 hours ago | |
Microbenchmarks. A LUT will win many of them but you pessimise the rest of the code. So unless a significant (read: 20+%) portion of your code goes into the LUT, there isn't that much point to bother. For almost any pure calculation without I/O, it's better to do the arithmetic than to do memory access. | ||
| ▲ | jcalvinowens 6 hours ago | parent [-] | |
Locality within the LUT matters too: if you know you're looking up identical or nearby-enough values to benefit from caching, an LUT can be more of a win. You only pay the cache cost for the portion you actually touch at runtime. I could imagine some graphics workloads tend compute asin() repeatedly with nearby input values. But I'd guess the locality isn't local enough to matter, only eight double precision floats fit in a cache line. | ||