Remix.run Logo
jcalvinowens 6 hours ago

Locality within the LUT matters too: if you know you're looking up identical or nearby-enough values to benefit from caching, an LUT can be more of a win. You only pay the cache cost for the portion you actually touch at runtime.

I could imagine some graphics workloads tend compute asin() repeatedly with nearby input values. But I'd guess the locality isn't local enough to matter, only eight double precision floats fit in a cache line.