Remix.run Logo
def-pri-pub 4 hours ago

Funny enough that fdlimb implementation of asin() did come up in my research. I believe it might have been more performant in the past. But taking a quick scan of `e_asin.c`, I see it doing something similar to the Cg asin() implementation (but with more terms and more multiplications, which my guess is that it's slower). I think I see it also taking more branches (which could also lead to more of a slowdown).

adampunk 2 hours ago | parent [-]

Yeah Ng’s work in fdlibm is cool and really clever in parts but a lot of branching. Some of the ways they reach correct rounding are…so cool.