Remix.run Logo
meisel a day ago

Another speed up method here would be using simd, although it would be interesting to see in the assembly if it was auto-vectorized already.

This reminds me of a trick to sort floats faster, even if they have negatives, nans, and inf: map each float to a sortable int version of itself where one can compare them as ints (the precise mapping depending on how you want to order stuff like Nan). The one time conversion is fast and will pay off for the lg(n) comparisons. Then after sorting, map them back.