▲ | burntsushi 5 days ago | |
Another challenge may be as a result of using a portable SIMD API instead of specific ISA instructions. I'm specifically thinking about computing the mask, which on x86-64 is I assume implemented via movemask. But aarch64 lacks such an instruction, so you need to do other shenanigans for the best codegen: https://github.com/BurntSushi/memchr/blob/ceef3c921b5685847e... | ||
▲ | ww520 5 days ago | parent [-] | |
A language level or std library level poly-fill for the missing SIMD operations for the platforms would be great. |