▲ | ashvardanian 3 days ago | |
You can avoid hard-coding the whitespace symbols and have a generic byte-set search kernel via `vpshufb` AVX512BW-capable CPUs [1] or via `tbl` instructions on NEON-capable CPUs [2]. [1]: https://github.com/ashvardanian/StringZilla/blob/2f4b1386ca2... [2]: https://github.com/ashvardanian/StringZilla/blob/2f4b1386ca2... | ||
▲ | Sesse__ 3 days ago | parent [-] | |
You don't need AVX512BW for shuffle, SSSE3 will do. (Of course, if you want wider registers, you'll need the newer versions such as AVX2 or AVX512, but they don't shuffle cross-lane.) |