Remix.run Logo
pkhuong a day ago

> Do you have examples for problems that are easier to solve in fixed-width SIMD?

Regular expression matching and encryption come to mind.

camel-cdr 21 hours ago | parent [-]

> Regular expression matching

That's probably true. Last time I looked at it, it seemed like parts of vectorscan could be vectorized VLA, but from my, very limited, understanding of the main matching algorithm, it does seem to require specialization on vector length.

It should be possible to do VLA in some capacity, but it would probably be slower and it's too much work to test.

> encryption

From the things I've looked at, it's mixed.

E.g. chacha20 and poly1305 vectorize well in a VLA scheme: https://camel-cdr.github.io/rvv-bench-results/bpi_f3/chacha2..., https://camel-cdr.github.io/rvv-bench-results/bpi_f3/poly130...

Keccak on the other hand was optimized for fast execution on scalar ISAs with 32 GPRs. This is hard to vectorize in general, because GPR "moves" are free and liberally applied.

Another example where it's probably worth specializing is quicksort, specifically the leaf part.

I've written a VLA version, which uses bitonic sort to sort within vector registers. I wasn't able to meaningfully compare it against a fixed size implementation, because vqsort was super slow when I tried to compile it for RVV.