Remix.run Logo
codedokode 12 hours ago

> Which still means you have to write your code at least thrice, which is two times more than with a variable length SIMD ISA.

This is a wrong approach. You should be writing you code in a high-level language like this:

    x = sum i for 1..n: a[i] * b[i]
And let the compiler write the assembly for every existing architecture (including multi-threaded version of a loop).

I don't understand what is the advantage of writing the SIMD code manually. At least have a LLM write it if you don't like my imaginary high-level vector language.

otherjason 6 hours ago | parent [-]

This is the common argument from proponents of compiler autovectorization. An example like what you have is very simple, so modern compilers would turn it into SIMD code without a problem.

In practice, though, the cases that compilers can successfully autovectorize are very limited relative to the total problem space that SIMD is solving. Plus, if I rely on that, it leaves me vulnerable to regressions in the compiler vectorizer.

Ultimately for me, I would rather write the implementation myself and know what is being generated versus trying to write high-level code in just the right way to make the compiler generate what I want.