Remix.run Logo
Someone 2 days ago

One example of such an optimization is that overflow checking can prevent vectorization of code.

See for example this post: https://lemire.me/blog/2016/12/06/dont-assume-that-safety-co.... It is ancient, but I don’t see a reason why it would have become outdated.

msichert 2 days ago | parent [-]

Vector instructions usually don't have overflow flags, so a compiler can't easily vectorize loops containing overflow checks. However, detecting overflows in integer operations requires only a bit of bitwise arithmetic. In my experiments, this lead to an overhead of only 7% for vectorized additions with overflow checks: https://cedardb.com/blog/vectorized_overflows/

ack_complete 2 days ago | parent [-]

That's with a simple data operation and using a recent x86 vector ISA (AVX-512) that is only available on some systems, notably excluding any current Intel desktop CPU.

The real killer isn't the data operations, though, it's if the overflow checks interfere with converting the loop logic or data addressing to vectorizable form. Indexing with 32-bit signed int vs. unsigned int on a 64-bit platform in C is a classic case -- with unsigned the compiler cannot assume that addressing offsets don't wrap, which then prevents coalescing data accesses into vector loads and stores.