▲ | dzaima 7 hours ago | |
SIMD only helps you where you're arithmetic-limited; you may be limited by memory bandwidth, or perhaps float division if applicable; and if your scalar comparison got autovectorized you'd have roughly no benefit. AVX-512 should be just fine via intrinsics/high-level vector types, not different from AVX2 in this regard. |