Because you are working in the cache.
Also, you should use SIMD.
> Also, you should use SIMD. ironically no clang is better at auto vectorizing