▲ | bjourne 16 hours ago | |
> "Loop unrolling also increases register pressure" -- it does, but code that really requires >32 registers is extremely rare, so a good instruction scheduler in the compiler can avoid spilling. No, it actually is super common in hpc code. If you unroll a loop N times you need N times as many registers. For normal memory-bound code I agree with you, but most hpc kernels will exploit as much of the register file as they can for blocking/tiling. |