▲ | IshKebab 17 hours ago | |
Those 4 counter instructions have no dependencies though so they'll likely all be issued and executed in parallel in 1 cycle, surely? Probably the branch as well in fact. | ||
▲ | codedokode 17 hours ago | parent [-] | |
The load instruction has a dependency on counter increment. While with packed SIMD one can issue several loads without waiting. Also, extra counter instructions still waste resources of a CPU (unless there is some dedicated hardware for this purpose). |