| ▲ | rep_lodsb 3 hours ago | |
There are separate opcodes for shift/rotate by 1, by CL, or by an immediate operand. Those are decoded to separate microcode entry points, so they could have at least optimized the "RCL/RCR x,1" case. And the microcode for bit test has to be different anyway. | ||