Remix.run Logo
themafia 11 hours ago

XOR and SUB have had identical cycle counts and latencies since the 8088. That's because you can "look ahead" when doing carries in binary. It's just a matter of how much floorspace on the chip you want to use.

https://en.wikipedia.org/wiki/Carry-lookahead_adder

The only minor difference between the two on x86, really, is SUB sets OF and CF according to the result while XOR always clears them.

asQuirreL 10 hours ago | parent | next [-]

A carry lookahead adder makes your circuit depth logarithmic in the width of the inputs vs linear for a ripple carry adder, but that is still asymptotically worse than XORs constant depth.

(But this does not discount the fact that basically all CPUs treat them both as one cycle)

bonzini 5 hours ago | parent | prev [-]

OF/CF/AF are always cleared anyway by SUB r,r. So there's absolutely no difference.

themafia an hour ago | parent [-]

The point is OF/CF are sometimes dependent on the inputs for SUB. They never are for XOR.

bonzini 15 minutes ago | parent [-]

Ah, you mean in terms of complexity of the calculation. Thanks for clarifying.

In practice AF and CF can be computed from the carry out vector which is already available, and OF is a single XOR (of the two most significant bits of the carry out vector). The same circuitry works for XOR and SUB if the carry out vector of XOR is simply all zeroes.