Remix.run Logo
tliltocatl 10 hours ago

It might be because XOR is rarely (in terms of static count, dynamically it surely appears a lot in some hot loops) used for anything else, so it is easier to spot and identify as "special" if you are writing manual assembly.

bonzini 2 hours ago | parent | next [-]

Indeed this is the best explanation!

stingraycharles 10 hours ago | parent | prev | next [-]

And helps with SMT

Edit: this is apparently not the case, see @tliltocatl's comment down the thread

tliltocatl 10 hours ago | parent [-]

What's SMT in this context?

recursivecaveat 10 hours ago | parent [-]

Simultaneous Multi-Threading (hyper-threading as Intel calls it). I'm not a cpu guy, but I think the ALU used for subtraction would be a more valuable resource to leave available to the other thread than whatever implements a xor. Hence you prefer to use the xor for zeroing and conserve the ALU for other threads to use.

tliltocatl 10 hours ago | parent | next [-]

I don't think that's how it works.

- Normally ALU implements all "light" operations (i. e. add/sub/and/or/xor) in a single block, separating them would result in far more interconnect overhead. Often, CPUs have specialized adder-only units for address generation, but never a xor-specialized block.

- All CPUs that implement hyper-threading also optimize a XOR EAX,EAX into MOV EAX,ZERO/SET FLAGS (where ZERO is an invisible zero register just like on Itanium and RISCs). This helps register renaming and eliminates a spurious dependency.

- The XOR trick is about as old as 8086 if not older.

Symmetry 3 hours ago | parent [-]

Right. Keeping down the number of slots the scheduler and bypass network need to worry about is an important design pressure.

fredoralive 10 hours ago | parent | prev | next [-]

By the time you get to a CPU complex enough to be to have SMT it is likely to detect these “clear register” patterns and special case them.

XOR would also be handled by the ALU, the L is for logic.

IshKebab 8 hours ago | parent | prev [-]

Most CPU use the same ALU for xor and sub.

kunley 10 hours ago | parent | prev [-]

XOR appears a lot in any code touching encryption.

PS. What is static vs dynamic count?

tliltocatl 10 hours ago | parent [-]

Static count - how many times an instruction appears in a binary (or assembly source).

Dynamic count - how many times an opcode gets executed.

I. e. an instruction that doesn't appear often in code, but comes up in some hot loops (like encryption) would have low static and high dynamic.