Remix.run Logo
sparkie 2 days ago

In long mode, compilers will typically emit `xor eax, eax`, as it only needs 2 bytes: The opcode and modrm byte. `xor ax, ax` takes 3 bytes due to the operand size override prefix (0x66), and `xor rax, rax` takes 3 bytes due to the REX.W prefix. `xor eax, eax` will still clear the full 64-bit register.

Shorter basically means you can fit more in instruction cache, which should in theory improve performance marginally.

Someone 2 days ago | parent [-]

Size isn’t everything. You should start by reading the manual for your CPU to see what it advises. The micro-architecture may treat only one of the sequences specially. For modern x64, I think that indeed is the shorter xor sequence, where, internally, the CPU just renames the register to a register that always contains zero, making the instruction independent of any earlier instructions using eax.

IIRC, Intel said a mov was the way to go for some now ancient x86 CPUs, though.