Remix.run Logo
jhoechtl 10 hours ago

Back in the stone ages XOR ing was just 1 byte of opcode. Habbits stick. In effect XORing is no longer faster since a long time.

dragontamer 10 hours ago | parent | next [-]

The XOR trick is implemented as a (malloc from register file) on modern processors, implemented in the decoder and it won't even issue a uOp to the execution pipelines.

Its basically free today. Of course, mov RAX, 0 is also free and does the same thing. But CPUs have limited decoder lengths per clock tick, so the more instructions you fit in a given size, the more parallel a modern CPU can potentially execute.

So.... definitely still use XOR trick today. But really, let the compiler handle it. Its pretty good at keeping track of these things in practice.

-----------

I'm not sure if "sub" is hard-coded to be recognized in the decoder as a zero'd out allocation from the register file. There's only certain instructions that have been guaranteed to do this by Intel/AMD.

flohofwoe 10 hours ago | parent | prev | next [-]

Depending on what's stone-age for you, a SUB with a register was also only one byte, and was the same cost as XOR, at least in the Intel/Zilog lineage all the way back to the 70s ;)

NetMageSCW 4 hours ago | parent | prev [-]

The article’s point is about why XOR is preferred over SUB, both being one byte.

MOV is right out.