Remix.run Logo
Symmetry 4 hours ago

That's all true, but on any modern x86 processor both the single pair of gates for the xor and the 10 or so for a carry-bypass 64 bit wide subtraction both happen with a single clock cycle of latency so from a programmer's perspective they're the same in that sense. There's still an energy difference but its tiny compared to what even the register file and bypass network for the operation use, let along the OoO structures.

masklinn 2 hours ago | parent | next [-]

The question is why one idiom won over the other, which happened a long time ago.

Because as the article notes on "any modern x86 processor" both xor r, r and sub r, r are handled by the frontend and have essentially no cost.

dataflow 3 hours ago | parent | prev | next [-]

The question isn't whether they both take a clock cycle, but rather whether any future implementation of the ISA might ostensibly find some sort of performance advantage, even if none do right now. From that standpoint, xor seems like a safer bet.

Symmetry an hour ago | parent | next [-]

There's been a lot of churn over the years but additions being done in the same timeframe as XORs has been pretty constant. The Pentium 4 double pumped its ALU but both XORs and ADDs could happen in a half cycle latency. The POWER 6 cut the FO4s of latency in stage from 16 to 10 and kept that parity as well. When you need 2 FO4s for latching between stages and 2 to handle clock jitter at high frequencies the difference between what a XOR needs and what an ADD need start looking smaller, particularly when you include the circuitry to move the data and select the instruction. Maybe if we move to asynchronous circuits?

vablings 2 hours ago | parent | prev [-]

Defacto standard, Compilers optimize for the CPU, CPU uarch is now optimizing for compilers

RiverCrochet 3 hours ago | parent | prev | next [-]

[dead]

idontwantthis 2 hours ago | parent | prev [-]

The blog post is about why this is idiomatic not whether it needs to be done that way today. It’s idiomatic because once upon a time none of that existed and xor gates did. The author apparently never took intro to digital logic.