Remix.run Logo
flohofwoe 10 hours ago

That comment is not very useful without pointing to realworld CPUs where SUB is more expensive than XOR ;)

E.g. on Z80 and 6502 both have the same cycle count.

HarHarVeryFunny 6 hours ago | parent | next [-]

The 6502 doesn't support XOR A or SUB A, and in fact doesn't have a SUB opcode at all, only SBC (subtract with carry, requiring an extra opcode to set the carry flag beforehand).

flohofwoe 5 hours ago | parent [-]

I was handwaving over the details, SBC is identical to SUB when the carry flag is clear, so it's understandable why the 6502 designers didn't waste an instruction slot.

EOR and SBC still have the same cycle counts though.

HarHarVeryFunny 4 hours ago | parent [-]

Sure, in some contexts you would know that the carry flag was set or clear (depending on what you needed), and it was common to take advantage of that and not add an explicit clc or sec, although you better comment the assumption/dependency on the preceding code.

However the 6502 doesn't support reg-reg ALU operations, only reg-mem, so there simply is no xor a,a or sbc a,a support. You'd either have to do the explicit lda #0, or maybe use txa/tya if there was a free zero to be had.

brigade 10 hours ago | parent | prev | next [-]

Cortex A8 vsub reads the second source register a cycle earlier than veor, so that can add one cycle latency

Not scalar, but still sub vs xor. Though you’d use vmov immediate for zeroing anyway.

em3rgent0rdr 5 hours ago | parent | prev | next [-]

With more bits, then SUB is going to be more and more expensive to fit in the same number of clocks as XOR. So with an 8-bit CPU like Z80, it probably makes design sense to have XOR and SUB both take one cycle. But if for instance a CPU uses 128-bit registers, then the propagate-and-carry logic for ADD/SUB might take way much longer than XOR that the designers might not try to fit ADD/SUB into the same single clock cycle as XOR, and so might instead do multi-cycle pipelined ADD/SUB.

A real-world CPU example is the Cray-1, where S-Register Scalar Operations (64-bit) take 3 cycles for ADD/SUB but still only 1 cycle for XOR. [1]

[1] https://ed-thelen.org/comp-hist/CRAY-1-HardRefMan/CRAY-1-HRM...

7 hours ago | parent | prev | next [-]
[deleted]
GoblinSlayer 9 hours ago | parent | prev [-]

Harvard Mark I? Not sure why people think programming started with Z80.

bonzini 5 hours ago | parent | next [-]

The article is about x86, and x86 assembly is mostly a superset of 8080 (which is why machine language numbers registers as AX/CX/DX/BX, matching roughly the function of A/BC/DE/HL on the 8080—in particular with respect to BX and HL being last).

flohofwoe 5 hours ago | parent | prev [-]

My WW2-era assembly is a bit rusty, but I don't think the Harvard Mark 1 had bitwise logical operations?