Remix.run Logo
Extra Instructions Of The 65XX Series CPU (1996)(ffd2.com)
64 points by embedding-shape 14 hours ago | 12 comments
Dwedit 19 minutes ago | parent | next [-]

One way to use unofficial instructions is so you can use Read-Modify-Write instructions in addressing modes that the official instruction cannot be used in.

To understand, it helps if you write out the instruction table in columns, so here's the CMP and DEC instructions:

Byte C1: (add 4 to get to the next instruction in this table)

CMP X,ind [x indirect, read instruction's immediate value, add X, then read that pointer from zeropage, written like CMP ($nn,x)]

CMP zpg [zeropage, written like CMP $nn]

CMP # [immediate value, written like CMP #$nn]

CMP abs [absolute address, written like CMP $nnnn]

CMP ind,Y [indirect Y, read pointer from zeropage then add Y, written like CMP ($nnnn,Y)]

CMP zpg,X [zeropage plus X, add X to the zeropage address, written like CMP $nn,X]

CMP abs,Y [absolute address plus Y, add Y to the address, written like CMP $nnnn,Y]

CMP abs,X [absolute address plus X, add X to the address, written like CMP $nnnn,X]

So that's 8 possible addressing modes for this instruction.

Immediately afterwards:

Byte C2: (add 4 to get to the next instruction in this table)

???

DEC zpg

DEX

DEC abs

???

DEC zpg,X

???

DEC abs,X

That's 5 possible addressing modes. So where's "DEC X,ind", "DEC ind,Y", and "DEC abs,Y"? They don't exist.

Table for Byte C3 is 8 undocumented instructions that aren't supposed to be used. So people determined what the instruction did. Turns out, it's a combination of CMP and DEC, so people named the instruction "DCP".

Byte C3:

DCP X,ind

DCP zpg

???

DCP abs

DCP ind,Y

DCP zpg,X

DCP abs,Y

DCP abs,X

Unlike the "DEC" instruction, you have the "X,ind", "ind,Y", and "abs,Y" addressing modes available. So if you want to decrement memory, and don't care about your flags being correct (because it's also doing a CMP operation), you can use this DCP instruction.

Same idea with INC and SBC, you get the ISC instruction. For when you want to increment, and don't care about register A and flags afterwards.

JetSetIlly 5 hours ago | parent | prev | next [-]

A couple of threads on AtariAge are exploring the possibility of using the "unstable" opcodes in this group (ARR, etc.) as a sort of fingerprint. The hope is that the instability is a prediction of the specific model of CPU. To what end I'm not sure of yet, but it's interesting research all the same.

https://forums.atariage.com/topic/385516-fingerprinting-6502... https://forums.atariage.com/topic/385521-fingerprinting-6502...

djmips 22 minutes ago | parent | prev | next [-]

a good essay on how they work https://www.pagetable.com/?p=39

ruk_booze 9 hours ago | parent | prev | next [-]

A nice guide on how to actually put those ”illegal opcodes” into work is ”No More Secrets”

https://csdb.dk/release/?id=248511

gblargg 8 hours ago | parent [-]

That's a new name I hadn't heard that fits well: unintended opcodes. I also like unofficial. Undocumented isn't correct because these are quite well documented.

adrian_b 4 hours ago | parent [-]

They are well documented now, after reverse engineering.

The manufacturer did not document them, so they really were undocumented.

The same happened with many other CPUs, like Zilog Z80, Intel 8086 and the following x86 CPUs.

They all had undocumented instructions, which have been discovered by certain users through reverse engineering.

Some of the undocumented instructions were unintended, so they existed only due to cost-cutting techniques used in the design of the CPU, therefore the CPU manufacturer intended to remove them in future models and they had a valid reason to not document them.

However a few instructions that were undocumented for the public were documented for certain privileged customers, like Microsoft in the case of Intel CPUs, so they were retained in all future CPU models, for compatibility.

bonzini 2 hours ago | parent [-]

Not always. LOADALL was used heavily by Microsoft's HIMEM.SYS on the 286, but was not preserved on subsequent models.

rossjudson 13 hours ago | parent | prev [-]

Some of those crazy instructions were used for copy protection, back in the day. Those mystery page boundary overflows were entertaining.

embedding-shape 13 hours ago | parent [-]

Aah, that's much better and more realistic than my previous assumption that they were "government instructions", something used in the military and similar more secretive contexts, but I suppose they didn't use off-the-shelves components perhaps like today.

cyco130 12 hours ago | parent | next [-]

These instructions were not intentionally designed and put in there in secret. They're simply an unintended consequence of the "don't care" states of the instruction decoding logic.

The decoder is the part of the CPU that maps instruction opcodes to a set of control signals. For example "LDA absolute" (opcode 0xA5) would activate the "put the result in A" signal on its last cycle while "LDX absolute" (opcode 0xA6) would activate the "put the result in X" signal. The undocumented "LAX absolute" (opcode 0xA7) simply activates both because of the decoder logic's internal wiring, causing the result to be put in both registers. For other undocumented opcodes, the "do both of these things" logic is less recognizable but it's always there. Specifically disallowing these illegal states (to make them NOPs or raise an exception, for instance) would require more die space and push the price up.

See here[1] for example to get a sense of how opcode bits form certain patterns when arranged in a specific way.

  [1] https://www.nesdev.org/wiki/CPU_unofficial_opcodes
kimixa 13 hours ago | parent | prev [-]

I don't think they were "intended" for anything - it's just that was the state of the control lines after it decoded that instruction byte, and combination might do something somewhat sane.

Wiring all the "illegal" instructions to a NOP would have taken a fair bit of extra logic, and that would have been a noticeable chunk of the transistor budget at the time.

NobodyNada 12 hours ago | parent [-]

That's exactly right. There's a really good article about it here: https://www.pagetable.com/?p=39