Remix.run Logo
kragen 5 hours ago

That doesn't seem right to me. If the problem is that it has too many instructions and addressing modes, you can decide to only use a small subset of those instructions and addressing modes, which really isn't much of a handicap for implementing functionality. (It doesn't help with analyzing existing code, but neither does a powerful assembler.)

I'm no expert on assembly-language programming, but probably 90% of the assembly I write on i386, amd64, RISC-V, and ARM is about 40 instructions: ldr, mov, bl, cmp, movs, push, pop, add, b/jmp, bl/blx/call, ret, str, beq/jz, bne/jnz, bhi/ja, bge/jge, cbz, stmia, ldmia, ldmdb, add/adds, addi, sub/subs, bx, xor/eor, and, or/orr, lsls/shl, lsrs/sar/shr, test/tst, inc, dec, lea, and slt, I think. Every once in a while you need a mul or a div or something. But the other 99% of the instruction set is either for optimized vectorized inner loops or for writing operating system kernels.

I think that the reason that i386 assembly (or amd64 assembly) is error-prone is something else, something it has in common with very simple architectures and instruction sets like that of the PDP-8.

yjftsjthsd-h 4 hours ago | parent [-]

> I think that the reason that i386 assembly (or amd64 assembly) is error-prone is something else, something it has in common with very simple architectures and instruction sets like that of the PDP-8.

What reason is that? (And, if it's not obvious, what are ARM/RISC-V doing that make them less bad?)

kragen 4 hours ago | parent [-]

I don't think they're particularly less bad. All five architectures just treat memory as a single untyped array of integers, and registers as integer global variables. Their only control structure is goto (and conditional goto.) If you forget to pass an argument to a subroutine, or pass a pointer to an integer instead of the integer, or forget to save a callee-saved register you're using, or do a signed comparison on an unsigned integer, or deallocate memory and then keep using it, or omit or jump past the initialization of a local variable, conventional assemblers provide you no help at all in finding the bug. You'll have to figure it out by watching the program give you wrong answers, maybe single-stepping through it in a debugger.

There are various minor details of one architecture or the other that make them more or less bug-prone, but those are minor compared to what they have in common.

None of this is because the instruction sets are complex. It would be closer to the mark to say that it's because they are simple.