The Nintendo 64 RDP(graphics/memory controller) used 9 bit bytes.

This was done for graphics reasons, native antialiasing if I understand it. The cpu can't use it. it still only sees 8-bit bytes.

https://www.youtube.com/watch?v=DotEVFFv-tk (Kaze Emanuar - The Nintendo 64 has more RAM than you think)

To summarize the relevant part of the video. The RDP wants to store pixel color in 18 bits 5 bits red 5 bits blue 5 bits green 3 bits triangle coverage it then uses this coverage information to calculate a primitive but fast antialiasing. so SGI went with two 9-bit bytes for each pixel and magic in the RDP(remember it's also the memory controller) so the cpu sees the 8-bit bytes it expects.

Memory on N64 is very weird it is basicly the same idea as PCIE but for the main memory. PCI big fat bus that is hard to speed up. PCIE small narrow super fast bus. So the cpu was clocked at 93 MHz but the memory was a 9-bit bus clocked at 250 MHz. They were hoping this super fast narrow memory would be enough for everyone but having the graphics card also be the memory controller proved to make the graphics very sensitive to memory load. to the point that the main thing that helps a n64 game get higher frame rate is to have the cpu do as few memory lookups as possible. which in practical terms means having it idle as much as possible. This has a strange side effect that while a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...) on the N64 it can be the opposite. If you can make your code do more calculation with less memory you can utilize the cpu better because it is mostly sitting idle to give the RDP most of the memory bandwidth.

▲

fc417fc802 6 days ago | parent | next [-]

> a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...)

That really depends. A cache miss adds eons of latency thus is far worse than doing a few extra cycles of work but depending on the workload the reorder buffer might manage to negate the negative impact entirely. Memory bandwidth as a whole is also incredibly scarce relative to CPU clock cycles.

The only time it's a sure win is if you trade instruction count for data in registers or L1 cache hits but those are themselves very scarce resources.

▲

01HNNWZ0MV43FF 6 days ago | parent | prev [-]

Yeah but if the CPU can't use it then it's kinda like saying your computer has 1,000 cores, except they're in the GPU and can't run general-purpose branchy code

In fact, it's not even useful to say it's a "64-bit system" just because it has some 64-bit registers. It doesn't address more than 4 GB of anything ever

▲

dspillett 5 days ago | parent | next [-]

> In fact, it's not even useful to say it's a "64-bit system" just because it has some 64-bit registers.

Usually the size of general purpose registers is what defines the bitness of a CPU, not anything else (how much memory it can address, data bus width, etc).

For instance, the 80386SX was considered a 32-bit CPU because its primary register set is 32-bit, despite the fact it had a 24-bit external address bus and a 16-bit external data bus (32-bit requests are split into two 16-bit requests, this was done to allow the chip to be used on cheaper motherboards such as those initially designed with the 80286 in mind).

Note that this is for general purpose registers only: a chip may have 80-bit floating point registers in its FPU parts (supporting floating point with a 64-bit mantissa) but that doesn't make it an 80-bit chip. That was a bit more obvious when FPUs where external add-ons like the 8087 (the co-pro for the 16-bit 8086 family back in the day, which like current FPUs read & wrote IEEE754 standard 32- & 64- bit format floats and computed/held intermediate results in an extended 80-bit format).

▲

mrob 5 days ago | parent [-]

>Usually the size of general purpose registers is what defines the bitness of a CPU

The Motorola 68000 has 32-bit registers but it's usually considered a 16-bit CPU because it has 16-bit ALU and 16-bit data bus (both internal and external).

▲

p_l 5 days ago | parent [-]

Motorola 68k is a curious case because it originally was supposed to be a 16bit cpu, not 32bit, and the 24bit addressing that ignored upper 8 bits didn't help the perception.

Ultimately, 68k being "16bit" is a marketing thing from home computers that upgraded from 8bit 6502 and the like to m68k but didn't use it fully.

▲

dspillett 5 days ago | parent | next [-]

That is an odd case.

I'd still call it a 32-bit CPU as it had 32-bit registers and instructions (and not just a few special case 32-bit instructions IIRC). Like the 386SX it had a 16-bit external data bus, but some of its internal data routes were 16-bit also (where the 386SX had the full 32-bit core of a 386, later renamed 386DX, with the changes needed to change the external data bus) as were some of its ALUs hence the confusion abaout its bit-ness.

▲

p_l 5 days ago | parent | next [-]

In a way, the fact that you have home computer market calling it 16bit, while at the same time you have workstation systems that plainly talk about 32bit ISA, shows how much of marketing issue it is :)

▲

ddingus 5 days ago | parent | prev [-]

Would you call the 6809 a 16 bit device?

	▲	dspillett 5 days ago \| parent [-]
		I'm not aware of that one off the top of my head. If it naturally operated over 16-bit values internally (i.e. it had 16-bit registers and a primarily 16-bit¹ instruction set), at least as fast as it could work with smaller units, then probably yes. ---- [1] So not a mostly 8-bit architecture with 16-bit add-ons. The 8086 had a few instructions that could touch 32 bits, multiply being able to give a 32-bit output from two 16-bit inputs for instance (though the output was always to a particular pair of its registers), but a few special cases like that doesn't count so it is definitely 16-bit.

▲

ddingus 5 days ago | parent | prev [-]

Well, the 6809 was basically the same in these respects.

Internal registers are 16 bit, with the accumulator (A) being provisioned as two 8 bit registers (A, B) as needed. Index X, Y, Stack, User Stack, PC, are all 16 bit registers.

The Hitachi 6309, adds to that with up to 32 bit register sizes in specific cases.

In any case, the ALU and data transfers are 8 bits and I am not sure I ever saw the 6809 referenced as a 16 bit device.

Maybe 16 bit curious, LMAO.

	▲	p_l 5 days ago \| parent [-]
		I'd say that it's a somewhat extended 8bit device because it's still 8bit focused architecture (6800) with extensions towards better handling of 16bit values and certain common parts involved including the zero/direct page are also effectively an increase in flexibility for 8bit code not so much move to 16bit. That said, "16bit curious" is a great term :D

▲

jama211 5 days ago | parent | prev [-]

But in the n64’s case the GPU _could_ use the extra bit, so it’s fine. It was more trivia than anything about what the cpu could see.