Remix.run Logo
dist-epoch 2 hours ago

Do we really know that LEA is using the hardware memory address computation units? What if the CPU frontend just redirects it to the standard integer add units/execution ports? What if the hardware memory address units use those too?

It would be weird to have 2 sets of different adders.

adrian_b an hour ago | parent [-]

The modern Intel/AMD CPUs have distinct ALUs (arithmetic-logic units, where additions and other integer operations are done; usually between 4 ALUs and 8 ALUs in recent CPUs) and AGUs (address generation units, where the complex addressing modes used in load/store/LEA are computed; usually 3 to 5 AGUs in recent CPUs).

Modern CPUs can execute up to between 6 and 10 instructions within a clock cycle, and up to between 3 and 5 of those may be load and store instructions.

So they have a set of execution units that allow the concurrent execution of a typical mix of instructions. Because a large fraction of the instructions generate load or store micro-operations, there are dedicated units for address computation, to not interfere with other concurrent operations.

dist-epoch an hour ago | parent [-]

But can the frontend direct these computations based on what's available? If it sees 10 LEA instructions in a row, and it has 5 AGU units, can it dispatch 5 of those LEA instructions to other ALUs?

Or is it guaranteed that a LEA instruction will always execute on an AGU, and an ADD instruction always on an ALU?

adrian_b 34 minutes ago | parent [-]

This can vary from CPU model to CPU model.

No recent Intel/AMD CPU executes directly LEA or other instructions, they are decoded into 1 or more micro-operations.

The LEA instructions are typically decoded into either 1 or 2 micro-operations. The addressing modes that add 3 components are usually decoded into 2 micro-operations, like also the obsolete 16-bit addressing modes.

The AGUs probably have some special forwarding paths for the results towards the load/store units, which do not exist in ALUs. So it is likely that 1 of the up to 2 LEA micro-operations are executed only in AGUs. On the other hand, when there are 2 micro-operations it is likely that 1 of them can be executed in any ALU. It is also possible for the micro-operations generated by a LEA to be different from those of actual load/store instructions, so that they may also be executed in ALUs. This is decided by the CPU designer and it would not be surprising if LEAs are processed differently in various CPU models.