| ▲ | fanf2 4 days ago |
| Apple’s ARM cores have wider decode than x86 M1 - 8 wide M4 - 10 wide Zen 4 - 4 wide Zen 5 - 8 wide |
|
| ▲ | adgjlsfhk1 3 days ago | parent | next [-] |
| pure decoder width isn't enough to tell you everything. X86 has some commonly used ridiculously compact instructions (e.g. lea) that would turn into 2-3 instructions on most other architectures. |
| |
| ▲ | ajross 3 days ago | parent | next [-] | | The whole ModRM addressing encoding (to which LEA is basically a front end) is actually really compact, and compilers have gotten frightently good at exploiting it. Just look at the disassembly for some non-trivial code sometime and see what it's doing. | |
| ▲ | monocasa 3 days ago | parent | prev | next [-] | | Additionally, stuff llike rmw instructions are really like at least three, maybe four or five risc instructions. | |
| ▲ | ack_complete 3 days ago | parent | prev | next [-] | | Yes, but so does ARM. ld1 {v0.16b,v1.16b,v2.16b,v3.16b},x0,#64 loads 4 x 128-bit vector registers and post-increments a pointer register. | |
| ▲ | kimixa 3 days ago | parent | prev [-] | | Also the op cache - if it hits that the decoder is completely skipped. |
|
|
| ▲ | ryuuchin 3 days ago | parent | prev | next [-] |
| Is Zen 5 more like a 4x2 than a true 8 since it has dual decode clusters and one thread on a core can't use more than one? https://chipsandcheese.com/i/149874010/frontend |
|
| ▲ | wmf 4 days ago | parent | prev | next [-] |
| Skymont - 9 wide |
|
| ▲ | mort96 3 days ago | parent | prev [-] |
| Wow, I had no idea we were up to 8 wide decoders in amd64 CPUs. |