Remix.run Logo
epolanski 3 days ago

I have a hard time believing this fully: more custom instructions, more custom hardware, more heat.

How can you avoid it?

tapoxi 3 days ago | parent | next [-]

Since the Pentium Pro the hardware hasn't implemented the ISA, it's converted into micro ops.

epolanski 3 days ago | parent [-]

Come on, you know what I meant :)

If you want to support AVX e.g. you need 512bit (or 256) wide registers, you need dedicated ALUs, dedicated mask registers etc.

Ice Lake has implemented SHA-specific hardware units in 2019.

adgjlsfhk1 3 days ago | parent | next [-]

sure, but Arm has Neon/sve which impose basically the same requirements for vector instructions, and most high performance arm implimentations have a wide suite of crypto instructions (e.g. Apple's M series chips have AES, SHA1 and Sha256 instructions)

camel-cdr 3 days ago | parent [-]

Neon has it worse, because it's harder to scale issue width then vector length.

Zen5 has four issue 512-bit ALUs, current Arm processors have been stuck at four issue 128-bit for years.

Issue width scales quadratically, while vector length mostly scales linearly.

Intel decided it is easier to rewrite all performance critical applications thrice then to go wider than four issue SIMD.

It will have to be seen if Arm is in a position to push software ro adopt SVE, but currently it looks very bleak, with much of the little SVE code thats out there just assuming 128-bit SVE, because thats what all of the hardware is.

toast0 3 days ago | parent | prev | next [-]

ARM has instructions for SHA, AES, vectors, etc too. Pretty much have to pay the cost if you want the perf.

ryan-ca 3 days ago | parent | prev [-]

I think AVX is actually power gated when unused.

CyberDildonics 3 days ago | parent | prev [-]

The computation has to be done somehow, I don't know that it is a given that more available instructions means more heat.