Remix.run Logo
londons_explore 4 days ago

> Integer vector instructions and floating point vector instructions now have the same latencies.

There is very little reason to use integers for anything anymore. Loop counter? Why not make it a double - you never know when you might need an extra 0.5 loops at the end!

sushevff 4 days ago | parent | next [-]

Totally. Can’t wait to access the 18463.637th record in my database plus or minus a record or thousand.

vhcr 4 days ago | parent [-]

Doubles can represent integers exactly up to 2^52

mark-r 4 days ago | parent [-]

Actually because of the implied upper bit in the format, it can go to 2^53.

bee_rider 4 days ago | parent | prev | next [-]

Finally we can implement BiCGStab intuitively!

Intralexical 4 days ago | parent | prev [-]

Integers aren't for performance. They're for precision (anything financial for example) and occasionally size.

crest 4 days ago | parent [-]

At least historically integer operations also offered lower latency and higher throughput on CPUs. For decades integer addition and bitwise logical operations have been the canonical single-cycle instructions that any microarchitecture could perform at least once per cycle without visible latency while floating point operations and integer multiplication had multi-cycle latency if it was even fully pipelined.

Zen 5 breaks several performance "conventions" e.g. AMD went directly from one to three complex scalar integer units (multiplication, PDEP/PEXT, etc.).

Intel effectively has two vector pipelines and the shortest instruction latency is a single cycle while Zen 5 has four pipelines with a two cycle minimum latency. That's a *very* different optimisation target (aim for eight instead of two independent instructions in flight) for low level SIMD code going forward despite an identical instruction set.