Remix.run Logo
fweimer 7 hours ago

What language is this article talking where compilers don't optimize multiplication and division by powers of two? Even for division of signed integers, current compilers emit inline code that handles positive and negative values separately, still avoiding the division instruction (unless when optimizing for size, of course).

shakow 6 hours ago | parent | next [-]

That's what I would have thought as well, but looks like that on x86, both clang and gcc use variations of LEA. But if they're doing it this way, I'm pretty sure it must be faster, because even if you change the ×4 for a <<2, it will still generate a LEA.

https://godbolt.org/z/EKj58dx9T

shaggie76 5 hours ago | parent | next [-]

Not only is LEA more flexible I believe it's preferred to SHL even for simple operations because it doesn't modify the flags register which can make it easier to schedule.

Cold_Miserable 11 minutes ago | parent [-]

shlx doesn't alter the flag register.

adrian_b 5 hours ago | parent | prev [-]

They use LEA for multiplying with small constants up to 9 (not only with powers of two, but also with 3, 5 and 9; even more values could be achieved with two LEA, but it may not be worthwhile).

For multiplying with powers of two greater or equal to 16, they use shift left, because LEA can no longer be used.

cjbgkagh 7 hours ago | parent | prev [-]

It was written in assembly so goes through an assembler instead of a compiler.

rawling 6 hours ago | parent [-]

I assume GP is talking about the bit in the article that goes

> RCT does this trick all the time, and even in its OpenRCT2 version, this syntax hasn’t been changed, since compilers won’t do this optimization for you.

cjbgkagh 5 hours ago | parent [-]

That makes more sense, I second their sentiment, modern compilers will do this. I guess the trick is knowing to use numbers that have these options.

bombcar 5 hours ago | parent [-]

There was a recent article on HN about which compiler optimizations would occur and which wouldn't and it was surprising in two ways - first, it would make some that you might not expect, and it would not make others that you would - because in some obscure calling method, it wouldn't work. Fixing that path would usually get the expected optimization.