Remix.run Logo
20k 7 hours ago

Static inline functions can sometimes serve as an optimisation barrier to compilers. Its very annoying. I've run into a lot of cases when targeting C as a compilation target where swapping something out into an always-inline function results in worse code generation, because compilers have bugs sadly

There's also the issue in that the following two things don't have the same semantics in C:

    float v = a * b + c;
vs

    static_inline float get_thing(float a, float b) {
        return a*b;
    }

    float v = get_thing(a, b) + c;
This is just a C-ism (floating point contraction) that can make extracting things into always inlined functions still be a big net performance negative. The C spec mandates it sadly!

uintptr_t's don't actually have the same semantics as pointers either. Eg if you write:

    void my_func(strong_type1* a, strong_type2* b);
a =/= b, and we can pull the underlying type out. However, if you write:

    void my_func(some_type_that_has_a_uintptr_t1 ap, some_type_that_has_a_uintptr_t2 bp) {
        float* a = get(ap);
        float* b = get(bp);
    }
a could equal b. Semantically the uintptr_t version doesn't provide any aliasing semantics. Which may or may not be what you want depending on your higher level language semantics, but its worth keeping the distinction in mind because the compiler won't be able to optimise as well
kazinator 6 hours ago | parent | next [-]

The inline function receives the operands as arguments, and so whatever they are, they get converted to float. Thus the inline code is effectively like this:

  float v = (float) ((float) a) * ((float) b) + c;
Since v is float, the cast representing the return conversion can be omitted:

  float v = ((float) a) * ((float) b) + c;
Now, if a and b are already float, then it's equivalent. Otherwise not; if they are double or int, we get double or int multiplication in the original open code.
jcranmer 6 hours ago | parent [-]

> Now, if a and b are already float, then it's equivalent.

Not necessarily! Floating-point contraction is allowable essentially within statements but not across them. By assigning the result of a * b into a value, you prohibit contraction from being able to contract with the addition into an FMA.

In practice, every compiler has fast-math flags which says stuff it and allows all of these optimizations to occur across statements and even across inline boundaries.

(Then there's also the issue of FLT_EVAL_METHOD, another area where what the standard says and what compilers actually do are fairly diametrically opposed.)

kazinator 4 hours ago | parent | next [-]

The first mention of contraction in the standard (I'm looking at N3220 draft that I have handy) is:

A floating expression may be contracted, that is, evaluated as though it were a single opera- tion, thereby omitting rounding errors implied by the source code and the expression evalua- tion method.86) The FP_CONTRACT pragma in <math.h> provides a way to disallow contracted expressions. Otherwise, whether and how expressions are contracted is implementation-defined.

If you're making a language that generates C, it's probably a good idea to pin down which C compilers are supported, and control the options passed to them. Then you can more or less maintain the upper hand on issues like this.

garaetjjte 6 hours ago | parent | prev [-]

It seems to me that either you want to allow for contraction everywhere, or not all. Allowing it only sometimes is worst of both worlds.

jcranmer 4 hours ago | parent [-]

If you allow contraction after inlining, whether or not an FMA will get contracted becomes subject to the vicissitudes of inlining and other compiler decisions that can be hard-to-predict. It turns out to be a lot harder of a problem to solve than it appears at first glance.

quotemstr 7 hours ago | parent | prev [-]

Compiler bugs and standards warts suck, but you know what sucks more? Workarounds for compiler bugs and edge cases that become pessimizing folk wisdom that we can dispell only after decades, if ever. It took about that long to convince the old guards of various projects that we could have inline functions instead of macros. I don't want to spook them into renewed skepticism.

thomasahle 4 hours ago | parent [-]

Maybe they just checked with a compiler and got the same code?