Remix.run Logo
unwind 4 days ago

Very nice, although I think the level of "trickery" with the macros becomes a bit much. I do understand that is The Way in C (I've written C for 30 years), it's just not something I'd do very often.

Also, from a strictly prose point of view, isn't it strange that the `clz` instruction doesn't actually appear in the 10-instruction disassembly of the indexing function? It feels like it was optimized out by the compiler perhaps due to the index being compile-time known or something, but after the setup and explanation that was a bit jarring to me.

mananaysiempre 4 days ago | parent | next [-]

The POSIX name for the function is clz() [the C23 name is stdc_leading_zeros(), because that's how the committee names things now, while the GCC intrinsic is __builtin_clz()]. The name of the x86 instruction, on the other hand, is BSR (80386+) or LZCNT (Nehalem+, K10+) depending on what semantics you want for zero inputs (keep in mind that early implementations of BSF/BSR are very slow and take time proportional to the output value). The compiled code uses BSR. (All of these are specified slightly differently, take care if you plan to actually use them.)

unwind 4 days ago | parent [-]

Got it, thanks. I suck at x86, even more than I thought. :/

Edit: it's CLZ on Arm [1], probably what I was looking for.

[1]: https://developer.arm.com/documentation/100069/0610/A32-and-...

mananaysiempre 3 days ago | parent [-]

In that case, I suck at POSIX—I could’ve sworn clz() was a standard or at least a conventional function, but no, that’s fls(), which is furthermore not universal across Unices. Either way, if you don’t feel your knowledge of the x86 instruction set is adequate, there’s always an option of taking an instruction listing and looking up anything that seems unfamilliar. It’s surprisingly effective. (You can either use an old listing[1] or skip all the vector stuff in a new one[2].)

[1] https://web.archive.org/web/20000819070651/http://www.quanta...

[2] https://www.felixcloutier.com/x86/

kilpikaarna 3 days ago | parent | prev | next [-]

> The Way in C

Is it though? (Ab)using C macros so you can write obviously-not-C stuff like (from the example):

SegmentArray(Entity) entities = {0};

Seeing that kind of thing in example C code just makes my hair stand on end because you know it's someone who actually wants to write C++ but for whatever reason has decided to try to implement their thing in C and be clever about it. And I'm going to have to go parse through multiple levels of macro indirection to just understand what the hell is going on.

Seems like a useful data structure, despite the shortcoming that it can't be accessed like a regular array. Normally auto-expanding arrays involves realloc which is tricky with arena allocation. But jeez, just pass void pointers + size and have it assert if there's a mismatch.

exDM69 3 days ago | parent | prev [-]

> Also, from a strictly prose point of view, isn't it strange that the `clz` instruction

It's using the `bsr` instruction which is similar (but worse). The `lzcnt` instruction in x86_64 is a part of the BMI feature introduced in Intel Haswell. The compiler does not generate these instructions by default so it runs on any x86_64.

If you add `-mbmi` or `-march=haswell` or newer to the compiler command line, you should get `clz`/`lzcnt` instead.