| ▲ | Rochus 10 hours ago | ||||||||||||||||||||||
The benchmark demonstrates that the modern C++ "Lambda" approach (creating a unique struct with fields for captured variables) is effectively a compile-time calculated static link. Because the compiler sees the entire definition, it can flatten the "link" into direct member access, which is why it wins. The performance penalty the author sees in GCC is partly due to the OS/CPU overhead of managing executable stacks, not just code inefficiency. The author correctly identifies that C is missing a primitive that low-level languages perfected decades ago: the bound method (wide) pointer. The most striking surprise is the magnitude of the gap between std::function and std::function_ref. It turns out std::function (the owning container) forces a "copy-by-value" semantics deeply into the recursion. In the "Man-or-Boy" test, this apparently causes an exponential explosion of copying the closure state at every recursive step. std::function_ref (the non-owning view) avoids this entirely. | |||||||||||||||||||||||
| ▲ | gpderetta 10 hours ago | parent [-] | ||||||||||||||||||||||
Even if you never copy the std::function the overhead is very large. GCC (14 at least) does not seem to be able to elide the allocation, nor inline the function itself, even if used immediately after use and the object never escapes the function. Given the opportunity, GCC seems to be able to completely remove one layer pf function_ref, but fails at two layers. | |||||||||||||||||||||||
| |||||||||||||||||||||||