| ▲ | gpderetta 10 hours ago | |||||||
Even if you never copy the std::function the overhead is very large. GCC (14 at least) does not seem to be able to elide the allocation, nor inline the function itself, even if used immediately after use and the object never escapes the function. Given the opportunity, GCC seems to be able to completely remove one layer pf function_ref, but fails at two layers. | ||||||||
| ▲ | Rochus 9 hours ago | parent | next [-] | |||||||
This is exactly right, and the "Man-or-Boy" benchmark hits the worst-case scenario for libstdc++ specifically. The optimization fails here. My "copy-by-value" comment refers to the ownership semantics. Since std::function owns its storage, and the Man-or-Boy recursion passes the closure into the next layer (often by value or by capturing it into a new closure), we trigger the copy constructor. If the SBO limit is exceeded, that copy constructor performs a new heap allocation and a deep copy of the state. | ||||||||
| ▲ | boris 10 hours ago | parent | prev [-] | |||||||
GCC (libstdc++) as all other major C++ runtimes (libc++, MSVC) implements the small object optimization for std::function where a small enough callable is stored directly in std::function's state instead of on the heap. Across these implementations, you can reply on being able to capture two pointers without a dynamic allocation. | ||||||||
| ||||||||