Remix.run Logo
Animats 5 days ago

Right. Copying is very fast on modern CPUs, at least up to the size of a cache line. Especially if the data being copied was just created and is in the L1 cache.

If something is const, whether to pass it by reference or value is a decision the compiler should make. There's a size threshold, and it varies with the target hardware. It might be 2 bytes on an Arduino and 16 bytes on a machine with 128-bit arithmetic. Or even as big as a cache line. That optimization is reportedly made by the Rust compiler. It's an old optimization, first seen in Modula 1, which had strict enough semantics to make it work.

Rust can do this because the strict affine type model prohibits aliasing. So the program can't tell if it got the original or a copy for types that are Copy. C++ does not have strong enough assurances to make that a safe optimization. "-fstrict-aliasing" enables such optimizations, but the language does not actually validate that there is no aliasing.

If you are worried about this, you have either used a profiler to determine that there is a performance problem in a very heavily used inner loop, or you are wasting your time.