Remix.run Logo
gavinray 4 hours ago

I want to point out an implemented feature that people SHOULD be adopting but that I doubt will be picked up:

  P2590R2, Explicit lifetime management (PR106658)
This is for "std::start_lifetime_as<T>". If you have not heard of this before, it's the non-UB way to type-pun a pointer into a structured type.

Nearly all zero-copy code that deals with external I/O buffers looks something like:

  std::unique_ptr<char[]> buffer = stream->read();
  if (buffer[0] == FOO)
    processFoo(reinterpret_cast<Foo*>(buffer.get())); // undefined behavior
  else
    processBar(reinterpret_cast<Bar*>(buffer.get())); // undefined behaviour
With this merged, swap the reinterpret_cast for start_lifetime_as and you're no longer being naughty.

https://en.cppreference.com/cpp/memory/start_lifetime_as

jandrewrogers 3 hours ago | parent | next [-]

There was already a legal way to achieve this that everyone should already have been using (laundering a pointer through a no-op memmove). Using reinterpret_cast here is a bug.

The "start_lifetime_as" facility does one additional thing beyond providing a tidy standard name for the memory laundering incantation. Semantically it doesn't touch the memory whereas the no-op memmove intrinsically does. In practice, this makes little difference, since the compiler could see that the memmove was a no-op and optimized accordingly.

kevin_thibedeau 3 hours ago | parent | next [-]

This still has unresolved alignment issues that blow up outside the amd64 ecosystem.

jandrewrogers 2 hours ago | parent [-]

Is this just a basic lack of alignment enforcement or is there a bigger issue?

szmarczak 3 hours ago | parent | prev [-]

No because the object does not exist after std::launder. It only exists after std::start_lifetime_as. The bytes being there says nothing about the object, per the C++ standard.

jandrewrogers 2 hours ago | parent [-]

The compiler will create an implicit lifetime type at the memmove destination as required to give it defined behavior. Technically you don't even need std::launder, it is just far more convenient than the alternative.

amluto 2 hours ago | parent | prev | next [-]

The cppreference description seems questionable to me:

> Implicitly creates a complete object of type T (whose address is p) and objects nested within it. The value of each created object obj of TriviallyCopyable type U is determined in the same manner as for a call to std::bit_cast<U>(E) except that the storage is not actually accessed, where E is the lvalue of type U denoting obj. Otherwise, the values of such created objects are unspecified.

So T is the complete new object. It contains subobjects, and one of those subobjects has type U. U is initialized as if by bit_cast, and I presume they meant to say that bit_cast casted from the bits already present at the address in question. Since “obj” is mentioned without any definition of any sort, I’ll assume it means something at the correct address.

But what’s E? The page says “E is the lvalue of type U denoting obj,” but obj probably has type char or a similar type, and if it already had type U, there would be no need for bit_cast.

groundzeros2015 4 hours ago | parent | prev | next [-]

You’re allowed to type pun char buffers.

jcranmer 4 hours ago | parent | next [-]

No, you're not.

You're allowed to access any type via a char buffer. But the converse is not true (quoting https://eel.is/c++draft/expr#basic.lval-11):

> An object of dynamic type Tobj is type-accessible through a glvalue of type Tref if Tref is similar ([conv.qual]) to: Tobj, a type that is the signed or unsigned type corresponding to Tobj, or a char, unsigned char, or std :: byte type. If a program attempts to access ([defns.access]) the stored value of an object through a glvalue through which it is not type-accessible, the behavior is undefined.

The dynamic type of a char buffer is, well, a char buffer, and can only be accessed via things that are the same type as a char buffer up to signedness and cv-qualification. The actual strict aliasing rules are not commutative!

denotational an hour ago | parent | next [-]

If the type is an implicit-lifetime type, then you can legally create an unsigned char array, and then reinterpret_cast a pointer to that to a pointer to the type.

See https://eel.is/c++draft/intro.object#def:object,implicit_cre....

https://eel.is/c++draft/intro.object#15 is an example showing this with malloc; the subsequent paragraph further permits it to work with an unsigned char array.

groundzeros2015 3 hours ago | parent | prev [-]

I’m not a language lawyer but i think the part you are missing is about “type establishment”. (Is this a C vs C++ thing?)

Malloc returns a buffer and then you cast it to the type you want. Similarly for all memory allocators.

Punning the same region of char buffer as two different types is a bit different.

jcranmer 19 minutes ago | parent | next [-]

This gets to the heart of effective type rules, which are complex, confusing, and not properly implemented by compilers. C and C++ definitely diverge here, because C is less ambitious in its object model (which mean it just simply leaves so many details about it undiscussed).

Malloc returns memory that is uninitialized and has no type. The effective type of that memory is initialized on first use by C, whereas C++ relies on angelic nondeterminism to magically initialize the type at return type to whatever will work in the future.

amluto 3 hours ago | parent | prev [-]

> Malloc returns a buffer and then you cast it to the type you want.

You can’t cast the buffer — you’re casting the pointer.

One might argue (I’m not sure whether this is correct) that malloc allocated an array of ints and the language merely has no way to state that directly. Then you write to those ints using a char pointer, and then you access them as ints, but they’ve been ints ever since allocation.

groundzeros2015 3 hours ago | parent [-]

yes but that works not because malloc is special but because there are more relaxed types rules than suggested by the comment above.

ozgrakkurt 4 hours ago | parent | prev [-]

And should always use -fno-strict-aliasing anyways. The default rules are insane

throw834948398 4 hours ago | parent | prev [-]

Your code is not only naughty, it’s also incorrect due to alignment issues.