Remix.run Logo
JonChesterfield 4 days ago

My grievance isn't with aliasing or dataflow, it's with a pointer provenance model which makes assumptions which are inconsistent with reality, optimises based on it, then justifies the nonsense that results with UB.

When the hardware behaviour and the pointer provenance model disagree, one should change the model, not change the behavior of the program.

jcranmer 4 days ago | parent [-]

Give me an example of a program that violates pointer provenance (and only pointer provenance) that you think should be allowed under a reasonable programming model.

JonChesterfield 4 days ago | parent [-]

This is rather woven in with type themed alias analysis which makes a hard distinction tricky. E.g realloc doesn't work under either, but the provenance issue probably only shows up under no-strict-aliasing.

I like pointer tagging because I like dynamic language implementations. That tends to look like "summon a pointer from arithmetic", which will have unknown to the compiler provenance, which is where the deref without provenance is UB demon strikes.

jcranmer 3 days ago | parent [-]

I think you're misunderstanding pointer provenance, and you're being angry at a model that doesn't exist.

The failure mode of pointer provenance is converting an integer to a pointer to an object that was never converted to an integer. Tricks like packing integers into unknown bits or packing pointers into floating-point NaNs don't violate pointer provenance--it's really no different from passing a pointer to an external function call and getting it back from a different external function call.

JonChesterfield 3 days ago | parent [-]

That's definitely possible. The UB if no provenance information is available belief comes from https://www.cl.cam.ac.uk/~pes20/cerberus/clarifying-provenan..., in particular

> access via a pointer value with empty provenance is undefined behaviour

I'm annoyed that casting an aligned array of bytes to a pointer to a network packet type is forbidden, and that a pointer to float can't be cast to a pointer to a simd vector of float, and that malloc cant be written in C, but perhaps those aren't provenance either.

jcranmer 3 days ago | parent [-]

> The UB if no provenance information is available belief comes from https://www.cl.cam.ac.uk/~pes20/cerberus/clarifying-provenan..., in particular

That's an old document. In particular, it's largely arguing for a PVI provenance model (i.e., integers carry provenance information), whereas the current TS is relying on a PNVI provenance model (i.e., integers do not carry provenance information). https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2577.pdf is the last draft pre-TS-ification (i.e., has all the background information to understand it).

> I'm annoyed that casting an aligned array of bytes to a pointer to a network packet type is forbidden, and that a pointer to float can't be cast to a pointer to a simd vector of float, and that malloc cant be written in C, but perhaps those aren't provenance either.

That's all strict aliasing rules, not pointer provenance rules. (Well, malloc has issues with living in the penumbra of the C object model). The big thing that provenance prevents you from doing is writing memcpy in C (since char access of a pointer counts as exposing the pointer, whereas the PNVI model makes memcpy a non-exposing operation).