Remix.run Logo
gpderetta 4 days ago

> nonescaping locals don’t get addresses

inlining, interprocedural optimizations.

For example, something as an trivial accessor member function would be hard to optimize.

pjmlp 4 days ago | parent | next [-]

Safer languages manage similar optimizations without having to rely on UB.

gpderetta 4 days ago | parent [-]

Well, yes, safer languages prevent pointer forging statically, so provenance is trivially enforced.

And I believe that provenance is an issue in unsafe rust.

tialaramex 4 days ago | parent [-]

Unlike C++ and (until Martin's work is moved to the actual language ISO document rather than separate) C the Rust language actually has a definition for how provenance is supposed to work.

https://doc.rust-lang.org/std/ptr/index.html#provenance

The definition isn't deemed complete because of aliasing. AIUI The definition we have is adequate if you're OK with treating all edge cases for "Is this an alias?" as "Yes" but eventually Rust will also need to carefully nail down all those edge cases so that you can tread closer without falling off.

pizlonator 4 days ago | parent | prev [-]

Inlining doesn’t require UB

gpderetta 4 days ago | parent [-]

I didn't claim that. What I mean is that if a pointer escapes into an inlined function and no further, it will still prevent further optimizations if we apply your rule that only non-escaping locals don't get addresses. The main benefit of inlining is that it is effectively a simple way to do interprocedurally optimizations. I.e.

  inline void add(int* to, int what) { *to += what; }
  void foo();
  void bar() {
      int x = 0;
      add(&x, 1);
      foo();
      return x;
  }
By your rules, optimizing bar to return the constant 1 would not be allowed.
pizlonator 4 days ago | parent [-]

I think you’re applying a very strange strawman definition to “nonescaping”. It’s certainly not the definition I would pick.

The right definition is probably something like:

- pointers that come out of the outside world (syscalls) are escaped. They are just integers.

- pointers to locals have provenance. They point to an abstract location. It is up to the implementation to decide when the location gets an integer value (is in an actual address) and what that value is. The implementation must do this no later than when the pointer to the local escapes.

- pointer values passed to the outside world (syscalls) escape.

- pointer values stored in escaped memory also escape, transitively

That’s one possible definition that turns the UB into implementation defined behavior. I’m sure there are others

gpderetta 4 days ago | parent [-]

I think you have a non-standard definition. An escaping pointer is an address that the compiler cannot fully track (directly or indirectly). It could be to a syscall, it could be a separately compiled function (without LTO), it could even be to a function in the same translation unit if the compiler cannot inline that function nor do sufficient intraprocedural analysis.

Again, I'm not a compiler writer, but my understanding is that non escaping variables can be optimized in SSA form, escaped variables are otherwise treated as memory and the compiler must be significantly more conservative.

In any case, whether a pointer escapes or not depends purely on the compiler capabilities and optimization level, so it would not be sane making a code well defined or UB depending on the compiler or optimization level.

edit: to be more concrete, do you think that in my example the constant folding of the return into return 1 should be allowed? And if so, which variant of this code would prevent the optimization and why?

pizlonator 4 days ago | parent [-]

> Again, I'm not a compiler write

I am a compiler writer.

The definition I gave in my post is general enough to cover all possible compilers (ones that have LTO, ones that are inside a DBT, etc).

Yes the constant folding should be allowed because the pointer to the local never escaped.