Remix.run Logo
gobdovan 6 hours ago

Rust is in an awkward position of being already complicated enough that adding proofs for skipping bounds checks probably will not happen for a long time, even though this kind of low-level operation is where a lot of optimisation is lost.

Compounding on this, Rust is also unstable underneath, since there is no public, stable contract for carrying high-level semantics from HIR into MIR. Because these high-level invariants are lost during compilation, the compiler cannot easily use them to prove and eliminate low-level safety checks. But even if the frontend was perfect, Rust relies on LLVM's language-neutral SCEV, which operates purely on low-level math and cannot reason about high-level language semantics.

Ultimately, a lot of things would need to change for Rust to pay no performance for safety features.

aw1621107 5 hours ago | parent | next [-]

> Compounding on this, Rust is also unstable underneath, since there is no public, stable contract for carrying high-level semantics from HIR into MIR. Because these high-level invariants are lost during compilation

Not sure if I'm just out of the loop, but I'm having a hard time following this line of reasoning. Why is a public and/or stable contract needed to carry high-level semantics from HIR to MIR? Neither seems necessary to me; from what I understand HIR and MIR are rustc-internal so public contracts shouldn't matter, and the lack of stability means the Rust devs aren't precluded by backwards compatibility from modifying the IRs to add the ability to carry such invariants.

gobdovan 4 hours ago | parent [-]

Whoops! Although there is no public contract between HIR and MIR, the public part was not relevant here. What I wanted to highlight is that if they'd want to add proper proof machinery to eliminate low-level safety checks, they'd have to do it at: surface language, which is already complex enough; then HIR->MIR boundary with clean provenance (which I think current MIR would collapse too aggressively) and which may require a much clearer contract; then, even if they do the full front- and mid- ends properly, if you leave it up to LLVM, it ends up in SCEV, which is language neutral and would not be a good fit to support the proof obligations that would be specific to Rust.

I dug up a proposal from 2021 around bounds check hoisting in MIR, and from the discussion, details are pretty thorny [0]. It's narrower than general proofs but the frictions are very similar. The easiest example that shows HIR -> MIR difficulties is the part around `for i in 0..32 { a[i] = 1; }`. At the source level the range fact is super obvious, but after the for-loop/iterator lowering the MIR optimiser has to recover that `i` comes from exactly that range before it can turn 32 checks into the one hot-path check. Then it also would have to check for panic strategy to maintain the correct behaviour after optimisation.

[0] https://github.com/rust-lang/rust/issues/92327

aw1621107 7 minutes ago | parent [-]

OK, I think that makes more sense. Thanks for taking the time to explain!

afdbcreid 6 hours ago | parent | prev [-]

The overhead of bounds checking varies a lot. In the common case it's negligible (few percents), but in some cases, depending on what you build, it can go up to even 20%. And if it prevents autovectorization it can cost even more.

There are techniques to minimize the perf loss, though (safely), and of course you can use unsafe code. If you do it smartly, in the vast majority of cases bound checks do not matter (in fact, even in C++ there is a push for a hardened standard library that does bound checks, and e.g. Google uses that).

Rust will never include full proofs, but it might include ranged integers which can minimize bound checks even more.

CrazyboyQCD 5 hours ago | parent [-]

[dead]