Remix.run Logo
stouset a day ago

Safe Rust does.

Unsafe Rust allows you to tell the compiler “hold my beer”. It’s a concession to the reality that the normal restrictions of Rust disallow some semantically valid programs that you might otherwise want to write. The safeguards work great in most cases, but in some they’re overly restrictive.

In practice, the overwhelming majority of code is able to be written in safe Rust and the compiler can have your back. The majority of the rest is for performance reasons, interacting with external functions like C libraries over FFI, or expressing semantics that safe Rust struggles with (e.g., circular references).

stavros a day ago | parent [-]

OK but the title says "in safe Rust". Am I misunderstanding something? All the replies here are saying how it's allowed in unsafe Rust, which is not what the title says.

ndiddy a day ago | parent | next [-]

If code in an unsafe block triggers undefined behavior, then the assumptions the compiler makes regarding safety will no longer be true, and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe. This is what's happening in the example the person on Github wrote in the issue.

weinzierl a day ago | parent [-]

Exactly and "[...]and purely safe code (code with no unsafe blocks) is no longer guaranteed to be safe" hits the nail on the head.

I take issue with the phrasing of OP's title: "allows for UB in safe rust". AFAIK there are compiler bugs that allow UB in safe Rust, but this is not what is happening here. We have UB in an unsafe block (which is to be expected) which enables an issue outside in safe code. What is your opinion? Is calling this "UB in safe Rust" justified?

tialaramex 9 hours ago | parent | next [-]

I'd prefer the phrasing "UB from safe Rust".

The UB is† always in the Unsafe Rust, but that's not necessarily a problem, the problem was that we caused it from our safe Rust and that's definitely not OK.

† Soundness bugs are known to exist in Rust, but for the known ones you have to be really asking for it, so it's not plausible that they'd impact you by accident.

mswphd a day ago | parent | prev | next [-]

it is, but it's a little confusing here because the library/consumer of the library are the same person.

This is a bug in the library, namely in Bun's PathString implementation. The bug is a soundness issue, precisely because usage of Bun's PathString implementation allows for UB in safe rust. Now this buggy library isn't that big of a concern for the community, because Bun is the only consumer. It's not also an indication of a compiler bug, because Bun's library is implemented using unsafe rust. But the fundamental issue is that usage of Bun's PathString implementation allows for UB in safe rust, and is therefore (clearly) unsound.

fc417fc802 a day ago | parent | prev [-]

Suppose I initialize something in an unsafe block. I promise the compiler that it's properly initialized, but in reality it isn't. Importantly I never make use of the garbage values in the unsafe block so no UB has occurred - yet.

Later, the garage enters otherwise safe machinery and triggers UB. UB has now happened in safe rust as a result of my earlier contractual violation.

You can extend this example to other scenarios where UB in unsafe begets further UB in safe later on.

tialaramex 9 hours ago | parent [-]

Actually I think Rust says UB has occurred if you return a T which isn't properly initialized. For example if you unsafely MaybeUninit<u32>::assume_init() then Rust says that's UB even though all possible bit patterns for u32 are valid.

This is because that's not going to emit any machine code at all, and yet it will cause LLVM to do very nasty things, for example code which either prints "Odd" or "Even" by examining the integer may now print neither because that's faster and this uninitialized integer isn't odd, yet it also isn't even so...

mswphd a day ago | parent | prev | next [-]

`unsafe` isn't viral. I can write

fn safe_function(...) -> (...) {

    // do unsafe things here
}

then `safe_function` can be called from safe code, and still trigger UB. This wouldn't be a soundness issue in the rust compiler, but instead a bug in safe_function.

There are many reasons you might want to do that. In particular, it's very common in rust to have a library define some data structure that uses unsafe under-the-hood, but checks whatever invariants it needs to, and provides solely safe methods to external callers. Rust's `String` type is like this: it's (roughly) a `Vec<u8>`, e.g. heap-allocated bytes. It has the additional invariant that these bytes correspond to valid UTF8 though. See for example `push_str_slice`, which (roughly) concatenates 2 strings.

https://doc.rust-lang.org/src/alloc/string.rs.html#1107

It does the following thing

1. reserve enough space for the concatenated string within the source string 2. does some pointer arithmetic and a call to Rust's equivalent to `memcpy` (unsafe) 3. re-casts this pointer to a string object without checking that it's valid utf8 (unsafe).

While these individual calls are unsafe, `push_str_slice` checks that in this particular situation they are safe, so the stdlib authors do not mark `push_str_slice` as unsafe. It has no invariants that must be maintained by external callers.

rcxdude a day ago | parent | prev [-]

Unsafe blocks are you saying to the compiler 'trust me bro, I know this is safe'. But often that relies on some property of the code being true in order for it to actually be safe. Generally speaking, the expectation in rust is that you either encapsulate the code that enforces whatever property you are relying on behind a safe interface, so that it's not possible for other code to use it unsafely, or that you mark the interface itself unsafe so that it's obvious that the code using that interface needs to maintain that property itself. Rust code that doesn't do this will generally be considered buggy by most rust programmers (e.g. if you find a use of safe interfaces in the stdlib that causes a memory safety violation, then you should file a ticket with the rust team), but this is essentially only a social convention of where the blame lies for a bug, not something that compiler itself can enforce (and, for example, you can violate memory safety in rust with only safe std interface by abusing OS interfaces like /proc/self/mem but this is something that most people don't think can be reasonably fixed). The main reason that rust as a language is better in this regard is that it gives much better tools for being able to express that safe interface without giving up performance and that it has the means to mark and encapsulate this safe/unsafe distinction.

Here's some links on this topic which have some examples:

https://doc.rust-lang.org/nomicon/working-with-unsafe.html https://www.ralfj.de/blog/2016/01/09/the-scope-of-unsafe.htm...