Remix.run Logo
Asooka 19 hours ago

Back when strncpy was written there was no undefined behaviour (as the compiler interprets it today). The result would depend on the implementation and might differ between invocations, but it was never the "this will not happen" footgun of today. The modern interpretation of undefined behaviour in C is a big blemish on the otherwise excellent standards committee, committed (hah) in the name of extremely dubious performance claims. If "undefined" meaning "left to the implementation" was good enough when CPU frequency was measured in MHz and nobody had more than one, surely it is good enough today too.

Also I'm not sure what you mean with C successor languages not having undefined behaviour, as both Rust and Zig inherit it wholesale from LLVM. At least last I checked that was the case, correct me if I am wrong. Go, Java and C# all have sane behaviour, but those are much higher level.

Cyph0n 18 hours ago | parent | next [-]

The problem isn't undefined behavior per se; I was using it as an example for strncpy. Rust is a no - in fact, the goal of (safe) Rust is to eliminate undefined behavior. Zig on the other hand I don't know about.

In general, I see two issues at play here:

1. C relies heavily on unsized pointers (vs. fat pointers), which is why strncpy_s had to "break" strncpy in order to improve bounds checks.

2. strncpy memory aliasing restrictions are not encoded in the API and can only be conveyed through docs. This is a footgun.

For (1), Rust APIs of this type operate on sized slices, or in the case of strings, string slices. Zig defines strings as sized byte slices.

For (2), Rust enforces this invariant via the borrow checker by disallowing (at compile-time) a shared slice reference that points to an overlapping mutable slice reference. In other words, an API like this is simply not possible to define in (safe) Rust, which means you (as the user) do not need to pore over the docs for each stdlib function you use looking for memory-related footguns.

loeg 10 hours ago | parent [-]

> For (2), Rust enforces this invariant via the borrow checker by disallowing (at compile-time) a shared slice reference that points to an overlapping mutable slice reference.

At least the last time I cared about this, the borrow checker wouldn't allow mutable and immutable borrows from the same underlying object, even if they did not overlap. (Which is more restrictive, in an obnoxious way.)

Cyph0n 9 hours ago | parent [-]

Do you mean borrows for different fields of a struct? If so, that’s handled today - it’s sometimes called “splitting borrows”: https://doc.rust-lang.org/nomicon/borrow-splitting.html

loeg 9 hours ago | parent [-]

Not exactly -- independent subranges of the same range (as would be relevant to something like memcpy/memmove/strcpy). E.g.,

https://godbolt.org/z/YhGajnhEG

It's mentioned later in the same article you shared above.

oneshtein 14 minutes ago | parent | next [-]

  fn f() {
    let mut v = vec![1, 2, 3, 4, 5];
    let (header, tail) = v.split_at_mut(1);
    b(&header[0], &mut tail[0]);
  }
Cyph0n 9 hours ago | parent | prev [-]

Gotcha. There is a split_at_mut method that splits a mutable slice reference into two. That doesn’t address the problem you had, but I think that’s best you can do with safe Rust.

loeg 8 hours ago | parent [-]

Yeah. It just isn't something the borrow checker natively understands.

tialaramex 10 hours ago | parent | prev [-]

Rust safe subset doesn't have UB. At all. So long as you never write the "unsafe" keyword you're fine, the compiler will check you are obeying all of the language rules at all times.

Whereas in C, oops, sorry, you broke a rule you didn't even know existed and so that's Undefined Behaviour left and right. Some of it you could argue falls into the category you're describing, where in a better world it should have been made Implementation Defined, not UB, and too bad. However lots of it is just because the language was designed a very long time ago and prioritized ease of implementation.

If you wish the language was properly defined, you should use (safe) Rust. If you just wish that when you write nonsense the compiler should somehow guess what you meant and do that, you're not actually a programmer, find a practice which suits you better - take up knitting, learn to paint, something like that.