Remix.run Logo
aw1621107 2 days ago

> It is worth noting that the class of bugs described here (logic errors in highly concurrent state machines, incorrect hardware assumptions)

While the bugs you describe are indeed things that aren't directly addressed by Rust's borrow checker, I think the article covers more ground than your comment implies.

For example, a significant portion (most?) of the article is simply analyzing the gathered data, like grouping bugs by subsystem:

    Subsystem        Bug Count  Avg Lifetime
    drivers/can      446        4.2 years
    networking/sctp  279        4.0 years
    networking/ipv4  1,661      3.6 years
    usb              2,505      3.5 years
    tty              1,033      3.5 years
    netfilter        1,181      2.9 years
    networking       6,079      2.9 years
    memory           2,459      1.8 years
    gpu              5,212      1.4 years
    bpf              959        1.1 years

Or by type:

    Bug Type         Count  Avg Lifetime  Median
    race-condition   1,188  5.1 years     2.6 years
    integer-overflow 298    3.9 years     2.2 years
    use-after-free   2,963  3.2 years     1.4 years
    memory-leak      2,846  3.1 years     1.4 years
    buffer-overflow  399    3.1 years     1.5 years
    refcount         2,209  2.8 years     1.3 years
    null-deref       4,931  2.2 years     0.7 years
    deadlock         1,683  2.2 years     0.8 years
And the section describing common patterns for long-lived bugs (10+ years) lists the following:

> 1. Reference counting errors

> 2. Missing NULL checks after dereference

> 3. Integer overflow in size calculations

> 4. Race conditions in state machines

All of which cover more ground than listed in your comment.

Furthermore, the 19-year-old bug case study is a refcounting error not related to highly concurrent state machines or hardware assumptions.

johncolanduoni 2 days ago | parent | next [-]

It depends what they mean by some of these: are the state machine race conditions logic races (which Rust won’t trivially solve) or data races? If they are data races, are they the kind of ones that Rust will catch (missing atomics/synchronization) or the ones it won’t (bad atomic orderings, etc.).

It’s also worth noting that Rust doesn’t prevent integer overflow, and it doesn’t panic on it by default in release builds. Instead, the safety model assumes you’ll catch the overflowed number when you use it to index something (a constant source of bugs in unsafe code).

I’m bullish about Rust in the kernel, but it will not solve all of the kinds of race conditions you see in that kind of context.

aw1621107 2 days ago | parent | next [-]

> are the state machine race conditions logic races (which Rust won’t trivially solve) or data races? If they are data races, are they the kind of ones that Rust will catch (missing atomics/synchronization) or the ones it won’t (bad atomic orderings, etc.).

The example given looks like a generalized example:

    spin_lock(&lock);
    if (state == READY) {
        spin_unlock(&lock);
        // window here where another thread can change state
        do_operation();  // assumes state is still READY
    }
So I don't think you can draw strong conclusions from it.

> I’m bullish about Rust in the kernel, but it will not solve all of the kinds of race conditions you see in that kind of context.

Sure, all I'm trying to say is that "the class of bugs described here" covers more than what was listed in the parentheses.

rjzzleep 2 days ago | parent | next [-]

I'd argue, that while null ref and those classes of bugs may decrease, logic errors will increase. Rust is not an extraordinary readable language in my opinion, especially in the kernel where the kernel has its own data structures. IMHO Apple did it right in their kernel stack, they have a restricted subset of C++ that you can write drivers with.

Which is also why in my opinion Zig is much more suitable, because it actually addresses the readability aspect without bring huge complexity with it.

aw1621107 2 days ago | parent | next [-]

> I'd argue, that while null ref and those classes of bugs may decrease, logic errors will increase.

To some extent that argument only makes sense; if you can find a way to greatly reduce the incidence of non-logic bugs while not addressing other bugs then of course logic bugs would make up a greater proportion of what remains.

I think it's also worth considering the fact that while Rust doesn't guarantee that it'll catch all logic bugs, it (like other languages with more "advanced" type systems) gives you tools to construct systems that can catch certain kinds of logic bugs. For example, you can write lock types in a way that guarantees at compile time that you'll take locks in the correct order, avoiding deadlocks [0]. Another example is the typestate pattern [1], which can encode state machine transitions in the type system to ensure that invalid transitions and/or operations on invalid states are caught at compile time.

These, in turn, can lead to higher-order benefits as offloading some checks to the compiler means you can devote more attention to things the compiler can't check (though to be fair this does seem to be more variable among different programmers).

> Rust is not an extraordinary readable language in my opinion, especially in the kernel where the kernel has its own data structures.

The above notwithstanding, I'd imagine it's possible to think up scenarios where Rust would make some logic bugs more visible and others less so; only time will tell which prevails in the Linux kernel, though based on what we know now I don't think there's strong support for the notion that logic bugs in Rust are a substantially more common than they have been in C, let alone because of readability issues.

Of course there's the fact that readability is very much a personal thing and is a multidimensional metric to boot (e.g., a property that makes code readable in one context may simultaneously make code less readable in another). I don't think there would be a universal answer here.

[0]: https://lwn.net/Articles/995814/

[1]: https://cliffle.com/blog/rust-typestate/

viraptor 2 days ago | parent | prev | next [-]

Maybe increase as a ratio, but not absolute. There are various benefits of Rust that affect other classes of issues: fancy enums, better errors, ability to control overflow behaviour and others. But for actual experience, check out what the kernel code developer has to say: https://xcancel.com/linaasahi/status/1577667445719912450

oguz-ismail2 2 days ago | parent | prev | next [-]

> Zig is much more suitable, because it actually addresses the readability aspect

How? It doesn't look very different from Rust. In terms of readability Swift does stand out among LLVM frontends, don't know if it is or can be used for systems programming though.

Someone 2 days ago | parent [-]

Apple claims Swift can be used for systems programming, and is (partly) eating its own dogfood by using it in FoundationDB (https://news.ycombinator.com/item?id=38444876) and by providing examples of embedded projects (https://www.swift.org/get-started/embedded/)

I think they are right in that claim, but in making it so, at least some of the code loses some of the readability of Swift. For truly low-level code, you’ll want to give up on classes, may not want to have copy-on-write collections, and may need to add quite a few some annotations.

galangalalgol 2 days ago | parent [-]

Swift is very slow relative to rust or c though. You can also cause seg faults in swift with a few lines. I Don't find any of these languages particularly difficult to read, so I'm not sure why this is listed as a discriminator between them.

saagarjha 2 days ago | parent [-]

But those segfaults will either be memory memory safe or your lines will contain “unsafe” or “unchecked” somewhere.

galangalalgol a day ago | parent [-]

You can make a fully safe segfault the same way you can in go. Swapping a base reference between two child types. The data pointer and vft pointer aren't updated atomically, so a thread safety issue becomes a memory safety one.

saagarjha 16 hours ago | parent [-]

This is no longer allowed with strict concurrency

galangalalgol 16 hours ago | parent [-]

When did that happen? Or is it something I have to turn on? I had Claude write a swift version of the go version a few months ago and it segfaulted.

Edit: Ah, the global variable I used had a warning that it isn't concurrency safe I didn't notice. So you can compile it, but if you treat warnings as errors you'd be fine.

bcrosby95 2 days ago | parent | prev | next [-]

I would argue logic errors would decrease because you aren't spending as much time worrying about and fixing null ref and other errors.

Tarucho a day ago | parent [-]

can you prove that?

staticassertion 2 days ago | parent | prev | next [-]

Rust is a lot more explicit. I suspect logic bugs will be much less common. It's far easier to model complexity in Rust.

2 days ago | parent | prev | next [-]
[deleted]
rowanG077 2 days ago | parent | prev [-]

I would expect the opposite. C requires you to deal with extreme design complexity in large systems because the language offers nothing to help.

jiggawatts 2 days ago | parent | prev [-]

The default Mutex struct in Rust makes it impossible to modify the data it protects without holding the lock.

"Each mutex has a type parameter which represents the data that it is protecting. The data can only be accessed through the RAII guards returned from lock and try_lock, which guarantees that the data is only ever accessed when the mutex is locked."

Even if used with more complex operations, the RAII approach means that the example you provided is much less likely to happen.

yencabulator 5 hours ago | parent | prev | next [-]

> It’s also worth noting that Rust doesn’t prevent integer overflow

Add a single line to a single file and you get that enforced.

https://rust-lang.github.io/rust-clippy/stable/index.html#ar...

materielle 2 days ago | parent | prev [-]

I don’t think that the parent comment is saying all of the bugs would have been prevented by using Rust.

But in the listed categories, I’m equally skeptical that none of them would have benefited from Rust even a bit.

johncolanduoni a day ago | parent [-]

That’s not my point - just that “state machine races” is a too-broad category to say much about how Rust would or wouldn’t help.

RealityVoid a day ago | parent | prev | next [-]

Why doesn't it surprise me that the CAN bus driver bugs have the longest average lifetime?

apaprocki 2 days ago | parent | prev [-]

> Furthermore, the 19-year-old bug case study is a refcounting error

It always surprised me how the top-of-the line analyzers, whether commercial or OSS, never really implemented C-style reference count checking. Maybe someone out there has written something that works well, but I haven’t seen it.