Before the "rewrite it in Rust" comments take over the thread:

It is worth noting that the class of bugs described here (logic errors in highly concurrent state machines, incorrect hardware assumptions) wouldn't necessarily be caught by the borrow checker. Rust is fantastic for memory safety, but it will not stop you from misunderstanding the spec of a network card or writing a race condition in unsafe logic that interacts with DMA.

That said, if we eliminated the 70% of bugs that are memory safety issues, the SNR ratio for finding these deep logic bugs would improve dramatically. We spend so much time tracing segfaults that we miss the subtle corruption bugs.

▲ aw1621107 2 days ago | parent | next [-]

> It is worth noting that the class of bugs described here (logic errors in highly concurrent state machines, incorrect hardware assumptions)

While the bugs you describe are indeed things that aren't directly addressed by Rust's borrow checker, I think the article covers more ground than your comment implies.

For example, a significant portion (most?) of the article is simply analyzing the gathered data, like grouping bugs by subsystem:

    Subsystem        Bug Count  Avg Lifetime
    drivers/can      446        4.2 years
    networking/sctp  279        4.0 years
    networking/ipv4  1,661      3.6 years
    usb              2,505      3.5 years
    tty              1,033      3.5 years
    netfilter        1,181      2.9 years
    networking       6,079      2.9 years
    memory           2,459      1.8 years
    gpu              5,212      1.4 years
    bpf              959        1.1 years

Or by type:

    Bug Type         Count  Avg Lifetime  Median
    race-condition   1,188  5.1 years     2.6 years
    integer-overflow 298    3.9 years     2.2 years
    use-after-free   2,963  3.2 years     1.4 years
    memory-leak      2,846  3.1 years     1.4 years
    buffer-overflow  399    3.1 years     1.5 years
    refcount         2,209  2.8 years     1.3 years
    null-deref       4,931  2.2 years     0.7 years
    deadlock         1,683  2.2 years     0.8 years

And the section describing common patterns for long-lived bugs (10+ years) lists the following:

> 1. Reference counting errors

> 2. Missing NULL checks after dereference

> 3. Integer overflow in size calculations

> 4. Race conditions in state machines

All of which cover more ground than listed in your comment.

Furthermore, the 19-year-old bug case study is a refcounting error not related to highly concurrent state machines or hardware assumptions.

▲ johncolanduoni 2 days ago | parent | next [-]

It depends what they mean by some of these: are the state machine race conditions logic races (which Rust won’t trivially solve) or data races? If they are data races, are they the kind of ones that Rust will catch (missing atomics/synchronization) or the ones it won’t (bad atomic orderings, etc.).

It’s also worth noting that Rust doesn’t prevent integer overflow, and it doesn’t panic on it by default in release builds. Instead, the safety model assumes you’ll catch the overflowed number when you use it to index something (a constant source of bugs in unsafe code).

I’m bullish about Rust in the kernel, but it will not solve all of the kinds of race conditions you see in that kind of context.

▲ aw1621107 2 days ago | parent | next [-]

> are the state machine race conditions logic races (which Rust won’t trivially solve) or data races? If they are data races, are they the kind of ones that Rust will catch (missing atomics/synchronization) or the ones it won’t (bad atomic orderings, etc.).

The example given looks like a generalized example:

    spin_lock(&lock);
    if (state == READY) {
        spin_unlock(&lock);
        // window here where another thread can change state
        do_operation();  // assumes state is still READY
    }

So I don't think you can draw strong conclusions from it.

> I’m bullish about Rust in the kernel, but it will not solve all of the kinds of race conditions you see in that kind of context.

Sure, all I'm trying to say is that "the class of bugs described here" covers more than what was listed in the parentheses.

▲

rjzzleep 2 days ago | parent | next [-]

I'd argue, that while null ref and those classes of bugs may decrease, logic errors will increase. Rust is not an extraordinary readable language in my opinion, especially in the kernel where the kernel has its own data structures. IMHO Apple did it right in their kernel stack, they have a restricted subset of C++ that you can write drivers with.

Which is also why in my opinion Zig is much more suitable, because it actually addresses the readability aspect without bring huge complexity with it.

▲

aw1621107 2 days ago | parent | next [-]

> I'd argue, that while null ref and those classes of bugs may decrease, logic errors will increase.

To some extent that argument only makes sense; if you can find a way to greatly reduce the incidence of non-logic bugs while not addressing other bugs then of course logic bugs would make up a greater proportion of what remains.

I think it's also worth considering the fact that while Rust doesn't guarantee that it'll catch all logic bugs, it (like other languages with more "advanced" type systems) gives you tools to construct systems that can catch certain kinds of logic bugs. For example, you can write lock types in a way that guarantees at compile time that you'll take locks in the correct order, avoiding deadlocks [0]. Another example is the typestate pattern [1], which can encode state machine transitions in the type system to ensure that invalid transitions and/or operations on invalid states are caught at compile time.

These, in turn, can lead to higher-order benefits as offloading some checks to the compiler means you can devote more attention to things the compiler can't check (though to be fair this does seem to be more variable among different programmers).

> Rust is not an extraordinary readable language in my opinion, especially in the kernel where the kernel has its own data structures.

The above notwithstanding, I'd imagine it's possible to think up scenarios where Rust would make some logic bugs more visible and others less so; only time will tell which prevails in the Linux kernel, though based on what we know now I don't think there's strong support for the notion that logic bugs in Rust are a substantially more common than they have been in C, let alone because of readability issues.

Of course there's the fact that readability is very much a personal thing and is a multidimensional metric to boot (e.g., a property that makes code readable in one context may simultaneously make code less readable in another). I don't think there would be a universal answer here.

[0]: https://lwn.net/Articles/995814/

[1]: https://cliffle.com/blog/rust-typestate/

▲

viraptor 2 days ago | parent | prev | next [-]

Maybe increase as a ratio, but not absolute. There are various benefits of Rust that affect other classes of issues: fancy enums, better errors, ability to control overflow behaviour and others. But for actual experience, check out what the kernel code developer has to say: https://xcancel.com/linaasahi/status/1577667445719912450

▲

oguz-ismail2 2 days ago | parent | prev | next [-]

> Zig is much more suitable, because it actually addresses the readability aspect

How? It doesn't look very different from Rust. In terms of readability Swift does stand out among LLVM frontends, don't know if it is or can be used for systems programming though.

▲

Someone 2 days ago | parent [-]

Apple claims Swift can be used for systems programming, and is (partly) eating its own dogfood by using it in FoundationDB (https://news.ycombinator.com/item?id=38444876) and by providing examples of embedded projects (https://www.swift.org/get-started/embedded/)

I think they are right in that claim, but in making it so, at least some of the code loses some of the readability of Swift. For truly low-level code, you’ll want to give up on classes, may not want to have copy-on-write collections, and may need to add quite a few some annotations.

▲

galangalalgol 2 days ago | parent [-]

Swift is very slow relative to rust or c though. You can also cause seg faults in swift with a few lines. I Don't find any of these languages particularly difficult to read, so I'm not sure why this is listed as a discriminator between them.

▲

saagarjha 2 days ago | parent [-]

But those segfaults will either be memory memory safe or your lines will contain “unsafe” or “unchecked” somewhere.

▲

galangalalgol a day ago | parent [-]

You can make a fully safe segfault the same way you can in go. Swapping a base reference between two child types. The data pointer and vft pointer aren't updated atomically, so a thread safety issue becomes a memory safety one.

▲

saagarjha 16 hours ago | parent [-]

This is no longer allowed with strict concurrency

	▲	galangalalgol 16 hours ago \| parent [-]
		When did that happen? Or is it something I have to turn on? I had Claude write a swift version of the go version a few months ago and it segfaulted. Edit: Ah, the global variable I used had a warning that it isn't concurrency safe I didn't notice. So you can compile it, but if you treat warnings as errors you'd be fine.

▲

bcrosby95 2 days ago | parent | prev | next [-]

I would argue logic errors would decrease because you aren't spending as much time worrying about and fixing null ref and other errors.

	▲	Tarucho a day ago \| parent [-]
		can you prove that?

▲

staticassertion 2 days ago | parent | prev | next [-]

Rust is a lot more explicit. I suspect logic bugs will be much less common. It's far easier to model complexity in Rust.

▲

2 days ago | parent | prev | next [-]

[deleted]

▲

rowanG077 2 days ago | parent | prev [-]

I would expect the opposite. C requires you to deal with extreme design complexity in large systems because the language offers nothing to help.

▲

jiggawatts 2 days ago | parent | prev [-]

The default Mutex struct in Rust makes it impossible to modify the data it protects without holding the lock.

"Each mutex has a type parameter which represents the data that it is protecting. The data can only be accessed through the RAII guards returned from lock and try_lock, which guarantees that the data is only ever accessed when the mutex is locked."

Even if used with more complex operations, the RAII approach means that the example you provided is much less likely to happen.

▲ yencabulator 5 hours ago | parent | prev | next [-]

> It’s also worth noting that Rust doesn’t prevent integer overflow

Add a single line to a single file and you get that enforced.

https://rust-lang.github.io/rust-clippy/stable/index.html#ar...

▲ materielle 2 days ago | parent | prev [-]

I don’t think that the parent comment is saying all of the bugs would have been prevented by using Rust.

But in the listed categories, I’m equally skeptical that none of them would have benefited from Rust even a bit.

	▲	johncolanduoni a day ago \| parent [-]
		That’s not my point - just that “state machine races” is a too-broad category to say much about how Rust would or wouldn’t help.

▲ RealityVoid a day ago | parent | prev | next [-]

Why doesn't it surprise me that the CAN bus driver bugs have the longest average lifetime?

▲ apaprocki 2 days ago | parent | prev [-]

> Furthermore, the 19-year-old bug case study is a refcounting error

It always surprised me how the top-of-the line analyzers, whether commercial or OSS, never really implemented C-style reference count checking. Maybe someone out there has written something that works well, but I haven’t seen it.

▲ johncolanduoni 2 days ago | parent | prev | next [-]

This is I think an under-appreciated aspect, both for detractors and boosters. I take a lot more “risks” with Rust, in terms of not thinking deeply about “normal” memory safety and prioritizing structuring my code to make the logic more obviously correct. In C++, modeling things so that the memory safety is super-straightforward is paramount - you’ll almost never see me store a std::string_view anywhere for example. In Rust I just put &str wherever I please, if I make a mistake I’ll know when I compile.

▲ anon-3988 2 days ago | parent | prev | next [-]

> It is worth noting that the class of bugs described here (logic errors in highly concurrent state machines, incorrect hardware assumptions) wouldn't necessarily be caught by the borrow checker. Rust is fantastic for memory safety, but it will not stop you from misunderstanding the spec of a network card or writing a race condition in unsafe logic that interacts with DMA.

Rust is not just about memory safety. It also have algebraic data types, RAII, among other things, which will greatly help in catching this kind of silly logic bugs.

	▲	JuniperMesos 2 days ago \| parent [-]
		Yeah, Rust gives you much better tools to write highly concurrent state machines than C does, and most of those tools are in the type system and not the borrow checker per se. This is exactly what the Typestate pattern (https://docs.rust-embedded.org/book/static-guarantees/typest...) is good at modeling.

▲ the8472 2 days ago | parent | prev | next [-]

The concurrent state machine example looks like a locking error? If the assumption is that it shouldn't change in the meantime, doesn't it mean the lock should continue to be held? In that case rust locks can help, because they can embed the data, which means you can't even touch it if it's not held.

▲ kubb 2 days ago | parent | prev | next [-]

It’s hilarious that you feel the need to preemptively take control of the narrative in anticipation of the Rust people that you fear so much.

Is this an irrational fear, I wonder? Reminds me of methods used in the political discourse.

▲

Bridged7756 2 days ago | parent | next [-]

People who make that kind of remarks should be called out and shunned. The Rust community is tired of discrimination and being the butt of jokes. All the other inferior languages prey on its minority status, despite Rust being able to solve all their problems. I take offense to these remarks, I don't want my kids to grow up as Rustaceans in such a caustic society.

▲

irishcoffee 2 days ago | parent | prev [-]

> It’s hilarious that you feel the need to preemptively take control of the narrative in anticipation of the Rust people that you fear so much.

> Is this an irrational fear, I wonder? Reminds me of methods used in the political discourse.

In a sad sort of way, I think its hilarious that hn users have been so completely conditioned to expect rust evangelism any time a topic like this comes up that they wanted to get ahead of it.

Not sure who it says more about, but it sure does say a whole lot.

	▲	kubb 2 days ago \| parent [-]
		I don’t think evangelism is necessary anymore. Rust adoption is now a matter of time.

▲ john01dav a day ago | parent | prev | next [-]

Rust has more features than just the borrow checker. For example, it has a a more featured type system than C or C++, which a good developer can use to detect some logic mistakes at compile time. This doesn't eliminate bugs, but it can catch some very early.

▲

wordisside a day ago | parent [-]

[dead]

▲

aw1621107 a day ago | parent [-]

> But unsafe Rust, which is generally more often used in low-level code, is more difficult than C and C++.

I think "is" is a bit too strong. "Can be", sure, but I'm rather skeptical that all uses of unsafe Rust will be more difficult than writing equivalent C/C++ code.

	▲	wordisside a day ago \| parent [-]
		[flagged]

▲ pjc50 2 days ago | parent | prev | next [-]

> race condition in unsafe logic that interacts with DMA

It's worth noting that if you write memory safe code but mis-program a DMA transfer, or trigger a bug in a PCIe device, it's possible for the hardware to give you memory-safety problems by splatting invalid data over a region that's supposed to contain something else.

▲ BobbyTables2 a day ago | parent | prev | next [-]

I’ve seen too many embedded drivers written by well known companies not use spinlocks for data shared with an ISR.

At one point, I found serious bugs (crashing our product) that had existed for over 15 years. (And that was 10 years ago).

Rust may not be perfect but it gives me hope that some classes of stupidity will be either be avoided or made visible (like every function being unsafe because the author was a complete idiot).

▲ mgaunard 2 days ago | parent | prev | next [-]

I don't think 70% of bugs are memory safety issues.

In my experience it's closer to 5%.

▲

cogman10 2 days ago | parent | next [-]

I believe this is where that fact comes from [1]

Basically, 70% of high severity bugs are memory safety.

[1] https://www.chromium.org/Home/chromium-security/memory-safet...

	▲	saagarjha 2 days ago \| parent \| next [-]
		High severity security issues.
	▲	mgaunard 2 days ago \| parent \| prev [-]
		Right, which is a measure which is heavily biased towards memory safety bugs.

▲

stonemetal12 a day ago | parent | prev | next [-]

Using the data provided, memory safety issues (use-after-free, memory-leak, buffer-overflow, null-deref) account for 67% of their bugs. If we include refcount It is just over 80%.

▲

IshKebab 2 days ago | parent | prev | next [-]

70% of security vulnerabilities are due to memory safety. Not all bugs.

▲

tester756 2 days ago | parent | prev | next [-]

That's the figure that Microsoft and Google found in their code bases.

▲

redeeman 2 days ago | parent | prev | next [-]

probably quite a bit less than 5%, however, they tend to be quite serious when they happen

▲

mgaunard 2 days ago | parent [-]

Only serious if you care about protecting from malicious actors running code on the same host.

▲

redeeman a day ago | parent [-]

you dont? I would imagine people that runs for example a browser would have quite an interest in that

▲

mgaunard 13 hours ago | parent [-]

Browsers are sandboxed, and working on the web browsers themselves is a very small niche, as is working on kernels.

Software increasingly runs either on dedicated infrastructure or virtual ones; in those cases there isn't really a case where you need to worry about software running on the same host trying to access the data.

Sure, it's useful to have some restrictions in place to track what needs access to what resource, but in practice they can always be circumvented for debugging or convenience of development.

	▲	yencabulator 5 hours ago \| parent [-]
		Browsers are sandboxed by the kernel, and we're talking about bugs in the kernel here...

▲

nibman 2 days ago | parent | prev [-]

[dead]

▲ ramon156 2 days ago | parent | prev | next [-]

You're fighting air

▲ marcosdumay a day ago | parent | prev | next [-]

Eh... Removing concurrence bugs is one of the main selling points for Rust. And algebraic types are a really boost for situations where you have lots of assumptions.

▲ keybored 2 days ago | parent | prev | next [-]

No other top-level comments have since mentioned Rust[1] and TFA mentions neither Rust nor topics like memory safety. It’s just plain bugs.

The Rust phantom zealotry is unfortunately real.

[1] Aha, but the chilling effect of dismissing RIR comments before they are even posted...

	▲	staticassertion 2 days ago \| parent [-]
		Yes, I saw this last night and was confused because only one comment mentioned Rust, and it was deleted I think. I nearly replied "you're about to prompt 1,000 rust replies with this" and here's what I woke up to lol

▲ IshKebab 2 days ago | parent | prev | next [-]

Rust has other features that help prevent logic errors. It's not just C plus a borrow checker.

▲ paulddraper 2 days ago | parent | prev | next [-]

Rust would prevent a number of bugs, as it can model state machine guarantees as well.

Rewriting it all in Rust is extremely expensive, so it won't be done (soon).

▲

wiz21c 2 days ago | parent [-]

Expensive because of: 1/ a re-write is never easy 2/ rust is specifically tough (because it catches error and forces you to think about it for real, because it makes some contruct (linked list) really hard to implement) for kernel/close to kernel code ?

▲

IshKebab 2 days ago | parent [-]

Both I'd say. Rust imposes more constraints on the structure of code than most languages. The borrow checker really likes ownership trees whereas most languages allow any ownership graph no matter how spaghetti it is.

As far as I know that's why Microsoft rewrote Typescript in Go instead of Rust.

▲

wiz21c 19 hours ago | parent [-]

I've been using rust for several years now and I like the way you explain the essence of the issue: tree instead of spaghetti :-)

However: https://www.reddit.com/r/typescript/comments/wbkfsh/which_pr...

so looks like it's not written in go :-)

	▲	IshKebab 16 hours ago \| parent [-]
		> so looks like it's not written in go :-) That post is three years old, before the rewrite.

▲ lynx97 a day ago | parent | prev | next [-]

Thanks for raising this. It feels like evangelists paint a picture of Rust basically being magic which squashes all bugs. My personal experience is rather different. When I gave Rust a whirl a few years ago, I happened to play with mio for some reason I can't remember yet. Had some basic PoC code which didn't work as expected. So while not being a Rust expert, I am still too much fan of the scratch your own itch philosophy, so I started to read the mio source code. And after 5 minutes, I found the logic bug. Submitted a PR and moved on. But what stayed with me was this insight that if someone like me can casually find and fix a Rust library bug, propaganda is probably doing more work then expected. The Rust craze feels a bit like Java. Just because a language baby-sits the developer doesn't automatically mean better quality. At the end of the day, the dev needs to juggle the development process. Sure, tools are useful, but overstating safety is likely a route better avoided.

▲ nibman 2 days ago | parent | prev | next [-]

[dead]

▲ DobarDabar 2 days ago | parent | prev [-]

[dead]