Remix.run Logo
antirez 3 days ago

"But almost all programs have paths that crash, and perhaps the density of crashes will be tolerable."

This is a very odd statement. Mature C programs written by professional coders (Redis is a good example) basically never crash in the experience of users. Crashing, in such programs, is a rare occurrence mostly obtained by attackers on purpose, looking for code paths that generate a memory error that - if the program is used as it should - are never reached.

This does not mean that C code never segfaults: it happens, especially when developed without care and the right amount of testing. But the code that is the most security sensitive, like C Unix servers, is high quality and crashes are mostly a security problem and a lot less a stability problem.

jgraham 2 days ago | parent | next [-]

Notice that it says "almost all programs" and not "almost all _C_ programs".

I think if you understand the meaning of "crash" to include any kind of unhandled state that causes the program to terminate execution then it includes things like unwrapping a None value in Rust or any kind of uncaught exception in Python.

That interpretation makes sense to me in terms of the point he's making: Fil-C replaces memory unsafety with program termination, which is strictly worse than e.g. (safe) Rust which replaces memory unsafety with a compile error. But it's also true that most programs (irrespective of language, and including Rust) have some codepaths in which programs can terminate where the assumed variants aren't upheld, so in practice that's often an acceptable behaviour, as long as the defect rate is low enough.

Of course there is also a class of programs for which that behaviour is not acceptable, and in those cases Fil-C (along with most other languages, including Rust absent significant additional tooling) isn't appropriate.

pizlonator 2 days ago | parent [-]

> Rust which replaces memory unsafety with a compile error

Rust uses panics for out-of-bounds access protection.

The benefit of dynamic safety checking is that it's more precise. There's a large class of valid programs that are not unsafe that will run fine in Fil-C but won't compile in Rust.

thomasmg 3 days ago | parent | prev | next [-]

I don't think it's odd statement. It's not about segfaults, but use-after-free (and similar) bugs, which don't crash in C, but do crash in Fil-C. With Fil-C, if there is such a bug, it will crash, but if the density of such bugs is low enough, it is tolerable: it will just crash the program, but will not cause an expensive and urgent CVE ticket. The bug itself may still need to be fixed.

The paragraph refers to detecting such bugs during compilation versus crashing at runtime. The "almost all programs have paths that crash" means all programs have a few bugs that can cause crashes, and that's true. Professional coders do not attempt to write 100% bug-free code, as that wouldn't be efficient use of the time. Now the question is, should professional coders convert the (existing) C code to eg. Rust (where likely the compiler detects the bug), or should he use Fil-C, and so safe the time to convert the code?

zozbot234 3 days ago | parent | next [-]

Doesn't Fil-C use a garbage collector to address use-after-free? For a real use-after-free to be possible there must be some valid pointer to the freed allocation, in which case the GC just keeps it around and there's no overt crash.

thomasmg 2 days ago | parent [-]

Yes, Fil-C uses some kind of garbage collector. But it can still detect use-after-free: In the 'free' call, the object is marked as free. In the garbage collection (in the mark phase), if a reference is detected to an object that was freed, then the program panics. Sure, it is also possible to simply ignore the 'free' call - in which case you "just" have a memory leak. I don't think that's what Fil-C does by default however. (This would be more like the behavior of the Boehm GC library for C, if I understand correctly.)

jitl 2 days ago | parent | next [-]

I don’t think that’s how it works. Once an object is freed, any access will crash. You’re allowed to still have a reference to it.

thomasmg 2 days ago | parent [-]

Ok, you are right. My point is, yes it is possible to panic on use-after-free with Fil-C. With Fil-C, a life reference to a freed object can be detected.

cyberax 2 days ago | parent | prev [-]

A free()-d object that is NOT garbage-collected during the next collection is a bug in itself.

pizlonator 2 days ago | parent | next [-]

The Fil-C GC will only GC a free'd object if it succeeds at repointing all capabilities to it to point at the free singleton instead.

Don't worry, it's totally sound.

thomasmg 2 days ago | parent | prev [-]

I'm not sure what you mean. Do you mean there is a bug _in the garbage collection algorithm_, if the object is not freed in the very next garbage collection cycle? Well, it depends: the garbage collection could defers collection of some objects until memory is low. Multi-generation garbage collection algorithm often do this.

cyberax a day ago | parent [-]

You can defer the actual freeing of the object until at least one GC pass finishes. Then alert if any of them are still reachable.

cesarb 2 days ago | parent | prev [-]

> it will just crash the program, but will not cause an expensive and urgent CVE ticket.

Unfortunately, security hysteria also treats any crash as "an expensive and urgent CVE ticket". See, for instance, ReDoS, where auditors will force you to update a dependency even if there's no way for a user to provide the vulnerable input (for instance, it's fixed in the configuration file).

thomasmg a day ago | parent [-]

I agree security issues are often hyped nowadays. I think this is often due to two factors: (A) security researches get more money if they can convince people a CVE is worse. So of course they make it sound extremely bad. (B) security "review" teams in software companies do the least amount of work, and so it's just a binary "is a dependency with a vulnerability used yes/no" and then force the engineering team to update the dependency, even thought its useless. I have seen (was involved) in a number of such cases. This is wasting a lot of time. Long term, this can mean the engineering team will try to reduce the dependencies, which is not the worst of outcomes.

mjw1007 3 days ago | parent | prev | next [-]

I think what you've written is pretty much what the "almost all programs have paths that crash" was intended to convey.

I think "perhaps the density of crashes will be tolerable" means something like "we can reasonably hope that the crashes from Fil-C's memory checks will only be of the same sort, that aren't reached when the program is used as it should be".

baq 3 days ago | parent | prev | next [-]

I think the point is that Fil-C makes programs crash which didn't crash before because use-after-free didn't trigger a segfault. If anything, I'd cite Redis as an example that you can build a safe C program if you go above and beyond in engineering effort... most software doesn't, sadly.

zozbot234 3 days ago | parent | next [-]

Redis uses a whole lot of fiddly data structures that turn out to involve massive amounts of unsafe code even in Rust. You'd need to use something like Frama-C to really prove it safe beyond reasonable doubt. (Or the Rust equivalents that are currently in the works, and being used in an Amazon-funded effort to meticulously prove soundness of the unsafe code in libstd.) Compiling it using Fil-C is a nice academic exercise but not really helpful, since the whole point of those custom data structures is peak performance.

throwawaymaths 2 days ago | parent | prev [-]

sel4 is the example of building a safe C program if you go above and beyond in effort.

It's provably safer than rust, e.g.

gf000 2 days ago | parent [-]

There are obviously multiple levels of correctness. Formal verification is just the very top of that spectrum, but it does comes at extraordinary effort.

throwawaymaths 2 days ago | parent [-]

did i read "above and beyond"

heisenbit 2 days ago | parent | prev | next [-]

It is a question of probability and effort. My personal estimation rule for my type of projects is it takes 3 times longer from my prototype to something I‘m comfortable having others use it and another factor to get to an early resemblance of a product. A recent interview I read an AI expert said each 9 in terms of error probability is the same effort.

Most software written does not serve a serious nation level user base but caters to so a relatively small set of users. The effort spent eradicating errors needs to be justified by the effort of workarounds, remediation work and customer impact. Will not be fixed can a rationale decision.

razighter777 2 days ago | parent | prev | next [-]

I think the focus should be on tools with high surface area that enforce security bounadaries. Especially those where performance is not so important. Like sudo, openssh, polkit, PAM modules. It would be make a lot more sense than these half-baked rust rewrites that just take away features. (I'm biased I personally had a backup script broken by uutils) I think rewrites in rust need 100% bit for bit feature parity before replacing the battletested existing tools in c userland. I say this as someone who writes rust security tools for linux.

kstrauser 2 days ago | parent | prev | next [-]

A lot of my programs crash, and that’s a deliberate choice. If you call one of them like “./myprog.py foo.txt”, and foo.txt doesn’t exist, it’ll raise a FileNotFound exception and fail with a traceback. Thing is, that’s desirable here. If could wrap that in a try/except block, but I’d either be adding extraneous info (“print(‘the file does not exist’); raise”) or throwing away valuable info by swallowing the traceback so the user doesn’t see the context of what failed.

My programs can’t do anything about that situation, so let it crash.

Same logic for:

* The server in the config file doesn’t exist.

* The given output file has bad permissions.

* The hard drive is full.

Etc. And again, that’s completely deliberate. There’s nothing I can do in code to fix those issues, so it’s better to fail with enough info that the user can diagnose and fix the problem.

That was in Python. I do the same in Rust, again, deliberately. While of course we all handle the weird cases we’re prepared to handle, I definitely write most database calls like “foo = db.exec(query)?” because if PostgreSQL can’t execute the query, the safest option is to panic instead of trying foolhardily to get back the last known safe state.

And of course that’s different for different use cases. If you’re writing a GUI app, it makes much more sense to pop up a dialog and make the user go fix the issue before retrying.

estebank 2 days ago | parent [-]

That strategy is ok if the expected user is a fellow developer. For anyone else, a back trace is spilling your guts on the floor and expecting the user to clean up. A tool for wider usage should absolutely detect the error condition and explain the problem as succinctly as possible with all the information necessary for a human to be able to solve the issue. Back trace details are extraneous information only useful for a software developer familiar with the codebase. There's of course a difference when talking about unexpected incorrect state, something like a fatal filesystem or allocation error that shouldn't happen unless the environment is in an invalid state, a nice human error then is not as necessary.

theamk 2 days ago | parent [-]

Hard disagree, esp re "only useful for a software developer familiar with the codebase". Too many "user-friendly" apps create "helpful" error messages that hide essential information.

"Cannot open data file: Not found" and that's it - no more context (such as a filename). Even for a user with no coding experience, this is absolutely useless, you cannot find good explanation for it on the Google. A backtrace might look ugly, but at least would have a much higher chance to point to a useful forum post. And now with AI advances, AI can analyze backtraces and sometimes give an explanation (not very often, but there is no alternatives...)

So by all means, add a nice, human-readable error message for a few common cases that user likely to encounter, such as "internet down" or "wrong type of input file"... but leave backtraces on for all other unexpected cases, like "server returned nonsense", "out of disk space", etc....

estebank 2 days ago | parent [-]

A backtrace won't give you the values of variables needed to identify that.

I did specify "succinctly as possible with all the information necessary for a human to be able to solve the issue". An error that doesn't have "all the information necessary" is a bad error. It can be worse than a backtrace. That doesn't mean a backtrace is good.

zbentley 2 days ago | parent | prev | next [-]

> Mature C programs written by professional coders (Redis is a good example) basically never crash in the experience of users

That is a very difficult assertion to validate. It might well be true! But so many conversations about memory safety and C/C++ devolve to assertions with “get gud” at one extreme and “change platforms to one that avoids certain errors” at the other.

Without data, even iffy data, those groups talk past each other. Are memory-error CVE counts on C projects the data we need here? Is there some other quantitative measure of real world failures that occur due to memory unsafety?

This is all by way of saying that I’d love to see some numbers there. That’s not on you, or meant to question your claim. As you implied, errors in code don’t always translate to errors in behavior for users.

It just always sucks to talk about this because broad-spectrum quantitative data on software error rates and their causes is lacking.

jancsika 2 days ago | parent [-]

> That is a very difficult assertion to validate.

Keep in mind he's limited his assertion to UX. That narrow point is almost certainly true in the case of his C codebase.

But read the rest-- he literally wrote how security researchers find memory safety errors in C codebases!

Dollars to donuts he came up with this UX-on-accident vs. security-researcher-on-purpose bug dichotomy in his head as a response to some actual CVE in his own C codebase.

In short, he's agreeing with the research that led to programming languages like Rust in the first place. And even though he's found an odd way to agree, there's no assertion to validate here (at least wrt security).

Edit: clarifications

torginus 2 days ago | parent | prev | next [-]

I heard this argument about Rust vs. C that Rust might be memory safe, but the reason why memory safety issues are so prominent in C programs, is that basically every other kind of problem has been fixed throughout its lifetime, so these are the only kind of issues that remain. Both in terms of security and stability.

This is very much not the case for programs that are much newer, even if they are written in Rust they still need years of maturation before they reach the quality of older C programs, as Rust programs suffer from non-memory safety issues just as much. That's why just rewriting things in Rust isn't a panacea.

The perfect example of this the Rust coreutils drama that has been going on.

wat10000 2 days ago | parent | next [-]

I don't agree with that assessment at all. The reason memory safety issues are so prominent is that they are extremely likely to be exploitable. Of course you can write exploitable bugs in any language, but most bug classes are unlikely to be exploitable. A bug that always crashes is about a trillion times less severe than a bug that allows someone else to take control of your computer.

gf000 2 days ago | parent | prev [-]

I can only quote (from the top of my head) the Android team's findings, that having a C++ codebase extended with Rust cut down significantly on the number of memory safety-related issues. The reasoning was that since the stable C++ codebase was no longer actively changed, only patched, and new features were implemented in Rust, the C++ codebase could go through this stabilization phase where almost all safety issues are found.

zozbot234 3 days ago | parent | prev [-]

How many "mature C programs" try to recover in a usable way when malloc() returns NULL? That's a crash - a well-behaved one (no UB involved) hence not one that would be sought by most attackers other than a mere denial of service - but still a crash.

okanat 3 days ago | parent | next [-]

On 64-bit systems (esp Linux ones) malloc almost never returns NULL but keeps overallocating (aka overcommiting). You don't get out of memory errors / kills until you access it.

sibellavia 3 days ago | parent [-]

Exactly. Also, it is extremely rare.

1718627440 2 days ago | parent | prev [-]

> when malloc() returns NULL? That's a crash - a well-behaved one (no UB involved)

Wrong, dereferencing a NULL pointer is UB.

sph 2 days ago | parent [-]

Which on UNIXes is a crash because the zero page is unmapped so you get a SIGSEGV

1718627440 a day ago | parent [-]

Unless the compiler optimized the access away, or replaced it with a different address.