Remix.run Logo
bbminner an hour ago

I am far from a security expert, but from the number of "we missed a single line C check across files during refactoring" critical security bugs discovered on a regular basis these days, the whole premise of a "giant secure open source C codebase" seems questionable. It is not specific to C of course, but invariants are arguably even harder to enforce and track consistently (esp under changes to code) in C. Unsure if FP with invariants encoded in types is a practically feasible scalable solution either. Model checking? [LLM] fuzzing? Fewer primitives with clear boundaries? Is that how seLinux was "checked"?

fsddfsdfssdf an hour ago | parent | next [-]

While I can see the shortcomings of C and generally don't recommend it for new projects I don't see this particular bug as a good example of something Rust's borrow checker or some other language's type system will catch. I don't think even static analyzers can catch this.

It's basically something like this:

original: DoTheThing()

new: DoTheThingSlightlyDifferentButKeepMyCredentialsAlive()

fix: DoTheThingSlightlyDifferentButDoInFactNOTKeepMyCredentialsAlive()

In my experience a substantial portion of gnarly bugs come down to a violation of a high-level system invariant and those do not strike me as something that can be automated. Even with something like Lean you can prove your program satisfies certain properties but you need to have thought about those properties in the first place. The proof doesn't discover the invariant for you.

If you'd had thought about the relevant security property you could have written a regression test for it which is not hard. IMO the really hard part isn't expressing the implementation safely, but it's the realization that this was a property the implementation needed to preserve.

bbminner 19 minutes ago | parent | next [-]

I agree re Rust vs C - this is not (only) a language issue. What would (roughly) the invariant be here?

In another thread comment below i argue that maybe the system (OS) itself is so complex that it lacks clear contract / the contract evolves too quickly over time (as other parts of the code need to change the given piece of code to extend it to their use case) and that defies clear encoding?

Or we lack easy enough means to describe specs? I tried reading jepsen spec earlier today and despite it being an "integration test" of sorts, it is far from "simple".

Can an entire OS or a system of comparable complexity be decomposed into objects simple enough that their entire intended behavior (with all edge cases) can be explained in a paragraph of human text + half a screen of dense behavioral "spec" - if i do X and do Y, Z should come out / hold _no matter what happens in-between_. Or that's what asserts + fuzzing is effectively supposed to do? Is there a clear distinction between invalid input and failed invariant in typical C code? I guess error code vs seg fault?

estebank 36 minutes ago | parent | prev [-]

This is in effect a state machine, and when you have a type system more complex than C's you can encode state transitions in the type system (either by having state transitions explicitly return a new return type or by using sum types). You still need to architect the system to encode the invariants in types. No language will fix all logic bugs for free. But you can leverage language features to reduce their number.

fsddfsdfssdf 27 minutes ago | parent [-]

> You still need to architect the system to encode the invariants in types.

That's the problem though, right? If it's pointed out we all agree the "do not keep credentials alive" is a property that should hold and we can leverage whatever the environment offers to help preserve it. I fully agree modern languages have amazing support for this, but in C you can still run tests. Let's just say I don't think the language's inability to express logic of this kind held all those involved back from testing for it. I personally find "we just didn't think of it" much more likely.

That said, I am not a fan of C and recommend leveraging whatever fantastic modern tooling is available to you.

WhitneyLand an hour ago | parent | prev | next [-]

The premise of a secure open codebase is fine.

The problem is being more auditable does not automatically make it more audited.

There have to be enough people with skill taking enough time to work on it.

pixl97 an hour ago | parent [-]

If you think open source is bad, wait till you see enterprise code. I'm talking full auth bypass due to the stupidest crap. You can do that in any language if you have fools working on the code base.

danudey 38 minutes ago | parent | next [-]

Even security code. Fortinet, a vendor whose entire thing is security for your network, is consistently getting caught out with default passwords, backdoors, etc.

https://community.spiceworks.com/t/hard-coded-password-backd...

This sort of thing leads to every kind of exploit, like

https://www.linkedin.com/pulse/half-worlds-fortinet-firewall...

620gelato 37 minutes ago | parent | prev [-]

I explicitly make sure services I lead have Integration tests in CI pipeline to validate the "negative paths" against all APIs with missing, invalid, un-authorised identities, expired, un-authenticated tokens. Of course that still doesn't cover every surface, but even that gets sideways glances from some folks who think we should just test happy paths and why we're testing for access controls in Integration tests.

pjdesno 36 minutes ago | parent | prev | next [-]

To translate to Rust, it would have been "we missed a single line Rust check"...

This is a bug involving intersecting concerns and a deficit of cross-domain knowledge. It probably would have been the same in Lisp or assembly language.

russdill an hour ago | parent | prev | next [-]

The lesson here is that if a feature (at a minimum) does not have a associated test case, it is not actually a feature.

fsddfsdfssdf 40 minutes ago | parent [-]

Yes, I agree. I find the addition of the regression test the true long-term fix. The code is just an opaque incantation that may or may not preserve some property we find worth preserving and we have no way of knowing it keeps preserving it over time as other parts of the system change.

The test actually proves it and while it too can change it has more staying power because it's expressed at a higher level of abstraction ("random arcane weird C shit" in the case of code versus "does this property hold" in the case of a regression test).

bbminner 12 minutes ago | parent [-]

I have not looked into this specific issue, but are we sure that a regression here could have been avoided via a localized test? Maybe issues seem to arise from A implementing a feature with tests. B seeing that A lacks some functionality and adding it (potentially with tests), C seeing this (extra) functionality in A, and using in unintended ways not covered by tests (or in an unintended environment) + multiply by many layers of this A-B-C story up and down the stack.

moritzwarhier an hour ago | parent | prev | next [-]

The whole premise of a "giant secure open source C codebase" seems questionable

Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.

In other words, there is no strict definition of what is a security issue.

bbminner 32 minutes ago | parent [-]

On the other hand, it is (both halting and spec adherence) are checkable under compute and space constraints though? :) I'd say the biggest hurdle are means to describe the spec in way that is easy enough for a human to produce to make it feasible.

Not a DB person either, but things like TLA+ seem very hard to write even with LLMs. Behavioral tests with an enumerable number of random paths to take (aka model checking - eg jepsen) seem more feasible. Although you can't check internal properties of the system (string `pass` or any of it's copies or parts are not held anywhere in memory at any point between lines A and B) unless we can check that two memory dumps are indistinguishable with different pass strings (assuming we abstracted away storage devices in a test environment).. Also not sure if it's "easy enough" to write such tests either.

Maybe the reason is that OS domain objects / primitives are too complex and not "isolatable" enough / lack a clear contract at all? (Hence multi file refactorings that break invariants.)

lazide an hour ago | parent | prev | next [-]

In open source, someone (many, many) someone’s can at least check.

Closed source…..

Twirrim an hour ago | parent [-]

Not sure why you're getting downvoted, this is the entire point of open source.

Does such a bug exist in Windows? OSX? Who checks? If someone finds the key in memory, can they tell what conditions might be causing it and where?

Their only recourse under those situations is to hand it off to the OS Vendor and trust that what they implement does solve the problem, and trust that it wasn't a deliberate back-door that is now being replaced by another back-door.

charcircuit an hour ago | parent [-]

Security researchers find security bugs in closed source operating systems all of the time.

lazide an hour ago | parent [-]

Yup, it’s just harder to know for sure.

pixl97 an hour ago | parent [-]

Oh, and large companies quite often fix these horrific issues silently, especially in online services where the customer can't diff bins. We're talking auth bypasses and RCE's that you'll never know about.

deepsun 36 minutes ago | parent | prev [-]

"Million eyeballs" argument was always kinda meh.