Remix.run Logo
pizlonator 3 days ago

Not saying you’d want both. Just answering why MTE isn’t a path to CHERI

But here’s a reason to do both: CHERI’s UAF story isn’t great. Adding MTE means you get a probabilistic story at least

bri3d 3 days ago | parent | next [-]

True! On the flip side, MTE sucks at intra-object corruption: if I get access to a heap object with pointers, MTE doesn't affect me, I can go ahead and write to that object because I own the tag.

Overall my _personal_ opinion is that CHERI is a huge win at a huge cost, while MTE is a huge win at a low cost. But, there are definitely vulnerability classes that each system excels at.

pizlonator 3 days ago | parent [-]

I think the intra object issue might be niche enough to not matter.

And CHERI fixes it only optionally, if you accept having to change a lot more code

jrtc27 3 days ago | parent | next [-]

Where studies suggest "a lot" is sub-0.1%. For example, https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f2... was a study into porting 6 million lines of C and C++ to run a KDE+X11 desktop stack on CHERI, and saw 0.026% LoC change, or ~1.5k LoC out of ~6 million LoC, all done in just 3 months by one person. That's even an overestimate, because it includes many changes to build systems just to be able to cross-compile the projects. It's not nothing, but it's the kind of thing where a single engineer can feasibly port large bodies of code. Yes, certain systems code will be worse (like JITs), but the vast majority of cases are not that, and even those are still feasible (e.g. we have people working with Chromium and V8).

pizlonator 2 days ago | parent [-]

Does that study include enabling intra object overflow protection, or not?

When I say that this optional feature would force you to change a lot more code I’m comparing CHERI without intra object overflow protection to CHERI with intra object object overflow protection.

Finally, 6 million lines of code is not that impressive. Real OSes are measured in billions

jrtc27 2 days ago | parent [-]

> Does that study include enabling intra object overflow protection, or not? > > When I say that this optional feature would force you to change a lot more code I’m comparing CHERI without intra object overflow protection to CHERI with intra object object overflow protection.

Sorry, I misinterpreted what you were saying. No, that's not with subobject bounds. If you want that then yes there is more incompatibility, because C does not have a good subobject memory model. That's not really because there's anything wrong with CHERI, it's just because the language itself is at odds in places with doing that kind of enforcement with any technology. But, if you're willing to incur that additional friction (as we do for our pure-capability kernel in CheriBSD), you can enable it, and it can protect against additional vulnerabilities that other security technologies fundamentally cannot. We even provide a sliding scale of subobject bounds enforcement, where each of the three levels restricts bounds in more cases at the expense of compatibility. The architecture gives you the flexibility to decide what software model you want to enforce with it.

> Finally, 6 million lines of code is not that impressive.

We have far more than that ported, that was just one case study done in a few months by one developer. FreeBSD alone is, by my very rough estimation cloc that excludes LLVM, about 14 million lines of C and C++ (yes, I'm not distinguishing architecture-specific code and all kinds of other considerations, but it's close enough and gives an order of magnitude for the purposes of this conversation), and we have FreeBSD ported. Not to mention our work on, say, Chromium and V8 (Chromium being another set of 10s of millions of lines of code, again tractable with the engineering effort of just a few members of our research group).

> Real OSes are measured in billions

Citation needed. The Linux kernel is only a bit over 40 million lines of code these days. Real systems may well approach the billions of lines of code running once you factor in all the libraries, daemons and applications running on top of it, but that is not all low-level OS code that needs the kind of porting an OS or runtime does. Even if it were a billion lines of code, though, extrapolating at 0.026% that would be 260 kLoC changed, which isn't that scary a number.

Even V8, which is about the worse case you could possibly have (highly-stylised code written in a way that uses types in CHERI-unfriendly ways; a language runtime full of pointers; many (about 6?) different highly-optimised just-in-time compilers that embed deep knowledge of the ISAs and ABIs they are targeting and like to play games with pointers in the name of performance) we see (last I checked) ~0.8% LoC changed, or about 16k out of 2 million. The porting cost is real, but the numbers have never suggested to us it's at all intractable for industry.

bri3d 3 days ago | parent | prev [-]

I think I broadly agree with you. IMO tagging is practically much, much more valuable than capabilities systems modeled like CHERI.

quotemstr 3 days ago | parent [-]

Yes, but CHERI opens whole new system design possibilities, including things like ultra-cheap intra-address-space security boundaries. See https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201607...

> We have used CHERI’s ISA facilities as a foundation to build a software object-capability model supporting orders of magnitude greater compartmentalization performance, and hence granularity, than current designs. We use capabilities to build a hardware-software domain-transition mechanism and programming model suitable for safe communication between mutually distrusting software

and https://github.com/CTSRD-CHERI/cheripedia/wiki/Colocation-Tu...

> Processes are Unix' natural compartments, and a lot of existing software makes use of that model. The problem is, they are heavy-weight; communication and context switching overhead make using them for fine-grained compartmentalisation impractical. Cocalls, being fast (order of magnitude slower than a function call, order of magnitude faster than a cheapest syscall), aim to fix that problem.

This functionality revolves around two functions: cocall(2) for the caller (client) side, and coaccept(2) for the callee (service) side. Underneath they are implemented using CHERI magic in the form of CInvoke / LDPBR CPU instruction to switch protection domains without the need to enter the kernel, but from the API user point of view they mostly look like ordinary system calls and follow the same conventions, errno et al.

There's a decent chance that we get back whatever performance we pay for CHERI with interest as new systems architecture possibilities open up.

MTE helps us secure existing architectures. CHERI makes new architectures possible.

saagarjha 3 days ago | parent [-]

Yes, but this breaks mirror mappings.

jrtc27 3 days ago | parent [-]

Can you elaborate on what you perceive as broken?

saagarjha 3 days ago | parent [-]

mremap?

jrtc27 2 days ago | parent | next [-]

You may wish to read what the current pure-capability CHERI Linux user ABI specifies for mremap(), because we (primarily Arm, in conjunction with us) have thought about this, and the conclusion is not "the existence of mremap() makes CHERI undeployable". See https://git.morello-project.org/morello/kernel/linux/-/wikis...

quotemstr 2 days ago | parent | prev [-]

Add a a sliding window aliasing mode to the hardware? You'd set a page table bit saying "check capabilities not against my VA, but those VAs over there"

quotemstr 3 days ago | parent | prev [-]

Some progress on UAF though! https://dl.acm.org/doi/10.1145/3703595.3705878