Remix.run Logo
pizlonator 3 days ago

> The profiles technology isn't very good.

Can you be very specific about why?

Here's the argument for why profiles might work: with all of the profiles enabled, you are only allowed to use the safe subset of C++ and all of the unsafe stuff is hidden behind APIs whose implementations don't have those profiles enabled. Those projects that enable all profiles by default effectively get Swift-like or Rust-like protection.

Like, you could force all array operations to use C++ stdlib primitives, enable full hardening of the stdlib, and then have bounds safety.

And you could force all lifetime operations to use C++ stdlib refcounting primitives, and then have lifetime safety in a Swift-like way (i.e. eager refcounting everywhere).

I can imagine how this falls over but then it might just be a matter of engineering to make it not fall over.

(I'm playing devils advocate a bit since I prefer Fil-C++.)

seanbax 2 days ago | parent | next [-]

I illustrate why it won't work with a number of examples here: https://www.circle-lang.org/draft-profiles.html

To address your points: 1. The safe subset of C++ is too small to do anything with. 2. The Standard Library is not written in the safe subset.

My favorite example from the above paper is the problem of std::sort -- the compiler has no idea if both operands are iterators into the same allocation. The function is fundamentally unsafe. Which C++ profile do you turn on to make that safe? Does it ban use of std::sort? Does it ban use of all <algorithms>, all of which work on pointers/iterators that are susceptible to use-after-free UB?

The whole Standard Library is unsafe. I proposed a rigorously safe std2, and that was rejected. And now you propose a safe std2 (using refcounting primitives)--why would that fare better? What does Profiles actually propose? No change in existing code. The compiler simply finds all UB. Right.

Rusky 2 days ago | parent | prev | next [-]

If that is what profiles were actually doing, it would probably make sense. But it's not what profiles are doing.

Instead, for example, the lifetime safety profile (https://github.com/isocpp/CppCoreGuidelines/blob/master/docs...) is a Rust-like compile time borrow checker that relies on annotations like [[clang::lifetimebound]], yet they also repeatedly insist that profiles will not require this kind of annotation (see the papers linked from https://www.circle-lang.org/draft-profiles.html#abstract).

Their messaging is just not consistent with the concrete proposals they have described, let alone actually implemented.

pjmlp 2 days ago | parent [-]

Additionally they ignore field experience, I can tell that on VC++ the lifetime checker only has worked in small examples, as I was really into trying it out.

Microsoft even has blog posts admitting that only with SAL like annotations it can be improved, while keeping the usual C++ semantics.

Yet WG21 has ignored this field experience.

tialaramex 2 days ago | parent | prev | next [-]

Yes I can be specific.

Firstly, you need composition. Rust's safety composes. The safe Rust library for farm animals from Geoff, the safe Rust library for cooking recipes by Alice and the safe Rust library for web server by Bert together with my safe program code adds up to my safe Rust farm foods web site.

By having N profiles, where N is intended to be at least five and might grow arbitrarily and be user extensible, C++ guarantees it cannot deliver composition this way.

Maybe they can define some sort of composition and maybe everybody will ship software which conforms to that definition and so eventually they get composition, that's not there today, so it's just a giant unknown at best.

Secondly, of the profiles described so far, most of them are just solving parts of the single overarching problem Rust addresses, for the serial case. So if they ship that, which already involves some amount of new work yet to be finished, you need all of those profiles to get to only partial memory safety.

Which comes to the third part. Once you start down this path, as they found, you realise you actually want a borrowck. You won't call it that of course, because that would be embarrassing. But you'll need to track reference lifetimes and you'll need annotation and you end up building most of the stuff you insisted you didn't want. For now, you can handwave, this is an unsolved static analysis problem. Well, not so much unsolved as you know the solution and you don't like it.

Your idea to do the reference counting everywhere is not something WG21 has looked at, I think the perf cost is sufficiently bad that they won't even glance at it. They're also not going to ship a GC.

Finally though, C++ is a concurrent language. It has a whole memory model which doesn't even make sense if you aren't thinking about concurrency. But to deliver concurrent memory safety without Fil-C's overheads you would want... well, Rust's Send and Sync traits, which sure enough have eerie twins in the Safe C++ proposal. No attempt to solve this is even hinted at in the current profiles proposal, and they would need to work one out and if it's not Send + Sync again they'd need to prove it is correct.

silon42 2 days ago | parent | next [-]

+1 ... Rust has done pretty much the minimal thing that one needs to write C/C++ like programs safely... things must fit together to cover all scenarios (borrow checker / mut / send / sync / bounds checking). Especially for multithreading.

C++ / profiles will not be able to do much less or much different to achieve the same goals.

pizlonator 2 days ago | parent | prev [-]

I think the point is that folks will incrementally move their code towards having all profiles enabled, and that's sort of fundamental if the goal is to give folks with C++ codebases an incremental path to safety. So I don't buy your first and second points.

> Which comes to the third part. Once you start down this path, as they found, you realise you actually want a borrowck.

That's a bold statement. It might be true for some very loose definition of "borrow checker". See the super simple static analysis that WebKit uses (that presentation is now linked in at least two places on this HN discussion, so I won't link it again).

> Your idea to do the reference counting everywhere is not something WG21 has looked at, I think the perf cost is sufficiently bad that they won't even glance at it. They're also not going to ship a GC.

The point isn't to have ref counting on every pointer at the language level, but rather: if your prevent folks from calling `delete` directly (as one of the profiles does) then you're effectively forcing folks to use smart pointers.

Reference counting that happens by smart pointers is something that they would ship. We know this because it's already happened.

I imagine this would really mean that some references are ref counted (if you use shared_ptr or similar) while other references use some other policy.

> Finally though, C++ is a concurrent language. It has a whole memory model which doesn't even make sense if you aren't thinking about concurrency. But to deliver concurrent memory safety without Fil-C's overheads you would want... well, Rust's Send and Sync traits

Yeah, this might be an area where they leave a hole. Like, you might have reference counting that is only partially thread safe:

- The refcount of any object is atomic.

- The smart pointer itself is racy. So, racing on pointers can pop the protections.

If they got that far, then that wouldn't be so bad. The marginal safety advantage of Rust would be very slim at that point.

pjmlp 2 days ago | parent | next [-]

> I think the point is that folks will incrementally move their code towards having all profiles enabled, and that's sort of fundamental if the goal is to give folks with C++ codebases an incremental path to safety.

I doubt it, because the reason I favoured C++ over C back in 1993, was the safety culture, as someone coming from Turbo Pascal.

Somehow this has been deteriorating since 2000, as C++ kept getting C refugees that would rather keep using C, but work required C++ now.

Most of the hardening capabilities that are being added now, were already part of the compiler frameworks during the 1990's, e.g. Turbo Vision, OWL, MFC, CSet++, MFC, MacApp, PowerPlant,...

tialaramex 2 days ago | parent | prev [-]

So you agree then, it's technically not as good. With a lot of extra work that nobody has signed up to do, and some of which is speculative, they can't quite get to where Safe C++ was when proposed.

dminik 2 days ago | parent | prev | next [-]

As far as I'm concerned, there are two main issues with profiles:

1. They're either unimplementable or useless (too many false positives and false negatives).

I think this is pretty evident based on the fact that profiles have been proposed for a while and that no real implementation exists. Worse, out of all of the open source projects and for profit companies, noone has been able to implement any sort of static analysis that would even begin to approach the guarantees Rust makes.

2. The language doesn't give you any tools to actually write safe code.

Ok, let's say that someone actually implements safety profiles. And it highlights your usage of a standard library type. What do you do?

Safe C++ didn't require a new standard library just because. The current stdlib is riddled with safety issues that can't really be fixed and would not be fixed because of backwards compatibility.

You're stuck. And so you turn the safety profile off.

IAmLiterallyAB 3 days ago | parent | prev | next [-]

My limited understanding is. There is no safe subset (That's what was just discontinued, profiles are the alternative.)

And C++ code simply doesn't have the necessary info to make safety decisions. Sean explains it better than I can https://www.circle-lang.org/draft-profiles.html

jmull 2 days ago | parent [-]

The analysis you link to is insufficient.

E.g., the first case is "Inferring aliasing". He presents some examples and states, "The compiler cannot infer a function’s aliasing requirements from its declaration or even from its definition."

But why not?

The aliasing requirements come directly from vector. If the compiler has those then determining the aliasing requirements of those functions is straightforward.

Now, maybe there is some argument that a C++ compiler cannot determine the aliasing requirements of vector, but if that's the claim, then the paper should make it, and back it up.

The paper continues in the same vein in the next section, as if the lifetime requirements of map and min cannot be known or cannot bubble up through the functions that call them.

As written, the paper says almost nothing about the feasibility of static analysis of C++ to achieve safety goals for C++.

dwattttt 2 days ago | parent | next [-]

I imagine it's (implicitly?) referring to avoiding whole-of-program analysis.

For example, given a declaration

  int* func(int* a);
What's the relationship between the return value and the input? You can't know without diving into 'func' itself; they could be the same pointer or it could return a freshly allocated pointer, without getting into the even more esoteric options.

Trying to solve this without recursively analysing a whole program at once is infeasible.

Rust's approach was to require more information to be provided by function definitions, but that's new syntax, and not backwards compatible, so not a palatable option for C++.

jmull 2 days ago | parent [-]

> avoiding whole-of-program analysis

Why, though?

Perhaps it's unfeasibly complex? But if that's the argument, then that's an argument that needs to be made. The paper sets out to refute the idea that C++ already has the information needed for safety analysis, but the examples throw away most of the information C++ does have, without explanation. I can't really take it seriously.

steveklabnik 2 days ago | parent | next [-]

In general, there are three reasons to avoid whole program analysis:

1. Complexity. This manifests as compile times. It takes much longer.

2. Usability. Error messages are poor, because changes have nonlocal effects.

3. Stability. This is related to 2. Without requirements expressed in the signature, changes in the body change the API, meaning keeping APIs stable is much harder.

There’s really a simple reason why it’s not fully feasible in C++ though: C++ supports separate compilation. This means the whole program is not required to be available. Therefore you don’t have the whole program for analysis.

dwattttt a day ago | parent [-]

It's not even required for the information to be present at link time; C/C++ doesn't require the pointer to always be owned or not-owned, it's valid for that to be decided by configuration loaded at runtime. Or for it to be decided at random.

Trying to establish proofs that the pointer is one way or the other can't work, because the pointer doesn't have to be only one or the other.

The fact that you then have to treat the pointer one way or the other is a problem; if you reduce the allowed programs so that the pointer must be one of the two that's a back-compat hazard. If you don't constrain it, you need to require additional information be carried somewhere to determine how to treat it.

If you do neither, you don't have the information needed to safely dispose of the pointer.

coderedart 2 days ago | parent | prev [-]

Local reasoning is the foundation of everything formal (this includes type systems) and anyone in the type-system-design space would know that. Graydon Hoare (ex-rust dev) wrote a post about it too (which links to another great without-boat's post in the very first line): https://graydon2.dreamwidth.org/312681.html

The entire point of having a static-type-system, is to enable local reasoning. Otherwise, we would just do whole program analysis on JS instead of inventing typescript.

seanbax 2 days ago | parent | prev [-]

The Profiles authors are the ones claiming this uses local analysis only: https://news.ycombinator.com/item?id=41942126

They are clear that Profiles infers everything from function types and not function bodies. Obviously that won't work, but that's what they say.

jmull 2 days ago | parent [-]

In that post (I think your own?) it says, "Local analysis only. It's not looking in function definitions."

But local analysis means analysis of function definitions. At least it does to me. I can't think of what else it could mean. I think there must be some aspect of people talking past each other here, using the same words to mean different things.

Further, I don't think local analysis of the code comprising a function means throwing away the results of that analysis rather than passing it up the line to the analysis of callers of the function. E.g., local analysis of std::sort would establish its aliasing limitations, which would be available to analysis of the body of "f1" from the example in the paper (the results of which, in turn, would be available to callers of f1).

Now, I don't know if that's actually feasible/workable without the "heavy" annotation that C++ profiles wants to forbid. That's the key question to me.

seanbax 2 days ago | parent [-]

  template<typename _RandomAccessIterator>
    _GLIBCXX20_CONSTEXPR
    inline void
    sort(_RandomAccessIterator __first, _RandomAccessIterator __last)
    {
      // concept requirements
      __glibcxx_function_requires(_Mutable_RandomAccessIteratorConcept<
     _RandomAccessIterator>)
      __glibcxx_function_requires(_LessThanComparableConcept<
     typename iterator_traits<_RandomAccessIterator>::value_type>)
      __glibcxx_requires_valid_range(__first, __last);
      __glibcxx_requires_irreflexive(__first, __last);

      std::__sort(__first, __last, __gnu_cxx::__ops::__iter_less_iter());
    }
That's the definition of std::sort. What aliasing information can be gleaned from local analysis of the function? Absolutely nothing.
steveklabnik 2 days ago | parent | prev | next [-]

> with all of the profiles enabled, you are only allowed to use the safe subset of C++ and all of the unsafe stuff is hidden behind APIs whose implementations don't have those profiles enabled.

This is not the goal of profiles. It’s to be “good enough.” Guaranteed safety isn’t in the cards.

pizlonator 2 days ago | parent [-]

> This is not the goal of profiles. It’s to be “good enough.” Guaranteed safety isn’t in the cards.

- Rust isn’t totally guaranteed safe since folks can and do use unsafe code.

- Exact same situation in Swift

- Go has escape hatches like it you race, but not only.

So most “safe” things are really “safe enough” for some definition of “enough”.

steveklabnik 2 days ago | parent | next [-]

You’re misunderstanding what I’m saying. Safe Rust guarantees memory safety. Profiles do not. This is regardless of the ability of the unchecked versions, on both sides, to introduce issues.

Profiles do not, even for code that is 100% using profiles, guarantee safety.

pizlonator 2 days ago | parent [-]

The kind of "safe Rust" where you never use `unsafe` and never call into a C library is theoretical. None of the major ports of software to Rust achieve that.

So, no matter what safe language we talk about, "safety" always has its caveats.

Can you be specific about what missing safety feature of profiles leads you to be so negative about them?

steveklabnik 2 days ago | parent | next [-]

No, I am saying that safe rust says “if unsafe is correct, safe rust means memory safety.” Profiles does not even reach that bar, it says “code under profiles is safer.”

It’s not about specifics, it’s about the stated goals of profiles. They do not claim to prove memory safety even with all of them turned on.

Measter 2 days ago | parent | prev | next [-]

You've misunderstood what Steve is saying, and what safe/unsafe means in Rust. In Rust, if I have a block of code that doesn't use any operations that require the unsafe keyword, then I am guaranteed (modulo compiler bugs) that this block of code is free of all undefined behaviour.

It does not guarantee that code in any function being called within that block is free of it, but it does guarantee this block of code is.

Profiles don't give you that.

dwattttt 2 days ago | parent | prev [-]

> The kind of "safe Rust" where you never use `unsafe` and never call into a C library is theoretical. None of the major ports of software to Rust achieve that.

An entire program ported to Rust will call into unsafe APIs in at least a few places, somewhere down the call stacks.

But you'll still have swathes of code that doesn't ultimately end up calling an unsafe API, which can be trivially considered memory safe.

AlotOfReading 2 days ago | parent | prev [-]

The language standard assumes that everyone collectively agrees to standard semantics implying certain things. If users don't follow the rules and write something without semantics (undefined behavior), the entire program is meaningless as opposed to just the bit around the violation. You know this, so I emphasize it here because it's entirely incompatible with the view that "good enough" is a meaningful concept to discuss from the PoV of the standard.

Rust does a pretty good job formalizing what the safety guarantees are and when you can assume them. Other languages don't, but they also don't support safety concepts that C++ nominally does like safety critical systems. "Good enough" can be perfectly fine for a web service like Go while being grossly inadequate for HPC or safety critical.

coffeeaddict1 2 days ago | parent | prev | next [-]

> And you could force all lifetime operations to use C++ stdlib refcounting primitives, and then have lifetime safety in a Swift-like way (i.e. eager refcounting everywhere)

That's going to be a non-starter for 99% of serious C++ projects there. The performance hit is going to be way too large.

For bounds checking, sure I think the performance penalty is so small that it can be done.

gmueckl 2 days ago | parent | next [-]

You have to realize that the number of locations in code where reference counter adjustment is actually meaningful is rather small and there are simple rules to keep the excess thrash from reference counting pointer wrappers to a minimum. The main one, as mentioned in the talk the sibling comment called out, is that it is OK to pass a raw pointer or reference to a function while holding on to a reference count for as long as that other function runs (and doesn't leak the pointer through a side effect). This rule catches a lot of pointless counter arithmetic through excessive pointer wrapper copying.

silon42 2 days ago | parent [-]

Maybe C++ should copy some Swift, before attempting to challenge Rust.

pjmlp 2 days ago | parent [-]

It has already multiple times,

Managed C++, C++/CLI, C++/CX, C++ Builder, Unreal C++

But those aren't extensions or approaches WG21 cares about having.

The C++11 GC design didn't even took those experiences into consideration, thus it got zero adoption, and was dropped on C++20.

pizlonator 2 days ago | parent | prev | next [-]

That would have been my first guess but WebKit's experience doing exactly this is the opposite.

See https://www.youtube.com/watch?v=RLw13wLM5Ko

Note that they also allowed other kinds of pointers so long as their use could be statically verified using very simple rules.

TuxSH 2 days ago | parent | prev [-]

> For bounds checking, sure I think the performance penalty is so small that it can be done.

Depends on how many times it's inlined and/or if it's in hot code. It can result in much worse assembly code.

Funny thing: C++17 string_view::substr has bound check + exception throw, whereas span::subspan has neither; I can see substr's approach being problematic performance- and code-size-wise if called many times yet being validated by caller.

AlotOfReading 2 days ago | parent | prev [-]

There'd be less opposition if profiles worked that way. The real goal is to define a subset that excludes 95% of the unsafe stuff, as opposed to providing hard guarantees.