Remix.run Logo
afdbcreid 6 hours ago

Is C++ more performant than C? I find this hard to believe. C++ does not have any construct that cannot be replicated, or is not common, in C. The only candidate is using virtualization and void* pointers instead of monomorphized generics which some C code does for the lack of better options, but that's not a problem in Rust as well.

If anything, Rust has the potential to be more performant than C due to its aliasing rules (C has `restrict` but it's rarely used, standard C++ does not have even that). The current perf stats show it does make Rust code faster but just a little bit, although we don't utilize the full optimization potential currently (LLVM does not do many possible optimizations here, and `noalias` is weaker than Rust's aliasing rules). It can also affect autovectorization, and if it does the effect could be dramatic.

jandrewrogers 5 hours ago | parent | next [-]

Modern C++ metaprogramming materially impacts performance in practice. I’ve done performance engineering for decades in both C and modern C++ and I would assert that the difference isn’t arguable.

The poor applicability of auto-vectorization is another area where C++ is strong. You can transparently codegen e.g. AVX512 from intrinsics directly in C++ in contexts that would be opaque to auto-vectorization and difficult to generalize in C. This allows you to get some degree of “auto-vectorization” where the compiler can’t see it because it works at the wrong level of abstraction.

With sufficiently heroic efforts you can write C that matches the performance of C++. I’m not arguing that. Virtually no one writes C to that standard, including myself when I was writing high-performance C because the effort was too high, so it is a bit of a strawman.

It is the difference between theory and practice. All code bases have a finite budget. C++ can do a lot more optimization in the same budget as C.

globalnode 4 hours ago | parent [-]

So youre saying the metaprogramming facilities of C++ allow the compiler to better optimise high level human readable code more effectively than C. Thats a fair point and one I'd never even thought of before, I always thought C was faster because of things like v-tables and all that stuff.

swiftcoder 5 minutes ago | parent | next [-]

> So youre saying the metaprogramming facilities of C++ allow the compiler to better optimise high level human readable code more effectively than C.

The metaprogramming facilities of C++ allow the programmer to more effectively optimise than they would have the patience to do in C.

The compiler's own optimisations don't directly benefit from the metaprogramming facilities in this sense. What they do is let the programmer break high level constructs down to codegen that the compiler can optimise

And you could do the same things by hand in C or Rust, but it would be tedious in the extreme, and you'd probably find yourself adopting external codegen tools

leonidasrup 3 hours ago | parent | prev | next [-]

For example, you can do loop unrolling using C++ template meta-programming.

https://cpplove.blogspot.com/2012/07/a-generic-loop-unroller...

Of course, nothing beats hand written ffmpeg-style assembly which takes into account optimal register allocation, instruction scheduling, cache alignment, etc. for specific processor architectures.

jeffreygoesto 2 hours ago | parent [-]

Careful. That article is from 2012 and compile time unrolling was more useful back then. Today or can actually be harmful as it hides strong hints about the loop from the optimizer. Our code that did this fared worse than a loop, because no optimizer-writer expected unrolled loops.

adrian_b 3 hours ago | parent | prev [-]

In C++, nobody would want or need to use virtual functions in high-performance computational applications, while in the C language structures with virtual function tables that are accessed explicitly by the programmer are in widespread use wherever suitable, for instance in many popular open-source C programs, like the Linux kernel or the debugger gdb.

So the existence of virtual function tables is not a differentiator between C++ and C.

The data types with virtual function tables are just the implementation method for sum types that is dual to tagged unions. Both virtual function tables and tagged unions can be implemented in C and in most other programming languages that do not have intrinsic support for them, but they require explicit boilerplate code for invoking the virtual functions or for testing the union tags.

Which is the better of these 2 variants depends on the application. In high-performance computations, one does not use ambiguous data types, so normally none of these 2 is used. There are a few object-oriented programming languages where "everything is an object", i.e. any kind of data includes a virtual table pointer, but those are just incomplete programming languages, which do not have all the data types needed in practice, like also many early programming languages that had a unique data type, e.g. the original LISP I, which had only linked lists and no arrays, etc. C++ at least is a complete language, in which any kind of data type can be implemented, without overheads.

As you said previously, C has few restrictions in what it can do, so in theory it is almost always possible to write a C program almost exactly equivalent with any program written in another language, matching its speed, even if that may require a significant reorganization of the code, not a line to line translation.

Nevertheless, as the other poster said, the effort needed to write that equivalent program may be so high that it is not a realistic solution.

So in practice it is not unusual that at similar programming efforts a higher-level language like C++ frequently allows writing a faster program than C.

flohofwoe 2 hours ago | parent [-]

> while in the C language structures with virtual function tables that are accessed explicitly by the programmer are in widespread use wherever suitable, for instance in many popular open-source C programs, like the Linux kernel or the debugger gdb

For dynamic dispatch there is absolutely no difference between using a jump table in C and virtual method tables in C++. If the compiler can infer the target address at compile time, it will not go through an indirect call, e.g.:

https://www.godbolt.org/z/as8ehGhv3

And for 'static dispatch' there's no difference between a C++ method call and a direct C function call (since for static dispatch the caller needs to 'know' the target function either way).

dwaite 27 minutes ago | parent | prev | next [-]

> Is C++ more performant than C? I find this hard to believe.

At the compiler level, no. But as you write projects, you will for instance run into things you can do with templates which are infeasible to attempt with macros.

One example might be qsort() - a C compiler _could_ catch cases where it could create an intrinsic qsort based on the data type and function pointer being passed. However, in C++ you have the facilities to create a type safe, genericized sort that will be inlined based on the data structure.

amelius 33 minutes ago | parent | prev | next [-]

> I find this hard to believe. C++ does not have any construct that cannot be replicated, or is not common, in C.

But this is not a valid argument, as all languages are Turing complete, and most modern languages can do low level stuff at optimum speeds. As an extreme example, in Java, you could just allocate a large chunk of memory and run an allocator inside of it and sidestep the GC entirely.

With a programming language the question is thus not what can you do with it and how fast can it run with infinite effort, but what are the ergonomics, and what performance will you get in practice.

loeg 5 hours ago | parent | prev | next [-]

C++ you get templated generic algorithms that in practice no one really does with C because macros suck too much. So in C typically you'd have a runtime generic routine that doesn't inline. A classic example here is qsort() vs std::sort().

flohofwoe 2 hours ago | parent | next [-]

> So in C typically you'd have a runtime generic routine that doesn't inline.

With LTO you get many of the same advantages as C++ template code, there's nothing magic about C++ template optimizations, it's all about whether the compiler can see all function bodies in a call hierarchy.

simonask an hour ago | parent [-]

LTO cannot change the layout of structs. For something like a hash map implementation, it matters whether inner nodes store a pointer to the key and value, or whether it stores a pointer to each. To achieve this in C, you have no other options than emulating templates using macros.

flohofwoe 40 minutes ago | parent [-]

The question is whether a hash-map implementation that works on a general `[key, index]` item and where index references at separate array of values isn't actually better for some access patterns ;)

And of course the other alternative to macros is code-generation (but macros are actually often fine).

But this also only matters for actually reusable generic code. If I'd implement a super-hot-path hashmap in C, I would stamp out a specialized version by hand instead of relying on a generic implementation. But for 90% of cases, a solution like in stb_ds.h is probably good enough.

afdbcreid 5 hours ago | parent | prev [-]

I explicitly acknowledged that:

> The only candidate is using virtualization and void* pointers instead of monomorphized generics which some C code does for the lack of better options, but that's not a problem in Rust as well.

But in fact, if speed is a concern to you, even in C you will use "templated" sorting (via macros or code generation).

20k 5 hours ago | parent | next [-]

The problem is that the implementation burden with C is so high, that people tend not to do it even in relatively performance constrained situations

loeg 3 hours ago | parent | prev | next [-]

> in practice no one really does with C because macros [and codegen] suck too much

fluffybucktsnek 4 hours ago | parent | prev [-]

Neither codegen nor macros (they are a part of the preprocessor) are really a part of C.

For the latter, the lack of integration becomes more noticeable if you try writing a macro in which the compare param can accept a function identifier. As the preprocessor doesn't have the knowledge of the contents of the referred function, it can't inline it. In C++ and Rust, their compilers do, so they can.

A codegen tool could overcome this, but you could also make a codegen tool to write Zig/D/C#/Swift in C, or any other language for that matter :). By this point, one could say you are programming in a superset of C, not strict C.

smallstepforman 5 hours ago | parent | prev | next [-]

c++ uses rich type system to avoid aliasing when it can, as well as template meta programming.

Eg: delete_scene(void *arg) vs delete_scene<T>(T *arg)

fithisux 5 hours ago | parent | prev [-]

You can write C style C++ and enjoy the same benefits.

In Twitter a user explained me that it is common in embedded space.

You do not need the OOP, RTTI, exceptions.

Like C with most use cases of preprocessor replaced by generic programming.

afdbcreid 5 hours ago | parent [-]

So? How is that an argument that C++ is more performant than C? It's only an argument that it's not less performant.