The sibling comments mention that C is used in a lot of places in Ruby that incur cross-language overheads, which is true, but it's also just true that in general, even ignoring this overhead, JIT'd functions are going to be faster then their comparable C functions, because 1) they have more profiling information to be able to work from, 2) they have more type information, and (as a consequence of 1&2) 3) they're more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code. Among other optimizations!

▲

Jweb_Guru 2 months ago | parent | next [-]

If you give the JIT compiler unlimited time with the code, then maybe. For real large applications, optimized JIT compiled code tends to lag behind AOT optimized C or Rust code, though I guess you could argue that these differences are language / runtime related rather than compiler related.

▲

uticus 2 months ago | parent | prev [-]

> ...they have more profiling information to be able to work from... more type information... more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code.

this is fascinating to me. i always assumed C had everything in the language that was needed for the compiler to use. in other words, the compiler may have a lot to work through, but the pieces are all available. but this makes it sound like JIT'd functions provide more info to the compiler (more pieces to work with). is there another language besides C that does have language features to indicate to the compiler how to make things as performant as possible?

▲

dhruvrajvanshi 2 months ago | parent | next [-]

A very simple way to think about is that if an intrinsic is written in C, the JIT can't easily inline it, whereas all ruby code can be inlined. Inlining is the most important optimization that enables other optimizations.

It's not necessarily the fact that C doesn't have enough information, it's just that the JIT can reason about Ruby code better than it can about C code. To the JIT, C code is just some function which does things and the only thing it can do with it is to call it.

On the other hand, a Ruby function's bytecode is available to the jit, so if it sees fit, it can copy paste the function body into the call site and eliminiate the function call overhead. Further, after the inlining, it can apply a lot of further optimizations across what was previously a function boundary.

In theory, you could have a way to "compile" the C intrinsics into the JIT's IR directly and that would also give you similar results.

▲

foobazgt 2 months ago | parent | prev | next [-]

JITs have runtime information that static compilers do not. Sometimes that's not a huge benefit, but it can often have massive performance implications. For example, a JIT can inline dynamically loaded code into your own code. That sounds unusual, but it's actually ultra-common in practice. For example, this shows up in something as mundane and simple as configurable logging.

▲

MobiusHorizons 2 months ago | parent | prev | next [-]

The c code in question is most likely interpreter code that is incredibly generic meaning it is very branchy based on data that is only known at runtime, and therefore can’t be optimized at compile time. Jit has the benefit of running the compiler at runtime when the data is known.

▲

adgjlsfhk1 2 months ago | parent | prev [-]

C is actually a pretty hard language to compile well. C is a very weakly typed language (e.g. malloc returns a void* that the user manually casts to the type they intended), and exposes raw pointers to the user, which makes analysis for compilers really annoying.

▲

Jweb_Guru 2 months ago | parent [-]

C also has lots of undefined behavior that lets compilers make assumptions they have a very hard time proving in safe languages. C++ takes this even further with stuff like TBAA. Sure it doesn't give the compiler as much to work with as something like Rust's pervasive restrict or Haskell's pervasive immutability, but on the other hand the compiler assuming things like "every array index is in bounds and infallible" exposes tons of opportunities for autovectorization etc. I think people overexaggerate how hard C is to optimize, at least compared to languages like Java and especially compared to languages like Ruby which let users do things like iterate through all the GC roots.

	▲	adgjlsfhk1 2 months ago \| parent \| next [-]
		UB is very much a double edged sword for compilers. On the one hand, it makes lots of simple optimizations much easier, but on the other, it makes lots of inter-procedural optimizations much harder (since the compiler must be incredibly careful not to introduce UB that the user didn't introduce themself). There is no compiler that actually uses all of the things that the standard allows them to do (especially wrt atomics), because if they did, everyone's code would break, and figuring out which code transforms were legal would be ridiculously difficult. > at least compared to languages like Java and especially compared to languages like Ruby I hope you didn't take from my previous comment that I think Java is a good language from this perspective. The fact that Java gets even gets half decent performance is a miracle given how bad the JVM model is. Ruby is a language I'm really interested to try out since IMO it was the language that first managed a modicum of optimization with python-like expressiveness.
	▲	steveklabnik 2 months ago \| parent \| prev [-]
		C also has TBAA, by the way. Lots of people disable it though.