Remix.run Logo
uticus a day ago

> But most of the speedup right now is still coming from rewriting C into Ruby.

Quick glance, this statement seems backwards - shouldn't C always be faster? or maybe i'm misunderstanding how the JIT truly works

molf a day ago | parent | next [-]

C itself is fast; it's calls to C from Ruby that are slow. [1]

Crossing the Ruby -> C boundary means that a JIT compiler cannot optimize the code as much; because it cannot alter or inline the C code methods. Counterintuitively this means that rewriting (certain?) built-in methods in Ruby leads to performance gains when using YJIT. [2]

[1]: https://railsatscale.com/2023-08-29-ruby-outperforms-c/ [2]: https://jpcamara.com/2024/12/01/speeding-up-ruby.html

vidarh a day ago | parent | prev | next [-]

Unless your JIT can analyse the full code, a transition between byte code and native code is often costly because the JIT won't be able to optimize the full path. Once your JIT generates good enough code, it then becomes faster to avoid that transition even in cases when in isolation native code might still be faster.

EDIT: Note that this isn't an inherent limit. You could write a JIT that could analyze the compiled C code too. It's just that it's much harder to do.

ksec a day ago | parent [-]

And that is what TruffleRuby did. I had wished there is a subset of Ruby that could be compiled to C. And then all gems should be written in that instead. I remember a few people tried but failed though. Have to dig up the old HN threads again.

vidarh 16 hours ago | parent | next [-]

Compiling a subset of Ruby to C wouldn't be that hard, but making it compile to C that is fast enough to be worth it is. Not because the Ruby VM is particularly fast, but because the "naive" way of compiling Ruby to C still incurs almost all of the overhead.

E.g. TruffleRuby is fast in part because it will do things like try to avoid method calls for built in types where the standard operations haven't been overridden, but that requires a lot of extra machinery...

So I'm not sure how much compiling to C would help for gems that use C to speed things up.

I think maybe an easier target would be to compile C to a slightly augmented Ruby bytecode. If you control the C compiler you could do things like make C code follow the Ruby calling convention until/unless calling external C code, and avoid a lot of stack overhead.

pjmlp 15 hours ago | parent | prev [-]

Not everyone failed, see RubyMotion.

However they decided it was more useful as a commercial product.

nightpool a day ago | parent | prev [-]

The sibling comments mention that C is used in a lot of places in Ruby that incur cross-language overheads, which is true, but it's also just true that in general, even ignoring this overhead, JIT'd functions are going to be faster then their comparable C functions, because 1) they have more profiling information to be able to work from, 2) they have more type information, and (as a consequence of 1&2) 3) they're more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code. Among other optimizations!

Jweb_Guru 11 hours ago | parent | next [-]

If you give the JIT compiler unlimited time with the code, then maybe. For real large applications, optimized JIT compiled code tends to lag behind AOT optimized C or Rust code, though I guess you could argue that these differences are language / runtime related rather than compiler related.

uticus a day ago | parent | prev [-]

> ...they have more profiling information to be able to work from... more type information... more likely to be monomorphized, and the compiler is more able to inline specialized variants of them into different chunks of the code.

this is fascinating to me. i always assumed C had everything in the language that was needed for the compiler to use. in other words, the compiler may have a lot to work through, but the pieces are all available. but this makes it sound like JIT'd functions provide more info to the compiler (more pieces to work with). is there another language besides C that does have language features to indicate to the compiler how to make things as performant as possible?

dhruvrajvanshi a day ago | parent | next [-]

A very simple way to think about is that if an intrinsic is written in C, the JIT can't easily inline it, whereas all ruby code can be inlined. Inlining is the most important optimization that enables other optimizations.

It's not necessarily the fact that C doesn't have enough information, it's just that the JIT can reason about Ruby code better than it can about C code. To the JIT, C code is just some function which does things and the only thing it can do with it is to call it.

On the other hand, a Ruby function's bytecode is available to the jit, so if it sees fit, it can copy paste the function body into the call site and eliminiate the function call overhead. Further, after the inlining, it can apply a lot of further optimizations across what was previously a function boundary.

In theory, you could have a way to "compile" the C intrinsics into the JIT's IR directly and that would also give you similar results.

foobazgt a day ago | parent | prev | next [-]

JITs have runtime information that static compilers do not. Sometimes that's not a huge benefit, but it can often have massive performance implications. For example, a JIT can inline dynamically loaded code into your own code. That sounds unusual, but it's actually ultra-common in practice. For example, this shows up in something as mundane and simple as configurable logging.

MobiusHorizons 19 hours ago | parent | prev | next [-]

The c code in question is most likely interpreter code that is incredibly generic meaning it is very branchy based on data that is only known at runtime, and therefore can’t be optimized at compile time. Jit has the benefit of running the compiler at runtime when the data is known.

adgjlsfhk1 18 hours ago | parent | prev [-]

C is actually a pretty hard language to compile well. C is a very weakly typed language (e.g. malloc returns a void* that the user manually casts to the type they intended), and exposes raw pointers to the user, which makes analysis for compilers really annoying.

Jweb_Guru 3 hours ago | parent [-]

C also has lots of undefined behavior that lets compilers make assumptions they have a very hard time proving in safe languages. C++ takes this even further with stuff like TBAA. Sure it doesn't give the compiler as much to work with as something like Rust's pervasive restrict or Haskell's pervasive immutability, but on the other hand the compiler assuming things like "every array index is in bounds and infallible" exposes tons of opportunities for autovectorization etc. I think people overexaggerate how hard C is to optimize, at least compared to languages like Java and especially compared to languages like Ruby which let users do things like iterate through all the GC roots.

adgjlsfhk1 2 hours ago | parent | next [-]

UB is very much a double edged sword for compilers. On the one hand, it makes lots of simple optimizations much easier, but on the other, it makes lots of inter-procedural optimizations much harder (since the compiler must be incredibly careful not to introduce UB that the user didn't introduce themself).

There is no compiler that actually uses all of the things that the standard allows them to do (especially wrt atomics), because if they did, everyone's code would break, and figuring out which code transforms were legal would be ridiculously difficult.

> at least compared to languages like Java and especially compared to languages like Ruby

I hope you didn't take from my previous comment that I think Java is a good language from this perspective. The fact that Java gets even gets half decent performance is a miracle given how bad the JVM model is. Ruby is a language I'm really interested to try out since IMO it was the language that first managed a modicum of optimization with python-like expressiveness.

steveklabnik 2 hours ago | parent | prev [-]

C also has TBAA, by the way. Lots of people disable it though.