Remix.run Logo
hinkley 7 months ago

> It’s very rare for code to allocate exactly the same type of object many times in a row, so the class of the instance local variable will change quite frequently.

That’s dangerous thinking because constructors will be a bimodal distribution.

Either a graph of calls or objects will contain a large number of unique objects, layers of alternating objects, or a lot of one type of object. Any map function for instance will tend to return a bunch of the same object. When the median and the mean diverge like this your thinking about perf gets muddy. An inline cache will make bulk allocations in list comprehensions faster. It won’t make creating DAGs faster. One is better than none.

munificent 7 months ago | parent | next [-]

> Any map function for instance will tend to return a bunch of the same object.

Yes, but if it ends up creating any ephemeral objects in the process of determining those returned objects, then the allocation sequence is still not homogeneous. In Ruby, according to the article, even calling a constructor with named arguments allocates, so it's very easy to still end up cycling through allocating different types.

At the same time, the callsite for any given `.new()` invocation will almost always be creating an instance of the exact same class. The target expression is nearly always just a constant name. That makes it a prime candidate for good inline caching at those callsites.

tenderlove 7 months ago | parent | next [-]

> Yes, but if it ends up creating any ephemeral objects in the process of determining those returned objects, then the allocation sequence is still not homogeneous.

Yes! People might do `map` transformations, but it's very common to do other stuff at the same time. Any other allocations during that transformation would ruin cache hit rate.

> At the same time, the callsite for any given `.new()` invocation will almost always be creating an instance of the exact same class. The target expression is nearly always just a constant name. That makes it a prime candidate for good inline caching at those callsites.

Yes again!

titzer 7 months ago | parent | prev [-]

This is why it's imperative that inline caches learn and adapt to the observed behavior. As long as learning is cheap, identifies profitable cases effectively, and backs off for polymorphic and megamorphic scenarios, it's a win.

VM implementer intuition only goes so far, and as the internet is the greatest fuzzer invented, you're definitely going to encounter programs that break your best laid plans.

munificent 7 months ago | parent [-]

> This is why it's imperative that inline caches learn and adapt to the observed behavior.

True, but if you only have a single bottleneck cache site for all constructor invocations across the program, the only reasonable thing that callsite can learn is "wow, every single constructed class goes through here".

That's why it makes sense to have a separate cache at every `.new()` location.

titzer 7 months ago | parent [-]

Yeah, then you want context-sensitive ICs which are indexed by the callsite. JSC gets some of this by profiling in higher tiers, where inlining might have occurred.

masklinn 7 months ago | parent | prev [-]

> One is better than none.

Not necessarily. An inline cache is cheap but it's not free, even less so when it also comes with the expense of moving Class#new from C to Ruby. It's probably not worth speeding up the 1% at the expense of the 99%.

> An inline cache will make bulk allocations in list comprehensions faster.

Only if such comprehensions create exactly one type of object, if they create two it's going to slow them down, and if they create zero (just do data extraction) it won't do anything.

hinkley 7 months ago | parent [-]

> Only if such comprehensions create exactly one type of object,

We just had this conversation maybe a month ago. If it’s 50-50 then you are correct. However if it’s skewed then it depends. I can’t recall what ratio was discovered to be workable, it was more than 50% and less than or equal to 90%.