Remix.run Logo
scottlamb 4 days ago

> Somehow relatedly, I still remember the first time I heard about profile-guided optimization which is essentially the same but for all of your code at once (well, same idea, not sure it's aggressive enough to reach the same result as the anecdote you shared).

Exactly: profile-guided optimization is pretty awesome if you have the right infrastructure. You can get maybe 15% speedup on your program without going deep into all the code. But it has limits. IIUC, it gathers information about which code is hot, including which branches are taken, and enables certain compiler optimizations based on that. [1] But it's not going to make huge algorithm changes, it's not going to change data structure definitions [2], and it's not going to prepare good microbenchmark input for you to compare your new algorithm against the current one.

Also, this article is about Java, and profile-guided optimization is something Java just has baked into its JIT already.

[1] I don't know exactly which, and they likely vary by compiler version. But I think a variety of stuff to better use cache: breaking code apart into likely-"hot" vs likely-"cold" pages, deciding which loops are worth aligning, and where inlining is worthwhile. Also where branches should be optimized for likely vs unlikely vs unpredictable (conditional instructions). When it's worth autovectorizing. But it's not as good at SIMD as good as a human who puts two weeks into it, and SIMD usage is constrained by the inability to change data structures.

[2] in theory, in Rust's default #[repr(Rust)] it could reorder a struct's members, but it's not going to say change from array-of-struct to struct-of-array format or hashmap to btree.