Remix.run Logo
anlsh 3 days ago

Oh neat, a post I actually know something about! I worked a lot on userfaultfd performance for GCE's live migration post-copy a couple years ago. Or more specifically, I worked on mechanisms to avoid it entirely- due to lock contention in the kennel, faults become veeeerry slow as the number of vcpus scales, and as it happens VMs these days can have a lot of vcpus

shayonj 3 days ago | parent | next [-]

that's very interesting! I was noticing page vault storm on live migrations as well and I wonder if that's what you were running into / mentioning here regarding the lock contention

samsudin 3 days ago | parent | prev [-]

[dead]