Remix.run Logo
jasonjayr 3 days ago

This strikes me as something that Kubernetes could handle if it could support it. You can use affinity to ensure workloads stay together on the same machines, if K8s was NUMA aware, you could extend that affinity/anti-affinity mechanism down to the core/socket level.

EDIT: aaaand ... I commented before reading the article, which describes this very mechanism.

jauntywundrkind 3 days ago | parent [-]

It'd be great to see Kubernetes make more extensive use of croups & especially nested croups, imo. The cpuset affinity should build into that layer nicely, imo. More broadly, Kubernetes' desire to schedule everything itself, to fit the workloads intelligent to insure successful running, feels like an anti-partern when the kernel has a much more aggressive way to let you trade off and define priorities and bound resources; it sucks having the ultra lo-fi kube take. I want the kernels "let it fail" version where nested cgroups get to fight it out according to their allocations.

Really enjoyed this amazing write up on how Kube does use cgroups. Seems like the QoS controls do give some top level cgroups, that pods then nest inside of. That's something. At least! https://martinheinz.dev/blog/91