Remix.run Logo
kragen 4 days ago

What if the kernel handled unimplemented instruction faults by migrating the process to a core that does implement the instruction and restarting the faulting instruction?

MBCook 3 days ago | parent | next [-]

What if that core isn’t free? What if it’s not going to be free for a long time?

That could be a recipe for random long stalls for some processes.

kragen 3 days ago | parent | next [-]

I don't think avoiding such pathological cases would be that hard. See https://news.ycombinator.com/item?id=45178286

mrheosuper 3 days ago | parent | prev [-]

> What if that core isn’t free

Just context switch it, like how you run 2 programs with single core cpu

kragen 3 days ago | parent [-]

It's correct to point out that you could end up in a situation where your "big" cores are all heavily loaded and your "small" cores with less instructions are all idle. That's unavoidable if your whole workload needs the AVX512 instructions or whatever, but it could be catastrophic if your OS just mistakenly thinks it does. But that doesn't seem unavoidable; see my comments further down the thread.

Rohansi 3 days ago | parent | prev [-]

Sounds great for performance.

kragen 3 days ago | parent [-]

Would this be more or less costly than a page fault? It seems like it would be easy to arrange for it to happen very rarely unless none of your cores support all the instructions.

Rohansi 3 days ago | parent [-]

Most likely similar. What would the correct behavior be for the scheduler to avoid hitting it in the future? Flag the process as needing X instruction set extension so they only run on the high performance cores?

kragen 3 days ago | parent [-]

Yeah, although maybe the flag should decay after a while? You want to avoid either spending significant percentages of your time trying to run processes that make no progress because they need unavailable instructions or delaying processes significantly because they are waiting for resources they no longer need.

This sounds a little bit subtle, in the way most operating system policy problems are subtle, but far from intractable. Most of the time all your processes are making progress and either all your cores are busy or you don't have enough runnable processes to keep them busy. In the occasional case where this is not true, you can try optimistically deflagging processes that have made some progress since they were last flagged. Worst case, you context switch an idle core to a process that immediately faults. If your load average is 256 you could maybe do this 256 times in a row at most, at a cost of around a microsecond each? Maybe you have wasted a millisecond on a core that would have been idle?

And you probably want the flag lifetime to be on the order of a second normally, so you're not forced to make suboptimal scheduling decisions by outdated flags in order to avoid that microsecond of wasted context switching.