| ▲ | tux3 11 hours ago | ||||||||||||||||
My first instinct for a poorly predicted branch would be to use a conditional move. This isn't always a win, because you prevent the CPU from speculating down the wrong path, but you also prevent it from speculating the correct path. If you really don't care about the failure path and really don't mind unmaintainable low-level hacks, I can think of a few ways to get creative. First there's the whole array of anti uarch-speculation-exploit tricks in the Kernel that you can use as inspiration to control what the CPU is allowed to speculate. These little bits of assembly were reviewed by engineers from Intel and AMD, so these tricks can't stop working without also breaking the kernel with it. Another idea is to take inspiration from anti-reverse engineering tricks. Make the failure path an actual exception. I don't mean software stack unwinding, I mean divide by your boolean and then call your send function unconditionally. If the boolean is true, it costs nothing because the result of the division is unused and we just speculate past it. If the boolean is false, the CPU will raise a divide by 0 exception, and this invisible branch will never be predicted by the CPU. Then your exception handler recovers and calls the cold path. | |||||||||||||||||
| ▲ | sparkie 6 hours ago | parent | next [-] | ||||||||||||||||
We could potentially use a conditional move and an unconditional jump to make the branch target predictor do the work instead - and flood it with a bunch of targets which are intended to mispredict. Eg, we could give 255 different paths for abandon and select one randomly:
Assuming no inherent bias in the low byte produced by `random`, there's only a ~1/255 chance that an abandon branch will correctly predict, though this is also true for the send branch. The conditional branch in send though should only mispredict 1/256 times (when random returns 0).If we're sending significantly more often than 1/256 calls to resolve, it may be possible to train the BTP to prefer the send branch, as it will correctly predict this branch more often than the others which are chosen randomly - though this would depend on how the branch target predictor is implemented in the processor. | |||||||||||||||||
| |||||||||||||||||
| ▲ | 6 hours ago | parent | prev [-] | ||||||||||||||||
| [deleted] | |||||||||||||||||