| ▲ | markisus 4 hours ago | |||||||
I wish there were more details on this part. > Missing @cython.cdivision(True) inserts a zero-division check before every floating-point divide in the inner loop. Millions of branches that are never taken. I thought never taken branches were essentially free. Does this mean something in the loop is messing with the branch predictor? | ||||||||
| ▲ | pavpanchekha 4 hours ago | parent [-] | |||||||
They're cheap but not free, especially at the front end of the CPU where it's just a lot more instructions to churn through. What the branch predictor gets you is it turns branches, which would normally cause a pipeline bubble, to be executed like straightline code if they're predicted right. It's a bit like a tracing jit. But you will still have a bunch of extra instructions to, like, compute the branch predicate. | ||||||||
| ||||||||