▲ | brucehoult 4 days ago | |
> this branch has been taken 15/16 of the last times we've hit it That is kind of how it worked more than 30 years ago (pre 1995), but not since, at least in OoO CPUs. In fact it was found that having more than a 2-bit saturating counter doesn't help, because when the situation changes it takes too many bad predictions in a row to get to predictions that, actually, this branch is not being taken any more. What both the Pentium Pro and PowerPC 604 (the first OoO designs in each family) had was a global history of how you GOT TO the current branch. The Pentium Pro had 4 bits of taken/not taken history for the last four conditional branches and this was used to decide which 2-bit counter to use for a given branch instruction. The PowerPC 604 used 6 bits of history. The Pentium Pro algorithm for combing the branch address with the history (XOR them!) is called "gshare". The PPC604 did something a little bit different but I'm not sure what. By the PPC750 Motorola was using basically the same gshare algorithm as Intel. There are newer and better algorithms today -- exactly what is somewhat secret in leading edge CPUs -- but gshare is simple and is common in low end in-order and small OoO CPUs to this day. The Berkeley BOOM core uses a 13 bit branch history. I think early SiFive in-order cores such as the E31 and U54 used 10 bits. | ||
▲ | duskwuff 4 days ago | parent [-] | |
Fair point, I oversimplified a bit. Either way, what matter is that it's dynamic. |