Hmm, that's interesting. The code as written only has one branch, the if statement (well, two, the while loop exit clause as well). My mental model of the branch predictor was that for each branch, the CPU maintained some internal state like "probably taken/not taken" or "indeterminate", and it "learned" by executing the branch many times.

But that's clearly not right, because apparently the specific data it's branching off matters too? Like, "test memory location X, and branch at location Y", and it remembers both the specific memory location and which specific branch branches off of it? That's really impressive, I didn't think branch predictors worked like that.

Or does it learn the exact pattern? "After the pattern ...0101101011000 (each 0/1 representing the branch not taken/taken), it's probably 1 next time"?

▲

rayiner 16 hours ago | parent | next [-]

Your mental model is close. Predictors generally work by having some sort of table of predictions and indexing into that table (usually using some sort of hashing) to obtain the predictions.

The simplest thing to do is use the address of the branch instruction as the index into the table. That way, each branch instruction maps onto a (not necessarily unique) entry in the table. Those entries will usually be a two-bit saturating counter that predicts either taken, not taken, or unknown.

But you can add additional information to the key. For example, a gselect predictor maintains a shift register with the outcome of the last M branches. Then it combines that shift register along with the address of the branch instruction to index into the table: https://people.cs.pitt.edu/~childers/CS2410/slides/lect-bran... (page 9). That means that the same branch instruction will map to multiple entries of the table, depending on the pattern of branches in the shift register. So you can get different predictions for the same branch depending on what else has happened.

That, for example, let’s you predict small-iteration loops. Say you have a loop inside a loop, where the inner loop iterates 4 times. So you’ll have a taken branch (back to the loop header) three times but then a not-taken branch on the fourth. If you track that in the branch history shift register, you might get something like this (with 1s being taken branches):

11101110

If you use this to index into a large enough branch table, the table entries corresponding to the shift register ending in “0111” will have a prediction that the branch will be not taken (i.e. the next outcome will be a 0) while the table entries corresponding to the shift register ending in say “1110” will have a prediction that the next branch will be taken.

So the basic principle of having a big table of branch predictions can be extended in many ways by using various information to index into the table.

	▲	fc417fc802 3 hours ago \| parent [-]
		It seems like that would struggle with detecting how many layers of branching to pay attention to. Imagine the two nested loops surrounded by a randomized one. Wouldn't that implementation keep hitting patterns it hadn't seen before? Obviously that must be a solved problem; I'd be curious to know what the solution is.

▲

jcalvinowens 14 hours ago | parent | prev | next [-]

Check out [1]: it has the most thorough description of branch prediction I've ever seen (chapter 3), across a lot of historical and current CPUs. It is mostly empirical, so you do have to take it with a grain of salt sometimes (the author acknowledges this).

Supposedly the branch prediction on modern AMD CPUs is far more sophisticated, based on [2] (a citation pulled from [1]).

[1] https://www.agner.org/optimize/microarchitecture.pdf

[2] https://www.cs.utexas.edu/%7Elin/papers/hpca01.pdf

▲

stingraycharles 2 hours ago | parent | prev | next [-]

> My mental model of the branch predictor was that for each branch, the CPU maintained some internal state like "probably taken/not taken" or "indeterminate", and it "learned" by executing the branch many times.

I always figured the algorithm was much simpler, it would just use the same branch as last execution — should work fairly well.

Didn’t realize it used the input value as well, which to me makes no sense — the whole point is to avoid having to inspect the value. This article raises more questions than answers, I’m intrigued now.

▲

LPisGood 16 hours ago | parent | prev | next [-]

There are many branch prediction algorithms out there. They range from fun architecture papers that try to use machine learning to static predictors that don’t even adapt to the prior outcomes at all.

▲

gpderetta 16 hours ago | parent | prev [-]

Typical branch predictors can both learns patterns (even very long patterns) and use branch history (the probability of a branch being taken depends on the path taken to reach that branch). They don't normally look at data other than branch addresses (and targets for indirect branches).

▲

jeffbee 16 hours ago | parent [-]

They can't. The data that would be needed isn't available at the time the prediction is made.

	▲	1718627440 16 hours ago \| parent [-]
		Yeah, otherwise you wouldn't need to predict anything.