Remix.run Logo
adgjlsfhk1 a day ago

The part that's really weird is that on modern CPUs predicted branches are free iff they're sufficiently rare (<1 out of 8 instructions or so). but if you have too many, you will be bottlenecked on the branch since you aren't allowed to speculate past a 2nd (3rd on zen5 without hyperthreading?) branch.

dzaima a day ago | parent [-]

The limiting thing isn't necessarily speculating, but more just the number of branches per cycle, i.e. number of non-contiguous locations the processor has to query from L1 / uop cache (and which the branch predictor has to determine the location of). You get that limit with unconditional branches too.

gpderetta 7 hours ago | parent [-]

Indeed, the limit is on taken branches, hence why making the most likely case fall through is often an optimization.