I do not understand how BTP differs from BP? Yes I understand BP evaluates whether a conditional is true/false, but surely implicitly this also determines the “target” instruction?
If I predict the first branch of an IF to be true, then surely I have just determined the branch target too (the code within the predicted IF branch)??
4
Indirect jumps / calls (through a function pointer or a switch using a jump table) don’t have the branch target available when an instruction is decoded. Even without a cache miss or something, jmp rax
has to fetch the value of rax
from the register file (or forwarding network). The Branch Target Buffer predicts the target address way ahead of this, so code fetch can start ASAP. A sophisticated BTB can recognize patterns, like an indirect jump that alternates between two targets. Good BTB performance is critical for indirect jumps. (Combined with the usual need for good normal branch prediction (taken vs. not-taken) for conditional indirect branches).
A BTB can also predict where a jump is going before the jmp
instruction is even decoded. This is important on an architecture like x86, where the variable-length instruction encoding makes it impossible to scan the instruction stream for upcoming branches ahead of what’s currently being decoded. There are multiple fetch / pre-decode pipeline stages ahead of the point where a jmp
is fully decoded.
See this SO question for an example of the BTB speeding up normal unconditional direct jmp
instructions on an Intel Broadwell. A loop with much less than 4096 jmp
s runs at 1 jmp per ~3 cycles, while a large loop (with many more jmp
s than the BTB can hold) runs slower, like 1 jmp per ~12 cycles.
3
Put simply:
Branch Prediction predicts the answer to “Will I branch?”
Branch Target Prediction predicts the answer to “Where will I branch to?”
Both of these are considered at the same time. An optimal CPU will be able to not only predict wether or not a branch will happen, but also where it will branch to. This would allow it to start pushing instructions through the pipeline that it predicts will be coming up.
Consider the following loop in assembly:
loop: ADDI R1, R1, #-1 ;subtract 1 from the value stored in R1
BNEZ R1, loop ; if the value of R1 is not 0, go to the code at label "loop"
SUBI R2, R2, #2 ; completely unrelated to the loop
This code will continuously subtract 1 from the value stored in R1, until the value in R1 is 0. Pretty simple, if not practical. Equivalent C code might read
while(a != 0)
a--;
According to Wikipedia:
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure.
In this example, the branch predictor will try to guess if the value of R1 is, in fact, 0, by various methods covered in the Wikipedia article. All the branch predictor does is determine “Yes, this branch is going to be taken”, or “No, this branch will not be taken.”
On the other hand, a Branch Target Predictor will take the results of the Branch Predictor, and give the address that the program is going to jump to.
Using the example above, if the branch predictor says that the BNEZ
instruction is going to take the branch, the branch target predictor will give the Address of the next instruction to execute after the ADDI
instruction. This might be the ADDI
instruction, at the label loop
, or it might be the SUBI
instruction after the the branch instruction.
The Branch Predictor predicts the result of a comparison. The Branch Target Predictor give where the program is going because of a branch.
Branches (and jumps, for that matter), are Program Counter (PC) relative. The Branch Target Predictor will add the offset (given by the branch instruction), and add it to the current program counter. This gives the address of the instruction to execute after the branch.
Branch Predictors give a Yes/No answer to the question “Am I going to branch”. The Branch Target Predictor needs that Yes/No answer to determine the answer to “WHERE am I going after the branch”.
1