Given branch prediction, and also the effect of compiler optimizations, which code tends to offer superior performance?
Note that bRareExceptionPresent represents an uncommon condition. It is not the normal path of logic.
/* MOST COMMON path must branch around IF clause */
bool SomeFunction(bool bRareExceptionPresent)
{
// abort before function
if(bRareExceptionPresent)
{
return false;
}
.. function primary body ..
return true;
}
/* MOST COMMON path does NOT branch */
bool SomeFunction(bool bRareExceptionPresent)
{
if(!bRareExceptionPresent)
{
.. function primary body ..
}
else
{
return false;
}
return true;
}
10
In today’s world, it doesn’t matter much, if it at all.
Dynamic branch prediction (something thought about for decades (see An Analysis of Dynamic Branch Prediction Schemeson System Workloads published in 1996)) are fairly common place.
An example of this can be found in the ARM processor. From the Arm Info Center on Branch Prediction
To improve the branch prediction accuracy, a combination of static and dynamic techniques is employed.
The question then is “what is dynamic branch prediction in the arm processor?” Contiuned reading of Dynamic branch prediction shows that it uses a 2 bit prediction scheme (described in the paper) builds information about if the branch is strongly or weakly taken or not taken.
Over time (and by time I mean a few passes through that block) this builds up information as to which way the code will go.
For static prediction, it looks at the way the code looks itself and which way the branch is made on the test – to a previous instruction or one further in the code:
The scheme used in the ARM1136JF-S processor predicts that all forward conditional branches are not taken and all backward branches are taken. Around 65% of all branches are preceded by enough non-branch cycles to be completely predicted.
As mentioned by Sparky, this is based on the understanding that loops more often than not, loop. The loop branches backwards (it has a branch at the end of the loop to restart it at the top) – it normally does this.
The danger of trying to second guess the compiler is that you don’t know how that code is actually going to be compiled (and optimized). And for the most part, it doesn’t matter. With dynamic prediction, twice through the function it will predict a skip over the guard statement for a premature return. If the performance of two flushed pipelines is of critical performance, there are other things to worry about.
The time it takes to read one style over the other is likely of greater importance – making code clean so that a human can read it, because the compiler is going to do just fine no matter how messy or idealized you write the code.
10
My understanding is that the first time the CPU encounters a branch, it will predict (if supported) that forward branches are not taken and backwards branches are. The rationale for this is that loops (which typically branch backwards) are assumed to be taken.
On some processors, you can give a hint in the assembly instruction as to which path is the more likely. Details of this escape me at the moment.
Additionally, some C compilers also support static branch prediction so that you can tell the compiler which branch is more likely. In turn it may reorganize the generated code, or use modified instructions to take advantage of this information (or even just flat out ignore it).
__builtin_expect((long)!!(x), 1L) /* GNU C to indicate that <x> will likely be TRUE */
__builtin_expect((long)!!(x), 0L) /* GNU C to indicate that <x> will likely be FALSE */
Hope this helps.
3