The often known as likely
and unlikely
macros help the compiler know whether an if
is usually going to be entered or skipped. Using it results in some (rather minor) performance improvements.
I started using them recently, and I’m not sure how often should such hints be used. I currently use it with error checking if
s, which are usually marked as unlikely
. For example:
mem = malloc(size);
if (unlikely(mem == NULL))
goto exit_no_mem;
It seems ok, but error-checking if
s happen quite often and consequently the use of the said macros.
My question is, is it too much to have likely
and unlikely
macros on every error-checking if
?
While we’re at it, what other places are they often used?
In my current usage it’s in a library that makes an abstraction from the real-time subsystem, so programs would become portable between RTAI, QNX and others. That said, most of the functions are rather small and directly call one or two other functions. Many are even static inline
functions.
So, first of all, it’s not an application I could profile. It doesn’t make sense to “identify bottle-necks” since it’s a library, not a standalone application.
Second, it’s kind of like “I know this is unlikely, I might as well tell it to the compiler”. I don’t actively try to optimize the if
.
8
Do you need performance that badly that you’re willing to pollute your code with that? It’s a minor optimization.
- Does the code run in a tight loop?
- Does your application have performance problems?
- Have you profiled your application and determined that this particular loop costs a lot of CPU time?
Unless you can answer yes
to all the above, don’t bother with stuff like this.
Edit: in response to the edit. Even when you can’t profile, you can usually estimate hotspots. A memory allocation function that is called by everyone is a good candidate, especially since it requires only a single use of the macro to work for the whole library.
4
If you’re writing for x86/x64 (and are not using 20-year-old CPUs), the performance gain from using __builtin_expect() will be negligible if any. The reason for it is that modern x86/x64 CPUs (not 100% sure about Atom though), have dynamic branch prediction, so essentially CPU “learns” about the branch which is taken more often. Sure, this information can be stored only for a limited number of branches, however, there are only two cases possible. If (a) it is a “frequently used” branch, then your program will benefit from that dynamic branch prediction, and if (b) it is “rare” branch, you won’t really see any realistic performance hit due to mispredictions in such rare branches (20 CPU cycles of branch misprediction are not TOO bad if it happens once in a blue moon).
NB: this does NOT imply that on modern x86/x64 importance of branch misprediction got lower: any branch with 50-50 chance of jump-nojump will still incur a penalty (IIRC 10-20 CPU cycles), so in inner loops branches may still need to be avoided. It is only importance of __builtin_expect() on x86/x64 which has diminished (IIRC, around 10-15 years ago or so) – mostly because of dynamic branch prediction.
NB2: for other platforms beyond x86/x64, YMMV.
4