Some JVMs would compile Java byte code into native machine code. We know that there are lots of optimizations we could apply for that. Recently, I also learn that a branch operation may block the CPU and affect the performance significantly, if a CPU makes a wrong prediction.
Does anyone know if any JVM would generate machine codes easier for CPU making right prediction based on runtime statistics collected?
1
No, HotSpot does not add hints to the hardware branch predictor as stated on the OpenJDK mailing list:
It’s been considered, and decided against. The platforms which openjdk
currently targets all have decent to spectacular hardware branch
prediction. Those that didn’t, such as Niagara 1, ignored prediction
bits. Conclusion is that it’s not worth complicating the code with, as
David says, ‘magic macros’.
3
My guess is that prediction hints at the machine instruction level are at best a noise, and at worst a detriment (wasted instruction bytes) on modern out-of-order, speculative-executing architecture. Doing so would be like telling the CPU to dumb down – to stop doing its already intelligent things it is designed to perform.
Secondly, the degree to which branch prediction can be improved depends on the the cause of the misprediction, and the ease with which one can measure the performance effects of, or to observe the tendency of the branch.
However, I think that the existing bag of JIT optimization tricks can already improve branch prediction to a certain extent, even without the help of CPU branch misprediction counters.
Just a very simple code example:
public void repeatHistory(int value)
{
if (value == 1492)
{
landing();
}
else if (value == 1776)
{
ratifying();
}
}
Supposed that repeatHistory
is called a lot of times. When the sampling-based performance monitor analyzes the call stack statistics, it may find that, for whatever reason, repeatHistory()
calling ratifying()
occurs more frequently than the former calling landing()
. Based on this observation, the next pass of JIT code generation for the repeatHistory
method will take this into account, and perform one or more optimizations:
- Move the check for
(value == 1776)
ahead of the check for(value == 1492)
- Attempt to inline the
ratifying()
method into the branch inrepeatHistory()
- If the
repeatHistory()
is called from another loop, try unroll the loop, or inline therepeatHistory()
method into that loop. - And many others.
After applying one optimization, it is often necessary to analyze again to see if more optimizations can be applied, because a successful one will often open the door to more opportunities.