This is regarding compilers.
Do compilers perform run time dependency checks to decide to vectorize a loop ? In other words, do compilers trace through the logic as it would in run-time to determine if a loop can be vectorized?
Is it that when the compiler compiles a code and if auto-vectorization is enabled(default) then the output is a vectorized code targeted with lets say AVX assembly instructions then when did it perform the dependency checks ?
7
For a compiler to do run time checks (to find warnings and errors that cannot be found at compile time, including loop vectorization), it starts to get dangerously close to the Halting Problem.
Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist. A key part of the proof was a mathematical definition of a computer and program, which became known as a Turing machine; the halting problem is undecidable over Turing machines. It is one of the first examples of a decision problem.
Consider the following Java program:
public class Halting {
public static void main(String... args) {
while(isCondition()) {
}
}
private static boolean isCondition() {
return true;
}
}
This program returns no errors or warnings by the compiler. It would have to trace through the logic to see that it is an infinite loop. What’s to stop the theoretical run time checker from getting stuck in an infinite loop the same way? The compiler can’t get stuck in an infinite loop, it’s not smart enough.
Notice, on the other hand, that this does have an error:
public class Halting {
public static void main(String... args) {
String foo = null;
System.out.println(foo.charAt(0));
}
}
This yields the warning:
Null pointer access: The variable foo can only be null at this location
Such checks are the best the compiler can do. But notice that this has no warning, even though it’s the same problem:
public class Halting {
public static void main(String... args) {
String foo = getFoo();
System.out.println(foo.charAt(0));
}
private static String getFoo() {
return null;
}
}
1
I don’t think any do, but it should be possible. For example gcc 4.8 (scroll down to IA-32/x86-64 section) has both a builtin __builtin_cpu_supports function and function multiversioning support that the developer can use to write different code for different architectures. I don’t see why the compiler couldn’t generate different codepaths dependant on the runtime using these features automatically.
For example, manually:
__attribute__ ((target ("default")))
int foo(void)
{
return 1;
}
__attribute__ ((target ("sse4.2")))
int foo(void)
{
return 2;
}
int main (void)
{
int (*p) = &foo;
assert ((*p)() == foo());
return 0;
}