Besides a faster register allocation algorithm and some trade-off in control and data-flow analysis for optimization purposes, which components/phases of a JIT compiler are different from a traditional ahead of time compiler?
10
The main differences between an AOT and a JIT compiler are resources and information.
An AOT compiler has infinite resources. It can use as much RAM as it likes, and take as much time as it wants. (Note that this is only theoretically true. Pragmatically, people don’t like long time compiles. Plus, compilers are now typically embedded into IDEs, where they provide instant feedback while-you-type, so that at least lexing, parsing, semantic analysis, type inference, type checking, macro expansion, etc., basically everything except the actual code generation and optimization have to happen very fast and with low memory usage.)
A JIT compiler OTOH has to “steal” its resources from the running application. (Again, theoretically. Pragmatically, the JIT compiler has to work hardest, when a lot of new code is introduced into the system, this is typically when the application starts, at which point the loading of configuration files and setting up of object graphs is the bottleneck, not the JIT compiler.)
A JIT compiler has much more information available than an AOT compiler, and it doesn’t have to work hard to get it. An AOT compiler can only get static information about the code. Static analysis algorithms are usually very expensive (often at least O(n2) in at least one of time and space, sometimes exponential) and they don’t even reliably work, because many of them are equivalent to solving the Halting Problem (Class Hierarchy Analysis, Escape Analysis, Dead Code Elimination, for example).
A JIT compiler OTOH doesn’t run into the Halting Problem, because it doesn’t do static analysis. And it doesn’t have to run expensive algorithms: want to know whether a method is being overridden or not so that you can potentially inline it? Don’t need to run Class Hierarchy Analysis, just look at the classes, they’re all there. Or even better yet: don’t even bother, just inline it anyway, and if it turns out you were wrong about it not being overridden, un-inline it again. Want to know whether a reference escapes a local scope or not so that you can potentially allocate it on the stack? Don’t bother, just allocate it on the stack, tag it, and when the tag shows up somewhere else, re-allocate on the heap. And since you only compile code when it is running, Dead Code Elimination is totally trivial because dead code will never run and thus never be compiled.
So, the basic difference is that the analysis in a JIT compiler can be simpler, because it has a lot of information available that an AOT compiler doesn’t have, and the optimization and code generation must be simpler, because it has a lot less resources available. Note, however, that a JIT compiler can nonetheless perform much more aggressive optimizations than an AOT compiler can, because it doesn’t necessarily have to prove the optimizations correct. If it turns out an optimization is wrong, it can always de-optimize again. (Not all JITs do this (the CLR JIT for example, is incapable of de-optimizing), but for example the HotSpot JIT in the Oracle JDK does.) Speculative Inlining is one such optimization that is only possible in a de-optimizing JIT.
One thing that is peculiar about JITs is that very often the languages they compile are designed to be easily compilable by a machine (e.g. JVM bytecode, CPython bytecode, Rubinius bytecode, LLVM IR, CLI CIL, Dalvik bytecode), whereas the languages that a typical AOT compiler compiles are designed to be easily readable by humans (e.g. Ruby, Python, Java). But I understood your question to be about an AOT vs. a JIT for the same language, so none of this applies. Obviously, if you compare compilers for different languages, there will be a lot of differences, and many of those will be totally unrelated to the difference between JIT and AOT and more related to the differences between the two languages.
Since that is a very broad question, I’ll stick to conceptual differences rather than implementation details:
-
A JIT compiler can theoretically perform more platform-specific optimizations, since it doesn’t have to worry about producing a binary that’s compatible with as many machines as possible.
-
A JIT compiler has access to some information only available at runtime, which means it can do things like make an optimized integer-only version of your function because it’s only seen integer arguments so far (this makes it good for dynamic languages), or try to use less memory because your system doesn’t have much left.
-
An AOT compiler never needs to be installed on your users’ machines. A JIT compiler (and the runtime or VM it usually comes with) needs to be there every time they run a program using it.
-
An AOT compiler typically looks at your entire program, or at least very large portions of it, allowing it to do things like check for type errors across modules (which is good in static languages that rigidly define the types each function will accept) and remove code it knows will never be called.
-
Because AOT compilation only happens once, the executable’s performance is more predictable and repeatable, even in situations when a JIT may perform better on average.
3
Most of the JIT compilers take an intermediate or bytecode language as their input. For that reason, the structure of a JIT compiler is closer to that of a traditional assembler than that of an AOT compiler.
Lexical analysis and parsing are much simpler, because the input language is much simpler that your typical high level programming language.
Semantic analysis is probably completely absent, trusting that the bytecode doesn’t try to do the impossible (like mixing integer and floating point operations without the required conversions).
What a AOT and JIT compiler have in common are the optimisations and code generation, although a JIT compiler can’t perform lengthy optimisations but can easily do optimisations that require information about how the code is actually being used.
2