It’s not meant to be subjective or get advice on what would be the best path to take, but an objective list of things that must be known in order for me to pick up a book on compiler theory and understand it.
What level of mathematics, and related skills are required?
3
The key question really is: at what level? I’m going to assuming a fairly introductory one if you’re asking this question. I have only done one course on compilers, so perhaps my knowledge is too elementary, but here’s what is useful to know:
- Basic 32-bit assembler, including how the stack works, and the usage of
ebp
, and the different kind ofjmp
s - Good understanding of data structures: trees, linked lists, and some of the common algorithms associated with them (binary search, for example)
- Basic understanding of complexity theory, mainly the big O notation
- Knowledge of basic regular expressions
To be perfectly honest, if you wanted to pick up any book on compiler theory and read it, then you should have read an introduction to compilers and preferably implemented something basic yourself (so you get a good idea from how it works: parsing source code, generating parse trees, to code generation). After that, you can look at optimisations (e.g. age = 4 + 5
would be compiled directly as age = 9
, the 4+5
computation would be done once at compile time and never again).
12
Use different compilers and understand what a compiler can do, as a black box, before you adventure into writing one. Try to use different compilers from command line, see what options they have and what are the effects, what do they have in common, etc.
After that, you need to know a few things:
- Basic arithmetic
- Data structures
- Graph theory
- Computer architecture
- Formal languages (might or might not be covered in detail by the compiler book)
- Some advanced math might be required if you adventure in code optimization
Compilers are much, much simpler than they’re usually perceived. You do not need any advanced knowledge. Understanding some basics of the graph theory might help (but it is certainly not a prerequisite). Understanding the term rewriting systems would also be beneficial. But the best approach is to just start learning the compilers theory and pick up what is missing on the way.
Many things that are discussed in detail in the popular compilers textbooks are totally irrelevant to the real world practice and even to the bleeding edge modern compilers research. Chances are, you’ll never need to touch the formal languages (alas, parsing is overrated). There is no need to understand the computer architecture in detail (although it might help). No complicated data structures are required, besides very simple trees and DAGs.
3