Optimizing code generation for expressions in a compiler

I don’t know if this question has a simple answer or not, but I just have to ask.

I’m reading this tutorial (I don’t know if this is a well-known series, but it’s name is ‘Let’s build a compiler!’ by Jack Crenshaw). It’s a nice series where you gradually learn how to construct a compiler (it is a learn-by-doing thing, really).

In the second chapter already, I felt the need to optimize some things. In the series, some concessions are made, in order to keep things simple: there is no difference between lexical scanner (although I believe it is introduced later in the series) and the code generation: they are done in one go. Also, the author heavily relies on the stack to create the code to handle expressions.
(Also, the output is assembly code, I don’t know if this is normal for compilers, so I just thought I’d mention it…)

For example, parsing the expression 5 + (7 * 3 – 21 / 3) (and storing the result in eax) results roughly in the following code (I use x86 asm, as opposed to the series, so this is the assembly code my compiler gave):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code> push 5 ; These lines is generated by a simple number parser
push 7 ; This parser pushes its result to the stack
push 3 ; (like most other parsing functions)
pop eax ; Code for multiplication
pop ebx
mul ebx
push eax
push 21 ; Result of parsing these numbers
push 3
pop ebx ; Division code
pop eax
div ebx
push eax
pop ebx ; Subtraction code
pop eax
sub eax, ebx
push eax
pop eax ; Addition code
pop ebx
add eax, ebx
push eax
</code>
<code> push 5 ; These lines is generated by a simple number parser push 7 ; This parser pushes its result to the stack push 3 ; (like most other parsing functions) pop eax ; Code for multiplication pop ebx mul ebx push eax push 21 ; Result of parsing these numbers push 3 pop ebx ; Division code pop eax div ebx push eax pop ebx ; Subtraction code pop eax sub eax, ebx push eax pop eax ; Addition code pop ebx add eax, ebx push eax </code>
    push 5   ; These lines is generated by a simple number parser
    push 7   ; This parser pushes its result to the stack
    push 3   ; (like most other parsing functions)

    pop eax  ; Code for multiplication
    pop ebx
    mul ebx
    push eax

    push 21  ; Result of parsing these numbers
    push 3

    pop ebx  ;  Division code
    pop eax
    div ebx
    push eax

    pop ebx  ; Subtraction code
    pop eax
    sub eax, ebx
    push eax

    pop eax  ; Addition code
    pop ebx
    add eax, ebx
    push eax

As you can see, the code for handling the addition, subtraction, multiplication, etc. can stay the same each time the operation is used. This is very convenient, but of course, the assembly code generated is very slow and esoteric. When writing by hand, one would rather write something like:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code> mov eax, 21
div 3
mov ebx, eax
mov eax, 7
mul 3
sub eax, ebx
add 5
</code>
<code> mov eax, 21 div 3 mov ebx, eax mov eax, 7 mul 3 sub eax, ebx add 5 </code>
    mov eax, 21
    div 3
    mov ebx, eax
    mov eax, 7
    mul 3
    sub eax, ebx
    add 5

My question is if there is any way to optimize the compiler output. Of course there is, but is there any specific way to make my code more like the second, ie: not using the stack so intensively, but instead relying on registers to pass values.

I myself have been thinking about parsing the whole expression and representing it internally as a tree-like structure (I supposethis is the kind of thing a lexical parser does) before generating the code, but I’m not entirely sure how to proceed.
On the other hand, there is peephole optimization and the possibility to optimize the assembler code after it has been generated, but I have no idea how effective this will be. Generating better code in the first place seems logical and more feasible to me.

I know there are many techniques available to optimize my code (for example, constant folding, which would reduce the whole expression to 19. I have deliberately not used this kind of optimization to bring my point across better), I am hoping for a single technique that is well-suited for this kind of stuff.

3

Understand that Jack Crenshaw’s tutorial is about the basics of compiler design. It is not intended to take you all the way to production quality in one swell foop. There are other basic texts available: the various dragon books are well-known. I personally like Wirth’s text: it is almost as approachable as Crenshaw’s.

For production quality code generation, you have to do a lot more. The canonical reference here is Wulf’s “Design of an Optimizing Compiler”. Surprisingly, Amazon lists several used copies available at reasonable prices.

Also note that generating optimal, or even sort-of-optimized, code for the x86 architecture is a lot like breeding elephants: it takes lots of time, lots of noise, lots of effort, lots of bellowing, and there is a real risk of getting crushed underfoot.

2

“Let’s build a compiler” is a great series for teaching the principles of parsing, but compiler output is not its strong point. The way real compilers deal with this is to separate the parser from the code generator. The parser produces an Abstract Syntax Tree (example), a tree of in-memory objects that represent the meaning of the code. The AST is passed to a code generator, which produces output from it, but optimizations can be applied, either in between the creation of the AST and the code generation or during the code generation phase.

For example, your code generator could have one or more internal member variables called “current variable” that keeps track of the last variable(s) that the code operated on, and as long as the code continues to work on the same variables, it doesn’t need to write out the pops and pushes.

One example of optimization before code generation would be if the tree contains a node for an arithmetic operation, and both operands are constant values, (3 * 5) the compiler could perform the calculation at compile time and then replace the node with a constant value node (15) instead of writing out code to perform the calculation.

2

A common method when you have a very, very simple compiler is called “peephole optimisation”: You Look at tiny bits of code and see if you can replace them with something better.

Every time you see a push followed closely by a pop, you can change that to a mov. Even if there are instructions in between (you need to examine what exactly is allowed in between). No code is needed at all if you push and pop the same register. That changes your code to

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code>push 5
mox eax, 3 ; Code for multiplication
mov ebx, 7
mul ebx
push eax
mov ebx, 3
mov eax, 21
div ebx
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax
</code>
<code>push 5 mox eax, 3 ; Code for multiplication mov ebx, 7 mul ebx push eax mov ebx, 3 mov eax, 21 div ebx mov ebx, eax pop eax sub eax, ebx pop ebx add eax, ebx push eax </code>
push 5
mox eax, 3 ; Code for multiplication
mov ebx, 7
mul ebx
push eax
mov ebx, 3
mov eax, 21
div ebx
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax

Next, your mul and div instructions have constant arguments. If both arguments are constant you replace the operation with a mov. If one is a constant you can optimise x/1, x/(power of two) becomes a shift, x * 0 is 0, x * 1 is x, x * 2 adds x+x, x * (power of two) becomes a shift. You cannot replace the instruction setting the constant yet because it could be used by other code. Now the code is

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code>push 5
mox eax, 3 ; Code for multiplication
mov ebx, 7
mov eax, 21
push eax
mov ebx, 3
mov eax, 21
mov eax, 7
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax
</code>
<code>push 5 mox eax, 3 ; Code for multiplication mov ebx, 7 mov eax, 21 push eax mov ebx, 3 mov eax, 21 mov eax, 7 mov ebx, eax pop eax sub eax, ebx pop ebx add eax, ebx push eax </code>
push 5
mox eax, 3 ; Code for multiplication
mov ebx, 7
mov eax, 21
push eax
mov ebx, 3
mov eax, 21
mov eax, 7
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax

Next you remove moves to registers that are overwritten, making it

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code>push 5
mov eax, 21
push eax
mov eax, 7
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax
</code>
<code>push 5 mov eax, 21 push eax mov eax, 7 mov ebx, eax pop eax sub eax, ebx pop ebx add eax, ebx push eax </code>
push 5
mov eax, 21
push eax
mov eax, 7
mov ebx, eax
pop eax
sub eax, ebx
pop ebx
add eax, ebx
push eax

If a register is set to x and then moved to another register, then we can do that mov directly. And throw unused movs away again. And optimise push/pop.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code>mov ebx, 7
mov eax, 21
sub eax, ebx
mov ebx, 5
add eax, ebx
push eax
</code>
<code>mov ebx, 7 mov eax, 21 sub eax, ebx mov ebx, 5 add eax, ebx push eax </code>
mov ebx, 7
mov eax, 21
sub eax, ebx
mov ebx, 5
add eax, ebx
push eax

Sub/add can be optimised like mul/div for constants, so you get

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<code>mov ebx, 5
mov eax, 14
push 14
</code>
<code>mov ebx, 5 mov eax, 14 push 14 </code>
mov ebx, 5
mov eax, 14
push 14

The mov to ebx and eax cannot be removed yet because they might be used later.

PS Mentioned elsewhere is “instruction selection”. You first ask yourself: If I had to write assembler code for this, what code would I write? And then you implement producing that code. As an example, x86 processors have an instruction that takes one register times 1,2,4 or 8, plus optionally another register, plus a constant. Now if you want to multiply x by 81, you can use two of these instructions because one can multiply by 9. If you needed 21x and 38x that’s actually an interesting problem if a plain multiplication is slow.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật