Code Generation
Code Generation
Introduction
Code
Front end Code optimizer
Generator
Issues in the Design of Code Generator
• Input to the code generator
• Target Programs
• Memory Management
• Instruction Selection
• Register allocation
• Choice of evaluation order
• Approaches to code generation
Issues in the Design of Code Generator
• The most important criterion is that it produces correct
code
• Input to the code generator
– IR + Symbol table
– We assume front end produces low-level IR, i.e. values of names
in it can be directly manipulated by the machine instructions.
– Syntactic and semantic errors have been already detected
• The target program
– Common target architectures are: RISC, CISC and Stack based
machines
– In this chapter we use a very simple RISC-like computer with
addition of some CISC-like addressing modes
Issues in the Design of Code Generator
• Memory Management: Mapping names in the
source program to address of data objects in
run-time memory is done by the code
generator.
• Instruction Selection: The nature of the
instruction set of the target machine determine
the difficulty of instruction selection. The
quality of the generated code is determined by
its speed and size.
Issues in the Design of Code Generator
• Register Allocation: Instruction involving register
operands are usually shorter and faster than
involving operands in memory.
• Evaluation order: The order in which computation
are performed can affect the efficiency of the target
code. Some computations orders require fewer
register to hold intermediate results than others.
• Approaches to code Generation: Give the premium
on correctness, designing a code generator so it can
be easily implemented, tested and maintained is an
important design goal.
complexity of mapping
LD R0, y LD R0, b
ADD R0, R0, z ADD R0, R0, c
ST x, R0 ST a, R0
LD R0, a
ADD R0, R0, e
ST d, R0
Register allocation
• Two subproblems
– Register allocation: selecting the set of variables that will reside in
registers at each point in the program
– Resister assignment: selecting specific register that a variable reside in
• Complications imposed by the hardware architecture
– Example: register pairs for multiplication and division
t=a+b t=a+b
t=t*c t=t+c
T=t/d T=t/d
L R0, a
L R1, a A R0, b
A R1, b M R0, c
M R0, c SRDA R0, 32
D R0, d D R0, d
ST R1, t ST R1, t
A simple target machine model
• Load operations: LD r,x and LD r1, r2
• Store operations: ST x,r
• Computation operations: OP dst, src1, src2
• Unconditional jumps: BR L
• Conditional jumps: Bcond r, L like BLTZ r, L
Addressing Modes
• variable name: x
• indexed address: a(r) like LD R1, a(R2) means
R1=contents(a+contents(R2))
• integer indexed by a register : like LD R1, 100(R2)
• Indirect addressing mode: *r and *100(r)
• immediate constant addressing mode: like LD R1,
#100
Mode Form Address Added Cost
Absolute M M 1
Register R R 0
Indexed C(R) C+contents(R) 1
Indirect register *R Contents(R) 0
Indirect indiexed *C(R) Contents(C+contents(R) 1
Literal #C (source to be a constant) 1
LD R1, i //R1 = i
MUL R1, R1, 8 //R1 = Rl * 8
LD R2, a(R1)
//R2=contents(a+contents(R1))
ST b, R2 //b = R2
a[j] = c
LD R1, c //R1 = c
LD R2, j // R2 = j
MUL R2, R2, 8 //R2 = R2 * 8
ST a(R2), R1
//contents(a+contents(R2))=R1
x=*p
LD R1, p //R1 = p
LD R2, 0(R1) // R2 =
contents(0+contents(R1))
ST x, R2 // x=R2
conditional-jump three-address instruction
If x<y goto L
LD R1, x // R1 = x
LD R2, y // R2 = y
SUB R1, R1, R2 // R1 = R1 - R2
BLTZ R1, M // i f R1 < 0 jump t o M
costs associated with the addressing modes
• LD R0, R1 cost = 1
• LD R0, M cost = 2
• LD R1, *100(R2) cost = 3
Addresses in the Target Code
Return to caller
in Callee: BR *0(SP)
in caller: SUB SP, SP, #caller.recordsize
Target code for stack allocation
A Simple Code Generator
• Here we shall consider an algorithm that
generates code for a single basic block. It
considers each three address insturction in
turn, and keep track of what are in what
registers so it can avoid generating
unnecesary loads and strores.
Principal uses of registers
• In most machine architectures, some or all of the
operands of an operation must be in registers in order
to perform the operation.
• Registers make good temporaries - places to hold the
result of a subexpression while a larger expression is
being evaluated, or more generally, a place to hold a
variable that is used only within a single basic block.
• Registers are used to hlod values that are complied in
one basic block and used on other blocks.
• Registers are often used to help with run-time storage
management, for example, to manage the run-time
stack, including the maintenance of stack pointers and
possibly the top elements of the stack itself.
Generic code Generation Algorithm
The code generates algorithm takes a sequence of
there-address statement of the form x=y op z, we
perform the following actions.
1.Invoke a function getreg to determine the
location L where the result of the computation y
op z should be stored, L will usually be a register.
2.Consult the address descriptor for y to determine
y’, the current location of y. if the value of y in not
already in L, generate the instruction MOV y’, L to
place and copy of y in L
Generic code Generation Algorithm
3. Generate the instruction op z’, L where z’ is a
current location z.
4. Update the address descriptor of x, to indicate
that x is in location L. If L is a register, update its
descriptor to indicate that it contains the value
of x and remove x from all other register
descriptors.
5. If the current values of y and /or z have no next
uses, and are in registers, after the register
descriptor or indicate that, those registers no
longer will contain y and/or z, respectively.
Register and Address Descriptors
The code generation algorithm uses descriptors to keep track of
register contents and address for names
1. Register Descriptors: A register descriptors is a pointer to a list
containing information about the current contents of each register.
Initially all the registers are empty.