0% found this document useful (0 votes)

127 views

Unit 5

Uploaded by

Shirley Andrina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views

Unit 5

Uploaded by

Shirley Andrina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

UNIT V

Code generation: Machine-dependent code generation, object code forms, generic code
generation algorithm, Register allocation, and assignment. Using DAG representation of
Block.

CODE GENERATION
The final phase in compiler model is the code generator. It takes as input an intermediate representation of
the source program and produces as output an equivalent target program. The code generation techniques
presented below can be used whether or not an optimizing phase occurs before code generation.

The code generated by the compiler is an object code of some lower-level programming language,
for example, assembly language.
The source code written in a higher-level language is transformed into a lower-level language that
results in a lower-level object code, which should have the following minimum properties:
-It should carry the exact meaning of the source code.
-It should be efficient in terms of CPU usage and memory management

ISSUES IN THE DESIGN OF A CODE GENERATOR

Code generator converts the intermediate representation of source code into a form that can be
readily executed by the machine. A code generator is expected to generate the correct code.
Designing of the code generator should be done in such a way that it can be easily implemented,
tested, and maintained.

The main issues in the design of a code generator are:

Input to the code generator
Target program
Memory management
Instruction selection
Register allocation
Evaluation order

(i) Input to code generator The input to the code generator is the intermediate code generated
by the front end, along with information in the symbol table that determines the run-time addresses
of the data objects denoted by the names in the intermediate representation. Intermediate codes

1
may be represented mostly in quadruples, triples, indirect triples, Postfix notation, syntax trees,
DAGs, etc. The code generation phase just proceeds on an assumption that the input is free from
all syntactic and state semantic errors, the necessary type checking has taken place and the type-
conversion operators have been inserted wherever necessary.

(ii) Target program: The target program is the output of the code generator. The output may be
absolute machine language, relocatable machine language, or assembly language.
Absolute machine language as output has the advantages that it can be placed in
a fixed memory location and can be immediately executed. For example, WATFIV
is a compiler that produces the absolute machine code as output.
Relocatable machine language as an output allows subprograms and subroutines
to be compiled separately. Relocatable object modules can be linked together and
loaded by a linking loader. But there is added expense of linking and loading.
Assembly language as output makes the code generation easier. We can generate
symbolic instructions and use the macro-facilities of assemblers in generating code.
And we need an additional assembly step after code generation.

(iii) Memory Management Mapping the names in the source program to the addresses of data
objects is done by the front end and the code generator. A name in the three address statements
refers to the symbol table entry for the name. Then from the symbol table entry, a relative address
can be determined for the name.

(iv) Instruction selection Selecting the best instructions will improve the efficiency of the program. It
includes the instructions that should be complete and uniform. Instruction speeds and machine idioms also
play a major role when efficiency is considered. But if we do not care about the efficiency of the target
program then instruction selection is straightforward. For example, the respective three-address statements
would be translated into the latter code sequence as shown below:
P:=Q+R
S:=P+T

MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement is redundant as the value of the P is loaded again in that statement that
just has been stored in the previous statement. It leads to an inefficient code sequence. A given
intermediate representation can be translated into many code sequences, with significant cost
differences between the different implementations. Prior knowledge of instruction cost is needed
in order to design good sequences, but accurate cost information is difficult to predict.
(v) Register allocation issues Use of registers make the computations faster in comparison to
that of memory, so efficient utilization of registers is important. The use of registers is
subdivided into two sub-problems:
1. During Register allocation we select only those sets of variables that will reside in the
registers at each point in the program.

2
2. During a subsequent Register assignment phase, the specific register is picked to access the
variable.

To understand the concept consider the following three address code sequence
t:=a+b
t:=t*c
t:=t/d
Their efficient machine code sequence is as follows:
MOV a, R0
ADD b, R0
MUL c, R0
DIV d, R0
MOV R0, t

(vi) Evaluation order The code generator decides the order in which the instruction will be
executed. The order of computations affects the efficiency of the target code. Among many
computational orders, some will require only fewer registers to hold the intermediate results.
However, picking the best order in the general case is a difficult NP-complete problem.

Approaches to code generation issues: Code generator must always generate the correct code.
It is essential because of the number of special cases that a code generator might face. Some of
the design goals of code generator are:
Correct
Easily maintainable
Testable
Efficient

Disadvantages in the design of a code generator:

Limited flexibility: Code generators are typically designed to produce a specific type of code,
and as a result, they may not be flexible enough to handle a wide range of inputs or generate
code for different target platforms. This can limit the usefulness of the code generator in certain
situations.
Maintenance overhead: Code generators can add a significant maintenance overhead to a
project, as they need to be maintained and updated alongside the code they generate. This can
lead to additional complexity and potential errors.
Debugging difficulties: Debugging generated code can be more difficult than debugging hand-
written code, as the generated code may not always be easy to read or understand. This can make
it harder to identify and fix issues that arise during development.
Performance issues: Depending on the complexity of the code being generated, a code
generator may not be able to generate optimal code that is as performant as hand-written code.
This can be a concern in applications where performance is critical.
Learning curve: Code generators can have a steep learning curve, as they typically require a
deep understanding of the underlying code generation framework and the programming
languages being used. This can make it more difficult to onboard new developers onto a project
that uses a code generator.

3
Over-reliance: -
reliance on generated code, to the point where developers are no longer able to write code
manually when necessary. This can limit the flexibility and creativity of a development team,
and may also result in lower quality code overall.

MACHINE DEPENDENT CODE OPTIMIZATIONS

Machine-dependent optimization uses information about the limits and special features of the target
machine to produce code which is shorter or which executes more quickly on the machine.
The code produced by the compiler should take advantage of the special features of the target
machine. For example, consider code intended for machines of the PDP-11 family.
These computers have auto increment and auto decrement modes for instructions
When an instruction is given in the auto increment mode, the contents of the register are incremented
after being used. The register is incremented by one for byte instructions and by two for word
instructions.
The use of instructions in these modes reduces the code necessary for pushing and popping stacks.
The PDP-11 computers also have machine-level instructions to increment (INC), or to decrement
(DEC), by one, values stored in memory.
Whenever possible, the INC and DEC operations should be used instead of creating a constant with
value 1 and adding or subtracting this constant from the value stored in memory.
The PDP-11 machines have left- and right-shift operations. Shifting the bits one position to the left
is equivalent to multiplying by 2. Since shifting is faster than multiplication or division, more efficient
code is generated if multiplication and division by multiples of 2 are implemented with shift
operations.

OBJECT CODE FORMS

The output of code generation is an object code or machine code. Which is normally classified into different
forms.
1. Absolute Code
2. Relocatable machine Code
3. Assembler Code
Absolute Code:- Producing an absolute machine language program as output has the advantage that it can
be placed in a fixed location in memory and immediately executed. Programs can be compiled and executed
quickly.
Relocatable Machine Code:- Producing a relocatable machine language program (often called as object
module) as output allows sub programs to be compiler separately. For example a set of relocatable object
modules can be linked together and loaded for execution by linking loader. If the target machine does not
handle relocation automatically, the compiler must provide explicit relocation information to the loader to
link the separately compiled program segments.
Assembler Code:- Producing an assembly-language program as output makes the process of code
generation somewhat easier. We can generate symbolic instructions and use macro facilities of the

4
assembler to help in generation of code. But, generating assembler code as an output makes code generation
process slower because of it needs assembling, linking and loading.
GENERIC CODE GENERATION ALGORITHM

A simple code generation algorithm is a one that generates code for a single basic block.
It considers each three-address instruction in turn and keeps track of what values are in what
registers so it can avoid generating unnecessary loads and stores.
One of the primary issues during code generation is how to use registers effectively.
There are four principle uses of registers:

In most machine architectures some or all of the operands of an operation must be in registers in
order to perform the operation.
Registers make good temporaries i.e. places to hold the result of sub expression while a larger
expression is being evaluated.
Registers are used to hold values that are computed in one basic block and used in other blocks.
Registers are often used to help with run-time storage management, for example registers are used
to manage the run-time stack.
Let us assume that some set of registers is available to hold the values that are used within the block.
Typically this set of registers does not include all the registers of the machine since some registers are
reserved for global variables and managing the stack.
But our code generation algorithm considers each three address instruction in turn and decides what loads
are necessary to get the needed operands into registers. After generating the loads, it generates the operation
itself. And if there is a need to store the result into a memory location, it also generates that store. In order
to make needed decisions we require a data structure that tells us what program variables currently have
their value in a register and in which register(s) if so.
The desired data structure has the following descriptors:
o For each available register, a register descriptor keeps track of the variable names whose current
value is in that register.
o For each program variable an address descriptor keeps track of the location or locations where the
current value of that variable can be found. Where the location may be a stack location or a register
or a memory address.
o An essential part of the algorithm is the function getReg(I) which selects registers for each memory
location associated with the three address instruction I.
o The function getReg has access to the register and address descriptors for all the variables of the
basic block.
Machine Instructions for operations
For a three address instruction such as x = y + z, do the following:
1. Use getReg(x = y + z) to select registers for x, y and z. let these registers are Rx, Ry, and Rz.
2. If y is not in Ry then issue an instruction LD
for y .
3

5
4. Issue the instruction ADD Rx, Ry, Rz.

In the above three address instruction x = y + z we shall treat + as a generic operator and
ADD as the equivalent machine instruction.
Thus, when we implement the operation, the value of y must be in the second register and
z must be the third register in the ADD instruction.
Managing Register and Address Descriptors
As the code generation algorithm issues load, store and other machine instructions, it needs to update the
register and address descriptors. The rules are as follows:
1. For the instruction LD R, x
a) Change the register descriptor for register R so it holds only x.
b) Change the address descriptor for x by adding register R as an additional location.
2. For the instruction ST x, R change the address descriptor for x to include its own memory location.
3. For an operation such as ADD Rx, Ry, Rz implementing a three address instruction x = y + z.
a) Change the register descriptor for Rx so that it holds only x
b)
c) Remove Rx from the address descriptor of any variable other than x.
REGISTER ALLOCATION AND ASSIGNMENT
Register allocation is only within a basic block. It follows top-down approach.
Local register allocation
Register allocation is only within a basic block. It follows top-down approach.
Assign registers to the most heavily used variables
Traverse the block
Count uses
Use count as a priority function
Assign registers to higher priority variables first
Need of global register allocation

Local allocation does not take into account that some instructions (e.g. those in loops) execute more
frequently. It forces us to store/load at basic block endpoints since each block has no knowledge of
the context of others.
To find out the live range(s) of each variable and the area(s) where the variable is used/defined
global allocation is needed. Cost of spilling will depend on frequencies and locations of uses.
Register allocation depends on:
Size of live range

6
Number of uses/definitions
Frequency of execution
Number of loads/stores needed.
Cost of loads/stores needed.
Usage Counts:
A simple method of determining the savings to be realized by keeping variable x in a register for the
duration of loop L is to recognize that in our machine model we save one unit of cost for each reference to
x if x is in a register. An approximate formula for the benefit to be realized from allocating a register to x
within a loop L is:

where,
-use(x, B) is the number of times x is used in B prior to any definition of x;
-live(x,B) is 1 if x is live on exit from B and is assigned a value in B and
-live(x,B) is 0 otherwise.

B1 B2 B3 B4
a=(0+2)+(1+0)+(1+0)+(0+0)=4
b=(2+0)+(0+0)+(0+2)+(0+2)=6
c=(1+0)+(0+0)+(1+0)+(1+0)=3
d=(1+2)+(1+0)+(1+0)+(1+0)=6
e=(0+2)+(0+0)+(0+2)+(0+0)=4

7
f=(1+0)+(0+2)+(1+0)+(0+0)=4
Registers R0, R1, R2 are fixed registers.
R0 can be used by a or e or f
R1 can be used by b
R2 can be used by d

DIRECTED ACYCLIC GRAPH

Directed Acyclic Graph (DAG) is a tool that depicts the structure of basic blocks, helps to see the flow of
values flowing among the basic blocks, and offers optimization too. DAG provides easy transformation on
basic blocks. DAG can be understood here:

Leaf nodes represent identifiers, names or constants.

Interior nodes represent operators.
Interior nodes also represent the results of expressions or the identifiers/name where the values are
to be stored or assigned.
Input: A basic block
Output: A DAG for the basic block containing the following information:
1. A label for each node. For leaves, the label is an identifier. For interior nodes, an operator symbol.
2. For each node a list of attached identifiers to hold the computed values.
Case (i) x := y OP z
Case (ii) x := OP y
Case (iii) x := y
Method:
Step 1: If y is undefined then create node(y).
If z is undefined, create node (z) for case (i).
Step 2: For the case (i), create a node (OP) whose left child is node(y) and right child is node (z) . (Checking
for common sub expression). Let n be this node.
For case (ii), determine whether there is node (OP) with one child node(y). If not create such a node.
For case (iii), node n will be node(y).
Step 3: Delete x from the list of identifiers for node(x). Append x to the list of attached identifiers for the
node n found in step 2 and set node(x) to n.
Problem 1: Construct a DAG for the following code
t0 = a + b

8
t1 = t0 + c
d = t0 + t1

Problem 2: Construct a DAG for the following code

T1 := 4*I0
T2 := a[T1]
T3 := 4*I0
T4 := b[T3]
T5 := T2 * T4
T6 := prod + T5
prod:= T6
T7 := I0 + 1
I0 := T7
if I0 <= 20 goto 1

9
Application of Directed Acyclic Graph:
Directed acyclic graph determines the subexpressions that are commonly used.
Directed acyclic graph determines the names used within the block as well as the names computed
outside the block.
Determines which statements in the block may have their computed value outside the block.
Code can be represented by a Directed acyclic graph that describes the inputs and outputs of each of
the arithmetic operations performed within the code; this representation allows the compiler to
perform common subexpression elimination efficiently.
Several programming languages describe value systems that are linked together by a directed acyclic
graph. When one value changes, its successors are recalculated; each value in the DAG is evaluated
as a function of its predecessors.

3160712_MI_question_bank
No ratings yet
3160712_MI_question_bank
2 pages
Penetration Testing Step-By-Step Guide
92% (12)
Penetration Testing Step-By-Step Guide
417 pages
Internship Report1
No ratings yet
Internship Report1
29 pages
Restaurant Project Presentation
100% (1)
Restaurant Project Presentation
21 pages
Issues in the design of a code generator
No ratings yet
Issues in the design of a code generator
4 pages
Unit V
No ratings yet
Unit V
21 pages
Unit V
No ratings yet
Unit V
42 pages
Compiler Notes KCG Unit IV
No ratings yet
Compiler Notes KCG Unit IV
14 pages
Unit4 Compiler PDF
No ratings yet
Unit4 Compiler PDF
73 pages
CD Uint5
No ratings yet
CD Uint5
16 pages
Unit-4-5
No ratings yet
Unit-4-5
36 pages
UNIT 4 - Chapter 1 in Compiler Design
No ratings yet
UNIT 4 - Chapter 1 in Compiler Design
51 pages
Chapter 10 - Code Generation
No ratings yet
Chapter 10 - Code Generation
31 pages
Code Generation
No ratings yet
Code Generation
5 pages
CD Unit 5
No ratings yet
CD Unit 5
26 pages
CD Unit 5
No ratings yet
CD Unit 5
26 pages
Compiler Notes Unit IV
No ratings yet
Compiler Notes Unit IV
15 pages
Code Generation 5th Year Computer Science Course
No ratings yet
Code Generation 5th Year Computer Science Course
20 pages
Code Generation: Issues in The Design of A Code Generator
No ratings yet
Code Generation: Issues in The Design of A Code Generator
33 pages
Unit 4 PCD
No ratings yet
Unit 4 PCD
15 pages
Code Geneartion
No ratings yet
Code Geneartion
13 pages
CH5 2
No ratings yet
CH5 2
23 pages
CD R19 Unit-5
No ratings yet
CD R19 Unit-5
13 pages
Unit-V Code Generation: 4.5. Issues in The Design of A Code Generator
No ratings yet
Unit-V Code Generation: 4.5. Issues in The Design of A Code Generator
6 pages
CH5 2
No ratings yet
CH5 2
24 pages
Unit Viii
No ratings yet
Unit Viii
16 pages
Code Generation I
No ratings yet
Code Generation I
32 pages
13-Issues in the Design of a Code Generator--22!10!2024
No ratings yet
13-Issues in the Design of a Code Generator--22!10!2024
54 pages
Unit 5
No ratings yet
Unit 5
13 pages
Issues in Code Generator-Pages-2
No ratings yet
Issues in Code Generator-Pages-2
3 pages
Compiler-Design U5
No ratings yet
Compiler-Design U5
13 pages
Compiler Design Code Generation
No ratings yet
Compiler Design Code Generation
4 pages
Code Generation (Autosaved)
No ratings yet
Code Generation (Autosaved)
48 pages
5.1 Issues in Code Generation
No ratings yet
5.1 Issues in Code Generation
16 pages
34-Issues in the design of a code generator_Target Machine-25-10-2024
No ratings yet
34-Issues in the design of a code generator_Target Machine-25-10-2024
29 pages
CODE GENERATION and Issues
No ratings yet
CODE GENERATION and Issues
3 pages
CODE generation cd
No ratings yet
CODE generation cd
57 pages
Unit 5 part 1_CD
No ratings yet
Unit 5 part 1_CD
14 pages
CD Unit 5
No ratings yet
CD Unit 5
9 pages
Principles of Compiler Design (Seng 3043) : Chapter - 8 Code Generation
No ratings yet
Principles of Compiler Design (Seng 3043) : Chapter - 8 Code Generation
25 pages
Compiler Design and Construction Lecture Notes
No ratings yet
Compiler Design and Construction Lecture Notes
28 pages
Chapter 6 Code generation and Optimization
No ratings yet
Chapter 6 Code generation and Optimization
34 pages
CD UNIT-5
No ratings yet
CD UNIT-5
16 pages
Code Generation
No ratings yet
Code Generation
49 pages
Code Generation-20241219074111
No ratings yet
Code Generation-20241219074111
20 pages
Code Generation I: Compiler Construction
No ratings yet
Code Generation I: Compiler Construction
28 pages
Chapter 8 Code Optimization and Code Generation
No ratings yet
Chapter 8 Code Optimization and Code Generation
58 pages
15Cs314J - Compiler Design: Unit 4
No ratings yet
15Cs314J - Compiler Design: Unit 4
71 pages
Unit 4 Part 2 A
No ratings yet
Unit 4 Part 2 A
19 pages
Target Code Generation: Utkarsh Jaiswal 11CS30038
No ratings yet
Target Code Generation: Utkarsh Jaiswal 11CS30038
15 pages
CD UNIT-6 LM
No ratings yet
CD UNIT-6 LM
17 pages
REDO%20-2%20CD.pdf 2
No ratings yet
REDO%20-2%20CD.pdf 2
2 pages
Lecture 8- Code Generation
No ratings yet
Lecture 8- Code Generation
19 pages
REDO%20-2%20CD.pdf 3
No ratings yet
REDO%20-2%20CD.pdf 3
1 page
Compiler Design (Unit-5)
No ratings yet
Compiler Design (Unit-5)
22 pages
Introduction To Compilers: Jun.-Prof. Dr. Christian Plessl Custom Computing University of Paderborn
No ratings yet
Introduction To Compilers: Jun.-Prof. Dr. Christian Plessl Custom Computing University of Paderborn
51 pages
Acd 5
No ratings yet
Acd 5
9 pages
UNIT-5 Notes
No ratings yet
UNIT-5 Notes
14 pages
Experiment No 6 - DONE
No ratings yet
Experiment No 6 - DONE
8 pages
Code Generation
No ratings yet
Code Generation
40 pages
Code Generation and Instruction Selection Unit-8
No ratings yet
Code Generation and Instruction Selection Unit-8
6 pages
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
CS8092 Computer Graphics and Multimedia UNIT II-Two Dimensional Graphics 2.1 Two Dimensional Geometric Transformations
No ratings yet
CS8092 Computer Graphics and Multimedia UNIT II-Two Dimensional Graphics 2.1 Two Dimensional Geometric Transformations
24 pages
Unit-I Introduction of Iot
No ratings yet
Unit-I Introduction of Iot
24 pages
User Datagram Protocol: Unit V Transport, Session Amd Application Layer
No ratings yet
User Datagram Protocol: Unit V Transport, Session Amd Application Layer
29 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
23 pages
114021
No ratings yet
114021
55 pages
Vulnerability Scanning
No ratings yet
Vulnerability Scanning
20 pages
Detailed Notification TSPSC SR Asst JR Asst Cum Typist Posts
No ratings yet
Detailed Notification TSPSC SR Asst JR Asst Cum Typist Posts
23 pages
Looking For A Challenge 2 en
No ratings yet
Looking For A Challenge 2 en
27 pages
File, Organizations and Management 5
No ratings yet
File, Organizations and Management 5
30 pages
Manuals Wisenet-Viewer 230512 en v1.04
No ratings yet
Manuals Wisenet-Viewer 230512 en v1.04
203 pages
Cytometro BD FACSLink LIS Interface
No ratings yet
Cytometro BD FACSLink LIS Interface
58 pages
A Hands-On Introduction To SAS DATA Step Hash Programming Techniques (V2)
No ratings yet
A Hands-On Introduction To SAS DATA Step Hash Programming Techniques (V2)
71 pages
Solving The Large-Scale TSP Problem in 1h: Santa Claus Challenge 2020
No ratings yet
Solving The Large-Scale TSP Problem in 1h: Santa Claus Challenge 2020
20 pages
JavaScript - Tutorial Basic Course
No ratings yet
JavaScript - Tutorial Basic Course
333 pages
Fsuipc7 User Guide
No ratings yet
Fsuipc7 User Guide
46 pages
Presentation Intro To NFV UCPE and Edge Cloud
No ratings yet
Presentation Intro To NFV UCPE and Edge Cloud
31 pages
Springboot Annotations
No ratings yet
Springboot Annotations
4 pages
HP Nc6400 Compal La-2952p
No ratings yet
HP Nc6400 Compal La-2952p
46 pages
Guia Comunicacion j1939 y Rs 485
No ratings yet
Guia Comunicacion j1939 y Rs 485
14 pages
Learn Python Programming Quickly
No ratings yet
Learn Python Programming Quickly
198 pages
C++ Presentation 3
No ratings yet
C++ Presentation 3
39 pages
Mad Syllabus
No ratings yet
Mad Syllabus
4 pages
Tecnms-3601 (2015)
No ratings yet
Tecnms-3601 (2015)
328 pages
Slide Bengkel RBT DAY 2
No ratings yet
Slide Bengkel RBT DAY 2
46 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
DESIGN AND SIMULATION OF IOT-BASED INTELLIGENT HOME AUTOMATION SYSTEMS USING MATLAB SIMULINK AND PYTHON INTEGRATION
No ratings yet
DESIGN AND SIMULATION OF IOT-BASED INTELLIGENT HOME AUTOMATION SYSTEMS USING MATLAB SIMULINK AND PYTHON INTEGRATION
12 pages
WM201 RTL8723 1
No ratings yet
WM201 RTL8723 1
7 pages
INTNETWORKING
No ratings yet
INTNETWORKING
26 pages
Ospf Mpls Mum Jakarta2016
No ratings yet
Ospf Mpls Mum Jakarta2016
41 pages
Kirthiga M - Resume
No ratings yet
Kirthiga M - Resume
3 pages
Computer 3 Exam
No ratings yet
Computer 3 Exam
2 pages