0% found this document useful (0 votes)
103 views

Compiler Design (16CS303) : Welcome

This document provides an introduction to intermediate code generation in compiler design. It discusses how intermediate codes are machine independent but close to machine instructions. Generating intermediate code eliminates the need for a new compiler for each machine and makes optimizations easier. Common intermediate representations include graphs, trees and 3-address code using quadruple or triple notation. The document provides examples of implementing intermediate code as 3-address code statements.

Uploaded by

Thanmayee Thanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

Compiler Design (16CS303) : Welcome

This document provides an introduction to intermediate code generation in compiler design. It discusses how intermediate codes are machine independent but close to machine instructions. Generating intermediate code eliminates the need for a new compiler for each machine and makes optimizations easier. Common intermediate representations include graphs, trees and 3-address code using quadruple or triple notation. The document provides examples of implementing intermediate code as 3-address code statements.

Uploaded by

Thanmayee Thanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

WELCOME

COMPILER DESIGN
(16CS303)

by
S. Shivaprasad
Assistant Professor
Department of CSE
VFSTR Deemed to be University
INTERMEDIATE CODE GENERATION

Dept. of CSE, VFSTR University Compiler Design


3
INTERMEDIATE CODE GENERATION
ICG
 Intermediate codes are machine independent codes, but they are close to
machine instructions.
 ADVANTAGES
 If a compiler translates the source language to its target machine
language without having the option for generating intermediate code,
then for each new machine, a full native compiler is required.

 Intermediate code eliminates the need of a new full compiler for every
unique machine

 It becomes easier to apply the source code modifications to improve


code performance by applying code optimization techniques on the
intermediate code
5
Compiler Design
Dept. of CSE, VFSTR University
 Intermediate language can be many different languages, and the designer
of the compiler decides this intermediate language.
TYPES OF INTERMEDIATE LANGUAGES

GRAPHICAL REPRESENTATIONS:

CONSIDER THE ASSIGNMENT a:=b*-c+b*-c:

= =

+
a + a

* * *

b uminus uminus
uminus b

c c 7
b c
AST
Dept. of CSE, VFSTR University
DAG Compiler Design
EXAMPLE OF THREE-ADDRESS CODE
Example: a:=b*-c+b*-c:

t1:=- c
t1:=- c
t2:=b * t1
t2:=b * t1
t3:=- c
t4:=b * t3 t5:=t2 + t2

t5:=t2 + t4 a:=t5
a:=t5

8
Dept. of CSE, VFSTR University Compiler Design
IMPLEMENTATION
THREE-ADDRESS CODE (QUADRAPLES)
A quadruple is:
x := y op z
where x, y and z are names, constants or compiler-generated
temporaries;
op is any operator.
 But we may also the following notation for quadraples (much better
notation
 because it looks like a machine code instruction)
op y,z,x
apply operator op to y and z, and store the result in x.
 We use the term “three-address code” because each statement usually
contains three addresses (two for operands, one for the result). 11

Dept. of CSE, VFSTR University Compiler Design


THREE-ADDRESS STATEMENTS (CONT.)

Binary Operator: op y,z,result or result := y op z


where op is a binary arithmetic or logical operator. This binary operator is applied to y and z,
and the result of the operation is stored in result.
Ex: add a,b,c
gt a,b,c
addr a,b,c
addi a,b,c
Unary Operator: op y,result or result := op y
where op is a unary arithmetic or logical operator. This unary operator is applied to y, and the result
of the operation is stored in result.
Ex: uminus a,c
not a,c
inttoreal a,c
12
Dept. of CSE, VFSTR University Compiler Design
THREE-ADDRESS STATEMENTS (CONT.)

Move Operator: mov y,result or result := y


where the content of y is copied into result.
Ex: mov a,c
movi a,c
movr a,c
Unconditional Jumps: jmp ,L or goto L
We will jump to the three-address code with the label L, and the execution
continues from that statement.
Ex: jmp ,L1 // jump to L1
jmp ,7 // jump to the statement 7
13
Dept. of CSE, VFSTR University
Compiler Design
THREE-ADDRESS STATEMENTS (CONT.)
Conditional Jumps:
jmp relop y, z, L or if y relop z goto L
We will jump to the three-address code with the label L if the result of y relop z is true,
and the execution continues from that statement. If the result is false, the execution
continues from the statement following this conditional jump statement.
Ex: jmpgt y,z,L1 // jump to L1 if y>z
jmpgte y,z,L1 // jump to L1 if y>=z
jmpe y,z,L1 // jump to L1 if y==z
jmpne y,z,L1 // jump to L1 if y!=z
Our relational operator can also be a unary operator.
jmpnz y,,L1 // jump to L1 if y is not zero
jmpz y,L1 // jump to L1 if y is zero
jmpt y,L1 // jump to L1 if y is true
jmpf y,L1 // jump to L1 if y is false 14
Dept. of CSE, VFSTR University Compiler Design
THREE-ADDRESS STATEMENTS (CONT.)

Procedure Parameters: param x,, or param x


Procedure Calls: call p,n, or call p,n
where x is an actual parameter, we invoke the procedure p with n parameters.

15
Dept. of CSE, VFSTR University
Compiler Design
THREE-ADDRESS STATEMENTS (CONT.)

Indexed Assignments:
move y[i],,x or x := y[i]
move x,,y[i] or y[i] := x

Address and Pointer Assignments:


move addr y,,x or x := &y
move cont y,,x or x := *y

16
Dept. of CSE, VFSTR University Compiler Design
TYPES OF THREE ADDRESS
CODES
IMPLEMENTATIONS OF 3-ADDRESS STATEMENTS
op arg1 arg2 result
 Quadruples
(0) uminus c t1
t1:=- c
(1) * b t1 t2
t2:=b * t1
t3:=- c (2) uminus c
t4:=b * t3 (3) * b t3 t4
t5:=t2 + t4 (4) + t2 t4 t5
a:=t5
(5) := t5 a

Pros: Statements can be moved around


Cons: Too much of space wasted
19
Compiler Design
IMPLEMENTATIONS OF 3-ADDRESS
STATEMENTS
op arg1 arg2
 Triples (0) uminus c
t1:=- c
(1) * b (0)
t2:=b * t1
(2) uminus c
t3:=- c
t4:=b * t3 (3) * b (2)

t5:=t2 + t4 (4) + (1) (3)


a:=t5 (5) assign a (4)

Pros: Space is not wasted 20


Compiler Design
Cons: statements cannot moved
OTHER TYPES OF 3-ADDRESS
STATEMENTS
 e.g. ternary operations like
x[i]:=y x:=y[i]
 require two or more entries. e.g.

op arg1 arg2
(0) []= x i
(1) assign (0) y

op arg1 arg2
(0) []= y i
(1) assign x (0)
21
Dept. of CSE, VFSTR University Compiler Design
IMPLEMENTATIONS OF 3-ADDRESS STATEMENTS

Indirect Triples op op arg1 arg2


t1:=- c
(0) (14) (14) uminus c
t2:=b * t1
t3:=- c (1) (15) (15) * b (14)
t4:=b * t3 (2) (16) (16) uminus c
t5:=t2 + t4
a:=t5 (3) (17) (17) * b (16)
(4) (18) (18) + (15) (17)
(5) (19) (19) assign a (18)

Pros: statements can be moved 22


Dept. of CSE, VFSTR University Compiler Design
Cons: Two memory access required
COVERTION OF POPULAR P.L INTO 3-ADDRESS CODE

a=b+c+d

THREE ADDRESS CODE FOR THE GIVEN EXPRESSION IS-


(1) t1 = b + c
(2) t2 = t1 + d
(3) a = t2
if (a < b + c)
a = a - c;
a = b * c + b * d; c = b * c;

t1 = b * c; t1 = b + c;
t2 = b * d;
t2 = a < t1;
ifz t2 goto l0;
t3 = t1 + t2;
t3 = a - c;
a =t3; a = t3;
l0: t4 = b * c;
c = t4;
 Write Three Address Code for the following expression-
-(a * b) + (c + d) – (a + b + c + d)
three address code for the given expression is-
(1) t1 = a * b
(2) t2 = uminus t1
(3) t3 = c + d
(4) t4 = t2 + t3
(5) t5 = a + b
(6) t6 = t3 + t5
(7) t7 = t4 – t6
 Write Three Address Code for the following expression-
If A < B then 1 else 0

three address code for the given


expression is-
(1) if (a < b) goto (4)
(2) t1 = 0
(3) goto (5)
(4) t1 = 1
(5)
 Write Three Address Code for the following expression-
If A < B and C < D then t = 1 else t = 0

three address code for the given expression is-


(1) if (a < b) goto (3)
(2) goto (4)
(3) if (c < d) goto (6)
(4) t = 0
(5) goto (7)
(6) t = 1
(7)
in 3 ta code
while loop
  a=3;
b=4;
a=3; i=0;
b=4; l1:
var1=i<n;
i=0; if(var1) goto l2;
while(i<n){ goto l3;
a=b+1;  
l2: var2=b+1;
a=a*a; a=var2;
i++; var3=a*a;
a=var3;
} i++;
c=a; goto l1
 
l3: c=a;
FOR LOOP
a=3;
b=4;
i=0;
a=3; l1:
b=4; t1=i<n;
if(t1) goto l2;
for(i=0;i<n;i++){ goto l3;
l4: i++;
a=b+1; goto l1;
l2: t2=b+1;
a=a*a; a=t2;
t3=a*a;
} a=t3;
c=a;  
goto l4

l3: c=a;
t1= y*20
x=a[y, z] t2=t1+z
t3=t2*4
t4=base address of a
x=t4[t3]
ASSIGNMENTS

1. CONVERT INTO 3-ADDRESS CODE


x:=1;
y:=x+10;
while (x<y) {
x:=x+1;
if (x%2==1) then y:=y+1;
else y:=y-2;
}
.
2 Explain about various intermediate forms?

3. How to implement 3-address code in memory? Explain with an examples?

4. What are the different types of intermediate code ?


WELCOME
COMPILER DESIGN
(16CS303)

by
S. Shivaprasad
Assistant Professor
Department of CSE
VFSTR Deemed to be University
SEMANTIC ANALYSIS PHASE
Semantic analysis

It adds semantic information to the parse tree and performs


certain checks based on this information.
 It logically follows the parsing phase, in which the parse tree
is generated
Functionalities of Semantic Analyzer

•Type checking: The process of verifying and enforcing the constraints


of types is called type checking.
•This may occur either at compile time (a static check) or run time
(a dynamic check).
•Static type checking is a primary task of the semantic analysis carried out
by a compiler.
•Uniqueness checking: Whether a variable name is unique or not, in the
its scope.
•Type coercion: If some kind of mixing of types is allowed. Done in
languages which are not strongly typed. This can be done dynamically as
well as statically.
•Name Checks: Check whether any variable has a name which is not
allowed. Ex. Name same as an identifier( Ex. int in java).
•Disambiguate Overloaded operators: If an operator is overloaded, one
would like to specify the meaning of that particular operator because from
one will go into code generation phase next.
BEYOND SYNTAX ANALYSIS
 Parser cannot catch all the program errors
 There is a level of correctness that is deeper than syntax analysis
Example 1
string x; int y;
y=x+3
the use of x is type error
int a, b;
a=b+c
c is not declared

float x = 10.1;
float y = x*30;
37
COMPILER NEEDS TO KNOW?
Context-sensitive Grammar

38
CONTEXT-SENSITIVE ANALYSIS
Why is context-sensitive analysis hard?
answers depend on values, not syntax
questions and answers involve non-local information
answers may involve computation

Several alternatives:
abstract syntax tree (attribute grammars): specify
non-local computations; automatic evaluators
symbol tables: central store for facts; express checking
code
language design: simplify language; avoid problems
The plain parse-tree constructed in that phase is generally of no use for
a compiler, as it does not carry any information of how to evaluate the
tree. For exampe
E → E +T

The above CFG production has no semantic rule associated with it, and it cannot help in making any sense of
the production.
Semantics
 Semantics of a language provide meaning to its constructs,
like tokens and syntax structure.
 Semantics help interpret symbols, their types, and their
relations with each other.
 Semantic analysis judges whether the syntax structure constructed
in the source program derives any meaning or not.
CFG + semantic rules = Syntax Directed Definitions
Attribute Grammar
 Attribute grammar is a special form of context-free
grammar where some additional information (attributes) are
appended to one or more of its non-terminals in order to
provide context-sensitive information.

 Attribute grammar is a medium to provide semantics to the


context-free grammar and it can help specify the syntax and
semantics of a programming language. 

 It can pass values or information among the nodes of a tree

Example:
E → E + T { E.value = E.value + T.value }
Based on the way the attributes get their values, they can be broadly divided
into two categories : synthesized attributes and inherited attributes.
Synthesized attributes:
These attributes get values from the attribute values of their child nodes. 

S → ABC
• If S is taking values from its child nodes (A,B,C), then it is
said to be a synthesized attribute, as the values of ABC
are synthesized to S.

Inherited attributes:
In contrast to synthesized attributes, inherited attributes can take values from
parent and/or siblings.
S → ABC • A can get values from S, B and C. B can take
values from S, A, and C. Likewise, C can take values
from S, A, and B
WELCOME
COMPILER DESIGN
(16CS303)

by
S. Shivaprasad
Assistant Professor
Department of CSE
VFSTR Deemed to be University
 Synthesized attributes

 Inherited attributes
Types of SDT

 S-attributed SDT
 L-attributed SDT

S-attributed SDT
Ex:
 If an SDT uses only synthesized
attributes, it is called as S-attributed A-> XY
SDT.
 S-attributed SDTs are evaluated in A.a= f(X,Y)
bottom-up parsing, as the values of the
parent nodes depend upon the values
of the child nodes.
 Semantic actions are placed in
rightmost place of RHS.
A-> XYZ
L-attributed SDT
A.a=f(X,Y,Z)
 If an SDT uses both synthesized Y.b=f(X)
attributes and inherited attributes with Z.c=f(X,Y)
a restriction that inherited attribute
can inherit values from left siblings
only, it is called as L-attributed SDT.
 Attributes in L-attributed SDTs are
evaluated by depth-first and left-to-
right parsing manner.
 Semantic actions are placed anywhere
in RHS.
A->BC { B.s= A.s)

A->LM {L.i=f(A.i), M.i = f(L.s), A.s= f(m.s);}

A->QR {R.i= f(A.i), Q.i=f(R.i), A.s=f(Q.s); }

You might also like