Therotical Computer Science Book
Therotical Computer Science Book
T H E O R E T I C AL
C O M P U T E R SC I E N C E
For T.Y.B.Sc. Computer Science : Semester – V
[Course Code CS - 356 : Credits - 2]
CBCS Pattern
As Per New Syllabus, Effective from June 2021
Price ` 360.00
N5866
THEORETICAL COMPUTER SCIENCE ISBN 978-93-5451-193-6
Second Edition : September 2022
© : Author
The text of this publication, or any part thereof, should not be reproduced or transmitted in any form or stored in any
computer storage system or device for distribution including photocopy, recording, taping or information retrieval system or
reproduced on any disc, tape, perforated media or other information storage device etc., without the written permission of Author
with whom the rights are reserved. Breach of this condition is liable for legal action.
Every effort has been made to avoid errors or omissions in this publication. In spite of this, errors may have crept in. Any
mistake, error or discrepancy so noted and shall be brought to our notice shall be taken care of in the next edition. It is notified
that neither the publisher nor the author or seller shall be responsible for any damage or loss of action to any one, of any kind, in
any manner, there from. The reader must cross check all the facts and contents with original Government notification or
publications.
DISTRIBUTION CENTRES
PUNE
Nirali Prakashan Nirali Prakashan
(For orders outside Pune) (For orders within Pune)
S. No. 28/27, Dhayari Narhe Road, Near Asian College 119, Budhwar Peth, Jogeshwari Mandir Lane
Pune 411041, Maharashtra Pune 411002, Maharashtra
Tel : (020) 24690204; Mobile : 9657703143 Tel : (020) 2445 2044; Mobile : 9657703145
Email : [email protected] Email : [email protected]
MUMBAI
Nirali Prakashan
Rasdhara Co-op. Hsg. Society Ltd., 'D' Wing Ground Floor, 385 S.V.P. Road
Girgaum, Mumbai 400004, Maharashtra
Mobile : 7045821020, Tel : (022) 2385 6339 / 2386 9976
Email : [email protected]
DISTRIBUTION BRANCHES
DELHI BENGALURU NAGPUR
Nirali Prakashan Nirali Prakashan Nirali Prakashan
Room No. 2 Ground Floor Maitri Ground Floor, Jaya Apartments, Above Maratha Mandir, Shop No. 3,
4575/15 Omkar Tower, Agarwal Road No. 99, 6th Cross, 6th Main, First Floor, Rani Jhanshi Square,
Darya Ganj, New Delhi 110002 Malleswaram, Bengaluru 560003 Sitabuldi Nagpur 440012 (MAH)
Mobile : 9555778814/9818561840 Karnataka; Mob : 9686821074 Tel : (0712) 254 7129
Email : [email protected] Email : [email protected] Email : [email protected]
[email protected] | www.pragationline.com
Also find us on www.facebook.com/niralibooks
Preface …
The book has its own unique features. It brings out the subject in a very simple and lucid
manner for easy and comprehensive understanding of the basic concepts. The book covers
theory of Finite Automaton, Regular Expressions and Languages, Context-free Grammars and
Languages, Pushdown Automata and Turing Machine.
A special word of thank to Shri. Dineshbhai Furia, and Mr. Jignesh Furia for
showing full faith in me to write this text book. I also thank to Mr. Amar Salunkhe and
Mr. Akbar Shaikh of M/s Nirali Prakashan for their excellent co-operation.
I also thank Ms. Chaitali Takle for Graphic Designing, Mr. Ravindra Walodare, Mr. Sachin
Shinde, Mr. Ashok Bodke, Mr. Moshin Sayyed and Mr. Nitin Thorat.
Although every care has been taken to check mistakes and misprints, any errors,
omission and suggestions from teachers and students for the improvement of this text book
shall be most welcome.
Author
Syllabus …
1. Finite Automation (10 Lectures)
• Introduction – Symbol, Alphabet, String, Prefix and Suffix of Strings, Formal Language,
Operations on Languages.
• Deterministic Finite Automaton – Definition, DFA as Language Recognizer, DFA as Pattern
Recognizer.
• Non-deterministic Finite Automaton – Definition and Examples.
• NFA To DFA (Myhill-Nerode Method)
• NFA with ∈ – Transitions Definition and Examples.
• NFA with ∈ – Transitions to DFA and Examples
• Finite Automaton with Output – Mealy and Moore Machine, Definition and Examples.
• Minimization of DFA, Algorithm and Problem using Table Method.
2. Regular Expressions and Languages (06 Lectures)
• Regular Expressions (RE) − Definition and Examples.
• Regular Expressions Identities.
• Regular Language − Definition and Examples.
• Conversion of RE to FA − Examples.
• Pumping Lemma for Regular Languages and Applications.
• Closure Properties of Regular Languages.
3. Context-Free Grammars and Languages (10 Lectures)
• Grammar − Definition and Examples.
• Derivation − Reduction − Definition and Examples.
• Chomsky Hierarchy.
• CFG − Definition and Examples. LMD, RMD, Parse Tree.
• Ambiguous Grammar – Concept and Examples.
• Simplification of CFG − Removing Useless Symbols, Unit Production, ∈-Production and
Nullable Symbol.
• Normal Forms – Greibach Normal Form (GNF) and Chomsky Normal Form (CNF).
• Regular Grammar – Definition.
• Left Linear and Right Linear Grammar − Definition and Example.
• Equivalence of FA and Regular Grammar.
• Construction of Regular Grammar Equivalent to a given DFA.
• Construction of a FA from the given Right Linear Grammar.
4. Pushdown Automata (05 Lectures)
• Definition of PDA and Examples.
• Construction of PDA using Empty Stack and Final State Method – Examples using Stack
Method.
• Definition DPDA and NPDA, their Correlation and Examples of NPDA.
• CFG (in GNF) to PDA – Method and Examples.
5. Turing Machine (05 Lectures)
• The Turing Machine Model, Definition and Design of TM.
• Problems on Language Recognizers.
• Language Accepted by TM.
• Types of Turing Machines (Multitrack TM, Two-way TM, Multitape TM, Non-deterministic
TM).
• Introduction to LBA (Basic Model) and CSG, (Without Problems).
Contents …
Finite Automaton
Objectives …
To study Basic Concepts in Theoretical Computer Science
To learn Finite Automata
To understand Deterministic Finite Automata
To learn Non-deterministic Finite Automaton
1.0 INTRODUCTION
• Theoretical computer science is a division of general computer science and mathematics. It
focuses on more mathematical and abstract aspects of computing.
• Theoretical computer science includes the theory of computation. In this chapter, we
introduce mathematical terms necessary for understanding the automata theory.
• The language we have defined above is the formal language. The word 'formal' refers to the
fact that all the rules for the language are explicitly stated in terms of what strings of symbols
can occur, and which are the valid sentences.
• Formal language will be considered as symbols on paper and not as expressions of ideas as in
Natural language like English. The rules are called as formal rules.
• The term formal used here emphasizes that it is the form of the string of symbols, we are
interested in, not the meaning.
Operations on Languages:
• In this point we will study various operations on languages.
Union:
• Union of languages L1 and L2 is the language (L) containing all strings of L1 and all strings
of L2.
L = L1 ∪ L2
Examples:
(i) If L1 = {a, b} and L2 = {c, d} then
L = L1 ∪ L2 = {a, b, c, d} (finite set)
(ii) If L1 = a* b and L2 = b* then
L = L1 ∪ L2
Here, L1 = {b, ab, aab, aaab …}
L2 = {∈, b, bb, bbb, … }
L = L1 ∪ L2 = {∈, b, ab, bb, aab, bbb, aaab …} (infinite set)
(iii) If L1 = {am bm | m ≥ 1} and L2 = {bm | m ≥ 0}
Here, L1 = {ab, aabb, aaabbb …}
and L2 = {∈, b, bb, bbb …} then
L = L1 ∪ L2 = containing strings having equal number of a's and b's and also
having strings of b*.
Intersection:
• Intersection of languages L1 and L2 is the language L containing common strings of L1 and
L2.
L = L1 ∩ L2
Examples:
(i) If L1 = {a, b} and L2 = {a, c} then
L = L1 ∩ L2 = {a}
(ii) If L1 = {aa, bb, aab} and L2 = {aa, aab} then
L = L1 ∩ L2
= {aa, aab}
(iii) If L1 = a* and L2 = b* then
L = L1 ∩ L2 = φ (No string common)
1.3
Theoretical Computer Science Finite Automaton
(iv) If L1 contains equal number of a's and b's over Σ = {a, b} and L2 = a* b* find
intersection of L1 and L2.
Here, L1 = EQUAL = {∈, ab, ba, aabb, abab abba, baab, baba, bbaa, aaabbb …}
and L2 = {∈, a, b, ab, aabb, aaabbb …}
∴ L1 ∩ L2 = {ab, aabb, aaabbb …}
L = L1 ∩ L2 = {an bn}
(v) If L1 containing all string starting with 'a' over {a, b} and L2 containing all strings ending
with 'a' over {a, b} then, L = L1 ∩ L2 = containing all string starting with 'a' and ending
with 'a' and also only 'a'.
∴ L = {a, aa, aba, abba, abbaa … }
Concatenation:
• The concatenation of languages L1 and L2 is the language L containing all strings of L1
followed by all strings of L2 without space.
L = L1 L2
Examples:
(i) L1 = {a}
L2 = {b}
L = L1 L2 = {ab}
(ii) L1 = {a, aa, aaa}
L2 = {bb, bbb}
Then, L = L1 L2 = {abb, abbb, aabb, aabbb, aaabb, aaabbb}
(iii) L1 = {a, bb, bab} and L2 = {a, ab}
L = L1 L2 = {aa, aab, bba, bbab, baba, babab}
(iv) L1 = {a, bb, bab} and L2 = {∈, bbbb}
L = L1 L2 = {a, bb, bab, abbbb, bbbbbb babbbbb}
(v) L1 = {∈, x} and L2 = y*
Here, L2 = {∈, y, yy, yyy, …}
Which, is infinite set.
L =L1 L2 = {∈, xy, xyy, xyyy, xyyyy, y, yy, yyy, …}
• FA consists of a finite set of states and a set of transitions from state to state that occur on
input symbol chosen from an alphabet Σ.
• Computer itself can be viewed as a finite state system. The state of the central processor,
main memory and auxiliary storage at any time is one of a very large but fixed number of
states.
• Finite automaton is an abstract model (formally describe the behavior of the computer
system) of a digital computer which has three components namely, Input tape, Control unit
and Output.
• Using abstract model, the behavior of the actual system can be understood and build to
perform various activities.
• The pictorial/graphical presentation/block diagram of finite automata is shown in Fig. 1.1.
The FA contains:
o Input Tape: It is divided into number of cells/blocks/squares. Each input symbol is
placed in each cell. Each cell contains a single symbol from input alphabet Σ.
o Finite State Control: The finite automaton has some states one of which is the start
state designed as q0 and at least one final state. Apart from these it has some finite states
denoted by q1, q2, q3…qn. The tape reader reads the cells one by one from left to right,
and at a time only one input symbol is read. Based on the current input symbol, the state
can be change.
o Output: The output of FA may be accept or reject depending on the input. When end of
the input is encountered, the control unit may be in accept or reject state.
Input tape
q0
q7 q1
Output
q6 q2 (accept/reject)
q5 q3
q4
Start
OFF ON
Push
Fig. 1.3
• Finite Automaton can be classified into two types namely, Deterministic Finite Automaton
(DFA) and Non-deterministic Finite Automaton (NFA).
• In DFA, for each input symbol, one can determine the state to which the machine will move.
Hence, it is called Deterministic Automaton.
• As it has a finite number of states, the machine is called Deterministic Finite
Machine or Deterministic Finite Automaton.
• In NFA, for a particular input symbol, the machine can move to any combination of the states
in the machine. In other words, the exact state to which the machine moves cannot be
determined. Hence, it is called Non-deterministic Automaton.
• As it has finite number of states, the machine is called Non-deterministic Finite
Machine or Non-deterministic Finite Automaton.
1.6
Theoretical Computer Science Finite Automaton
o The transition equation δ(q3, 0) = q2, shows that there is a transition from state q3 to q2 on
input 0.
o The transition equation δ(q3, 1) = q3, shows that there is a transition from state q3 to q3 on
input 1.
3. Transition Table: A DFA can also be represented by a transition table. A transition table is
the tabular representation of the transition system of the automation. A transition table is also
known as transition function table or state table.
Example: The transition diagram and its equivalent transition table are shown below:
Columns
0 0,1 a d a
0 1
Start 1
Rows
q0 Start q0 q1
a
accept q1 accept qq00 q01 q1
1 0 0,1
Start 0 1
q0 q1 q2
Fig. 1.4
6. Transition Function:
• A DFA may also represented by transition function δ. The transition function of the Fig. 1.4
is represented as follows:
δ(q0, 0) = q1
δ(q0, 1) = q0
δ(q1, 0) = q1
δ(q1, 1) = q2
δ(q2, 0) = q2
δ(q2, 1) = q2
• Here, the first argument of transition function δ represents the present state and second
argument represents the input letter and the right hand side represents the next states.
For example, in equation δ(q0, 0) = q1 where, q0 is a present state, 0 is an input alphabet and q1
is the next state
• The DFA accepts a string x if the sequence of transitions corresponding to the symbols of x
leads from start state to a final state. [Oct. 18]
• Finite control representation of DFA is shown in Fig. 1.5.
DFA
Source string
Fig. 1.7: DFA as a Recognizer
• Let us consider a DFA given in Fig. 1.8.
0 q1 1
q1 0 1 q2
1 0
1 q3 0
Fig. 1.8
1.12
Theoretical Computer Science Finite Automaton
• The Fig. 1.9 processes the strings 1, 00, 01, 010, 011, 1010 and 0100. The processing of two
strings 00 and 0110 is terminated to q0, which is the final state and rest strings are terminated
to a non-final state. Therefore, the strings 00 and 0101 are accepted and rest are not accepted.
0 1 0 1
q0 q1 q2 q3 q0 (Accepted)
1 0 0 1
q3 q0 q1 q2
(Accepted)
Name states as q0 q1
Start a
q0 q1 accept
Skeleton
DFA accepting
Fig. 1.10
Step 4 : Identify the transitions not defined in Step 3:
Step (i) : δ(q1, a) = ? Move q0 to q1 on ‘a’, then think of transition from q1 on 'a'.
1.13
Theoretical Computer Science Finite Automaton
a a
q0 q1 ?
d(q1, a) = q1
Fig. 1.11
Step 5 : Construct the DFA. The DFA can be obtained by skeleton DFA and transitions
obtained from previous step. The DFA is defined as:
M = (Q, Σ, δ, q0, F)
where,
o Q = (q0, q1)
o Σ = {a}
o q0 is the start state
o F = {q1}
o δ is shown in Fig. 1.12 using the transition diagram and table
a d a
Start a q0 q1
q0 q1 accept
* q1 q1
• The string is accepted by NFA, because it leads from initial state to final state (q4) i.e.
0 1 0 0 1
q0 → q0 → q0 → q3 → q4 → q4
Definition of NFA:
• NFA is denoted by 5-tuple (Q, Σ, δ, q0, F), where
o Q is a finite set of states.
o Σ is a finite set of symbols called the alphabets.
Q Q
o δ is the transition function where δ: Q × Σ → 2 , (here, the power set of Q (2 ) has been
taken because in case of NDFA, from a state, transition can occur to any combination of
Q states)
o q0 is the initial state from where any input is processed (q0 ∈ Q).
o F is a set of final state/states of Q (F ⊆ Q).
^
• The function δ can be extended to function δ mapping Q × Σ* to 2Q and is defined as
follows:
^
1. δ (q, ∈) = {q}
^ ^
2. δ (q, wa) = {p | for some state r in δ (q, w) , p is in δ (r, a)}. Means starting in state q
and reading input string w followed by input symbol, it can be in state p if one possible
state we can be in after reading w is r and from r we may go to p upon reading a. Note
^ ^
that δ (q, a) = δ (q, a). Thus we may replace δ to δ. It is also useful to extend δ to
arguments in 2Q × Σ* by, 3.δ (p, w) = Uq in p δ (q, w) for each set of states P ⊆ Q.
EXAMPLES
Example 1: Consider NFA of Fig. 1.10 whose transition function δ is as shown in Table 1.8.
Table 1.8
Inputs
δ 0 1
q0 {q0, q3} {q0, q1}
q1 φ {q2}
q2 {q2} {q2}
q3 {q4} φ
q4 {q4} {q4}
Let the input be 01001.
δ (q0, 0) = {q0, q3}
δ (q0, 01) = δ (δ (q0, 0), 1) = δ ({q0, q3}, 1) = 1
= δ (q0, 1) ∪ δ (q3, 1) = {q0, q1}
δ (q0, 010) = δ (δ (q0, 01), 0) = δ ({q0, q1}, 0)
= δ (q0, 0) ∪ δ (q1, 0)
= {q0, q3} ∪ φ
= {q0, q3}
1.19
Theoretical Computer Science Finite Automaton
1.20
Theoretical Computer Science Finite Automaton
1.21
Theoretical Computer Science Finite Automaton
[q1]
1
Start 1
[q0]
0
0
[q0,q1]
1
Fig. 1.22: Equivalent DFA
The machine can have one initial state but more than one final state.
Note: Final states are those where at least one final state is included.
Example 2: Find a deterministic acceptor equivalent to
M = ({q0, q1, q2}, {a, b}, δ, q0, {q2})
Where, δ is as given by Table 1.12.
Table 1.12: State table
∑
State/∑ a b
→ q0 q 0, q 1 q2
q1 q0 q1
q2 φ q 0, q 1
1.22
Theoretical Computer Science Finite Automaton
[q0,q1] b
a
[q1,q2]
Start a
[q0]
b
a b
[q2]
[q1]
b
Fig. 1.23
Example 3: Construct a deterministic finite automaton equivalent to
M = ({q0, q1, q2, q3}, {0, 1}}, δ, q0, {q3})
where δ is given by Table 1.14.
Table 1.14: State table for Example 2.8
∑
State/∑ a b
→ q0 q 0, q 1 q0
q1 q2 q1
q2 q3 q3
q3 φ q2
Solution: Let Q = {q0, q1, q2, q3}. Then the deterministic automaton M1 equivalent to M is
given by M1 = (2Q, {a, b}, δ, [q0], F)
where F consists of:
[q3], [q0, q3], [q1, q3], [q2, q3], [q0, q1, q3], [q0, q2, q3], [q1, q2, q3]
and [q0, q1, q2, q3]
and where δ is defined by the state table given by Table 1.15.
1.23
Theoretical Computer Science Finite Automaton
1.6 NFA WITH ∈TRANSITIONS [April 16, 17, 18, 19, Oct. 18]
• It is an NFA including transitions on the empty input i.e. '∈' (epsilon). We can extent a NFA
by introducing 'ε-moves' that allow us to make a transition on the empty string.
• There would be an edge labeled ∈ between two states and this edge allows transition from
one state to another even without receiving an input symbol. This is another mechanism that
allows NFA to be in multiple states at once.
• Constructing such NFA is easy, but the NFA thus constructed is not that powerful. The NFA
with ∈-moves is given by M = (Q, Σ, δ, q0, F) where δ is defined as Q × Σ ∪ {∈} → 2Q.
Definition:
• NFA with ∈-moves is denoted by a 5-tuple (Q, Σ, δ, q0, F),
where Q : finite set of states
Σ : finite input alphabet
q0 : initial state contained in Q
F : set of final states ⊆ Q
and δ : transition function or state function mapping
Q × (Σ ∪ {∈}) to 2Q
i.e. δ : Q × (Σ ∪ {∈}) ∪ 2Q
Basically, it is NFA with few transitions on the empty input '∈'.
Note : DFA can never have ∈-moves.
Example:
• Let us consider the FA shown in Fig. 1.24. This FA accepts a language consisting any
number of 0's followed by any number of 1's followed by any number of 2's.
• For example, the strings ∈, 0, 00, 01, 1, 11, 2, 22, 12, 122, 012, 0012, etc. are all valid strings
of this language.
^
• Now, find δ for resultant NFA.
^ ^
δ (q0, 0) = ∈-closure (δ (δ (q0, ∈) , 0))
= ∈-closure (δ ({q0, q1, q2}, 0))
= ∈-closure (δ (q0, 0) ∪ δ (q1, 0) ∪ δ (q2, 0))
= ∈-closure ({q0} ∪ φ ∪ φ)
= ∈-closure ({q0}) = {q0, q1, q2}
^ ^
δ (q0, 1) = ∈-closure (δ (δ (q0, ∈) , 1))
= ∈-closure (δ ({q0, q1, q2}, 1))
= ∈-closure (δ (q0, 1) ∪ δ (q1, 1) ∪ δ (q2, 1))
= ∈-closure (φ ∪ {q1} ∪ φ)
= ∈-closure (q1)
= {q1, q2}
^ ^
δ (q0, 2) = ∈-closure (δ (δ (q0, ∈) , 2))
= ∈-closure (δ ({q0, q1, q2}, 2))
= ∈-closure (δ (q0, 2) ∪ δ (q1, 2) ∪ δ (q2, 2))
= ∈-closure (φ ∪ φ ∪ {q2})
= ∈-closure (q2)
= {q2}
^ ^
δ (q1, 0) = ∈-closure (δ (δ (q1, ∈) , 0))
= ∈-closure (δ ({q1, q2}, 0))
= ∈-closure (δ (q1, 0) ∪ δ (q2, 0))
= ∈-closure (φ ∪ φ)
= ∈-closure (φ)
= φ
^ ^
δ (q1, 1) = ∈-closure (δ (δ (q1, ∈) , 1))
= ∈-closure (δ ({q1, q2}, 1))
= ∈-closure (δ (q1, 1) ∪ δ (q2, 1))
= ∈-closure ({q1} ∪ φ)
= ∈-closure (q1)
= {q1, q2}
• Similarly, we can obtain,
^
δ (q1, 2) = {q2}
^
δ (q2, 0) = φ
^
δ (q2, 1) = φ
^
δ (q3, 2) = {q2}
1.26
Theoretical Computer Science Finite Automaton
^
• Resultant δ is as shown in the transition Table 1.17.
Table 1.17
δ 0 1 2
q0 {q0, q1, q2} {q1, q2} {q2}
q1 φ {q1, q2} {q2}
q2 φ φ {q2}
^ ^
δ (q0, b) =
∈-closure (δ (δ (q0, ∈) , b))
=
∈-closure (δ (q0, b) ∪ δ (q1, b))
=
∈-closure (q2 ∪ φ)
=
∈-closure (q2)
=
{q1, q2}
^ ^
δ (q1, a) = ∈-closure (δ (δ (q1, ∈) , a))
= ∈-closure (δ (q1, a))
= ∈-closure (q1)
= {q1}
^ ^
δ (q1, b) = ∈-closure (δ (δ (q1, ∈) , b))
= ∈-closure (δ (q1, b))
= ∈-closure (φ)
= φ
^ ^
δ (q2, a) = ∈-closure (δ (δ (q2, ∈) , a))
= ∈-closure (δ (q1, a) ∪ δ (q2, a))
= ∈-closure (q1 ∪ φ)
= {q1}
^ ^
δ (q2, b) = ∈-closure (δ (δ (q2, ∈) , b))
= ∈-closure (δ (q1, b) ∪ δ (q2, b))
= ∈-closure (φ ∪ q2)
= {q1, q2}
Therefore equivalent NFA is as shown in the Fig. 1.27.
Table 1.19: Transition table
δ a b
q0 {q1} {q1, q2}
q1 {q1} φ
q2 {q1} {q1, q2}
Solution: The initial state of the equivalent DFA is ∈-closure (q0) = {q0, q1, q2, q4, q7) = A
(say), since there are exactly the states reachable from state q0 via a path in which every edge is
labeled ∈ and state itself is in ∈-closure (q0).
Here we mark set A i.e. set x to A and compute T. The set of states of N having transitions
on 'a' from all members of A. So among the states q0, q1, q2, q4 and q7, only states q2 and q7 have
such transitions to q3 and q8, so
y = ∈-closure ({q3, q8}) = {q1, q2, q3, q4, q6, q7, q8}
= B (say)
Now on input symbol b, from all members of states A we get q5 is only reachable state.
∴ y = ∈-closure ({q5}) = {q1, q2, q4, q5, q6, q7}
= C (say)
Now, we continue this process with the unmarked sets B and C.
From B on input 'a' we get,
y = ∈-closure ([q3, q8] ) = B
From B on input 'b' we get,
y = ∈-closure ({q5, q9} )
= {q1, q2, q4, q5, q6, q7, q9}
= D (say)
From C on input 'a' we get,
y = ∈-closure ({q3, q8}) = B
From C on input 'b' we get,
y = ∈-closure ({q5}) = C
From D on input 'a' we get,
y = ∈-closure ({q3, q8}) = B
From D on input 'b' we get,
y = ∈-closure ({q5, q10})
= {q1, q2, q4, q5, q6, q7, q10}
= E (say)
From E on input 'a' we get,
y = ∈-closure ({q3, q8}) = B
From E on input 'b' we get,
y = ∈-closure ({q5}) = C
The transition table of the resulting DFA is shown in Table 1.20. This DFA will also accept
the same language as that of NFA and having less number of states.
1.30
Theoretical Computer Science Finite Automaton
1.32
Theoretical Computer Science Finite Automaton
Start 0,1
P q
0 1 0,1
1
0,1
S r
0
Fig. 1.35: NFA
Solution: Initial state of DFA is P = A (say).
0
A → {q, r, s} = B
1
A → {p, q, r} = C
0
B → {r, s} = D (‡ q on 0 = r and r on 0 =s)
1
B → {p, q, r} = C
0
C → {q, r, s} = B
1
C → {p, q, r} = C
0
D → {s} = E
1
D → {p} = A
0
E → {φ}
1
E → {p} = A
1.34
Theoretical Computer Science Finite Automaton
Start 0 0
A B D 1
0
1 0
1
C E
1
Fig. 1.36: Equivalent DFA
Example 1: Construct a Moore machine to determine the residue (remainder) mod 3 for a
binary number.
Solution: (i) If 'i' is a binary number and if we write '0' after 'i' then its value becomes '2i'.
e.g. Consider binary number
i = '1' (value = 1)
If we write '0' after it,
'10' = 2 i.e. value = 2 × 1
As another example, if we consider
i = '100' (value = 4)
then i.0 = 1000 has value = 8
= 2×4
(ii) If we write '1' after 'i', where 'i' is any binary number, its value becomes '2i + 1'.
e.g. If i = '1' (value = 1)
i.1 = '11' = 3 i.e. value = 2 × 1 + 1 = 3
As another example, if
i = '100' (value 4)
i.1 = 1001 = 9
i.e. value = 2 × 4 + 1 = 9
As we are constructing a machine to determine remainder, when we divide any binary
number by 3, the different remainder values that we can have are 0, 1 and 2.
Table 1.24: Different remainder values
Remainder value (R) 0 1 2
When we write '0' after i + (R) and divide by 3, the remainder = 2R mode 3 0 2 1
When we write '1' after i + (R) and divide by 3, the remainder = (2R + 1) mod 3 1 0 2
In above Table 1.24, if say, the remainder of previous result is '2' i.e. '10' and if we write '0'
after it, it becomes '4' i.e. '100'. Now if we divide it by '3' the remainder will be '1'.
To construct a Moore machine, we can associate three different residue values with three
different states; i.e. '0' residue value with state 'q0', residue '1' with 'q1' and so on as described in
machine function stated in Table 1.25.
Table 1.25: Machine function for Moore machine λ: Q → ∆
State Q q0 q1 q2
Output ∆ 0 1 2
Now, Table 1.26 can be viewed as the state table or transition table for the required Moore
machine, given in Table 1.26 and resultant Moore machine is given in the Fig. 1.37.
Table 1.26: Transition table for required Moore machine δ: Q × Σ → Q
δ 0 1 Output λ
q0 q0 q1 0
q1 q2 q0 1
q2 q1 q2 2
1.36
Theoretical Computer Science Finite Automaton
Fig. 1.37: Moore Machine finding residue mode 3 for Binary Number
Example 2: Design a Moore machine for a language over {0, 1} which outputs '*' if string
contains '11' in it and outputs '#' otherwise.
Solution:
Fig. 1.38
Table 1.27: Transition table
δ Output
Q 0 1 λ
q0 q0 q1 #
q1 q0 q2 #
q2 q2 q2 *
Example 3: Design a Moore machine to get 1's complement of a given binary string.
Solution:
Fig. 1.39
Table 1.28: Transition table
δ Output
Q 0 1 λ
q0 q2 q1 0
q1 q2 q1 0
q2 q2 q1 1
1.37
Theoretical Computer Science Finite Automaton
Fig. 1.40
In 'P0' we get '0', it should remain in the same state and should produce output 'y' as we got
double zero's. It should remain in the same state because the string may be of the form "0000".
Similarly, in 'P1' if it gets '1', it should go to itself with output 'y' indicating it has got double '1's.
This state is reflected in Fig. 1.40 (b).
Now, in 'P0' if we get '1', machine should make transition to 'P1' which will be checking for
second consecutive '1' and in 'P1' if we get '0', it should show the transition from 'P1' to 'P0' which
will be looking for second consecutive zero. Here ends the process at considering transitions on
both '0' and '1' from every state. Fig. 1.40 (c) shows the final Mealy machine with given
requirements.
1.38
Theoretical Computer Science Finite Automaton
Example 2: Design a Mealy machine to find out 2's complement of a given binary number.
Solution: Steps to be followed are:
(i) Read bit by bit from LSB.
(ii) Keep the bits unchanged till you get first '1'; do not replace this '1' by '0'. Keep that also
unchanged.
(iii) The remaining bits are to be changed from '0' to '1' and from '1' to '0'.
The design here requires again two states Q = {q0, q1} with q0 as initial state, 'q0' will read '0's
without replacing them till it gets first '1'. On getting '1', it makes a transition to 'q1' which now
replaces each coming '0' by '1' and '1' by '0'.
The Mealy machine can be drawn as in Fig. 1.41.
1.39
Theoretical Computer Science Finite Automaton
Note: Consider both Moore and Mealy machines shown in Fig. 1.42, also consider the input
sequence "1010" of length = 4 i.e., length (input string) = n = 4.
(i) Output sequence for Moore machine is,
1.41
Theoretical Computer Science Finite Automaton
Similarly,
(d) δ' ([p0, y], 0) = [p0, y]
δ' ([p0, y], 1) = [p1, n]
λ' ([p0, y]) = y
(e) δ' ([p1, n], 0) = [δ (p1, 0), λ (p1, 0)]
= [p0, n]
δ' ([p1, n], 1) = [δ (p1, 1), λ (p1, 1)]
= [p1, y]
λ' ([p1, n]) = n
Similarly,
(f) δ' ([p1, y], 0) = [p0, n]
δ' ([p1, y], 1) = [p1, y]
λ' ([p1, y]) = y
• The transition graph for equivalent Moore machine is as shown in the Fig. 1.43 (b). We can
re-label the states if we want. Out of [q0, n] and [q0, y], we have to arbitrarily select the state
as initial state and here we have chosen [q0, n] as a initial state.
• We can remove state [q0, y] as there are no incoming edges to this state. After removing this
state and renaming the remaining states, we get the final Moore machine as shown in the
Fig. 1.43 (c).
Mealy Machine vs. Moore Machine: [April 17]
Sr.
Mealy Machine Moore Machine
No.
1. In Mealy machine output depends on In Moore machine output depends only
present state as well as present input. upon present state.
2. Generally, it has fewer states than Moore Generally, it has fewer states than Moore
machine. machine.
3. In Mealy machine, if input changes, In Moore machine If input changes,
output also changes. output does not change.
4. They react faster to inputs. They react slower to inputs (one clock
cycle later).
5. It is difficult to design. It is easy to design.
6. In Mealy machine the output is placed on In Moore machine the output is placed on
transitions. states.
1.43
Theoretical Computer Science Finite Automaton
Fig. 1.44
Solution:
δ a b
q0 q1 q2
q1 q1 q3
q2 q0 q1
q3 q1 q4
q4 q3 q1
Here q2 and q4 are final state. Therefore, mark X at entry (q2, q0) (q2, q1) (q4, q0) (q4, q1)
(q4 q3) as shown in Table 1.32.
1.44
Theoretical Computer Science Finite Automaton
Table 1.32
δ (q2, a) = q0 ∉F δ(q2, b) = q1 ∉ F
∴ q2, q0 are distinguishable
δ (q0, a) = q1 ∉ F δ (q0, b) = q2 ∈F
Once we mark entry X for final with nonfinal state i.e. (final, nonfinal); then
First we find D0. Table 1.33 shows D0. In this we mark entry (p, q) as X if pD0 q i.e. either q
∈ F and p ∉ F or q ∉ F and p ∈ F.
Check for those states which are not distinguishable using D1. First we copy distinguishable
states i.e. those marked as X from D0 for Table 1.33. Now q0 is not distinguishable from q1, so we
find q0 D1 q1. As δ (q0, b) ∈F and δ (q1, b) ∉ F, q0 is distinguishable from q1 with string if length
is 1 i.e. q0 D1 q1. Also, δ (q1, b) ∉ F and δ (q3, b) ∈ F, so q1 D1 q3 as shown in the Table 2.34.
Table 1.33
We do not get any other distinguishable pair with D1. We must repeat this till table
Di–1 = table Di. As D0 ∉ D1, we find D2 i.e. with string of length 2. As Σ = {a, b} possible strings
are aa, ab, bb and ba. Now, again copy distinguishable states from D1. The table D2 remains same
as D1. Hence, we get [q0 q3] and [q2 q4] as equivalent states. Thus minimal DFA is as shown in
the Fig. 1.35.
Fig. 1.45
1.45
Theoretical Computer Science Finite Automaton
We conclude that the equivalent states are a = e, b = h and d = f. Since (a, e), (b, h) and (d, f)
are not distinguishable. Therefore they are equivalent. Hence, the minimal finite automaton is as
shown in the Fig. 1.47.
1.47
Theoretical Computer Science Finite Automaton
1 2
2 2
q3
2
Fig. 1.58: DFA
Table 1.42: Transition table
δ 0 1 2
q0 q1 q2 q3
q1 q4 q2 q3
q2 q1 q4 q3
q3 q1 q2 q4
q4 q4 q4 q4
Example 11: Construct a DFA over {0, 1} such that every even position is occupied by '0'
and odd by '1'.
Solution:
1.53
Theoretical Computer Science Finite Automaton
Example 19: Design a DFA which checks whether a given decimal no. is even or not using
binary representation.
Solution: The given decimal number is even when its binary representation contains only ‘0’
at LSB. e.g. 100, 110, etc.
Now when number is entered from LSB, we can have DFA as shown below.
M = (Q, Σ, δ, S, F0) such that
Q = {q0, q1} , Σ = {0, 1}
S = {q0} , F0 = {q1}
L (M) = {∈, 0, 01, 0101, 0111, 01001, ……}
M = (Q, Σ, δ, q0, F)
where Q = {q0, q1, q2},
Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
δ = Q×Σ→Q
q0 = q0
F = {q0}
Table 1.51: Transition table
δ 0 1 2 3 4 5 6 7 8 9
q0 q0 q1 q2 q0 q1 q2 q0 q1 q2 q0
q1 q1 q2 q0 q1 q2 q0 q1 q2 q0 q1
q2 q2 q0 q1 q2 q0 q1 q2 q0 q1 q2
^
δ (q, ∈) = q and δ (q, ωa) = δ (δ (q, w), a)
Consider 873,
^
δ (q0, 8) = q2
δ (q0, 87) = δ (δ (q0, 8), 7) = δ (q2, 7) = q0
^
δ (q0, 873) = δ (δ (q0, 87), 3) = δ (q0, 3) = q0
∴ q0 ∈ F ∴ String 873 is accepted by DFA.
Example 21: Design a DFA which checks whether a binary number is divisible by 5 or not
and also show the string 110011 is accepted.
Solution:
^
δ (q0, a) = q
and δ (q, wa) = δ (δ (q, w), a)
Consider string 1110011.
^
δ (q0, 1) = q1
^
δ (q0, 11) = δ (δ (q0, 1), 1) = δ (q1, 1) = q3
δ (q0, 111) = δ (δ (q0, 11), 1) = δ (q3, 1) = q2
δ (q0, 1110) = δ (δ (q0, 111), 0) = δ (q2, 0) = q4
δ (q0, 11100) = δ (δ (q0, 1110), 0) = δ (q4, 0) = q3
δ (q0, 111001) = δ (δ (q0, 11100), 1) = δ (q3, 1) = q2
δ (q0, 1110011) = δ (δ (q0, 111001),1) = δ (q2, 1) = q0
q0 is final state. ∴ String is accepted.
Example 22: Design a DFA which accepts odd number of 1’s and even number of 0’s over
{0, 1}.
Solution:
1.58
Theoretical Computer Science Finite Automaton
^
We have, δ (q, ∈) = q and δ (q, a) = q'
δ (q, ωa) = δ (δ (q, ω), a)
Consider a string 0110100.
δ (q0, 0) = q3
δ (q0, 01) = δ (δ (q0, 0), 1) = δ (q3, 1) = q2
δ (q0, 011) = δ (δ (q0, 01), 1) = δ (q2, 1) = q3
δ (q0, 0110) = δ (δ (q0, 011), 0) = δ (q3, 0) = q0
δ (q0, 01101) = δ (δ (q0, 0110), 1) = δ (q0, 1) = q1
δ (q0, 011010) = δ (δ (q0, 01101), 0) = δ (q1, 0) = q2
δ (q0, 0110100) = δ (δ (q0, 011010), 0) = δ (q2, 0) = q1
Hence we get q1 ∈ F
∴ String is accepted by DFA.
Example 23: Design a DFA which accepts a language
L = {x | x has neither “aa” nor “bb” as a substring}
Solution:
1.59
Theoretical Computer Science Finite Automaton
Example 25: Design a DFA which accepts all strings over a, b, c such that if it starts with a,
then it should contain even number of ‘b’s. Else if it starts with ‘c’, then it should contain
substring ‘cba’ in it.
Solution:
Example 26: Design a DFA which accepts all strings over {0, 1, 2} such that if it starts with
‘0’ then contains substring ‘01’ else it should contain substring ‘112’ in it.
Solution:
1.62
Theoretical Computer Science Finite Automaton
Fig. 1.75: FA
1.63
Theoretical Computer Science Finite Automaton
M = (Q, Σ, δ, q0, F)
where Q = {q0, q1, q2, q3}
Σ = {C, H, A, P, T, E, R}
δ = Q×Σ→Q
q0 = q0
F = {q3}
L (M) = {CAP, ACAP, ACAPTER, CAHTCPCACAPR}
Table 1.59: Transition table
δ C H A P T E R
q0 q1 q0 q0 q0 q0 q0 q0
q1 q1 q0 q2 q0 q0 q0 q0
q2 q1 q0 q0 q3 q0 q0 q0
q3 q3 q3 q3 q3 q3 q3 q3
Consider a string HCPCAPR.
δ (q0, H) = q0
δ (q0, HC) = δ (δ (q0, H), C) = δ (q0, C) = q1
δ (q0, HCP) = δ (δ (q0, HC), P) = δ (q1, P) = q0
δ (q0, HCPC) = δ (δ (q0, HCP), C) = δ (q0, C) = q1
δ (q0, HCPCA) = δ (δ (q0, HCPC), A) = δ (q1, A) = q2
δ (q0, HCACAP) = δ (δ (q0, HCPCA), P) = δ (q0, P) = q3
∴ δ (q0, HCPCAPR) = δ (δ (q0, HCPCAP), R) = δ (q3, R) = q3
As q3 ∈ F, ∴ String is accepted by DFA.
Example 29: Design a NFA which accepts all strings containing even number of ‘a’s over
{a, b}.
Solution:
Table 1.60
δ a b
q0 q1 q0
q1 q0 q1
Consider string ‘babab’.
δ (q0, b) = q0
δ (q0, ba) = δ (δ (q0, b), a) = δ (q0, a) = q1
δ (q0, bab) = δ (δ (q0, ba), b) = δ (q1, b) = q1
δ (q0, baba) = δ (δ (q0, bab), b) = δ (q1, a) = q0
∴ δ (q0, babab) = δ (δ (q0, baba), b) = δ (q0, b) = q0
As q0 ∈ F, ∴ String is accepted by FA.
Example 30: Design a NFA which accepts the language with strings having even number of
‘a’s and odd number of ‘b’s over {a, b}.
Solution:
1.65
Theoretical Computer Science Finite Automaton
Example 32: Design a NFA with second last symbol is ‘a’ over {a, b}.
Solution:
M = (Q, Σ, δ, q0, F)
Q = {q0, q1, q2, q3}
Σ = {a, b, p, q}
q0 = q0
F = {q3}
L (M) = {a, b, p, q, abbpp, qq}
Table 1.64: Transition table
δ a b p q
q0 q0 φ φ φ
q1 φ q1 φ φ
q2 φ φ q2 φ
q3 φ φ φ q3
Example 34: Construct a DFA to accept the language
L = {ω | ω is of even length and begins with 01}
Solution:
1.68
Theoretical Computer Science Finite Automaton
Solution:
Table 1.67
1. Draw grid.
2. Marked the entries with hyphen '-' as shown above.
3. From transition table we can directly find q Do p such that either q∈F and P∉F or q∉F
and P∈F. Mark the entries 'X'. This process is continue for length of string more than
(1 ≥ 1) i.e. D1, D2 ….
First we marked some entries with Do from transition table by checking non-final and
final states. The remaining entries we can proceed further with more length as follows:
(i) E D2 A
0 0 0 1 1 1
E → H → G → G GE → F → G → E
0 0 0 1 1 1
A → B → G → G GA → F → G → E
Both states shown with circle are non-final and same, so we can't distinguish A and E
for i = 2.
(ii) G D2 A
0 0 0 1 1 1
G → G → G → G G → E → F → G
0 0 0 1 1 1
A → B → G → G A → F → G → E
distinguishable
(iii) H D2 B
0 0 0 1 1 1
H → G → G → G H → C → C → C
0 0 0 1 1 1
B → G → G → G B → C → C → C
can't distinguish
1.70
Theoretical Computer Science Finite Automaton
(iv) F D2 D
0 0 0 1 1 1
F → C → A → B F → G → E → F
0 0 0 1 1 1
D → C → A → B D → G → E → F
can't distinguish
(v) G D2 E
0 0 0 1 1 1
G → G → G → G G → E → F → G
0 0 0 1 1 1
E → H → G → G E → F → G → E
distinguishable
Therefore we will combine the following states which are non-distinguishable.
(a) A and E
(b) B and H
(c) F and D
C is final state.
Transition table for minimized DFA is
δ 0 1
Start [A E] [BH] [FD]
[B H] [G] [C]
Final C [AE] [C]
[F D] [C] [G]
[G] [G] [AE]
Fig. 1.84
Example 2: Construct the minimize DFA for the following DFA.
Table 1.68
δ 0 1
Start a b a
b a c
c d b
Final *d d a
e d f
f g e
g f g
h g d
1.71
Theoretical Computer Science Finite Automaton
Solution:
Table 1.69
(vi) g D2 f
0 0 0 1 1 1
g → f → g → f g → g → g → g
0 0 0 1 1 1
f → g → f → g f → e → f → e
Similarly e D2 c and c D2 b are also distinguishable. Like this if we proceed for more length
we will not get here any non-distinguishable states. Therefore this DFA cannot be minimize.
Example 3: Construct minimal DFA for the following DFA.
δ 0 1
Start q0 q4 q0
Final q1 q1 q0
q2 q1 q3
q3 q7 q2
q4 q0 q5
q5 q1 q4
q6 q7 q1
q7 q3 q7
Solution: q0 Do q1
0
q0 → q4 ∉ F
0
q1 → q1 ∈ F distinguishable mark X.
q0 Do q2
0 1
q0 → q4 ∉ F q0 → q0
0 1
q2 → q1 ∈ F q2 → q3
Mark X.
Similarly make all Do entries then D1 entries
e.g. q3 D1 q0
0 0 1 1
q3 → q7 → q3 q3 → q2 → q3
0 0 1 1
q0 → q4 → q0 q0 → q0 → q0
Both reachable states are different. Hence, proceed with D2 entries.
(i) q3 D2 q0
0 0 0 1 1 1
q3 → q7 → q3 → q7 q3 → q2 → q3 → q2
0 0 0 1 1 1
q0 → q4 → q0 → q4 q0 → q0 → q0 → q0
1.73
Theoretical Computer Science Finite Automaton
Again distinguishable
(ii) q4 D2 q0
0 0 0 1 1 1
q4 → q0 → q4 → q0 q4 → q5 → q4 → q5
0 0 0 1 1 1
q0 → q4 → q0 → q4 q0 → q0 → q0 → q0
(iii) q7 D2 q0
0 0 0 1 1 1
q7 → q3 → q7 → q3 q7 → q7 → q7 → q7
0 0 0 1 1 1
q0 → q4 → q0 → q4 q0 → q0 → q0 → q0
Similarly,
q5 D2 q2
q3 D2 q4
q3 D2 q7
q4 D2 q7
All these remaining entries also we get as distinguishable as shown below.
Table 1.70
Fig. 1.85
1.74
Theoretical Computer Science Finite Automaton
Example 2: Construct a Moore machine that outputs valid or invalid for a language
L = a (a + b)* b.
Solution:
Fig. 1.86
Example 3: Design a Mealy machine for input (a + b + c)* if input ends with 'bac', print A
else print B.
Solution:
a/B, c/B b/B b/B
b/B
Start q0 b/B q1 a/B q2 c/A q3
c/B
a/B
c/B, a/B
Fig. 1.87
Example 4: Design a Moore machine for binary input sequence such that if it has a substring
101, the machine output A, if it has substring 110 then outputs B else outputs C.
Solution:
Fig. 1.88
Example 5: Construct a Moore machine to convert each occurrence of substring 100 by 101.
0 1
Start 1 0 0
q0 q1 q2 q3
1 0 1
1
Fig. 1.89
1.75
Theoretical Computer Science Finite Automaton
Example 6: Design Mealy machine for a binary input sequence such that if it has a substring
101, the machine outputs A, if it has a substring 110, the machine outputs B otherwise it
outputs C.
Solution:
Fig. 1.90
EXAMPLES
Example 1: Construct NFA for language L, where L = {a (a + b)* b}.
Solution:
Fig. 1.91
Example 2: Construct DFA equivalent to NFA
Fig. 1.92
Solution: (1) To find initial state , find ∈-closure of (q0).
∈-closure (q0) = {q0, q1} = A (say)
From A on 0
∈-closure {q0, q1, q2} = {q0, q1, q2, q3} = B
A on input 1
∈-closure {q1} = {q1} = C
B on 0
∈-closure {q0, q1, q2} = B
1.76
Theoretical Computer Science Finite Automaton
B on 1
∈-closure {q1} = C
C on 0
∈-closure {q1, q2} = {q1, q2, q3} = D
C on 1
∈-closure {φ} = φ
D on 0
∈-closure {q1, q2} = D
D on 1
∈-closure {q3} = q3 = E
E on 0
∈-closure {φ} = φ
E on 1
∈-closure {q3} = E
Equivalent DFA is
Fig. 1.93
Example 3: Construct Mealy machine over {0, 1} which toggles its input.
Solution:
Fig. 1.94
Table 1.71
δ 0 1 λ (q0, 0) = 1
q0 1 0 λ (q0, 1) = 0
Example 4: Construct NFA without ∈ for language L, where L = (0 + 1)* 01.
Solution:
Fig. 1.95
1.77
Theoretical Computer Science Finite Automaton
Table 1.72
δ 0 1
q0 {q0, q1} q0
q1 φ q2
q2 q2 q2
Example 5: Construct DFA equivalent to given NFA.
Fig. 1.96
Solution: ∈-closure (q0) = q0 → Initial state of DFA (say A)
From A on input a i.e. q0 on a is
∈-closure {q1} = {q0, q1, q2} = B
From A on input b is
∈-closure {q0, q1} = {q0, q1, q2} = B
B on input a
∈-closure {q1, q2} = {q1, q2, q0} = B
B on input b is
∈-closure {q0, q1, q2} = {q0, q1, q2} = B
No more states are added.
∴ DFA is
Fig. 1.97
Example 6: Construct Moore machine to generate 1's complement of binary number.
Solution: M = (Q, Σ, ∆, δ, λ, q0)
Fig. 1.98
1.78
Theoretical Computer Science Finite Automaton
Fig. 1.100
1.79
Theoretical Computer Science Finite Automaton
Example 9: Construct Moore machine which outputs even or odd according to number of a’s
encountered is even or odd.
Solution: M = (Q, Σ, ∆, δ, λ, q0)
Fig. 1.101
Table 1.73
δ a b λ
q0 q1 q0 Even
q1 q0 q1 Odd
Example 10: Construct DFA for accepting string over {a, b} such that it starts with a and not
having substring “bac” in it.
Solution:
Fig. 1.102
Example 11: Construct DFA equivalent to NFA
Fig. 1.103
Solution: ∈-closure (q0) = {q0, q1, q2} = Initial state = A
From A on input
∈-closure {q1, q1, q2} = {q0, q1, q2} = A
From A on input 1
∈-closure {q1, q2} = {q0, q1, q2) = A
No more states are added. Therefore, DFA is
Fig. 1.104
1.80
Theoretical Computer Science Finite Automaton
Fig. 1.105
Example 13: Construct Mealy machine to convert each occurrence of substring 101 by 100
over alphabet {0, 1}.
Solution:
Fig. 1.106
Table 1.74
δ 0 1
q0 q0 q1
q1 q2 q1
q2 q0 q0
λ (q0, 0) = 0
λ (q0, 1) = 1
λ (q1, 0) = 0
λ (q1, 1) = 1
λ (q2, 0) = 0
λ (q2, 1) = 0
Example 14: Construct NFA for ab* + ba*.
Solution:
Fig. 1.107
1.81
Theoretical Computer Science Finite Automaton
Example 15: Construct DFA for a language over {0, 1, 2} which starts with “00”, ends with
“22” and having substring “11” in it.
Solution:
Fig. 1.108
Example 16: Minimize the following DFA.
Fig. 1.109
Solution: Step 1: Table 1.75
F = {q2]
So put mark X at entries q2 Vs all non-final states.
1.82
Theoretical Computer Science Finite Automaton
Fig. 1.110
Example 17: Construct a Mealy machine equivalent to the following Moore machine.
Table 1.76
States 0 1 Output
q0 q1 q2 1
q1 q3 q2 0
q2 q2 q1 1
q3 q0 q3 1
Solution:
1.83
Theoretical Computer Science Finite Automaton
Fig. 1.112
Example 18: Construct FA for L = {∈}.
Solution:
Start
Fig. 1.113
Example 19: Define Moore machine. Design a Moore machine to change all vowels to '$'
and rest of the 21 alphabets changes to '#'.
Solution:
b,c,d.....z
a,e,i,o,u
Start a,e,i,o,u
# b,c,.....z $
Fig. 1.114
Example 20: Construct a DFA which accepts odd number of 1's and even number of 0's over
{0, 1}.
Solution:
1
q0 q2 Final state
1
0 0 0 0
1
q1 1 q3
Fig. 1.115
1.84
Theoretical Computer Science Finite Automaton
Start '
'
a
' '
' ' ' ' a ' b
' '
b '
'
Fig. 1.116
Example 22: Design Mealy machine to determine the residue (remainder) mod 3 for a
decimal number.
Solution:
0,3,6,9/0 0,3,6,9/1 0,3,6,9/2
2,5,8/0 2,5,8/1
1,4,7/0
2,5,8/2
Fig. 1.117
Example 23: Construct DFA for the following NFA with ∈-moves.
b a, b
a, b a, b
Start
q0 q1 q2
' '
'
Fig. 1.118
Solution:
a, b
a, b
A B
Fig. 1.119
1.85
Theoretical Computer Science Finite Automaton
PRACTICE QUESTIONS
Q.I Multiple Choice Questions:
1. Which uses mathematical and logical methods to understand the nature of computation
and to solve fundamental problems arising through the everyday practical use of
computer systems?
(a) Computer Science (CS) (b) Theoretical Computer Science (TCS)
(c) Science Theory (ST) (d) Computer Science Theory (CST)
2. The word ‘automata’ (plural and the singular is ‘automaton’), a ______ provides the
simplest model of a computing device.
(a) Finite Automaton (FA) (b) Finite Grammar (FG)
(c) Finite Language (GL) (d) None of the mentioned
3. Which is a finite, non empty set of symbols?
(a) a language (b) a grammar
(c) an alphabet (d) All of the mentioned
4. Which is an entity or individual objects, which can be any letter, alphabet or any
picture like 1, x, y, #?
(a) symbols (b) alphabets
(c) strings (d) All of the mentioned
5. Which is a finite collection of symbols from the alphabet.
(a) symbol (b) alphabet
(c) string (d) language
6. Which is a subset of Σ* for some alphabet Σ?
(a) symbol (b) alphabet
(c) string (d) language
7. Which is a set of rules to define a valid sentence in any language?
(a) grammar (b) language
(c) symbol (d) string
8. Noam Chomsky classified (Chomsky hierarchy) the grammar into following types
depending on the production rules:
(a) Type 0: Type 0 grammar is a phase structure grammar without any restriction. All
grammars are type 0 grammar. For example, Turing machine.
(b) Type 1: Type 1 grammar is called context-sensitive grammar. For example, Linear
Bounded Automata (LBA).
(c) Type 2: Type 2 grammar is called context-free grammar. In the LHS of the
production, there will no left or right context. For example, Push down automata
(PDA).
(d) Type 3: Type 3 grammar is called regular grammar. For example, Finite Automata
(FA).
(e) All of the mentioned
1.86
Theoretical Computer Science Finite Automaton
1.87
Theoretical Computer Science Finite Automaton
23. The transition ______ is basically a tabular representation of the transition function
which takes two arguments (a state and a symbol) and returns a state (the "next state").
24. The set of rules for constructing a language is called the ______ for that language.
25. The Moore machine, the output depends only on the present ______.
26. A ______ of a string is formed by taking any number of symbols from the end of the
string.
27. A ______ language is a set of strings of symbols drawn from a finite alphabet.
Answers
1. self-acting 2. automaton 3. symbol 4. Alphabets
5. states 6. Recognizer 7. String 8. Length
9. language 10. NFA 11. Empty 12. Terminal
13. Finite 15. Left Hand Side 16. deterministic
Automaton and Right
(FA) 14. NFA with ∈ move Hand Side
11. The Moore machine was proposed by Edward F. Moore in IBM around 1960.
12. By minimizing, we can get a minimized DFA with minimum number of states and
transitions which produces that particular language.
13. A DFA with minimized states needs less time to manipulate a regular expression.
14. The Myhill–Nerode theorem is used to minimize finite automata.
15. A transition graph or a transition system is a finite directed labeled graph in which each
vertex (or node) represents a state and the directed edges indicate the transition of a state
and the edges are labeled with input/output.
16. An FA has a infinite number of states.
17. A transition diagram or state transition diagram is a directed graph for FA.
18. NFA with ε can be converted to NFA without ε, and this NFA without ε can be
converted to DFA.
19. Maximization of FA means reducing the number of states from given FA.
20. In NFA, when a specific input is given to the current state, the machine goes to multiple
states. It can have zero, one or more than one move on a given input symbol.
21. In DFA, when a specific input is given to the current state, the machine goes to only one
state. DFA has only one move on a given input symbol.
22. In Mealy Machine the output depends both on the current state and the current input and
in Moore Machine the output depends only on the current state.
23. A finite automaton recognizes regular language.
24. A prefix of a string is the string formed by taking any number of symbols of the string.
Answers
1. (T) 2. (T) 3. (T) 4. (T) 5. (F) 6. (T) 7. (T) 8. (T) 9. (T) 10. (T)
11. (F) 12. (T) 13. (T) 14. (T) 15. (T) 16. (F) 17. (T) 18. (T) 19. (F) 20. (T)
21. (T) 22. (T) 23. (T) 24. (T)
Q.IV Answer the following Questions:
(A) Short Answer Questions:
1. What is symbol?
2. Define alphabet.
3. Define FA.
4. What is prefix and suffix?
5. Define formal language.
6. List operations on languages.
7. Define grammar.
8. What are DFA and NFA?
9. Define minimization of FA.
10. Give the purpose of Mealy and Moore machines.
11. Compare DFA and NFA (any two points).
1.90
Theoretical Computer Science Finite Automaton
(ii)
(iii)
1.92
Theoretical Computer Science Finite Automaton
(iii)
(iv)
Fig. 1.87
(v)
(vi)
Start e a
1 1 5
a a
a,b
b
3 4
16. Design a Mealy machine for the following: for input from (a + b + c)* if input ends in
"bac" then print A, else print B.
17. Design a Mealy machine to get 1's complement of a given binary string.
18. Design a Moore machine to get 1's complement of a given binary string.
1.93
Theoretical Computer Science Finite Automaton
19. Design a Moore machine for binary input sequence such that if it has a substring 101, the
machine outputs A, if it has a substring 110, the machine outputs B, otherwise it outputs
C.
20. Design a Mealy machine which outputs EVEN or ODD according to number of 1's
encountered is even or odd over {0, 1}.
21. Write a mapping of δ in case of NFA and DFA.
22. Differentiate between Moore and Mealy machine.
23. Convert the following NFA with ∈-moves to DFA.
Start q0 ' q1 a q3
b
' b a,b
q2 a, ' q4 a,b q5
a
a,b
24. Find minimum state FA equivalent to the following DFA, M = ({q0, …, q5}, {a, b}, δ, q0,
{q4, q5)}.
b
a
Start a b
q0 q1 q4
a
b b
a b a
q2 q3 b
a
q5
October 2016
1. Define NFA. [1 M]
Ans. Refer to Section 1.4.
2. Define proper suffix with the help of an example. [1 M]
Ans. Refer to Page 1.2.
3. Write the mapping of λ function in Mealy Machine. [1 M]
Ans. Refer to Section 1.8.
4. Construct a DFA to accept the set of all strings over ∑ = {a, b, c} such that the string
starts with ‘ac’ and not having ‘cab’ as substring in it. [5 M]
Ans. L = {ac, aca, acb, acc . . . . . . .}
The TD is as follows.
M = (Q, ∑, δ, q0, F)
Q = {q0, q1, q2, q3, q4, q5}
∑ = {a, b, c}
q0 = {q0}
F = {q2, q3, q4}
1.95
Theoretical Computer Science Finite Automaton
δ Q\∑ a b c
q0 q1 q5 q5
q1 q5 q5 q2
q2 q2 q2 q3
q3 q4 q2 q3
q4 q2 q5 q3
q5 q5 q5 q5
5. Convert the following NFA with ∈ moves to DFA. [5 M]
1.96
Theoretical Computer Science Finite Automaton
6. Construct a Mealy machine of a language L over ∑ = {0, 1} which outputs ‘$’ if string
ends with ‘aba’, outputs ‘#’ if string ends with ‘bab’, otherwise outputs ‘*’. [5 M]
Ans. Mealy Machine
δ φ
∑
Q\∑ a b Q\∈
∈ a b
q0 q1 q4 q0 * *
q1 q1 q2 q1 * *
q2 q3 q4 q2 $ *
q3 q1 q6 q3 * #
q4 q5 q4 q4 * *
q5 q1 q6 q5 * #
q6 q3 q4 q6 $ *
7. Differentiate between DFA and NFA. [5M]
Ans. Refer to Section 1.21.
April 2017
1. Give the mapping of 'δ' function of NFA with ∈ moves. [1 M]
Ans. Refer to Section 1.6.
2. If A = {∈}. Find the value of |A|. [1 M]
Ans. Value of A = 1.
3. Differentiate between Moore and Mealy machine. [1 M]
Ans. Refer to Page 1.43.
4. Construct a DFA to accept the set of all strings over ∑ = {0, 1, 2} such that the string
ends with '012' or '20'. [5 M]
Ans. L = {012, 20, 0012, 020 …}
1.97
Theoretical Computer Science Finite Automaton
1
1
0 0
Start 0 1 2
q0 q1 q2 q3
0 2
2 0
1 2
0
q4 q5
2
2
b a,b
Ans. Refer to Section 1.5.
6. Construct a Moore machine for a language L over ∑ = {0, 1} which outputs '$' if string
ends with '100', outputs '#' if string ends with '001', otherwise outputs '*'. [5 M]
Ans. Refer to Section 1.8.
7. Minimize the following DFA:
M = ({q0, q1, q2, q3, q4, q5, q6, q7}, {0, 1}, δ, q0, {q1}) where, δ is given by: [5 M]
δ a b
→ q0 q4 q0
*q1 q1 q0
q2 q1 q3
q3 q7 q2
q4 q0 q5
q5 q1 q4
q6 q7 q1
q7 q3 q7
Ans. Refer to Section 1.9.
October 2017
1. Define suffix of a string. Give one example. [1 M]
Ans. Refer to Page 1.2.
2. "DFA cannot have more than one final states". Justify. [1 M]
Ans. Refer to 1.2.
1.98
Theoretical Computer Science Finite Automaton
b q4
0 q3
q1
0
Start q0 1 0 0 1
1
q2 q4
1
1
Ans. Refer to Section 1.9.
7. Construct Mealy machine to convert each occurrence of substring 101 by 100 over
alphabet {0, 1}. [4 M]
Ans. Refer to Section 1.8.
October 2018
1. Define suffix of a string. Give one example. [1 M]
Ans. Refer to Page 1.2.
2. Compare ‘λ’ function of Melay and Moore machine. [1 M]
Ans. Refer to Section 1.8.
3. Write down the ‘∈-closure’ of each state from the following FA: [3 M]
e
q0 q1
b q4
Start q0 0, e q1 1, e q2
0 0
e
Ans. Refer to Section 1.6.
1.101
Theoretical Computer Science Finite Automaton
1.102
CHAPTER
2
Regular Expressions
and Languages
Objectives …
To study Basic Concepts in Regular Expressions
To learn Regular Languages
2.0 INTRODUCTION
• The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
• The languages accepted by some regular expression are referred to as Regular languages. A
regular expression can also be described as a sequence of pattern that defines a string.
• Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
2.1 REGULAR EXPRESSION (RE) [April 16, 17, 18, 19, Oct. 17, 18]
• The languages accepted by finite automata are described or represented by simple
expressions called regular expressions.
• The languages that are associated with these regular expressions are called regular and are
also said to be defined by finite automata.
• Regular expressions are also referred as rational expressions. Regular expression is generally
a sequence of characters that is used to find a string in language.
Operations of Sets of Strings:
• Regular expressions represent simple language-denoting formulas, based on the operations of
concatenation, union and closure.
• Let, Σ be a finite set of symbols and let L1, L2 and L be sets of strings from Σ*. Then we can
define following operations on these sets.
1. Concatenation: Concatenation of L1 and L2 is denoted by,
L1 L2 = {xy | x is in L1 and y is in L2}
2.1
Theoretical Computer Science Regular Expressions and Languages
respectively.
• If 'r' is a regular expression, then the language represented by 'r' is denoted by L (r).
1. If r = a+b
L (r) = {a, b}
2. If r = ab
L (r) = {ab}
3. If r = a*
L (r) = {∈, a, aa, aaa, aaaa, …}
Here, ∈ stands for zero occurrences of 'a'. Hence, 'a' denotes an infinite set of strings.
• If r = (ab)* = {∈, ab, abab, ababab, …}
• If r = a* b* = {∈, a, b, ab, aab, abb, aabb, …}
• If r = (a + b)* = {∈, a, b, ba, ab, baa, abb, …}
2.2
Theoretical Computer Science Regular Expressions and Languages
2.1.2 Examples
• In this section we will study some example of regular expressions.
Example 1: Define the language such that all words begin and end with 'a' and in between
any word using 'b', using regular expression.
Solution: Σ = {a, b}, start symbol and end symbol = 'a'. The regular expression is ab*a + a.
Example 2: Describe the language consisting of all strings over Σ = {0, 1} with atleast two
consecutive 0's, using regular expression.
Solution: Σ = {0, 1}
∴ Σ* = {∈, 0, 1, 00, 01, 10, 11, …}
Here, we want atleast one occurrence of "00" and any number of trailing 1's and 0's and any
number of leading 1's and 0's.
Therefore, the regular expression is,
r = (0 + 1)* 00 (0 + 1)*
Example 3: If L (r) = the set of all strings over {0, 1} ending with "011", find regular
expression r.
Solution: The regular expression is,
r = (0 + 1)* 011
Example 4: Define language over {0, 1} containing all possible combinations of
0's and 1's but not having two consecutive 0's.
Solution: (1 + 10)* is language containing the strings starting with 1 and not having two
consecutive 0's. Similarly, the language containing the strings starting with '0' and not having two
consecutive 0's is represented by 0 · (1 + 10)*.
If we combine both, we get the required regular expression as
(1 + 10)* + 0 (1 + 10)* i.e. (0 + ∈) (1 + 10)*.
Example 5: Define language containing all strings of a's and b's containing atleast one
combination of double letters using regular expression.
Solution: Regular expression, r = (a + b)* (aa + bb) (a + b)*.
Example 6: Define language containing all strings over {0, 1} having atmost one pair of 0's
or at most one pair of 1's.
Solution: Here we have four combinations
1. Consecutive 0's and 1's not present.
2. Only one pair of consecutive 0's present.
3. Only one pair of consecutive 1's present.
4. One pair of consecutive 0's and 1's present.
The regular expression is,
r = (01)* 00 (10)* + (10)* 11 (01)* + (01)* + (10)* + (10)* 0 (10)* + (01)* 1 (01)*
2.3
Theoretical Computer Science Regular Expressions and Languages
Example 16: Write a regular expression to denote a language having strings not containing
substring “01” over Σ = {0, 1}.
Solution: r = 1* 0*
L = {∈, 0, 1, 10, 100, 000, ......}
Example 17: Find regular expression for a language consists of strings over {a, b} which are
either all b’s or strings begin with a and followed with any number of b’s only or an empty
string.
Solution: L = {∈, b, bb, bbb, a, ab, abb, abbb, ......}
r = b* + ab*
= (∈ + a) b*
Example 18: Find regular expression for a language consists of string over {0, 1} whose 3rd
digit from right end is always 1.
Solution: r = (0 + 1)* 1 (0 + 1) (0 + 1)
Example 19: Write regular expression for a language consists of string having length
divisible by 3 over {a}.
Solution: r = (aaa)*
Example 20: Write regular expression for a languag containing strings do not have either
“aa” or “bb” as a substring in it over {a, b}.
Solution: L = {∈, a, b, ab, bab, aba, ......}
r = (b + ∈) (ab)* (a + ∈)
Example 21: Write regular expression for a language consisting of string such that total
number of b’s in each string is divisible by 3 over {a, b}.
Solution: L = {∈, a, aa, abbb, bbba, ababab, ......}
r = (a* b a* ba* ba*)* + a*
Example 22: Write a regular expression for a language of strings with total number of 0’s
are even over {0, 1}.
Solution: r = 1* + (1* 01* 01*)*
Example 23: Write regular expression to denote a language L over {a, b} such that all the
strings do not have a substring “ab”.
Solution: r = b* a* (L = {ε‚ a‚ b‚ bb‚ ba‚ baa‚ ……})
Example 24: Write regular expression for language containing strings without a substring
“abb” and “bba”.
Solution: r = b* + (b + ∈) (a + ab)*)*
∴ L = {∈, a, b, ab, ba, abab, aab, bb, ......}
2.5
Theoretical Computer Science Regular Expressions and Languages
Example 25: Write regular expression for language containing strings in which every block
of 4 consecutive symbols contain atleast two a’s over {a, b}.
Solution: For block of 4, atleast two a’s are required, among 4 positions any two can be a’s.
There are 4C2 possible ways = 6.
∴ R = r1 + r2 + r3 + r4 + r5 + r6
where r1 = (a + b) (a + b) (a) (a)
r2 = (a) (a) (a + b) (a + b)
r3 = (a) (a + b) (a) (a + b)
r4 = (a + b) (a) (a) (a + b)
r5 = (a + b) (a) (a + b) (a)
r6 = (a) (a + b) (a + b) (a)
∴ R = [(a + b) (a + b) aa] + [aa (a + b) (a + b)] + [a (a + b) a (a + b)]
+ [(a + b) aa (a + b)] + [(a + b) a (a + b) a] + [a (a + b) (a + b) a]
Example 26: Write regular expression for language containing strings without consecutive
0’s or 1’s or without consecutive 0’s followed by without consecutive 1’s over {0, 1}.
Solution: r = (1 + 01)* + (0 + 01)*
Example 27: Define the language of the following regular expression :
(a + b)* a (∈ + bbb)
Solution: It defines the language made up of zero or more occurrences of a or b ending with
“a” or “abbb”.
Example 28: Define the language of following regular expression:
((0 + 1) 0)*
Solution: It defines the language containing set of strings of even length in which “0” is at
even position.
Example 29: Find the regular expression for the set of strings not having "101" as a
substring.
Solution: In this, we should take care that if 1 is followed by 0 (i.e. 10) then it should be
followed by another 0 or nothing to avoid (101). Thus the regular expression is 0* (1* (00+)*)*.
Example 30: Find the regular expression for the set of all strings such that every block of
four consecutive symbols contains atleast two zeros.
Solution: This means that always we have to search four symbols and see that atleast two
zeros are there i.e. if A, B, C, D are four symbols then any of these two symbols should be zero.
i.e. AB, or BC, or CD, or AC or AD or BD.
2.6
Theoretical Computer Science Regular Expressions and Languages
2. φR = Rφ = R
3. ∈R = R∈ = R
4. ∈* = ∈ and φ* = ∈
5. R+R=R
6. R* R* = R*
7. RR* = R* R = R+
8. (R*)* = R*
9. ∈ + RR* = R* = ∈ + R* R
12. (P + Q) R = PR + QR
13. R (P + Q) = RP + RQ
Arden’s Theorem:
• The Arden’s theorem can be applied to find the regular expression. It states that, if P and Q
are two regular expressions over Σ. If P does not contain ∈, then R=Q + RP has a unique
solution given by, R = QP*.
2.7
Theoretical Computer Science Regular Expressions and Languages
Let b + aa* b = P
and a + ba* b = Q
∴ L.H.S. = P + PQ* Q
= P + PQ* Q
= P (∈ + Q* Q) = PQ* (∈ + RR* = R*)
= (b + aa* b) (a + ba* b)*
= (∈ + aa*) b (a + ba* b)*
= a* b (a + ba* b)*
= R.H.S.
2.8
Theoretical Computer Science Regular Expressions and Languages
= 0* + (∈ + 11*)
= 0* (1*) = 0*1*
= R.H.S.
• Theorem: Let r be a regular expression. There exists an NFA with ∈-transition that accepts
L (r).
• Proof: The languages accepted by FA are precisely the languages denoted by regular
expression.
2.3.1 Conversion from RE to FA
• In this section we will study how to convert the RE to FA. Consider the pictorial
representation in Fig. 2.1 of RE to FA.
Case 1:
Fig. 2.1
Case 2: If r = r1 + r2 (for union)
there are NFAs, M1 = (Q1, Σ1, δ1, q1, {f1})
and M2 = (Q2, Σ2, δ2, q2, {f2}) with
L (M1) = L (r1) and L (M2) = L (r2)
Construct, M = (Q1 ∪ Q2 ∪ {q0, f0}, Σ1 ∪ Σ2, δ, q0, {f0})
q0 = new initial state, f0 = new final state.
L (M) = L (M1) ∪ L (M2)
Fig. 2.2
Case 3: If r = r1 r2 (for concatenation)
M1 and M2 are same as above,
L (M) = L (M1) L (M2)
Fig. 2.3
2.10
Theoretical Computer Science Regular Expressions and Languages
*
Case 4: r1 = r1 (for closure)
Let M1 = (Q1, Σ1, δ1, q1, {f1}) and L (M1) = r1
Construct, M = (Q1 ∪ {q0, f0}, Σ1, δ, q0, {f0})
Fig. 2.4
• Any path from q0 to f0 is either path from q0 to f0 on ∈ or a path from q0 to q1 on ∈ followed
by some number of paths from q1 to f1, then back to q1 on ∈, each labeled by a string in L
(M1) followed by path q1 to f1 on string in L (M1) then to f0 on ∈.
Example 1: Construct NFA for regular expression 01* + 1.
Solution: 01* + 1 r + r
1 2
(a)
NFA for r1 = r3 r4 , where r3 = 0 and r4 = 1*.
NFA for r3 is,
(b)
NFA for r4 is,
(c)
NFA for r1 = 01* is,
(d)
2.11
Theoretical Computer Science Regular Expressions and Languages
(e)
Fig. 2.5
Example 2: Construct a FA equivalent to regular expression (1* + 0)*.
Solution:
Fig. 2.6
Example 3: Construct a FA equivalent to regular repression (01 + 10)* + 11.
Solution: (01 + 10)* + 11 = r1 + r2, where r1 = (01 + 10)* and r2 = 11.
NFA for r1 is M1 as shown below.
Fig. 2.8
Solution: If the start symbol is 'a', we can find different expressions to reach to final state q3
from q0 via q1 and are as follows :
a (ba)* a (a + b)* and a (ba)* bb (a + b)* .
Similarly, if the start symbol is 'b' then we have,
b (ab)* b (a + b)* and b (ab)* aa (a + b)*
Therefore, the required regular expression is obtained by "ORing all above expressions,
i.e. a (ba)* a + a (ba)* bb (a + b)* + (ab)* b (a + b)* + b (ab)* aa (a + b)* a (a + b)*.
Example 5: For the DFA shown in the Fig. 3.9, find the regular expression.
Fig. 2.9
Solution : The regular expression is 1* 00 (0 + 1)* + (1* 01)* 00 (0 + 1)*.
Example 6: For the DFA shown in the Fig. 3.10, find the regular expression.
Fig. 2.10
Solution: The required regular expression is (0 + 1) (0 + 1)*.
2.13
Theoretical Computer Science Regular Expressions and Languages
Example 7: Language L1 consists of strings beginning with 'a' and language L2 consists of
strings ending with 'a'.
(a) What is L1 intersection L2 ?
(b) Write regular expression for L1 and L2 and L1 ∩ L2.
(c) Draw FA.
Solution:
(a) L1 intersection L2 is all words starting with 'a' and ending with 'a'.
(b) L1 : r1 = a (a + b)*
L2 : r2 = (a + b)* a
L1 ∩ L2 : r3 = a (a + b)* a + a.
(c)
Fig. 2.11
Example 8: Give regular expression of L1 ∩ L2 if
L1 : all strings of even length
L2 : starting with 'b'.
Solution: L1 intersection L2 is all strings of even length and starting with 'b'.
L1 = ((a + b) (a + b))*
L2 = b (a + b)*
Now, for intersection the string should start with 'b' and length is to be even. Hence, 'b'
should be concatenated by a string of odd length.
((a + b) (a + b))* is even length, adding an extra symbol we get the odd length.
Hence, ((a + b) (a + b))* (a + b)
Therefore, the required regular expression is
b ((a + b) (a + b))* (a + b).
• By using FA and RE, we have been able to define many languages, but language which
cannot be defined by a regular expression is called a non-regular set.
• All languages are regular or non-regular. Pumping lemma, which is a powerful tool for
proving certain languages, is non-regular.
2.14
Theoretical Computer Science Regular Expressions and Languages
2.4.1 Applications
• We will use above theorem to prove that given language is not regular.
1. Select the language L which we wish to prove non-regular.
2. Assume L is regular and let n be number of states in the corresponding FA.
3. Select a string z such that |z| ≥ n. Use pumping lemma to write z = uvw with
|uv| ≤ n and |v| ≥ 1.
4. Find a suitable integer i such that uviw ∉ L.
• Above contradicts our assumption and hence L is non-regular.
2.15
Theoretical Computer Science Regular Expressions and Languages
Example 1: Show that the set L = {0i2 | i is an integer, i ≥ 1} which consists of all strings of
0's whose length is a perfect square, is not regular.
Solution:
Case 1:
1. Assume L is regular set and let n be the integer in the pumping lemma. Let z = 0n2 .
2. By the pumping lemma, 0n2 may be written as uvw where 1 ≤ |v| ≤ n and uviw is in L
for all i.
3. Let i = 2 (say).
The length of uv2w is |uvvw| > n2 always. Since, v is at least one and maximum n, therefore,
the maximum length of string is n2 + n.
Hence, n2 < |uv2w| ≤ n2 + n < (n + 1)2 i.e. the length of uv2w lies properly between n2 and
(n + 1)2 and is thus not a perfect square. Thus uv2w is not in L, a contradiction.
Therefore, L is not regular.
Case 2: L = {0, 0000, 000000000, ......}
1. Assume L is regular set.
2. Let z = 0000 (any word from language)
3. Split z into uvw such that |uv| ≤ |z| and |v| ≥ 1.
Suppose u = 0, v = 00, w = 0.
4. Let i = 2.
uvi w = uv2 w
= 0 (00)2 0
= 000000 ∉ L
∴ i
uv w ∉ L
Hence, L is non-regular.
Example 2: Show that L = {ap | p is prime} is non-regular.
Solution:
Case 1:
1. Let L be regular and n be number of states in FA accepting L.
2. Let p is a prime number and p ≥ n.
u = am, v = a and w = ap – (m + 1), 1 ≤ n.
So, |uv| ≤ n and |v| = 1.
By pumping lemma, uviw is in L.
Let, i = p + 1
∴ am (a)p+1 ap–m–1 ∈ L
am+p+1+p–m–1 = a2p ∈ L
but 2p is not a prime number.
So, uvp+1 w ∈ L is a contradiction. Hence, L is not regular.
2.16
Theoretical Computer Science Regular Expressions and Languages
Fig. 2.13
Hence, L is a regular language.
Example 5: Show that L = {0n 1n | n ≥ 1} is not regular.
Solution:
1. Let L is regular.
2. Let z = 0n 1n, then |z| = 2n > n
By pumping lemma, split z = uvw such that |uv| ≤ n and |v| ≥ 1.
3. Let z = 000111 and u = 00, v = 011, w = 1
|uv| ≤ 6 satisfies the condition.
4. Let i = 2.
uvi w = uv2 w
= 00 (011)2 1
= 000110111 ∉ L
∴ L is not regular.
Example 6: Prove that L = {an bn+1 | n > 0} is non-regular.
Solution: Let, n = 1
an bn+1 = ab2
|ab2| = 1 + 2 = 3
Let n = 2 |a2 b3| = 5
All strings of this language are of odd length.
1. Assume L be regular.
2. z = a 3 b 4. |z| = 7
Split z into uvw such that |uv| ≤ n, |v| ≥ 1.
Let u = aa, v = ab, w = bbb.
z = aa ab bbb
uvw
3. Let i = 2 uvi w = uv2 w
= (aa) (ab)2 bbb
= aa ab ab bbb ∉ L
∴ Language L is not regular.
2.18
Theoretical Computer Science Regular Expressions and Languages
• Proof: Let L be L (M). Some of states of this FA, M are final states and some are not. Let us
reverse the status of each state i.e. if it was a final state, make it non-final and if it was non-
final state, make it final state. If the input ends formerly with non-final state, it now ends in
final state and vice-versa. This new FA accepts all strings that were not accepted by the
original FA (L'). Therefore, the machine accepts the language L'. Therefore L' is a regular
set.
Example: Construct DFA for a language over {a, b} which contains all strings not having
"bba" as a substring.
Solution: First construct DFA which accepts substring "bba".
Fig. 2.14
Now, complement the final state. We get the required DFA as shown in the Fig. 3.15.
Fig. 2.15
• Theorem: The regular sets are closed under intersection (i.e. if L1 and L2 are regular sets,
then L1 ∩ L2 is also a regular set).
• Proof: By DeMorgan's law for sets of any kind
'
L1 ∩ L2 = (L' + L' )
1 2
• This means the language L1 ∩ L2 consists of all words that are not in either L 1' or L 2' . Since
L and L are regular sets, then L' and L' are regular sets proved by theorem 3.4.2. By
1 2 1 2
theorem 3.4.1, since L 1' and L '2 are regular then L '1 + L 2' is also a regular set.
'
Again by theorem 3.4.2, L1' + L'2 is a regular set.
( )
Therefore, L1 ∩ L2 is a regular set.
Example: Construct DFA for L = L1 ∩ L2 over {0, 1}, where
L1 = starting with 0 and ending with 11
L2 = containing substring "010" in it.
2.20
Theoretical Computer Science Regular Expressions and Languages
Solution: L = L1 ∩ L2 = All strings with 0 ending with 11 and having substring 010 in it.
DFA of L is as shown in the Fig. 2.16.
Fig. 2.16
• Kleene Closure: Given an alphabet Σ, we wish to define a language in which any string of
letters from Σ is a word, even the null string. This language we shall call the closure of the
alphabet. It is denoted by writing a star (asterisk) after the name of the alphabet as a
superscript.
• The Σ* is a notation called Kleene star or Kleene closure.
• Example:
1. If Σ = {x} then
Σ* = {∈, x, xx, xxx, …}
2. If Σ = {0, 1} then
Σ* = {∈, 0, 1, 00, 01, 10, 11, 000, 011 …}
3. If Σ = {a, b, c} then
Σ* = {∈, a, b, c, aa, bb, cc, ac, ab, ba, bc, ca, cb, aaa …}
4. If S = {aa, b} then
S* = {∈, aa, b, aab, baa, bbb, aaaa, aabb, …}
The string aabaaab is not in S* since it has a's of length 3.
5. If S = {a, ab} then
S* = {∈, a, ab, aa, aab, aba, aaaa, aaab, aaba, abab, aaaab, abaaa …}
Here, for each word in S* every b must have an 'a' immediately to its left.
6. If S = {10, 1} then
S* = {∈, 10, 1, 1010, 11, 101 110 …}
Therefore, Kleene closure of language is denoted by L* and is defined as
∞
L* = ∪ Li
i=0
• Positive Closure: If S is a set of strings not including ∈, then S+ is the language S* without
the word ∈. If S is a language that does contain ∈, then S+ = S*.
• This plus operation is called positive closure. Therefore, positive closure of language is L+
and is defined as
∞
L+ = ∪ Li
i=0
• Example:
1. If S = {10, 1} then
S+ = {10, 1, 1010, 11, 101, 110 …}
2.22
Theoretical Computer Science Regular Expressions and Languages
2. If S = {aa, b} then
S+ = {aa, b, aab, baa, bbb, aaaa … }
• Closure operator can be applied to infinite set or finite set.
EXAMPLES
Example 1: Find the regular expression for the set of strings not having "101" as a substring.
Solution: In this, we should take care that if 1 is followed by 0 (i.e. 10) then it should be
followed by another 0 or nothing to avoid (101). Thus the regular expression is 0* (1* (00+)*)*.
Example 2: Find the regular expression for the set of all strings such that every block of four
consecutive symbols contains atleast two zeros.
Solution: This means that always we have to search four symbols and see that atleast two
zeros are there i.e. if A, B, C, D are four symbols then any of these two symbols should be zero.
i.e. AB, or BC, or CD, or AC or AD or BD.
Thus the regular expression is
[00 (0 + 1) (0 + 1) | 0 (0 + 1) 0 (0 + 1) | 0 (0 + 1) (0 + 1) 0 |
(0 + 1) 00 (0 + 1) | (0 + 1) 0 (0 + 1) 0 | (0 + 1) (0 + 1) 00]*
Example 3: Find the regular expression for the set of all strings such that the fifth symbol
from the right end is 1.
Solution: The regular expression is (0 + 1)* 1 (0 + 1) (0 + 1) (0 + 1) (0 + 1).
Example 4: Prove L = {an bn an} is not regular.
Solution: By pumping lemma, w = an bn an split into xyz (say)
w ∈ L ∴ xyz ∈ L.
Case 1: x = ∈
y = a
z = bn a n
∴ w = a i bn a n such that xyiz ∈ L
Thus we can show that this is not regular.
Case 2: x = an
y = b
z = an
∴ w = an bi an, again for this i ≠ n, w generates a word which is
not in L.
Case 3: x = a n bn
y = a
z = ∈
∴ w = a n bn a i
Thus, for any value of i, w will not generate valid words. Thus it is not regular.
2.23
Theoretical Computer Science Regular Expressions and Languages
Case 4: x = ∈
y = aba
z = ∈
∴ w = (aba)i
For i = 1, w = aba ∈ L
For i = 2, w = abaaba is not in L.
Thus, the language is not regular from case 1 through case 4.
∴ L = {an bn an} is not regular.
Example 5: Prove L = {an | bn} valid words in L = {a, b, … } is regular using pumping
lemma.
Solution: Here again y can be either a or b.
Case 1: x = ∈
y = a
z = ∈
i
Thus, w = a and for all i, w will generate a word which is in L. Thus, the language is regular.
Case 2: x = ∈
y = b
z = ∈
∴ w = bi
Again for this, w generates words which are in L.
Thus the language is regular. Thus L = {an | bn} is regular.
Example 6: Draw a FA for a language that will not accept strings "aba" and "abb" over
alphabet Σ = {a, b}.
Solution: The FA that accepts only the strings "aba" and "abb" is shown below.
Fig. 2.17
An FA that accepts all strings other than "aba" and "abb" is complement of above language
as shown below.
Fig. 2.18
Here, we have to reverse the final / non-final status of the states.
2.24
Theoretical Computer Science Regular Expressions and Languages
(b)
Fig. 2.19 (b)
∴ NFA for R is
2.25
Theoretical Computer Science Regular Expressions and Languages
Fig. 2.20
EXAMPLES
Example 1: Describe the language L = {an bn | n ≥ 1}.
Solution: ∈ = {a, b}
L = {ab, aabb, aaabb, ……}
The language contain equal number of a's followed by equal no. of b's.
Fig. 2.21
Example 10: Construct NFA without ∈ for language L = (0 + 1)* 01
Solution:
Fig. 2.22
2.27
Theoretical Computer Science Regular Expressions and Languages
Example 11: Write smallest possible string accepted by the following regular expression :
(ab + ba*)* b
Solution: L = (ab + ba*)* b
Smallest string is b.
Example 12: Construct FA for the following regular expression :
ab (a + b)* + ba (a + b)*
Solution:
Fig. 2.23
Fig. 2.24
Example 15: Write smallest possible string accepted by the regular expression :
a (a + b)* ab
Solution: aab is the smallest string.
2.28
Theoretical Computer Science Regular Expressions and Languages
*
Example 16: Construct FA for regular expression ((a + b)* + abb)
Solution:
Fig. 2.25
Example 17: .Describe the language L = {an bn | n ≥ 1}.
Solution: L = {ab, aabb, aaabbb, ......}
Language contains equal number of a’s and b’s.
Example 18: Construct NFA for 01* (10)* + 1
Solution: L = {0, 1, 01, 010, 0110, 011010, ......}
Fig. 2.26
Example 19: Write the smallest possible string accepted by regular expression :
01 + (0 + 1) 01*
Solution: Smallest string = 01, 00, 10.
Example 20:.If L1 = a* b* + b* a* and L2 = (a + b)*, find L1 ∩ L2
Solution: L1 = {∈, a, b, ab, ba, aab, bba, ......}
L2 = {∈, a, b, ab, ba, aab, bba, ......}
L1 ∩ L2 = φ
Example 21: Find the language L of the following expression : (a* b*) a.
Solution: L = {a, aa, ba, aba, aaba, bba, ......}
Language containing string ends with a.
2.29
Theoretical Computer Science Regular Expressions and Languages
Fig. 2.27
Example 23: Write smallest possible string generated by regular expression :
a (a + b) b*
Solution: Smallest string = aa or ab
Example 24: Draw FA equivalent to regular expression : a (a + b)* + b (b + a)*
Solution:
Fig. 2.28
Example 25: Give two kinds of operations that can be carried out on regular language.
Solution: (i) Union, (2) Intersection. (Refer section 3.6.2 for examples)
Example 26: Write the smallest string generated by regular expression.
b (a * b + ab*) c
Solution: bac or bbc.
2.30
Theoretical Computer Science Regular Expressions and Languages
PRACTICE QUESTIONS
Q.I Multiple Choice Questions:
1. Which is a set of strings of symbols over an alphabet?
(a) Expression (b) Grammar
(c) Language (d) None of the mentioned
2. A language is generated from the rules of a,
(a) expression (b) grammar
(c) language (d) None of the mentioned
3. Which can be defined as a language or string accepted by an FA.
(a) Regular Expression (RE) (b) Regular Grammar (RG)
(c) Regular Language (RL) (d) None of the mentioned
4. The machine format of regular expression is,
(a) Push down automata (b) Turing machine
(c) Finite Automata (FA) (d) None of the mentioned
5. The language of all words with at least 2 a’s can be described by the regular
expression,
(a) (a + b)*ab*a(a + b)* (b) b*ab*a(a + b)*
(c) (ab)*a (d) None of the mentioned
6. The pumping lemma for regular expression is used to prove that,
(a) Certain sets are not regular (b) Regular grammar don’t produce RE
(c) Certain sets are regular (d) Certain regular grammar produce RE
7. Which of the languages is accepted by the following FA?
(a) a(a + bb*a*)*b* d) (b) b(a + bba*)*a* b)
(c) a(a + bba*)*b* (d) None of the mentioned
8. Which of the strings do not belong to the regular expression (ba + baa)*aaba,
(a) babaabaaaba (b) baaaba
(c) baaaaba (d) babababa
9. A finite automata recognizes,
(a) Context sensitive language (b) Context-free language
(c) Regular language (d) None of the mentioned
10. Which of the following regular expressions describe the language over {0, 1}
consisting of strings that contain exactly two 1’s?
(a) 0*10*10* (b) (0 + 1)*1(0 + 1)*1(0 + 1)*
(c) 0*110* (d) (0 + 1)*11(0 + 1)*
2.31
Theoretical Computer Science Regular Expressions and Languages
11. Given the language L = {ab, aa, baa}, which of the following strings are in L*?
(a) baaaaabaa (b) baaaaabaaaab
(c) aaaabaaaa (d) abaabaaabaa
12. Which one of the following regular expressions is not equivalent to the regular
expression (a + b + c)*?
(a) (a*b*c*)* (b) (a*b* + c*)
(c) (a* + b* + c*)* (d) ((ab)* + c*)*
13. The set of all strings over {a, b} of even length is represented by the regular
expression,
(a) (ab + aa + bb + ba)* (b) (a + b)* (a* + b)*
(c) (aa + bb)* (d) (ab + ba)*
14. The set of strings over {a, b} having exactly 3a's is represented by the regular
expression,
(a) b*aaa (b) b*ab*ab*a
(c) ab*ab*a (d) b*ab*ab*ab*
15. {a2n | n ≥ 1} is represented by regular expression
(a) (aa)* (b) a*
(c) aa*a (d) a*a*
16. (0* 1*) is equivalent to,
(a) (0 + 1)* (b) (01)*
(c) (10)* (d) None of the mentioned.
Answers
1. (c) 2. (b) 3. (a) 4. (c) 5. (a) 6. (c) 7. (b) 8. (d) 9. (c) 10. (a)
11. (b) 12. (d) 13. (d) 14. (d) 15. (a) 16. (d)
Q.II Fill in the Blanks:
1. A regular ______ is a formal language that can be expressed using a regular
expression.
2. ______ is accepted by the machine called Finite Automata (FA).
3. ______ is used to prove that given language is not regular.
4. The language specified by a ______ referred as regular language.
5. ______ are those languages that are described by regular expressions and can be
accepted by FA.
6. ______ of regular expressions used for simplifying RE.
7. Pumping lemma, which is a powerful tool for proving certain languages is ______.
2.32
Theoretical Computer Science Regular Expressions and Languages
8. The property which describes when we combine any two elements of the set and the
result is also included in the set is called as______.
Answers
1. language 2. RE 3. Pumping Lemma 4. regular expression
5. Regular languages 6. Identities 7. non-regular 8. ∈-closure
Q.III State True or False:
1. The languages accepted by Finite Automata (FA) are called as regular expressions.
2. The regular languages are those languages that can be constructed from the three set
operations viz., Union (∪), Concatenation (⋅) and Kleene closure (*)
3. Pumping Lemma should never be used to show a language is regular.
4. A regular expression is written as RE or regex or regexp.
5. To prove a given language is regular we use Pumping Lemma.
6. The language that is accepted by FA is known as Regular language.
7. A regular expression can be defined as, a language or string accepted (recognisable) by
FA.
8. Every FA can have regular expression.
9. FA can accept only non-regular sets.
10. (R*)* = R* is true or false.
11. Regular sets are closed under union.
12. Every finite subset of ∑* is a regular language.
13. Every regular language over ∑ is finite.
14. The regular expression (0 + 1)* 2* = 0* (1 + 2)*.
15. aa* + bb* = (a + b)*.
16. The Language {0,1}* - {0101} is regular.
Answers
1. (F) 2. (T) 3. (T) 4. (T) 5. (F) 6. (T) 7. (T) 8. (T) 9. (F) 10. (T)
11. (T) 12. (F) 13. (F) 14. (T) 15. (F) 16. (T)
Q.IV Answer the following Questions:
(A) Short Answer Questions:
1. Define regular language.
2. Define regular expression.
3. List any four identities of regular expression.
4. What is purpose of Pumping Lemma?
5. Develop the finite automata for (a*ab + ba)*a regular expressions.
6. Give closure properties of regular languages.
2.33
Theoretical Computer Science Regular Expressions and Languages
2.34
Theoretical Computer Science Regular Expressions and Languages
8. Find L = L1 L2 where
L1 = a* and L2 = b*.
9. Find L = L1 ∪ L2 where
L1 = {an bn | n ≥ 1}
L2 = {an bn ci | n ≥ 1, i ≥ 1}
UNIVERSITY QUESTIONS AND ANSWERS
April 2016
1. Write the regular expression for the following FA: [1 M]
Ans. r = (1 + 0)*.0.
2. Give any two identities of regular expression. [1 M]
Ans. Refer to Section 2.1.3.
April 2017
1. Define regular expression. [1 M]
Ans. Refer to Section 2.1.
2. Which tool is used to prove that the language is not regular? [1 M]
Ans. Refer to Section 2.2.
3. Define Kleene Closure. [1 M]
Ans. Refer to Page 2.2, Point (2).
2.35
Theoretical Computer Science Regular Expressions and Languages
2.36
CHAPTER
3
3.0 INTRODUCTION
• A language consists of a finite or infinite set of sentences. Infinite language is specified by a
grammar.
• A grammar consists of a finite non-empty set of rules or productions, which specify the
syntax of language.
• Number of grammar may generate the same language but consists of different structures on
the sentences of that language.
• Even though human language's have rules of grammar. For example, English language has
grammar and grammar has rules.
• Another method for language specification is to have a machine, called an acceptor,
determine whether a given sentence belongs to the language.
• In 1959, Chomsky, cataloged the hierarchy of grammars according to the structure of their
productions, which we will discuss further.
• Context-Free Grammar (CFG) is a formal grammar which is used to generate all possible
patterns of strings in a given formal language.
• The context-free languages are the languages which can be represented by CFG.
3.1 GRAMMAR
• Grammar can be defined as, a set of formal rules for generating syntactically correct sentence
from a particular language for which it is written.
3.1
Theoretical Computer Science Context-Free Grammars and Languages
• For any type of language [Formal (like 'C') or Natural (like English)] is required to have a
grammar, which can be defined in syntactically correct statement formats or in turn we can
say that it is the syntactic definition of the language, or in other words, we can also say that,
grammar defines syntax of a language.
• For example: If we want to generate an English statement "Dog runs", we have to use the
following rules:
< sentence > → < noun > < verb >
< noun > → Dog
< verb > → runs
• These rules describe how the sentence can be generated as 'noun' followed by 'verb' and so
on. There are many such rules which can be defined for English language and collectively
can be called as grammar for the language.
• Grammar normally consists of two types of basic elements, namely Terminal symbols and
Non-terminal symbols or variables.
1. Terminal symbols are those which are the constituents of the generated sentence, which
we have generated using a grammar. For example, in the above example, 'Dog' and 'runs'
are terminal symbols.
2. Non-terminal symbols are those which take part in the formation or generation of the
statement, but are not the part of generated statement like terminal symbols. For
example, 'sentence', 'noun' and 'verb' are non-terminals in the above example, which are
not in the generated statement as,
gives
< sentence > → < noun > < verb >
gives
→ Dog runs.
• The rules of the grammar are called as productions or production rules or syntactical rules.
Example:
Sentence = Omkar ate an apple.
We use the following rules for sentence.
<sentence> → <subject> <predicate>
<predicate> → <verb> <article> <noun>
<subject> → <noun>
<noun> → Omkar | apple
<verb> → ate
<article> → an
3.2
Theoretical Computer Science Context-Free Grammars and Languages
3.4
Theoretical Computer Science Context-Free Grammars and Languages
|ab| = 2
There are two terminal nodes arranged
from left to right.
Fig. 3.1
Example 2: Draw the derivation tree for a substring "001100" using the following grammar
G:
G = ({S, A}, {0, 1}, {S → 0AS | 0, A → S1A | SS | 10}, S)
Solution: Since S is the start symbol, string is derived from S.
S ⇒ 0AS ⇒ 0S1AS ⇒ 001AS ⇒ 00110S ⇒ 001100
The derivation tree is shown in the Fig. 3.2.
Fig. 3.2
⇒ bbAA
⇒ bbaSA
⇒ bbaaBA
⇒ bbaabA
⇒ bbaaba
3.5
Theoretical Computer Science Context-Free Grammars and Languages
2. Rightmost derivation:
S ⇒ bA
⇒bbAA
⇒bbAa
⇒bbaSa
⇒bbaaBa
⇒bbaaba
Example 2: Consider the following grammar: G = ({S, A}, {a, b}, P, S)
Where, P is S → AbaaA | aA
A → Aa | Ab | ∈
Find the leftmost and rightmost derivation for the string “abaabb”.
Solution: 1. Leftmost derivation:
S ⇒ AbaaA
⇒AabaaA
⇒abaaA
⇒abaaAb
⇒abaaAbb
⇒abaabb
2. Rightmost derivation:
A ⇒ AbaaA
⇒AbaaAb
⇒AbaaAbb
⇒Abaabb
⇒Aabaabb
⇒abaabb
3.2.3 Reduction
• Reduction means derivation in reverse. The process starts from a sentence. It finds the string
which matches with RHS of the production rule.
• When the match is found also called handle, then it replace with LHS of same production
rule. This is called reduction. The process repeats until we get starting non-terminal symbol.
This process is called handle running.
• Reduction is used in bottom-up parsing.
3.6
Theoretical Computer Science Context-Free Grammars and Languages
3.3 CHOMSKY HIERARCHY [April 16, 17, Oct. 16, 17, 18]
• Linguist Noam Chomsky defined a hierarchy of languages, in terms of complexity. This four-
level hierarchy, called the Chomsky hierarchy, corresponds to four classes of machines.
• Each higher level in the hierarchy incorporates the lower levels i.e., anything that can be
computed by a machine at the lowest level can be computed by a machine at the next highest
level.
• The Chomsky hierarchy classifies grammars according to the form of their productions into
the following levels:
1. Type 0 (unrestricted grammar),
2. Type 1 (context-sensitive grammar),
3. Type 2 (context-free grammar), and
4. Type 3 (regular grammar).
• Let us see above grammars in detail.
1. Type 0 (Unrestricted Grammar ):
• There are no restrictions on the productions of grammar of this type. This type of grammar
permits productions of the form α → β with α ≠ ∈, where 'α' and 'β' are sentential forms i.e.
any combinations of any number of terminals and non-terminals i.e. α, β ∈ (V ∪ T)* but α
≠ ∈. Such grammar is called as unrestricted grammar.
• Unrestricted grammar generates the recursively enumerable languages or every type 0
language forms a recursively enumerable set. i.e. we can construct Turing Machine (TM) to
recognize the sentences generated by this type of grammar.
• For example: Grammar, G = (V, T, P, S)
where V = {S, B, C}, T = {a, b, c}
and P = S → SB
SB → BC
B→a
3.7
Theoretical Computer Science Context-Free Grammars and Languages
• The Table 3.1 shows the Chomsky hierarchy of grammars and machine which are acceptors
of the grammar.
Table 3.1: The Chomsky hierarchy of grammars and validating machines for languages
Name of Production restriction Example of
Type languages Acceptor
X→Y application
generated
0 Phrase-structure = X = any string with non- TM Computers
recursively terminals
enumerated Y = any string
1 Context sensitive X = any string with non- TM with Computers
terminals bounded (not
Y = any string as long as infinite) TAPE,
or longer than X called linear
bounded
automata LBA.
2 Context-free X = one non-terminal PDA Programming
(CFL) Y = any string languages,
statements,
compilers.
3 Regular X = one non-terminal FA Text editors
Y = tN or Y = t,
where t is terminal and N
is non-terminal
• From the above table we would know that regular languages are accepted by machine or
mathematical model called Finite Automata (FA), Non-regular Languages (CFL's) are
accepted by Pushdown Automata (PDA) and context sensitive, enumerated languages are
accepted by Turing Machine (TM).
• This form is called Backus Naur Form (BNF) since left hand side of the production rule
contains only one non-terminal.
S = start symbol of a grammar
• For example: Grammar of expression,
1. E → E + E | E * E | (E) | id
Here, V = {E}
T = {+, *, (, ), id}
There are 4 production rules and E is start symbol.
2. S → AB
A → a
B → b
Here, V = {S, A, B}
T = {a, b}
There are three production rules and S is start symbol.
Conventions Regarding Grammars (used in this Chapter):
1. The capital letters A, B, C, D, E and S denote variables, S is the start symbol.
2. Lower case letters a, b, c, d, e digits are terminals.
3. The capital letters X, Y and Z denote symbols that may be either terminal or variable.
4. The lower case letters u, v, w, x, y and z denote string of terminals.
5. The lower case Greek letters α, β and γ denote string of variables and terminals i.e. (V
∪ T)*.
6. "→" symbol stands for production, For Example: S → a.
7. "⇒" symbol stands for process of derivation.
*
"⇒ " stands for deriving in any number of steps.
8. If A → α1, A → α2 … A → αk are productions for variable A of some grammar then we
can express it using notation A → α1 | α2 | α3 | … |αk.
3.4.2 Examples
• Consider the grammar:
S → AB.
A → a
B → b
Here, S->AB , A→a, B→b are production rules.
S=>AB=>aB=> ab (The string ab is derived from the S, called derivation).
And, S ⇒* ab is the derivation from S using number of steps.
3.10
Theoretical Computer Science Context-Free Grammars and Languages
(a) Parse Tree for Leftmost Derivation (b) Parse Tree for Rightmost Derivation
Fig. 3.3: Parse Tree
Since the word "ab" represents only one parse tree, therefore, grammar is unambiguous,
(only one leftmost derivation is possible).
Hence, we can say that if some string has more than one leftmost or rightmost derivation then
it is an ambiguous CFG.
Example 2: Consider the following CFG:
S → aB | aA
A → aAB | a | b
B → Abb | b
The leftmost derivation for the string "aaabbbbb"
S ⇒ aA ⇒ aaAB ⇒ aaaABB ⇒ aaabBB ⇒ aaabAbbB ⇒ aaabbbbB ⇒ aaabbbbb
3.11
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.5
3.12
Theoretical Computer Science Context-Free Grammars and Languages
Clearly the grammar is ambiguous. So find the precedence and associativity of each operator.
Operator Precedence Associativity
+, – 2 left
*, / 4 left
↑ (exponent) 6 right
• First level of production rule contain lowest precedence operator. Lower priority
symbols are closer to the start symbols.
• Add new non-terminal symbols T and F in the above CFG.
• If operator is left associative, the production rule of that grammar is left recursive.
• If operator is right associative, then the production rule of that grammar is right
recursive.
Consider +, – which are having lower priority and left associativity, so production rule
should be left recursive.
E → E+T|E–T
E → T
Then for operator *, /, productions are added.
T → T * F | T/F
T → F
id has no associativity.
F → id
So unambiguous grammar becomes,
G = (V, T, P, S)
where V = {E, T, F} T = {+, –, *, /, id}
P is E → E+T|E–T|T
T → T * F | T/F | T
F → id
Now consider string id + id + id, we get the following derivation.
E ⇒ E+T
⇒ T+T
⇒ F+T
⇒ id + T
⇒ id + T * F
⇒id + F * F
⇒id + id * F
⇒id + id * id
3.13
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.6
So the above grammar is unambiguous.
EXAMPLES
Example 1: Consider CFG, G = ({S}, {a}, {S → aS, S → ∈}, S). Find the language L(G).
Solution: Firstly consider production S → ∈.
So S ⇒ ∈. Thus ∈ is in L(G).
Now for all n ≥ 1,
1. S ⇒ aS ⇒ a.
2. S ⇒ aS ⇒ aaS ⇒ aa
3. S ⇒ aS ⇒ aaS ⇒ aaaS ⇒ aaa
4. S ⇒ aS ⇒ aaS ⇒ aaaS ⇒ aaaaS ⇒ aaaa and so on.
The string an comes from n application of production 1 followed by one application of
production 2.
The language generated by this CFG is a*.
Thus, L(G) = {∈, a, aa, …}.
Example 2: Find CFL associated with CFG given below:
G = ({S}, {0}, {S → SS}, S).
Solution: In this case, L(G) = φ. This is because only production in G is S → SS and in
production there is no terminal symbol.
Example 3: Find CFL associated with CFG
G = ({S}, {a, b}, {S → aSb | ab}, S)
Solution: 1. S → ab production 2
2. S ⇒ aSb ⇒ aabb
3. S ⇒ aSb ⇒ aaSbb ⇒ aaabbb and so on.
Thus language L(G) = {ab, aabb, aaabbb, …}
i.e. L(G) = {an bn | n ≥ 1}
3.14
Theoretical Computer Science Context-Free Grammars and Languages
Example 10: Construct a CFG for each of the language defined by the following regular
expressions: (i) ab*, (ii) a* b*, (iii) (baa + abb)*.
Solution: 1. The required CFG for ab* is
S → aB
B → bB | ∈
2. The required CFG for a* b* is
S → AB
A → aA | ∈
B → bB | ∈
3. The required CFG for (baa + abb)* is
S → AS | BS | ∈
A → baa
B → abb
where 'S' is start symbol and T = {a, b}.
Example 11: Construct a grammar for L = {an bn cm | n ≥ 1, m ≥ 0}.
Solution: The required grammar will be
G = ({S, A}, {a, b, c}, P, S)
where P = {S → A | Sc
A → ab | aAb }
Example 12: Construct a CFG for language L = L1 ∩ L2
where L1 = {an b an | n ≥ 1}
L2 = All strings having odd length over {a, b}.
Solution: Since we observe that length of L1 is odd, therefore, L1 ∩ L2 = L1.
Required CFG for L is ({S}, {a, b}, {S → aSa | aba}, S)
Note: All examples we discussed in this chapter are called CFG. The property of CFG is that
all productions are of the form: one non-terminal → string of T and V. This form of production
is called as Backus Nour Form (BNF).
Example 13: Let G = ({S, A}, {a, b}, P, S),
where P is S → aAa
A → aAa | b construct the language.
Solution:
S ⇒ aAa ⇒ aba
S ⇒ aAa ⇒ aaAaa ⇒ aabaa
S ⇒ aAa ⇒ aaAaa ⇒ aaaAaaa
⇒ aaabaaa
:
:
L (G) = {an b an | n ≥ 1}
3.17
Theoretical Computer Science Context-Free Grammars and Languages
Example 14: Construct a CFG for a language in which all strings with no consecutive b's but
may or may not with consecutive a's.
Solution: Here, L = {∈, a, b, aa, ba, ab, ......}
The grammar G will be
G = ({S, A) {a, b}, P, S)
where P is S → aS | bA | a | b | ∈
A → aS | a | ∈
Example 15: Construct the CFG for language containing the string with atleast “aaa” in it
over {a, b}.
Solution: G = ({S, A} {a, b}, P, S)
where P is S → AaaaA
A → aA | bA | ∈
Example 16: Find the CFL generated by the following grammar.
S → AB
A → aA | bB | a
B → Ba | Bb | a
Solution: 1.
S ⇒ AB ⇒ aa
2. S ⇒AB
⇒aAB ⇒ aaa
3. S ⇒AB
⇒bBB
⇒bBaB
⇒baaa
So, L = {aa, aaa, baaa, baab, ……}
L = {w | w containing atleast one occurrence of two consecutive a's}
Example 17: Write CFG for a language containing string having atleast one occurrence of
“00” over {0, 1}. [April 16]
Solution: The regular expression for language is
(0 + 1)* 00 (0 + 1)*
So CFG is S → ABA
A → 0A | 1A | ∈
B → 00
OR S → A00A
A → 0A | 1A | ∈
3.18
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.7
∴ 001100 is the yield of above language.
• The productions in context free grammar that contains useless symbols are called Useless
productions. The grammar that we obtain after deleting useless production rules are
called reduced Context Free Grammar.
• Let G = (V, T, P, S) be a grammar. A symbol X is useful if there is a derivation
* αXβ ⇒ w for some α, β and w, where w ∈T*, otherwise X is useless.
S⇒
3.21
Theoretical Computer Science Context-Free Grammars and Languages
Symbol X is useful if
1. Some string must be derivable from X.
2. 'X' must appear in the derivation of atleast one string derivable from 'S' (start symbol).
3. It should not occur in any sentential form that contains a variable from which no
terminal string can be derived.
Example 1: Consider the grammar,
S → AB | a
A → a
Solution: Here, variable 'B' is not deriving any string of terminals. Thus, it is an useless
symbol. For removing 'B' to get reduced grammar, we should delete all productions for which 'B'
is appearing on the right hand side. Hence, we drop only S → AB to get simplified grammar
without useless symbol as
S → a
A → a
Here to derive a string of terminal through 'A', symbol 'A' cannot occur in any derivation of
some string derivable from 'S'. Therefore A is also a useless symbol.
Hence, the simplified grammar without useless symbol is
G = ({S}, {a}, {S → a}, S)
Example 2: Consider a grammar,
S → AB | BC
A → aAa | aAb
B → bB | b
D → dD | d
Solution: Consider production S → AB. As 'A' cannot derive any terminal string, it is a
useless symbol. So drop production S → AB. Similarly, for S → BC, symbol 'C' does not derive
any terminal string, drop S → BC.
Hence start symbol itself does not derive any string. Grammar is useless.
Example 3: Construct a grammar without useless symbols for the grammar
S → AB | CA
B → BC | AB
A → a
C → aB | b
D → SS | d
Solution: Consider S → AB, here B is useless because it cannot derive any terminal string.
Hence remove all productions which contain 'B' in RHS of production.
3.22
Theoretical Computer Science Context-Free Grammars and Languages
• A production of the form A → B where A and B both are non-terminals are called unit
productions. All other productions including ∈-productions are non-unit productions.
• Elimination Rule: If A → B is a unit production or if there is a chain of unit productions
leading from A to B such as A ⇒ X1 ⇒ X2 ⇒ … B where all Xi's are non-terminals, then
introduce new production(s) according to the following rule.
If the non-unit productions for 'B' are
B → α1 | α2 | …
where α1, α2 … are all sentential forms (not containing only one non-terminal)
then, create the productions for 'A' as
A → α1 | α2 | …
Example 1: Consider the grammar,
S → A | bb
A → B|b
B → S|a
Solution: As we can see there is a chain of unit productions S → A → B → S.
As A → B and B → S | a, we can add new productions to A as
A → S|a
Hence the grammar becomes,
S → A | bb
A → S|a|b
Still there is a unit production S → A, removing this we get S → S | a | b | bb.
3.23
Theoretical Computer Science Context-Free Grammars and Languages
There is one more unit production S → S. We can directly remove as both side symbols are
same.
Therefore, the equivalent grammar without unit production will be,
S → a | b | bb
Example 2: Consider the grammar G as,
A → B
B → a|b
Find the equivalent grammar without unit production.
Solution: By rule, B → α1 | α2 where α1 = a and α2 = b
∴ Reduced grammar without unit production is A → a | b.
Example 3: Consider the grammar G as,
S → Saab | A
A → Sbba | B
B → aS | bS | a | b
Solution: First substitute all B-productions in A
S → Saab | A
A → Sbba | aS | bS | a | b
Now substitute all A-productions in S.
S → Saab | Sbba | aS | bS | a | b
This is the grammar without unit production.
3.7.1 Chomsky Normal Form (CNF) [April 16, 17, 18, 19, Oct. 17, 18]
• In CNF, there are restrictions on the length of right-hand side, and type of symbols is used in
right hand side of production rules.
3.26
Theoretical Computer Science Context-Free Grammars and Languages
• Any CFL without ∈ is generated by a grammar in which all productions are of the form A →
BC or A → a, where A, B and C are variables and 'a' is a terminal. This type of grammar is
said to be in CFG.
• Note that any CFL that does not contain ∈ as a word has a CFG in CNF that generates
exactly as it's.
• However, if the CFL contains ∈, then when we convert the CFG into CNF, the ∈-word drops
out of the language while all other words stay the same.
Example 1: Convert the following CFG into CNF:
S → aSa | bSb | a | b | aa | bb
Solution: In CNF, we have only two types of productions A → BC or A → a.
We add two productions A → a and B → b and get,
S → ASA | BSB | a | b | AA | BB
A → a
B → b
Consider S → ASA which is not in CNF
S → D1A
D1 → AS
Similarly, S → BSB which is not in CNF
S → D2B
D2 → BS
Thus the equivalent grammar in CNF is
S → D1A | D2B | a | b | AA | BB
A → a
B → b
D1 → AS
D2 → BS
Example 2: Convert the following grammar into CNF:
S → aAab | Aba
A → aS | bB
B → ASb | a
Solution: Thus the required grammar in CNF is
S → CaD1 | AD3
D1 → AD2
D2 → Ca Cb
3.27
Theoretical Computer Science Context-Free Grammars and Languages
D3 → Cb Ca
A → Ca S | Cb S
B → A D4 | a
D4 → S Cb
Ca → a
Cb → b
Example 3: Convert the following grammar into CNF,
S → ABA
A → aA | ∈
B → bB | ∈
Solution: First remove ∈-productions to get new grammar L(G) – {∈}. After removing ∈-
productions as per rule, we get new grammar as
S → ABA | AB | BA | AA | A | B
A → aA | a
B → bB | b
Then by eliminating unit-productions S → A and S → B, we get grammar as,
S → ABA | AB | BA | AA | aA | a | bB | b
A → aA | a
B → bB | b
Now, convert the grammar into CNF.
So final grammar in CNF will be as follows:
S → D1 A | AB | BA | AA | Ca A | a | Cb B | b
A → Ca A | a
B → Cb B | b
D1 → AB
Ca → a
Cb → b
Example 4: Convert the following grammar into CNF,
S → bA | aB
A → bAA | aS | a
B → aBB | bS | b
3.28
Theoretical Computer Science Context-Free Grammars and Languages
3.7.2 Greibach Normal Form (GNF) [Oct. 16, 17, 18, April 17, 19]
• In GNF, there is restriction on the position, in which, terminals and variables can appear on
right-hand side of production rules.
• In GNF, every production must start with a single terminal followed by any number of
variables.
• In every CFL, L without ∈-production can be generated by a grammar for which every
production is of the form A → aα, where 'A' is a variable, 'a' is a terminal and α is a string of
only variables. This type of grammar is said to be GNF.
• Lemma 1: The production with variable A on left is called as A-production.
Let G = (V, T, P, S) be a CFG. Let A → α1βα2 be a production in P and B → β 1 | β 2 | … | β r
be the set of all B-productions. Let G1 = (V, T, P1, S) be obtained by deleting production
A → α1βα2 from P and adding productions A → α1β 1α2 | α1β 2α2 | … | α1 β r α2.
Then L(G) = L(G1).
• Lemma 2: Let G = (V, T, P, S) be a CFG. Let A → Aα1 | Aα2 | … | Aαr be the set of A-
productions for which A is the leftmost symbol of the RHS. Let A → β 1 | β 2 | … | β s be the
remaining A-productions. Let G1 = (V ∪ {B}, T, P1, S) be the CFG formed by adding the
variable B to V and replacing all the A-productions by the productions:
1. A → βi
1≤i≤S
A → βiB
3.30
Theoretical Computer Science Context-Free Grammars and Languages
2. B → αi
1≤i≤r
B → αiB
Then, L(G) = L(G1)
Note : Grammar should be in CNF before converting into GNF.
Example 1: Construct a grammar in GNF equivalent to grammar
S → AA | a
A → SS | b
Solution: Observe that the CFG is in CNF. If we rename S as A1 and A as A2 respectively,
then productions will be
A1 → A2 A2 | a
A2 → A1 A1 | b
We leave A2 → b as it is in the required form.
Now, consider A2 → A1 A1. To convert this we will use lemma 4.1, and get,
A2 → A2 A2 A1 By replacing the first A1 on
A2 → aA1 RHS of A2 → A1 A1 by definition of A1
(Note : Here we have been considering production A2 → A1 A1 because production is in the
form Ai → Aj α, j < i).
Now the production A2 → aA1 is in the required form.
But we need lemma for A2 → A2 A2 A1 as it is in the form A → Aα.
Applying lemma to productions of A2, A2 productions are,
A2 → A2 A2 A1 | aA1 | b
Here, β 1 = aA1, β 2 = b, α = A2 A1
∴ Adding new non-terminal B2, we get,
A2 → aA1 | b
A2 → aA1B2 | bB2
B2 → A2 A1
B2 → A2 A1 B2
Now, all A2 productions are in the required form.
Now we will have to consider A1 production,
A1 → A2 A2 | a
Applying lemma 4.2 i.e. replacing all A2 productions on RHS, we get,
A1 → aA1A2 | bA2 | aA1B2A2 | bB2A2 | b
3.31
Theoretical Computer Science Context-Free Grammars and Languages
We get, A3 → A3 A1 A4 | 1
Now, applying lemma 4.2, we get,
A3 → 1 | 1 B1
B1 → A1 A4 | A1 A4 B1
Now convert all productions in GNF.
A3 and A4 productions are already in GNF.
Applying lemma 4.1 on A2 productions we get,
A2 → 1 A1 | 1 BA1
Applying lemma 4.2 on A1 productions, we get,
A1 → 1 A1 A3 | 1 B1 A1 A3 | 1 A1 A4 | 1 B1 A1 A4 | 1
We get B1 productions as
B → 1 A1 A3 A4 | 1 B1 A1 A3 A4 | 1 A1 A4 A4 | 1 B1 A1 A4 A4 | 1 A4 |
1 A1 A4 A4 B1 | 1 B1 A1 A4 A4 B1 | 1 A1 A3 A4 B1 | 1 B1 A1 A3 A4 B1|
1 A4 B1
Thus the required grammar in GNF is
A1 → 1 A1 A3 | 1 B1 A1 A3 | 1 A1 A4 | 1 B1 A1 A4 | 1
A2 → 1 A1 |1 BA1
A3 → 1 | 1 B
A4 → 1
B1 → 1 A4 A3 A4 | 1 B1 A1 A3 A4 | 1 A1 A4 A4 | 1 B1 A1 A3 A4 B1 |
1 A4 | 1 A1 A4 A4 B1 | 1 B1 A1 A4 A4 B1 | 1 A1 A3 A4 B1 |
1 B1 A1 A3 A4 B1 | 1 A4 B1
Example 3: Convert the following grammar G to GNF:
A1 → A2 A3
A2 → A3 A1 | b
A3 → A1 A2 | a
Solution: Given grammar is in CNF.
Only the production A3 → A1 A2 is not in the required form.
Since j < i (i.e. 1 < 3).
Applying lemma 4.1 on A3 production, we get,
A1 → A2 A3
A2 → A3 A1 | b
A3 → A2 A3 A2 | a
3.33
Theoretical Computer Science Context-Free Grammars and Languages
3.9 LEFT LINEAR AND RIGHT LINEAR GRAMMARS [April 16, 18]
• Regular grammar is classified as Left-linear grammar and Right-linear grammar.
o Left-linear grammar (Definition): If all productions of a CFG are of the form A →
Bw, or A → w or A → ∈ where A, B are variables and w is a string of terminals then we
say that grammar is a left-linear grammar.
o Right linear grammar (Definition): If all productions of a CFG are of the form A →
wB or A → w or A → ∈ where A, B and w are same as above then grammar is called a
right-linear grammar.
• A right and left linear grammar is called a regular grammar. A grammar is called linear
grammar in which at most one non-terminal can occur on the right side of any production
rule.
Example 1: The language 0 (10)* is generated by the right-linear grammar S → 0 A,
A → 10 A | ∈ and left-linear grammar S → S 10 | 0.
Example 2:
A → a, A → aB, A → ∈
where, A and B are non-terminals and a is terminal.
3.35
Theoretical Computer Science Context-Free Grammars and Languages
Example 3:
S → 00A | 11S
A → 0A | 1A | 0 | 1
where, S and A are non-terminals and 0 and 1 are terminals.
Left-linear grammar examples:
A → a, A → Ba, A → ∈
where,
A and B are non-terminals, a is terminal and ∈ is empty string.
S → A00 | S11
A → A0 | A1 | 0 | 1
where, S and A are non-terminals and 0 and 1 are terminals.
Fig. 3.8
3.36
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.9
Solution: Equivalent regular grammar is
G = ({q0, q1, q2}, {a, b}, p, q0)
where p is {q0 → b q0 | a q1
q1 → b q1 | a q2 | a
q2 → a q1 | b q2 | b}
Example 3: Construct regular grammar for the DFA shown in the Fig. 3.10.
Fig. 3.10
3.37
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.11
Fig. 3.12
Fig. 3.13
EXAMPLES
Example 1: Construct regular grammar for a language over {a, b} consisting of all strings
starting with a and having ba as a substring in it.
Fig. 3.14
Solution:
(i) δ (q0, a) = q1 q1 ∉ F q0 → aq1
δ (q0, b) = φ
(ii) δ (q1, a) = q1 q1 ∉ F q1 → aq1
δ (q1, b) = q2 q2 ∉ F q1 → bq2
(iii) δ (q2, a) = q3 q3 ∈ F q2 → aq3 | a
δ (q2, b) = q2 q2 ∉ F q2 → bq2
(iv) δ (q3, a) = q3 q3 ∈ F q3 → aq3 | a
δ (q3, b) = q3 q3 ∈ F q3 → bq3 | b
∴ G = ({q0, q1, q2, q3}, {a, b}, p, q0}
P is:
q0 → aq1
q1 → aq1 | bq2
q2 → aq3 | bq2 | a
q3 → aq3 | bq3 | a | b
3.40
Theoretical Computer Science Context-Free Grammars and Languages
Example 2: Construct regular grammar for a language over {0, 1}. That starts with 00 and
ends with 1 having a substring 10 in it.
Fig. 3.15
Solution:
(i) δ (q0, 0) = q1 q1 ∉ F q0 → 0q1
δ (q0, 1) = φ
(ii) δ (q1, 0) = q2 q2 ∉ F q1 → 0q2
δ (q1, 1) = φ
(iii) δ (q2, 0) = q2 q2 ∉ F q2 → 0q2
δ (q2, 1) = q3 q3 ∉ F q2 → 1q3
(iv) δ (q3, 0) = q4 q4 ∉ F q3 → 0q4
δ (q3, 1) = q3 q3 ∉ F q3 → 1q3
(v) δ (q4, 0) = q4 q4 ∉ F q4 → 0q4
δ (q4, 1) = q5 q5 ∈ F q4 → 1q5 | 1
(vi) δ (q5, 0) = q4 q4 ∉ F q5 → 0q4
δ (q5, 1) = q5 q5 ∈ F q5 → q5 | 1
∴ G = ({q0, q1, q2, q3, q4, q5}, {0, 1}, P, q0}
∴ P is:
q0 → 0q1
q1 → 0q2
q2 → 0q2 | 1q3
q3 → 0q4 | 1q3
q4 → 0q4 | 1q5 | 1
q5 → 0q4 | 1q5 | 1
Example 3: Construct regular grammar for L = L1 ∩L2.
L1 = All strings over {a, b, c} having equal no. of a's and c's
L2 = {an b cn | n ≥ 0 and n ≤ 5}
3.41
Theoretical Computer Science Context-Free Grammars and Languages
Solution:
Fig. 3.16
(i) δ (q0, a) = q1 q1 ∉ F q0 → aq1
δ (q0, b) = φ
δ (q0, c) = φ
(ii) δ (q1, a) = q4 q4 ∉ F q1 → aq4
δ (q1, b) = q2 q2 ∉ F q1 → bq2
δ (q1, c) = φ
(iii) δ (q2, a) = φ
δ (q2, b) = φ
δ (q2, c) = q3 q3 ∈ F q2 → cq3 | c
(iv) δ (q3, a) = φ
δ (q3, b) = φ
δ (q3, c) = φ
(v) δ (q4, a) = q7 q7 ∉ F q4 → aq7
δ (q4, b) = q5 q5 ∉ F q4 → bq5
δ (q4, c) = φ
(vi) δ (q5, a) = φ
δ (q5, b) = φ
δ (q5, c) = q6 q6 ∉ F q5 → cq6
3.42
Theoretical Computer Science Context-Free Grammars and Languages
(vii) δ (q6, a) = φ
δ (q6, b) = φ
δ (q6, c) = q3 q3 ∈ F q6 → cq3
(viii) δ (q7, a) = q10 q10 ∉ F q7 → aq10
δ (q7, b) = q8 q8 ∉ F q7 → bq8
δ (q7, c) = φ
(ix) δ (q8, a) = φ
δ (q8, b) = φ
δ (q8, c) = q9 q9 ∉ F q8 → cq9
(x) δ (q9, a) = φ
δ (q9, b) = φ
δ (q9, c) = q6 q6 ∉ F q9 → cq6
Similarly, we get,
q10 → aq13 | bq11
q11 → cq12
q12 → cq9
q13 → bq14
q14 → cq15
q15 → cq12
∴ G = ({q0, q1, q2, q3, q4, q5, q6, q7, q8, q9, q10, q11, q12, q13, q14, q15}, {a, b, c}, P, q0}
P is:
q0 → aq1
q1 → aq4 | bq2
q2 → cq3 | c
q4 → aq7 | bq5
q5 → cq6
q6 → cq3
q7 → aq10 | bq8
q8 → cq9
q9 → cq6
q10 → aq13 | bq11
3.43
Theoretical Computer Science Context-Free Grammars and Languages
q11 → cq12
q12 → cq9
q13 → bq14
q14 → cq15
q15 → cq12
Example 4: Construct a regular grammar for a language over {a, b, c} starting with a and
having odd no. of b's.
Fig. 3.17
(i) δ (q0, a) = q1 q1 ∉ F q0 → aq1
δ (q0, b) = φ
δ (q0, c) = φ
(ii) δ (q1, a) = q1 q1 ∉ F q1 → aq1
δ (q1, b) = q2 q2 ∈ F q1 → bq2 | b
δ (q1, c) = q1 q1 ∉ F q1 → cq1
(iii) δ (q2, a) = q2 q2 ∉ F q2 → aq2 | a
δ (q2, b) = q1 q1 ∉ F q2 → bq1
δ (q2, c) = q2 q2 ∈ F q2 → cq2 | c
The grammar is: G = ({q0, q1, q2}, {a, b, c}, P, q0}
P is:
q0 → aq1
q1 → aq1 | bq2 | cq1 | b
q2 → aq2 | bq2 | cq2 | a | c
Example 5: Construct regular grammar for the language {a2n | n ≥ 1}.
Solution: Explanation.
When n = 1
a2n → a2 × 1 = a2 = aa
When n = 2
a2 × 2 = a4 = a4 = aaaa
3.44
Theoretical Computer Science Context-Free Grammars and Languages
When n = 3
a2 × 3 = a6 = a6 = aaaaaa
∴ We draw DFA for which accepts only even number of a'S.
Fig. 3.18
∴ Regular grammar is
q0 → aq1
q1 → aq2 | a
q2 → aq1
EXAMPLES
Example 1: Define nullable symbol.
Solution: If N is any non-terminal in CFG and N →* ∈ or N ∈, then N is nullable.
Example 2: What is the yield of following derivation tree?
Fig. 3.19
Solution: The yield of derivation tree is abba.
Example 3: Convert the following grammar into GNF.
S → AB
A → BS | b
B → SA | a
Solution: Substituting S-productions in B, we get,
B → ABA | a
Substitute A-productions in B.
B → BSBA | bBA | a
Grammer becomes
S → AB
A → BS | b
B → BSBA | bBA | a
3.45
Theoretical Computer Science Context-Free Grammars and Languages
3. B → aA bA 4. B → aAB
B → Ca A Cb A B → Ca A B
B → XQ B → XB
X → Ca A (already added)
Q → Cb A
5. A → CaB 6. A → Ca B Cb
A → RCb
R → CaB
∴ Equivalent grammar in CNF is
S → XY | PS
X → Ca A
Y → Cb B
P → BCb
B → XQ | XB | b
Q → Cb A
A → Ca B | RCb | a
R → Ca B
Ca → a
Cb → b
Example 5: Define useless symbols.
Solution: Let G = (V, T, P, S) be a grammar. A symbol X is useful if there is a derivation.
S ⇒ αXβ ⇒ w, where α, β ∈ (VUT)*
Otherwise X is useless.
Example 6: What is the yield of following derivation tree?
Fig. 3.20
Solution: The yield is aaba.
3.47
Theoretical Computer Science Context-Free Grammars and Languages
Example 7: Construct CFG for a language over {a, b} which accepts equal number of a's and
b's.
Solution: G = (V, T, P, S)
where V = {S} T = {a, b}
P is S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
Example 8: Construct the following grammar into GNF.
S → AB
A → SB | a
B → AB | b
Solution: Replace all S-productions in A-productions.
We get S → AB
A → ABB | a By lemma 1
B → AB | b
Now apply lemma 2 for A-productions.
[A → Aα | β , then A → Bi | Bi A' and A' → αi | αi A']
A → ABB | a
αβ
A → aA' | a
A' → BBA' | BB
Now all A-productions are in GNF, replace all A-productions in S-production, we get
S → aA'B | aB
A → aA' | a
A' → BBA' | BB
B → AB | b
Now replace all A-productions in B-productions
S → aA'B | aB
A → aA'| a
A' → BBA'| BB
B → aA'B | aB | b
Now replace all B-productions in A'-productions we get
S → aA'B | aB
A → aA' | a
A' → aA'BBA' | aA'BB | aBBA' | aBB | bBA' | bB
B → aA'B | aB | b
The grammar is in GNF with total productions = 13.
3.48
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.21
Solution: bbaaba is the yield of tree.
3.50
Theoretical Computer Science Context-Free Grammars and Languages
Substitute b by B, we get
S → aAS | a
A → aASBA | aASS | aSBA | aS | bA
B → b
The grammar in GNF.
Example 23: Remove unit production from the following grammar.
S → A | bb
A → B|b
B → S|a
Solution: S → A | bb
A → B|b
B → A | bb | a
Remove unit production from B-production.
S → A | bb
A → A | bb | a | b (A is useless in A → A)
Substitute A-production in S.
S → A | bb
A → bb | a | b
After substituting we get
S → bb | a | b
Example 24: What are the types of grammar in the Chomsky hierarchy?
Solution: Type 0(unrestricted)
Type 1 (context-sensitive)
Type 2 (context-free)
Type 3 (regular)
Example 25:.Define inherently ambiguous context free languages.
Solution: If small some word of a language has more than one leftmost derivation or more
than one rightmost derivation then it is ambiguous. A CFL for which every CFG is ambiguous is
said to be an inherently ambiguous CFL.
Example 26: Construct CFG for each of the following languages:
(i) ab* a , (ii) 0*1*2*, (iii) {an bn cm | n ≥ 1, m ≥ 0}.
Solution: (i) ab*a: Grammar is
S → aBa
B → bB | ∈
3.53
Theoretical Computer Science Context-Free Grammars and Languages
Fig. 3.22
Solution: abba.
Example 34: Construct CFG for language L = {an bn cm dm | m, n ≥ 1}.
Solution: S → AB
A → aAb | ab
B → cBd | cd
S ⇒ AB
(i)
⇒ abcd
(ii) S ⇒ AB
⇒aAbcBd
⇒aabbccdd
3.55
Theoretical Computer Science Context-Free Grammars and Languages
S → D2D3
D → AA'
D2 → A'C
D3 → A'A'
D → B'A'D | A'B'D | A'A'
D → B'D4 | A'D5 | A'A'
D4 → A'D
D5 → B'D
Example 38: Construct CFG for L = L1 ∪ L2 where L1 = {anb | n ≥ 1} and L2 = {0n | nn n ≥
1}
Solution: L1 ∪ L2 = all strings of L1 and L2
CFG of L1 is S1 → Ab
A → aA | b
CFG of L2 is S2 → AB
A → OA | O
B → |B|1
CFG of L1 ∪ L2 is S → S1 | S2
Example 39: Rewrite the following CFG after eliminating unit productions.
S → aAb | A
A → B|b
B → ∈
D → F
F → 01 | B
Solution: S → aAb | b | ∈
A → b|∈
D → 01 | ∈
Example 40: What is the field of following derivation tree.
S
a A b
A b b
'
Fig. 3.23
Solution: abbb.
3.57
Theoretical Computer Science Context-Free Grammars and Languages
b
b
a a
q3 q4
b
Fig. 3.24
Solution: q0 → aq1 | bq3
q1 → aq1 | bq2 | b
q2 → aq2 | bq2 | a | b
q3 → aq3 | bq4 | b
q4 → aq4 | bq3 | a
Example 48: Define left linear and right linear grammar.
Solution: Left linear → A → Bw → w or A → w or A ∈.
Right linear → A → wB or A → w or A → ∈
q0 q1
PRACTICE QUESTIONS
Q.I Multiple Choice Questions:
1. Which grammars define the context free languages?
(a) Context Free Grammars (CFG)
(b) Regular Grammars (RG)
(c) Context Sensitive Grammars (CSG)
(d) Unrestricted Grammars (URG)
2. The language generated by the CFG is called as, context-free language (CFL).
(a) Context-Automata Language (CAL)
(b) Context-Free Language (CFL)
(c) Context-Regular Language (CRL)
(d) None of the mentioned
3. The productions (rules of a grammar) are applied iteratively to generate a string or
language this process is called as,
(a) language (b) Alphabet
(c) derivation (d) expression
4. Which in grammar contains less number of non-terminals and productions, so the time
complexity for the language generating process becomes less from the reduced
grammar?
(a) derivation (b) expression
(c) automata (d) reduction
5. Which grammars generate the regular languages? Such a grammar restricts its rules to
a single non-terminal on the left hand side. The right hand side consists of either a
single terminal or a string of terminals with a single non-terminal on the left or the
right end.?
(a) Context Free Grammars (CFG)
(b) Regular Grammars (RG)
(c) Context Sensitive Grammars (CSG)
(d) Unrestricted Grammars (URG)
6. Which is the tree representation of deriving a CFL from a given context grammar?
(a) parse tree (b) derivation tree
(c) Both (a) and (b) (d) None of the mentioned
7. If we replace only the rightmost non-terminal by some production rule at each step of
the generating process of the language, then the derivation is called as a,
(a) leftmost derivation (b) rightmost derivation
(c) linear derivation (d) None of the mentioned
3.60
Theoretical Computer Science Context-Free Grammars and Languages
(iii) S → AB
A→B
B → 0B | 1 | ∈
9. Convert following CFG into CNF:
(i) S → aSd | aAd
A → bAc | bc
(ii) S → 01S1 | 0 | 0A1
A → 1S | 0AA1
(iii) S → abAB | bAda
A → baB | a
B → CAb | Bb
(iv) S → aAbB | BbS
B → aAbA | aAB | b
A → aB | aBb | a
(v) S → aAab | Aba
A → aS | bB
B → ASB | a
10. Convert following CFG into GNF:
(i) S → 0AB | A | B
A → S0B | 1B | 1
B → A1 | 0
(ii) S → AB
A → BSB | BB | b
B → aAb | a
(iii) S → aSd | aAd
A → bAc | bc
(iv) S → aA | b
A → SA | a
(v) S → AB
A → BS | b
B → SA | a
3.66
Theoretical Computer Science Context-Free Grammars and Languages
(vi) S → AaB | a
A → SBb | bA
B → Ba | b
(vii) S → 0A0 | 1B1 | BB
A→C
B→S|A
C→S|∈
(viii) S → AAA | B
A → aA | B
B→∈
11. Consider following grammar:
S → A *B | *A
A→#B|B#
B → *A | #
For the string "# * # # * #", find
(i) Leftmost derivation
(ii) Rightmost derivation
(iii) Parse tree.
12. Write a short note on: Simplification of CFG.
13. Describe Chomsky hierarchy with four types of grammars.
14. What is ambiguous grammar? Explain its concept with example.
15. Explain equivalence of FA and regular grammar with example.
16. How to construct of regular grammar equivalent to a given DFA?
17. What is normal form? Explain Greibach Normal Form (GNF) and Chomsky Normal
Form (CNF) with example. Also compare them.
18. Construct CFG accepting following sets:
(i) L = {0n 1m | n, m ≥ 0, n is not equal to m}
(ii) L = {am bn cm–n | m ≥ 1, m > n}
(iii) L = {am bn cm | n ≥ 1, m ≥ 0}
(iv) The set of all strings with exactly twice as many b's as a's.
(v) L = L1 ∪ L2 where
L1 = {an b | n ≥ 0}
L2 = {0m 1n 2n+1 | m, n ≥ 1}
3.67
Theoretical Computer Science Context-Free Grammars and Languages
19. Construct leftmost and rightmost derivations for ababa and draw parse tree.
S → AS | a
A → SA | b
20. Show that the grammar
S → a | abSb | aAb
A → bS | aAAb is ambiguous.
21. For the following grammar, find an equivalent grammar with no unit production:
S → AB
A→a
B→C|b
C→D
D→E
E→a
22. Eliminate ∈-productions from the following grammar:
S → AB | ∈
A → aASb | a
B → bS
23. Construct regular grammar for the following languages:
(i) Set of all strings over {a, b} such that if it starts with 'a' then it ends with "ab" and if
it starts with 'b' then it contains even number of a's in it.
(ii) (a + b)* aba (a + b)* bb
(iii) {a2n | n ≥ 1}
(iv) The set of all strings over {0, 1} beginning with "0".
(v) {al bm cn | l, m, n ≥ 1}
(vi) 01* ((01) (10))* + 1 (10)*.
(vii) L = L1 ∩ L2, where
L1 = All strings over {a, b, c} having equal number of a's and c's.
L2 = {an b cn | n ≥ 0 and n ≤ 5}
24. Construct regular grammar for DFA in Fig. 4.19 to 4.21
(i)
3.68
Theoretical Computer Science Context-Free Grammars and Languages
(ii)
(iii)
October 2016
1. State the machines used for CFL and CFG. [1 M]
Ans. Refer to Sections 3.1 and 3.2.
2. Define ambiguous grammar. [1 M]
Ans. Refer to Section 3.5.
3. Construct CFG for language L = {an bn cm dr | m, n, r ≥ 1}. [4 M]
Ans. Refer to Section .
4. Define type-2 grammar. [1 M]
Ans. Refer to Section 3.3, Point (3).
3.70
Theoretical Computer Science Context-Free Grammars and Languages
October 2017
1. Define ambiguous grammar. [1 M]
Ans. Refer to Section 3.5.
2. Consider the following grammar:
S → ADa
A→ a
D→ d
The grammar is in CNF. Justify. [1 M]
Ans. Refer to Section 3.7.1.
3. Convert the following grammar in GNF:
S → aAS | a
A → SbA SS bA [5 M]
Ans. Refer to Section 3.7.2.
4. Construct CFG for the language L = L1L2
where, L1 = {an b | n ≥ 0}
L2 = {bm c | m ≥ 0}. [4 M]
Ans. Refer to Examples on Pages 3.14 to 3.15.
5. Write a short note on Chomsky's hierarchy. [4 M]
Ans. Refer to Section 3.3.
6. Consider the following grammar:
S → AB | aD | a
A→ a
D → aD | aDD
Remove useless symbols and rewrite the grammar. [2 M]
Ans. Refer to Section 3.6.1.
April 2018
1. Define right linear grammar. [1 M]
Ans. Refer to Section 3.9.
2. State lemma 1 for converting a CFG to GNF. [1 M]
Ans. Refer to Examples on 3.31 to 3.33.
3. Construct the following CFG into Chomsky Normal Form (CNF):
S → aSa|bSb|a|b|aa|bb [5 M]
Ans. Refer to Section 3.7.1.
3.72
Theoretical Computer Science Context-Free Grammars and Languages
October 2018
1. Define ambiguous grammar. [1 M]
Ans. Refer to Section 3.5.
2. What are the types of grammar in Chomsky hierarchy? [1 M]
Ans. Refer to Section 3.3.
3. Convert the following grammar in CNF: [5 M]
S → ABA
A → aA/∈ (epsilon)
B → bB/∈ (epsilon)
Ans. Refer to Section 3.7.1.
4. Convert the following grammar in GNF: [5 M]
S → AB | A
A → BS
B → AI | I
Ans. Refer to Section 3.7.2.
5. Explain types of regular grammar. [2 M]
Ans. Refer to Section 3.8.
6. Construct CFG for: [4 M]
n m
(a) {a b | n, m ≥ 0}
(b) {an bnci | n > i, i ≥ 0}
Ans. Refer to Examples on Pages 3.14 to 3.15.
April 2019
1. Define context sensitive grammar. [1 M]
Ans. Refer to Section 3.3, Point (2).
2. State lemma 2 for converting a CFG to GNF. [1 M]
Ans. Refer to Section 3.7 and Examples on 3.31 to 3.33.
3. Construct CFG for the following: [5 M]
(i) L1 = {0n 1n 2m | n > 1, m > 0}
(ii) L2 = {0n 1m | n, m > 0}
Ans. Refer to Examples on Pages 3.14 to 3.15.
3.73
Theoretical Computer Science Context-Free Grammars and Languages
3.74
CHAPTER
4
Pushdown Automata
Objectives …
To study Basic Concepts in Pushdown Automata
To learn Construction of Pushdown Automata
To understand Deterministic and Non-deterministic Pushdown Automata
4.0 INTRODUCTION
• We have seen finite automata (abstract machine) with the following dual property:
1. For each regular language, there is at least one machine that runs successfully only on
the input string from that language.
2. For each machine in the class the set of words it accepts is a regular set.
• We are now considering a different class of languages but we want to answer the same
questions, so we would again like to find a machine formulation.
• We are looking for a mathematical model of some class of machines that corresponds to
CFLs i.e., there should be at least one machine that accepts each CFL and the language
accepted by each machine is context-free. We want CFL-acceptor just as FA's are regular
language recognizers and acceptors.
• To build these new machines, we start with our old FA's and add stack and input tape to
make it more powerful. This FA with stack (LIFO) is called as Pushdown Automata (PDA).
• A pushdown automaton is a way to implement a context-free grammar in a similar way we
design DFA for a regular grammar.
• The pushdown automaton is a finite automaton with an additional tape, which behaves like a
stack. The PushDown Automata (PDA) is the machine format of the context-free language.
• Input tape is infinitely long in one direction to hold any possible input. The tape has a first
location for the first letter of the input, then a second location and so on.
• The locations into which we put the input letters are called cells as shown in the Fig. 4.1.
Fig. 4.1
4.1
Theoretical Computer Science Pushdown Automata
• As we process this tape on the machine we read one letter at a time and eliminate each as it is
used.
• When we reach the first blank cell we stop, i.e. rest of tape is also blank. We read from left to
right and never go back to a cell that was read before.
• As a part of our new pictorial representation for FAs, let us introduce the symbols shown
below.
Fig. 4.2
• An accept state is a final state and reject state is not a final. Read state is shown by diamond
shaped boxes as shown below.
Fig. 4.3
(Note: Symbol ∆ stands for blank).
• The finite automata that accepts all words ending in the letter a is as shown in the Fig. 4.4.
Fig. 4.4
• The FA in the new symbolism is shown in the Fig. 4.5.
Fig. 4.5
4.2
Theoretical Computer Science Pushdown Automata
• Our machine is still an FA. Here we chosen this representation because we now want to add
additional component called pushdown stack (last-in-first-out) to our machine. The only
stack operations allowed to us are push and pop.
• Popping an empty stack, like reading an empty tape, gives us the blank character ∆. We
include the states as:
Fig. 4.6
• The edges coming out of a pop state are labeled in the same way as the edges from a read
state.
Basic Concept of Pushdown Automata:
• Pushdown Automata is a finite automaton with extra memory called stack which helps
Pushdown automata to recognize Context Free Languages.
• The term "pushdown" refers to the fact that the stack can be regarded as being "pushed
down" like a tray dispenser at a cafeteria, since the operations never work on elements other
than the top element.
• Fig. 4.7 shows diagram for PDA. The components of it are described below:
1. The input tape contains the input symbols. The tape is divided into a number of squares.
Each square/block contains a single input character. The string placed in the input tape is
traversed from left to right. The two end sides of the input string contain an infi nite
number of blank symbols.
2. The reading head scans each square in the input tape and reads the input from the tape.
The head moves from left to right. The input scanned by the reading head is sent to the
finite control of the PDA.
3. The finite control can be considered as a control unit of a PDA. An automaton always
resides in a state. The reading head scans the input from the input tape and sends it to the
finite control.
4. A stack is a temporary storage of stack symbols. Every move of the PDA indicates one
of the following to the stack
o One stack symbol may be added to the stack (push)
o One stack symbol may be deleted from the top of the stack (pop)
The stack is the component of the PDA which differentiates it from the finite automata.
In the stack, there is always a symbol z0 which denotes the bottom of the stack.
4.3
Theoretical Computer Science Pushdown Automata
I/P Tape
Reading Head
Finite
Control Stack
z0
Fig. 4.7
Example: Consider the following PDA shown in the Fig. 4.8.
Fig. 4.8
Solution: The question is what language is represented by the above PDA? To find this, we
will analyze the PDA. Let us consider input string aabb. We assume that this string has been put
on the tape.
We must begin at start and then read the symbol 'a' from the input tape. Push a state tells us
to push an a onto the stack as shown in Fig. 4.9 (b). We now read another a and proceed as
before. See Fig. 4.9 (c).
After the second push a, we return back to same read state again. Here we read the letter b,
we take the b edge out of this state down to left pop. The state pop takes the top element off the
stack as shown in Fig. 4.9 (d). Now, next read symbol is b, it returns to pop state again and pop
the top element off the stack as shown in Fig. 4.9 (e), leaving the stack empty. Next read is
symbol ∆; which will be the accept state. We observe that the language of words accepted by this
machine is exactly {an bn, n = 0, 1, 2, …}.
(b) After reading first 'a' (c) After reading second 'a'
(d) After reading first 'b' (e) After reading second 'b'
Fig. 4.9
This language is non-regular language, generated by CFG. We constructed machine (PDA)
which accepts this language. So PDAs are more powerful than FA.
FA could not keep the track of how many times 'n' occurs {an bn}. PDA has a primitive
memory unit. It can keep track of how many a's are read at the beginning.
Difference between PDA and FA: [April 16, 18, Oct. 16, 18]
1. The length of the path formed by a given input: If a string of seven letters is fed into
an FA, it follows a path exactly seven edges long. In a PDA, the path could be longer or
shorter.
For example, following PDA accepts only the language of all words beginning with an a.
Fig. 4.10
No matter how long the input string, the path is only one or two edges long.
2. PDA accepts regular or non-regular language. FA accepts regular language.
3. PDA is more powerful than FA because it has unlimited memory units (stack); so it can
remember arbitrarily long strings.
4. A DFA can remember a finite amount of information, but a PDA can remember an
infinite amount of information.
∴ PDA M = ({q0, q1}, {a, b}, {A}, δ, q0, ∈, φ), where δ is defined as
δ (q0, a, ∈) = (q0, A)
δ (q0, a, A) = (q0, AA)
δ (q0, b, A) = (q1, ∈)
δ (q1, b, A) = (q1, ∈)
δ (q1, ∈, ∈) = (q1, ∈)
6. If the device is in state q2 and a red plate is on the top of stack, then plate is popped and
stack is empty.
We write PDA as follows:
PDA M = ({q1, q2}, {0, 1, c}, {R, B, G}, δ, q1, R, φ)
where δ is δ (q1, 0, R) = {(q1, BR)}
δ (q1, 0, B) = {(q1, BB)}
δ (q1, 0, G) = {(q1, BG)}
δ (q1, 1, G) = {(q1, GG)}
δ (q1, 1, B) = {(q1, GB)}
δ (q1, c, R) = {(q2, R)}
δ (q1, c, B) = {(q2, B)}
δ (q1, c, G) = {(q2, G)}
δ (q2, 0, B) = {(q2, ∈)}
δ (q2, 1, G) = {(q2, ∈)}
δ (q2, ∈, R) = {(q2, ∈)}
Consider the string "01c10".
The sequence of moves is as follows:
(q1, 01c10,R) |– (q1, 1 c 10, BR) as δ (q1, 0, R) = {q1, BR}
|– (q1, c 10, GBR) as δ (q1, c, G) = {(q2, G)}
|– (q2, 10, GBR) as δ (q2, 1, G) = (q2, ∈)
|– (q2, 0, BR) as δ (q2, 0, B) = (q2, ∈)
|– (q2, ∈, R) as δ (q2, ∈, R) = (q2, ∈)
|– (q2, ∈, ∈)
The pictorial representation of PDA is shown in the Fig. 4.12.
Fig. 4.12
4.8
Theoretical Computer Science Pushdown Automata
4. δ (q1, 0, 0) = (q1, ∈)
δ (q1, 1, 1) = (q1, ∈)
In state q1, match input symbols against the top symbols on the stack. Then pop the
symbol.
5. Pop all symbols until z0 marker we get. Then pop z0 and stack is empty.
Fig. 4.14
δ (q0, (, R) = (q0, AR)
δ (q0, (, A) = (q0, AA)
δ (q0, ), A) = (q1, ∈)
δ (q1, ), A) = (q1, ∈)
δ (q1, ∈, R) = (q1, ∈)
4.13
Theoretical Computer Science Pushdown Automata
Fig. 4.15
We construct PDA M = ({q0, q1, q2, q3}, {a, b}, (R, A), δ, q0, R, φ), where δ is defined as
follows:
δ (q0, a, R) = (q0, AR)
δ (q0, a, A) = (q0, AA)
δ (q0, b, A) = (q1, A)
δ (q1, b, A) = (q2, ∈)
loop
δ (q2, b, A) = (q1, A)
δ (q2, b, R) = (q3, ∈)
δ (q2, ∈, R) = (q3, ∈)
Simulation of string “aabbbbb”:
(q0, aabbbbb, R) |– (q0, abbbbb, AR)
|– (q0, bbbbb, AAR)
|– (q1, bbbb, AAAR)
|– (q1, bbb, AAR)
|– (q2, bb, AR)
|– (q2, b, R)
|– (q3, ∈) = accept
4.14
Theoretical Computer Science Pushdown Automata
Solution:
Fig. 4.16
The PDA is
M = (Q, Σ, Γ, δ, q0, z0, φ)
where Q = {q0, q1, q2, q3, q4}
Σ = {0, 1, 2, 3}
Γ = {0, 1, R}
and δ mapping is
δ (q0, 0, R) = (q0, 0R) → Push 1st 0
δ (q0, 0, 0) = (q0, 00) → Push all 0’s
δ (q0, 1, 0) = (q1, 10) → Push first 1
δ (q1, 1, 1) = (q1, 11) → Push all 1’s
δ (q1, 2, 1) = (q2, 1) → Change state without push or pop
δ (q2, 2, 1) = (q3, 1) → Change state without push or pop
δ (q3, 2, 1) = (q3, ∈)
δ (q3, 3, 0) = (q4, ∈)
δ (q4, 3, 0) = (q4, ∈)
δ (q4, ∈, R) = (q4, ∈) = accept
Simulation of string 012223:
(q0, 012223, R) |– (q0, 12223, 0R)
|– (q1, 2223, 10R)
|– (q2, 223, 10R)
|– (q3, 23, 10R)
|– (q3, 3, 0R)
|– (q4, ∈, R)
|– (q4, ∈) = accept
Example 7: Construct PDA to check whether given string is correct prefix expression or
not?
Solution: Logic:
• All operators are push into the stack.
• When operands are read, pop operator from stack.
4.16
Theoretical Computer Science Pushdown Automata
Fig. 4.17
PDA is M = (Q, Σ, Γ, δ, q0, z0, φ)
Q = {q0, q1, q2,q3}
Σ = {operator, operand}
Γ = operator, z0
δ is δ (q0, operator, z0) = (q0, operator, z0)
δ (q0, operator, operator) = (q0, operator, operator)
δ (q0, operand, operator) = (q1, operator)
(By reading 1st operand do not change the stack)
δ (q1, operand, operator) = (q2, ∈)
δ (q2, operand, operator) = (q3, ∈)
δ (q3, ∈, z0) = (q3, ∈)
Simulation of string “+ * a b c”:
δ (q0, + * a b c, z0) |– (q0, * a b c, + z0)
|– (q0, a b c, * + z0)
|– (q1, b c, * + z0)
|– (q2, c, + z0)
|– (q3, ∈, z0)
|– (q3, ∈) string is accepted
4.17
Theoretical Computer Science Pushdown Automata
Example 8: Construct PDA to check whether given equation is in postfix form or not.
Solution: Logic:
• All operands are push into the stack.
• When operator is read, two operands are popped from the stack.
Fig. 4.18
PDA is M = (Q, Σ, Γ, δ, q0, z0, φ)
where Q = {q0, q1, q2, q3}
Σ = {operator, operand}
Γ = {operand, z0}
δ is δ (q0, operand, z0) = (q1, z0)
δ (q1, operand, z0) = (q1, operand, z0)
δ (q1, operand, operand) = (q1, operand, operand)
δ (q1, operator, operand) = (q1, ∈)
δ (q1, ∈, operand) = (q2, ∈)
δ (q2, ∈, z0) = (q2, ∈) = accept
Simulation of string "a b * c +": The string is operand operand operator operand operator.
δ (q0, a b * c +, z0) |– (q1, a b * c +, z0)
|– (q1, b * c +, a z0)
|– (q2, * c +, b a z0)
|– (q1, c +, a z0)
|– (q1, +, c a z0)
|– (q1, ∈, az0)
|– (q2, ∈, z0)
|– (q2, ∈) = accept
4.18
Theoretical Computer Science Pushdown Automata
Example 9: Construct PDA to accept all those strings containing equal number of a’s and
b’s.
Solution: L = {ab, abab, abba, aabb, bbaa, baba, ……}
Logic:
• When a is read at state q0, push a and change the state to q1. If b is read, then push b and
change the state to q2.
• At state q1, a is read, then push a. At state q2, b is read, then push b.
• When the PDA is at q1 state means a is already read and current input symbol is b then
pop a.
• When the PDA is at state q2 and it reads a then pop b.
• When string is over and stack is empty, then language is accepted.
PDA is, δ (q0, a, z0) = (q1, az0)
δ (q1, a, a) = (q1, aa)
δ (q1, b, a) = (q1, ∈)
δ (q1, b, z0) = (q2, bz0)
δ (q0, b, z0) = (q2, bz0)
δ (q2, b, b) = (q2, bb)
δ (q2, a, b) = (q2, ∈)
δ (q2, a, z0) = (q1, az0)
δ (q2, ∈, z0) = (q3, ∈)
δ (q3, ∈, ∈) = (q3, ∈) = accept
Simulation of string “abba”:
δ (q0, abba, z0) |– (q1, bba, az0)
|– (q1, ba, z0)
|– (q2, a, bz0)
|– (q2, ∈, z0)
|– (q3, ∈, ∈) = accept
Example 10: Show that {an bn | n ≥ 1} ∪ {am b2m | m ≥ 1} cannot be accepted by a
deterministic PDA. OR It is NPDA.
Solution: L = {ab, aabb, abb, aabbbb, ……}
The union of a language containing strings of equal number of a’s followed by equal number
of b’s or double b’s for single a.
4.19
Theoretical Computer Science Pushdown Automata
Logic:
• PDA push all a’s.
• When it reads the first b, it recognize that a’s are over. Then we either push 2nd b, or we
pop ‘a’ to check equal a’s and b’s.
PDA is, δ (q0, a, z0) = (q0, Az0)
δ (q0, a, A) = (q0, AA)
δ (q0, b, A) = {(q1, A) (q2, ∈)}
δ (q1, b, A) = (q2, ∈)
δ (q2, b, A) = (q1, A)
δ (q2, b, z0) = {(q3, ∈) (q2, ∈)}
δ (q2, ∈, z0) = (q3, ∈)
Simulation of “abb”:
δ (q0 abb, z0) |– (q0, bb, Az0)
|– (q1, b, Az0)
|– (q2, ∈, z0)
|– (q3, ∈) = accept
Simulation of “ab”:
δ (q0, ab, z0) |– (q0, b, Az0)
rejected (q3, ∈, ∈)
accepted
The given PDA is NPDA.
Example 11: Construct PDA for language L = {0n 1m 2n + m | n, m ≥ 1}
Solution: Logic:
• Read all 0’s at state q0 and push A onto stack.
• Read all 1’s at state q1 and push B onto stack.
• When 2 is read at state q1, POP A and B from the stack by changing the state q3.
• At state q4, language is accepted.
δ (q0, 0, z0) = (q0, Az0)
δ (q0, 0, A) = (q0, AA)
δ (q0, 1, A) = (q1, BA)
δ (q1, 1, B) = (q1, BB)
4.20
Theoretical Computer Science Pushdown Automata
δ (q1, 2, B) = (q2, ∈)
δ (q2, 2, A) = (q3, ∈)
δ (q3, 2, A) = (q3, ∈)
δ (q3, ∈, ∈) = (q4, ∈)
Example 12: Design a PDA for checking the acceptance of string an bn cn.
Solution: This is two stack problem. We require two stack for counting a and b, then b and c.
So it is double check problem. If we are using a single stack, then we cannot solve such a
problem because we can push all a’s in the stack which will get poped on b. By the time we reach
c, the stack will be empty and we do not have anything to compare. We only check equal number
of a’s and b’s.
Using two stack all a’s can be push into one stack and all b’s can be push into other stack.
When we read c, we can pop one symbol from each stack. When we finish reading input, both
the stacks should be empty and then the string can be accepted.
Fig. 4.19
Two stack PDA for L = {an bn cn
| n ≥ 0}
PDA with two stack is more powerful than PDA with one stack. But some languages
(context-sensitive) are not accepted by PDA. In the above figure (1) represents stack 1 and (2)
represents stack (2).
4.21
Theoretical Computer Science Pushdown Automata
EXAMPLES
Example 1: Construct a PDA for CFG
S → aAA
A → aS | bS | a
Solution: We construct PDA M = ({q0}, {a, b}, {S, A}, δ, q0, S, φ), where δ is defined as
follows:
δ (q0, a, S) = {(q0, AA)}
δ (q0, a, A) = {(q0, S) (q0, ∈)}
δ (q0, b, A) = {(q0, S)}
Consider a string derived from CFG, "abaaaa" and find whether it is accepted by our
constructed PDA by ID.
(q0, abaaaa, S) |– (q0, baaaa, AA)
|– (q0, aaaa, SA) |– (q0, aaa, AAA)
|– (q0, aa, AA) |– (q0, a, A) |– (q0, ∈, ∈)
Therefore, the string which is accepted by CFG is also accepted by PDA.
4.22
Theoretical Computer Science Pushdown Automata
Example 4: Show with example that if the language is accepted by CFG then the same
language is accepted by PDA.
Solution: Consider CFG in CNF.
S → SB
S → AB
A → CC
B → b
C → a
Let, Γ = {S, A, B, C}
Σ = {a, b}
The stack contains the symbol S and always contains non-terminals. There are two
possibilities (NPDA): either we replace the removed non-terminal by two non-terminals or we do
not replace non-terminal at all but we go to a read state to read the terminal from the tape.
Consider string "aab" generated by leftmost derivation.
S ⇒ AB ⇒ CCB ⇒ aCB ⇒ aaB ⇒ aab.
Now, we shall trace how the word "aab" can be accepted by PDA.
Leftmost Derivation State Stack Tape
Start ∈ aab
S PUSH S S aab
POP (S) ∈ aab
PUSH B B aab
S ⇒ AB PUSH A AB (... TOS = A) aab
POP (A) B aab
PUSH C CB
aab
A ⇒ CC PUSH C CCB
aab
POP (C) CB
aab
Read a CB
aab
POP (C) B
aab
Read a aab
B
POP (B) aab
∈
Read b aab
∈
ACCEPT ∈
∈
Both stack and tape are empty. Thus language is accepted by empty stack or PDA, N (M).
Thus, we construct a PDA as shown in the Fig. 4.20.
4.24
Theoretical Computer Science Pushdown Automata
Fig. 4.20
Example 5: Construct a PDA to check whether the given expression is a valid postfix
expression. Assume all operators to be binary.
Solution: If all operators are assumed to be binary, the postfix expression is → operand 1,
operand 2, operator e.g. ab * c +, ab +.
Fig. 4.21
Step 1 : Read the first symbol (operand).
Step 2 : Read next symbol and push it (operand).
Step 3 : Read next symbol, if it is operator then pop the symbol from the stack. If the
symbol popped is operand then go back to step 2, else if symbol popped is ∆ then
go to Reject. If the next input symbol read is ∆, pop the symbol from the stack. If
it is ∆, go to Accept, otherwise go to reject. The PDA is shown in the Fig. 4.21.
4.25
Theoretical Computer Science Pushdown Automata
Example 6: Construct a PDA for a language in which all the strings have at least one
occurrence of aa.
Solution:
Fig. 4.22
Here we have a language. Start with a or b, when we read b, the next symbol is either a or b.
But when we read a, then check for the next symbol. If it is 'b', then read again the next symbol.
If it is 'a' (here we get two consecutive a's) then go for accept state, since the language is
{aa, baa, baab, baaba, …}
PDA is δ (q0, a, z0) = (q1, Az0)
δ (q1, a, A) = (q2, ∈)
δ (q1, b, A) = (q1, BA)
δ (q1, a, B) = (q1, AB)
δ (q1, b, B) = (q1, BB)
Note: q2 is the final state language accepted by final state.
EXAMPLES
Example 1: Construct PDA for L = {ax by zz | x = 2y + z, y, z ≥ 1}
Solution: δ (q0, a, z0) = (q0, az0)
δ (q0, a, a) = (q0, aa)
δ (q0, b, a) = (q1, ∈)
δ (q1, b, a) = (q1, ∈)
δ (q1, c, a) = (q2, ∈)
δ (q0, ∈, z0) = (q2, ∈, ∈) = accept.
4.26
Theoretical Computer Science Pushdown Automata
PRACTICE QUESTIONS
Q.I Multiple Choice Questions:
1. Which is the machine format of the context-free language?
(a) Pushdown Automata (PDA) (b) Finite Automaton (FA)
(c) Push Automata (PA) (d) None of the mentioned
2. If a PDA being in a state with a single input and a single stack symbol gives a single
move, then the PDA is called as deterministic pushdown automata?
(a) Deterministic Finite Automaton (DFA)
(b) Deterministic Pushdown Automata (DPDA)
(c) Non-Deterministic Pushdown Automata (NPDA)
(d) None of the mentioned
3. The difference between FA and PDA is in,
(a) Finite control (b) Reading head
(c) Stack (d) Input tape
4. Which of the following is not possible algorithmically?
(a) NPDA to DPDA (b) NFA to DFA
(c) RE to CFG (d) CFG to PDA
5. A PDA can be formulated using how many tuples,
(a) 7 (b) 5
(c) 6 (d) 9
6. The Instantaneous Description (ID) of PDA is,
(a) the identity of that PDA
(b) describes the identity of that PDA always nonzero
(c) describe the configuration of PDA at a given instant (answer) always zero
(d) All of the mentioned
7. Which is the machine accepting a context-free language?
(a) PDA (b) FA
(c) DFA (d) NFA
4.27
Theoretical Computer Science Pushdown Automata
4.31
Theoretical Computer Science Pushdown Automata
April 2017
1. Define ID for PDA. [1 M]
Ans. Refer to Page 4.6.
2. Construct a PDA for L = {0m 1n 2n 0m | m ≥ 1, n ≥ 1}. [5M]
Ans. Refer to Section 4.1, Examples.
October 2017
1. Name the type of languages accepted by Pushdown Automata. [1 M]
Ans. Refer to Page 4.6.
2. Construct PDA for language: L = {ambm | m > n ≥ 1} [5 M]
Ans. Refer to Section 4.1, Examples.
April 2018
1. State two differences between PDA and FA. [1 M]
Ans. Refer to Page 4.5.
2. Construct PDA that accept langauge as S → aS | aSbS | a. [4 M]
Ans. Refer to Section 4.1, Examples.
October 2018
1. Write a mapping of δ in PDA. [5 M]
Ans. Refer to Section 4.1.
2. Construct PDA for language: [1 M]
L = {anb2n+1 | n > 1}.
Ans. Refer to Section 4.1, Examples.
3. Differentiate between FA and PDA. [2 M]
Ans. Refer to Page 4.5.
April 2019
n 2n+1
1. Construct PDA for L = {a b | n > 1} [4 M]
Ans. Refer to Section 4.1, Examples.
2. Construct PDA equivalent to the given CFG: [1 M]
S→bAB|aB
A→aAB|a
B→aBB|b
Ans. Refer to Section 4.4, Examples.
4.32
CHAPTER
5
Turing Machine
Objectives …
To study Basic Concepts in Turing Machine
To learn Design of Turing Machine
To understand Linear Bounded Automation (LBA)
5.0 INTRODUCTION
• We have seen various machines such as FA and PDA. To remove the limitations, we
required a more powerful machine, called Turing machine.
• Turing machine is a simple mathematical model of a modern digital computer. This machine
was introduced by Alan Turing in 1936.
• We can compare different machines which are shown in the Table 5.1.
Table 5.1
Language Defined Acceptor or Non-determinism Example of
By Recognizer = Determinism ? Application
1. Regular Finite automata Yes Text editors
expression
2. Context-free Pushdown automata No Programming
grammar language, statements,
compilers
3. Type 0 grammar Turing machine Yes Computers
• Turing machine can be constructed to accept a given language or to carry out some
algorithm. Machine may take its own output as input for further operation, hence no
distinction between input and output set.
• The machine can choose the current location and also decides whether to move left or right
in the memory.
• Today, the turning machine has become the accepted formalization of an effective procedure.
Turing machine is equivalent in computing power to the digital computer as we know it
today and also to all the most general mathematical notion of computation.
5.1
Theoretical Computer Science Turing Machine
Finite head
control
Fig. 5.2
• The present symbol under head is a5 and the present state is q3. So a5 is written to the right of
q3. The non-blank symbols to the left of a5 is a string a1 a2 a3 a4 which is written to the left of
q 3.
• The sequence of non-blank symbols to the right of a5 is a6 a7. Thus, ID is a1 a2 a3 a4 q3 a5 a6 a7
then α = a1 a2 a3 a4 and β = a5 a6 a7 where head is scanning a5.
TM Moves:
• The δ (q, xi) induces a change in ID of the TM. Such a change in ID is called a move.
• Let x1, x2, x3 … xn is an input string to be processed and present symbol under tape head is xi.
1. Let δ (q, xi) = (p, y, L)
So the ID before processing the symbol xi will be
x1 x2 … xi–1 q xi xi+1 … xn.
So after processing xi, the ID will be
x1 x2 … xi–2 p xi–1 y xi+1 … xn
This is represented as
x1 x2 … xi–1 q xi … xn |– x1 x2 … xi–2 p xi–1 y xi+1 … xn.
2. If δ (q, xi) = (p, y, R) then the change in ID is represented by
x1 x2 … xi–1 q xi … xn |– x1 x2 … xi–1 y p xi+1 … xn.
Note: The symbol |–* denotes the reflexive transitive closure of relation |–.
5.3
Theoretical Computer Science Turing Machine
• Turing machine is a simple mathematical model which computes two types of problems.
1. Problems on language recognition (Class of languages).
2. Problems on Arithmetical operation (Class of integer functions).
• Examples on Language Recognition:
1. These examples check whether the string or word is accepted by language or not.
2. If language is accepted by Finite Automata, then it is also accepted by TM.
3. If language is accepted by pushdown automata, then it is also accepted by TM.
4. All regular sets and context free languages are accepted by TM.
5. Languages which are not regular set as well as not context free are also accepted
by TM.
• All the following examples show that TM accept all types of regular or non-regular
languages.
Example 1: Construct the TM which accept regular language a* b*.
Solution: L = {∈, a, b, aa, bb, ab, aab, ……}
Logic:
1. Read all a’s using state q0.
2. If string is over, then accept the string, otherwise scan all b’s using state q1. State q1 will
not accept any ‘a’.
We design TM = (Q, Σ, Γ, δ, q0, B, F)
where Q = {q0, q1}
Σ = {a, b}
Γ = {a, b, B} and δ is
Table 5.2
State a b B
q0 (q0, a, R) (q1, b, R) (q2, B, R)
q1 (q1, b, R) (q2, B, R)
q2 → Final state
5.4
Theoretical Computer Science Turing Machine
Fig. 5.3
The instantaneous description for “aabbb” is
q0aabbbB |– aq0abbbB |– aaq0bbbB
|– aabq1bbB |– aabbq1bB
|– aabbbq1B |– aabbbq2 → Final state
Example 2: Construct the TM which will accept the language starts with 0 and ends with 1.
Solution: L = {01, 001, 011, 01011, ……}
It is regular language since it is represented with regular expression
0 (0 + 1)* 1
Logic:
• Read first symbol 0 at state q0 and change the state q1 when only 0 is read. (1 is not the
first symbol).
• At state q1 read 0 without changing the state move to right.
• At state q1, when 1 is read, change the state to q2 and move to right.
• At state q2, when 0 is read, change the state to q1 and when 1 is read, state will not
change, move to right till end of the string occurs.
Fig. 5.4
TM is as follows:
M = ({q0, q1, q2, q3} {0, 1}, {0, 1, B}, δ, q0, B, {q3})
δ is
Table 5.3
State 0 1 B
q0 (q1, 0, R) – –
q1 (q1, 0, R) (q2, 1, R) –
q2 (q1, 0, R) (q2, 1, R) (q3, B, R)
q3 Final state
5.5
Theoretical Computer Science Turing Machine
|– XX Y q1 1 |– XX q2 YY |– X q2 X YY
|– XX q0 YY |– XX Y q3 Y |– XX YY q3
|– XX YY B q4
As TM enters in q4, the string '0011' is accepted.
Example 4: Design a TM to recognize all strings consisting of even number of 1's. Assume
that the string is made up of only 1's.
Solution: We will construct TM as follows:
1. Let q0 be the initial state. The machine M will enter q1 after reading a '1' and it will
replace 1 by B.
2. In state q1, on reading a 1, the machine goes back to q0 after replacing that 1 with B.
3. q0 will be the final state.
Thus, if number of 1's are even in input string, then after scanning all the symbols in the
input, the machine will be in q0 which is the final state and the string is accepted, otherwise
reject.
Hence, TM M = ({q0, q1}, {1}, {1, B}, δ, q0, B, {q0}), where δ function is as shown in the
Table 5.5.
Table 5.5: Transition table for example 5.4
State 1 B
q0 (q1, B, R) Accept
q1 (q0, B, R) Reject
The ID for the word "1111" will be
q0 1111 B |– B q1 111 B
|– BB q0 11 B
|– BBB q1 1 B
|– BBBB q0 B accept.
The ID for the word "111" will be
q0 111 B |– B q1 11 B
|– BB q0 1 B
|– BBB q1 B.
Here, q1 is not a final state, therefore the word "111" is rejected.
Example 5: Construct a TM for language L = {am bn | n ≥ m and m ≥ 1}.
Solution: The steps involved can be written as:
1. TM starts in state q0.
2. If the symbol a is read, then state q0 changes to q1 and replace symbol by X and move
towards right then check for 1st occurrence of b; skipping a and Y.
5.7
Theoretical Computer Science Turing Machine
3. Replace b by Y and change the state to q2 and move towards left till we get first X then
move towards right.
4. If symbol is 'a' then repeat the cycle from step 2, else if symbol is Y then move towards
right until current symbol is not equal to b.
5. If current symbol is 'b' then it means that there is atleast one extra b then check whether
remaining string consists of only b's followed by a blank symbol. Also if current symbol
is 'B' then string is accepted as m = n.
We construct TM as follows:
M = ({q0, q1, q2, q3, q4}, {a, b}, {a, b, X, Y, B}, δ, q0, B, {q5}), where δ function is as
shown in the Table 5.6.
Table 5.6
State a b X Y B
q0 (q1, X, R) (q3, Y, R)
q1 (q1, a, R) (q2, Y, L) (q2, Y, R)
q2 (q2, a, L) (q2, Y, L) (q0, X, R)
q3 (q4, b, R) (q3, Y, R) (q5, b, R)
q4 (q4, b, R) (q5, b, R)
q5 final state
The TM process for string aabbb is as follows:
q0 aabbb |– X q0 abbb |– X a q1 bbb |– X q2 a Y bb
|– q2 X a Y bb |– X q0 a Y bb |– XX q1 Y bb
|– XX Y q2 bb |– XX q2 YY b |– X q2 X YY b
|– XX q0 YY b |– XX Y q3 Y b |– XX YY q3 b
|– XX YY b q4 B |– XX YY b B q5
As TM enters in q5, the string is accepted.
5. If leftmost X is read at state q3, state changes to q0 and X remains as it is and the cycle
repeats from step 1 again.
6. At q0 when symbol Y is read the state q4 enters and in q4 when B is read then string is
accepted. So we construct TM as follows:
M = ({q0, q1, q2, q3, q4, q5}, {a, b}, {a, b, X, Y, Z, B}, δ, q0, B, {q5}), where δ function is
shown in the Table 5.7.
Table 5.7
State a b X Y Z B
q0 (q1, X, R) – – (q4, Y, R) –
q1 (q1, a, R) (q2, Y, R) – (q1, Y, R) –
q2 (q3, Z, L) (q2, b, R) – – (q2, Z, R)
q3 (q3, a, L) (q3, b, L) (q0, X, R) (q3, Y, L) (q3, Z, L)
q4 – – – (q4, Y, R) (q4, Z, R) (q5, B, R)
q5 final state
Computation for string "aabbaa" is as follows:
q0 aabbaa |– X q1 abbaa |– X a q1 bbaa |– X a Y q2 baa |– X a Y b q2 aa
|– X a Y q3 b Z a |– X a q3 Y b Z a |– X q3 a Y b Z a |– q3 X a Y b Z a
|– X q0 a Y b Z a |– XX q1 Y b Z a |– XXYY q2 Z a
|– XX YY Z q2 a |– XX YY q3 ZZ |– XX Y q3 Y ZZ
|– XX q3 YY ZZ |– X q3 X YY ZZ |– XX q0 YY ZZ
|– XX Y q4 Y ZZ |– XX YY q4 ZZ |– XX YY Z q4 ZB
|– XX YY ZZ q4B |– XX YY ZZ B q5
String is accepted.
Example 7: Design a TM to recognize well-formedness of parenthesis.
Solution: We construct TM, M as follows:
M = ({q0, q1, q2, q3}, { (, )}, { (, ), X, Y, B}, δ, q0, B, {q3}), where δ function is shown in the
Table 5.8.
Table 5.8
State ( ) X Y B
q0 (q0, (, R) (q1, X, L) (q0, X, R) (q0, Y, R) (q2, B, L)
q1 (q0, Y, R) – (q1, X, L) (q1, Y, L) –
q2 – – (q2, X, L) (q2, Y, L) (q3, B, R)
q3 final state – – – –
5.9
Theoretical Computer Science Turing Machine
Example 9: Design a TM which will replace every occurrence of a substring ‘11’ by ‘10’
over {0, 1}.
Solution: We will construct TM as follows:
1. Let q0 be the initial state. At q0 when we read 1, change the state to q2 without replacing
the symbol. At q0 when we read 0, state will not change.
2. At state q1, when we read 1 (substring 11 is found), replace 1 by 0 and again state
changes to q0. When we read B go to state q2 which is the final state.
TM is M = ({q0, q1}, {0, 1}, {0, 1, B}, δ, q0, B, {q0, q1})
where, δ is shown in Table 5.10.
Table 5.10
State 0 1 B
q0 (q0, 0, R) (q1, 1, R) (q2, B, R)
q1 (q1, 0, R) (q0, 0, R) (q2, B, R)
q2 final state
The ID for the word “011011” is
q0 011011B |– 0q0 11011B |– 01q1 1011B
|– 010q0 011B |– 0100q0 11B
|– 01001q1 1B |– 010010q0 B
|– 010010 q2 = accept
|– Xq0 aYbZcB
|– XXq1 YbZcB |– XXYq1 bZcB
|– XXYYq2 ZcB
|– XXYYZq2 cB |– XXYYZq3 ZB
|– XXYYq3 ZZB
|– XXYYq3 ZZB
|– XXq3 YYZZB |– Xq3 XYYZZB
|– XXq0 YYZZB
|– XXYq4 YZZB
|– XXYYq4 ZZB
|– XXYYZZq4 B |– XXYYZZq5 = accept
Example 11: Design a TM for language {wwR | w ∈ (0 + 1)*}
Solution: Logic:
If we read first 0, then read last 0 and If we read first 1 then read last 1.
Fig. 5.6
The TM is
M = ({q0, q1, q2, q3, q4, q5, q6} {0, 1} {0, 1, X, B} {δ, q0, B, F}) where δ is as:
Table 5.12
State 0 1 X B
q0 (q1, X, R) (q4, X, R) (q6, X, R) –
q1 (q1, 0, R) (q1, 1, R) (q2, X, L) (q2, B, L)
q2 (q3, X, L) – –
q3 (q3, 0, L) (q3, 1, L) (q0, X, R) –
q4 (q4, 0, R) (q4, 1, R) (q5, X, L) (q5, B, L)
q5 – (q3, X, L) – –
q6 accept state (q6, X, R) (q6, B, R)
Consider string “0110”.
q0 0110B |– Xq1110B
|– X1q110B |– X11q10B
|– X110q1B | – X11q2 0B
|– X1q3 1XB |– Xq3 11XB
5.13
Theoretical Computer Science Turing Machine
q4 Final state
5.15
Theoretical Computer Science Turing Machine
M repeatedly replaces its leading 0 by blank, then searches right for 1 followed by 0 and
changes the 0 to 1. Next M moves left until it encounters a blank and then repeats the cycle. The
repetitions end if
1. Rightmost 0's changes to 1 and paired with leftmost 0's, changes to B and leaving m – n
0's on tape.
2. If n ≥ m, then m – n = 0, replace all remaining 1's and 0's by B.
The function δ is as described below in the Table 5.15.
Table 5.15
State 0 1 B
q0 (q1, B, R) (q5, B, R) –
q1 (q1, 0, R) (q2, 1, R) –
q2 (q3, 1, L) (q2, 1, R) (q4, B, L)
q3 (q3, 0, L) (q3, 1, L) (q0, B, R)
q4 (q4, 0, L) (q4, B, L) (q6, 0, R)
q5 (q5, B, R) (q5, B, R) (q6, B, R)
q6 final state
Here, if in state q0 a 1 is encountered instead of 0. In this, M enters in q5 to erase the rest of
the tape, then enters in q6 and M halts. State q4 is used for changing all 1's and B's until
encountering a B. B changes back to 0, state q6 is entered and M halts.
Computation for input 0010 (m – n = 2 – 1 = 1) is shown below.
q0 0010 |– B q1 010 |– B 0 q1 10 |– B 01 q2 0 |– B 0 q3 11 |– B q 3011 |– q3 B 011
|– B q0 011 |– BB q1 11 |– BB 1 q2 1 |– BB 11 q2 |– BB 1 q4 1 |– BB q4 1 |– B q4
|– B 0 q6 → accept.
If the input is 0100 then TM moves is shown below.
q0 0100 |– B q1 100 |– B 1 q2 00 |– B q3 110
|– q3 B 110 |– B q0 110 |– BB q5 10 |– BBB q5 0
|– BBBB q5 |– BBBBB q6
Here tape is blank i.e. output = 0.
Example 2: Design a TM to multiply two unary numbers.
Solution:
B 1 1 1 * 1 1 = B
5.17
Theoretical Computer Science Turing Machine
• A multi-tape TM has finite set of states Q, initial state q0, a set of P of tape symbols, set of
final states F, F ⊆ Q and blank symbol b ∉ ∑.
• Consider K-tapes, each divided into cells. The Fig. 5.8 represent multi-tape TM.
5.18
Theoretical Computer Science Turing Machine
Finite
control
Fig. 5.9
5.3 LINEAR BOUNDED AUTOMATION (LBA) [April 16, 19, Oct. 16]
• The machine to accept context-sensitive language is Linear Bounded Automation (LBA).
• An LBA is a special type of Turing machine with restricted tape space. The name ‘linear
bounded’ suggests that the machine is linearly bounded.
Difference between TM and LBA: [Oct. 16, 18]
1. In TM the storage is not restricted on tape. Tape head can move to both directions at
infinite storage. In LBA, the input string tape space is the only tape space allowed to use.
So storage is restricted in size.
2. In LBA, linear function is used to restrict (to bound) the length of the tape.
3. In LBA, the computation is done between end markers. In TM, no endmarkers are
present, end of string is considered when blank (B) occurs.
• LBA is a non-deterministic TM which has a single tape whose length is not infinite but
bounded by a linear function of the length of the input string.
Definition: LBA, M = (Q, Σ, Γ, δ, q0, B, ⊄, $, F)
where Q = Finite set of states
Σ = Input alphabet U {⊄, $}
Γ = Tape alphabet
q0, F, b are same as TM basic model.
⊄ and $ are two special symbols ⊄, $ ∈ Σ.
• Both symbols are end markers. ⊄ is called the left-end marker which is entered in the
leftmost cell of the input tape and prevents the R/W head from getting off the left-end of the
tape.
5.20
Theoretical Computer Science Turing Machine
• The $ is called the right-end marker which is entered in the rightmost cell of the input tape
and prevents the R/W head from getting off the right end of the tape. Both the end markers
should not appear on any other cell within the input tape, and the R/W head should not print
any other symbol over both the end markers.
5.3.1 Basic model of LBA
• The Fig. 5.10 shows the model of Linear Bounded Automata.
The TM is as follows:
Table 5.18
State 1 B
q0 (q0, 1, R) (q1, 1, R)
q1 (q1, 1, R) (q2, 1, R)
q2 (q2, 1, R) (q3, B, L)
q3 (q4, B, L) –
q4 (q5, B, L) –
q5 ← accept
Example 3: Construct TM which will subtract two unary numbers.
Solution:
Fig. 5.11
f (m, n) = m – n ∀ m ≥ n (proper subtraction)
Logic:
1. Skip all 1’s till B at state q0.
2. When B occurs change the state and search for last B. Change state to q2.
3. Change 1 to B and move to left at q2.
4. Repeat process till all 1’s are over.
Table 5.19
State 1 B
q0 (q0, 1, R) (q1, B, R)
q1 (q1, 1, R) (q2, B, L)
q2 (q3, B, L) (q7, B, L)
q3 (q3, 1, L) (q4, B, L)
q4 (q4, 1, L) (q5, B, R)
q5 (q0, B, R) (q6, 1, L)
q6 – (q7, B, L)
q7 ← accept
Output is,
B B B 1 1 1 B B B B ---
5.23
Theoretical Computer Science Turing Machine
Fig. 5.12
5.24
Theoretical Computer Science Turing Machine
Logic:
• At state q0, when 1st 0 is read, change the state q1 and 0 to X, move right.
• At q1, when 0 is read, change the state to q2 and 0 to X, move right.
• At q2 skip all 0’s till machine reads 1. State is changed to q3 and 1 changes to Y.
• At q3, move to left, skipping all 0’s till machine reads X. Change the state to q0.
This process repeats.
Table 5.22
State 0 1 X Y B
q0 (q1, X, R) – – (q4, Y, R) –
q1 (q2, X, R) – – – –
q2 (q2, 0, R) (q3, Y, L) – (q2, Y, R) –
q3 (q3, 0, L) – (q0, X, R) (q3, Y, L) –
q4 – – – (q4, Y, R) accept
Consider instantaneous description for string “001”.
q0 001B |– Xq101B
|– XXq2 1B
|– Xq3 XYB |– XXq0YB
|– XXYq4B |– XXYq4 → accept
Example 7: Design a TM that computes remainder when the number is divisible by 3.
Solution: First we draw finite automata for the machine and then we will design TM.
Finite automata is,
Fig. 5.13
Here we get 3 remainders 0, 1 and 2. So machine has 3 states where we get the outputs 0, 1
and 2.
Let we replace (0, 3, 6, 9) by X
(1, 4, 7) by Y
(2, 5, 8) by Z
5.25
Theoretical Computer Science Turing Machine
State q0 moves the machine to the right end of the input. State q1 moves the machine back till
it reaches to rightmost 1. In q2 machine moves back complementing each input symbol, till it
reaches blank and we get 2's complement of string.
PRACTICE QUESTIONS
Q.I Multiple Choice Questions:
1. Which is an accepting device which accepts the languages (recursively enumerable set)
generated by type 0 grammars and was invented in 1936 by Alan Turing?
(a) Turing Machine (TM) (b) Finite Machine (FM)
(c) Grammar Machine (GM) (d) Language Machine (LM)
2. The TM, is defined by how many tuples?
(a) 5 (b) 8
(c) 7 (d) 6
3. A TM consists of,
(a) input tape (b) read-write (R/W) head
(c) finite control (d) All of the mentioned
4. Which TM has multiple tapes where each tape is accessed with a separate head?
(a) Multi-tape (b) Multi-track
(c) nondeterministic TM (d) Two-way TM
5. Which is a multi-track non-deterministic TM with a tape of some bounded finite
length?
(a) Linear Finite Automaton (LFA) (b) Linear Bounded Automaton (LBA)
(c) Context Free Automaton (CFA) (d) None of the mentioned
6. A multi-track TM can be formally described as,
(a) 6 tuples (b) 8 tuples
(c) 7 tuples (d) 5 tuples
7. The mathematical notation for TM is,
(a) M = (Q, Σ, Γ, δ, q0, F) (b) M = (Q, Σ, Γ, δ, q0, B, ⊄, $, F)
(c) M = (Q, Σ, Γ, δ, q0, B, F) (d) None of the mentioned
8. The ID of the TM remembers the following at a given instance of time:
(a) The cell currently being scanned by the read–write head
(b) The state of the machine, and
(c) The contents of all the cells of the tape, starting from the rightmost cell up to at
least the last cell, containing a non-blank symbol and containing all cells up to the
cell being scanned
(d) All of the mentioned
5.27
Theoretical Computer Science Turing Machine
9. The TM accepts all the language even though they are recursively,
(a) contect-free (b) regular
(c) enumerable (d) All of the mentioned
Answers
1. (a) 2. (c) 3. (d) 4. (a) 5. (b) 6. (a) 7. (c) 8. (d) 9. (c)
Q.II Fill in the Blanks:
1. A ______ is a mathematical model consists of an infinite length tape divided into cells
on which input is given.
2. In TM each cell can store only ______ symbol.
3. A ______ is a nondetelministic TM which has a single tape whose length is not infinite
but bounded by a linear function of the length of the input string.
4. A Turing Machine (TM) accepts a language if it enters into a final state for any input
string ω and a language is recursively ______ (generated by Type-0 grammar) if it is
accepted by a TM.
5. The Turing machine was proposed by A.M. Turing in 1936 is the machine format of
______ language i.e., all types of languages are accepted by the TM.
6. The instantaneous description (ID) of a Turing machine remembers the contents of all
cells from the rightmost to at least the leftmost, the cell currently being scanned by the
read–write head and the state of the machine at a given ______ of time.
7. The Turing machine, in short TM, is defined by 7 tuples M = ______.
8. In ______multi-track TM a single tape head reads n symbols from n tracks at one step. It
accepts recursively enumerable languages like a normal single-track single-tape Turing
Machine accepts.
9. The computation of a non-deterministic TM is a tree of configurations that can be
reached from the ______ configuration.
10. According to the ______ hierarchy, type 0 language is called as unrestricted language
(URG)
11. A linear bounded automaton can be defined as an 8-tuple M = ______.
Answers
1. Turing 2. one 3. Linear Bounded Automaton 4. enumerable
Machine (TM) (LBA)
5. unrestricted 6. instance 7. (Q, Σ, Γ, δ, q0, B, F) 8. multi-track
9. start 10. Chomsky 11. (Q, Σ, Γ, δ, q0, B, ⊄, $, F)
5.28
Theoretical Computer Science Turing Machine
October 2016
1. Define Multi-tape Turing Machine. [1 M]
Ans. Refer to Section 5.2.2.
2. Construct a TM for a language L, where L = {am + nbmcn | m, n ≥ 1} [5 M]
Ans. Refer to Section 5.1.3, Examples.
3. Differentiate between TM and LBA. [3 M]
Ans. Refer to Page 5.20.
April 2017
1. Write the tuples of Turning Machine. [1 M]
Ans. Refer to Section 5.1.2.
2. Construct a TM for L = [wcwR | w ∈ (a + b)*]. [5 M]
Ans. Refer to Section 5.1.3, Examples.
October 2017
1. Define non-deterministic Turning Machine. [1 M]
Ans. Refer to Section 5.2.4.
April 2018
1. Design a TM to recognize well-formedness of parenthesis( ). [4 M]
Ans. Refer to Page 5.1.3.
October 2018
1. Define Turing Machine (T.M.). [1 M]
Ans. Refer to Section 5.1.2.
5.31
Theoretical Computer Science Turing Machine
April 2019
1. Write the tuples of LBA. [1 M]
Ans. Refer to Section 5.3.
2. Construct TM for language: L = {am bn | n > m, m > 1} [5 M]
Ans. Refer to Section 5.1.3, Examples.
5.32