0% found this document useful (0 votes)
158 views

Backus Forma PDF

To modify NFAs to recognize context-free languages, we need to add a stack, which is unbounded. This gives us pushdown automata (PDAs). PDAs extend NFAs with a stack. The finite state control of a PDA operates similarly to an NFA, with transitions between states. However, PDAs can also perform stack operations - pushing symbols onto the stack and popping symbols from the stack. This allows PDAs to recognize languages defined by context-free grammars by simulating the derivations of the grammar on the stack. By pushing and popping symbols during transitions, the PDA can verify that the input string is a valid sentential form that can be derived from the start symbol. Thus,

Uploaded by

yugirio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
158 views

Backus Forma PDF

To modify NFAs to recognize context-free languages, we need to add a stack, which is unbounded. This gives us pushdown automata (PDAs). PDAs extend NFAs with a stack. The finite state control of a PDA operates similarly to an NFA, with transitions between states. However, PDAs can also perform stack operations - pushing symbols onto the stack and popping symbols from the stack. This allows PDAs to recognize languages defined by context-free grammars by simulating the derivations of the grammar on the stack. By pushing and popping symbols during transitions, the PDA can verify that the input string is a valid sentential form that can be derived from the start symbol. Thus,

Uploaded by

yugirio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Backus-Naur Form (BNF)

Backus-Naur Form (BNF) is a notation technique used to describe recursively


the syntax of
– programming languages
– document formats
– communication protocols
– etc.

hdigiti ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
hunsigned integeri ::= hdigiti | hunsigned integerihdigiti
hintegeri ::= hunsigned integeri | + hunsigned integeri |
−hunsigned integeri
hletteri ::= a | b | c | ...
hidentifieri ::= hletteri | hidentifierihletteri | hidentifierihdigiti

designed in the 1950–60s to define the syntax of the programming language ALGOL

in fact, this is an example of a context-free grammar , Chomsky (1956)


Fundamentals of Computing (7) 1
Compilers
convert a high-level language into a machine-executable language

For example, ((3 + 4) ∗ (6 + 7)) LOAD 3 in register 1


LOAD 4 in register 2
ADD contents of register 2 into register 1
LOAD 6 in register 3
LOAD 7 in register 4
ADD contents of register 3 into register 4
MULTIPLY register 1 by register 4

Fundamentals of Computing (7) 2


Defining languages recursively

Example 1. L = {an bn | n ≥ 0}

Basis: ε ∈ L (the empty word is in L) L→ε (r1)

Induction: if w is a word in L, then so is awb L → aLb (r2)

BNF notation: L ::= ε | aLb

(r1), (r2) are understood as (substitution) rules (or productions) that generate
all words in L

For example, the word aabb is generated (or derived) as follows:

L ⇒ aLb replace L with aLb by rule (r2)


aLb ⇒ aaLbb replace L with aLb by rule (r2)
aaLbb ⇒ aaεbb replace L with ε by rule (r1)

Thus we obtain the derivation L ⇒ aLb ⇒ aaLbb ⇒ aaεbb = aabb

a word w can be derived using (r1) and (r2) if, and only if, w ∈ L
Fundamentals of Computing (7) 3
Palindromes

Example 2. Define the language P of palindromes over {0, 1}


(a palindrome is a string that reads the same forward and backward, e.g., ‘madamimadam’
or ‘Damn. I, Agassi, miss again. Mad’)

Basis: ε ∈ P , 0 ∈ P , 1 ∈ P P →ε (r1)
P →0 (r2)
P →1 (r3)
Induction: if w is a word in P , then so is 0w0 and 1w1 P → 0P 0 (r4)
P → 1P 1 (r5)

BNF notation: P ::= ε | 0 | 1 | 0P 0 | 1P 1

Construct a derivation of 01010

Exercise. Use the Pumping Lemma to show that P is not regular

Fundamentals of Computing (7) 4


Context-free grammars

A context-free grammar (CFG) consists of 4 components G = (V, Σ, R, S)


– V is a finite set of symbols called variables (or nonterminals)
each variable represents a language (such as L and P in Examples 1, 2)

– S ∈ V is a start variable
other variables in V represent auxiliary languages we need to define S

– Σ is a finite set of symbols called terminals (V ∩ Σ = ∅)


terminals give alphabets of languages (such as {a, b} and {0, 1} in Examples 1, 2)

– R is a finite set of rules (or productions) of the form A → w


where A is a variable and w is a string of variables and terminals
rules give a recursive definition of the language

Informally: to generate a string of terminal symbols from G, we:


– Begin with the start variable.
– Apply one of the productions with the start symbol on the left-hand side,
replacing the start symbol with the right-hand side of the production
– Repeat selecting variables and replacing them with the right-hand side of some
corresponding production, until all variables have been replaced by terminal symbols
Fundamentals of Computing (7) 5
CFGs: derivations and languages

Let G = (V, Σ, R, S) be a CFG

For strings u and v of variables and terminals, we say that:

v is derivable from u in one step in G and write u ⇒1G v if


v can be obtained from u by replacing some occurrence of A in u with w
where A → w is a rule in R

v is derivable from u in G and write u ⇒G v if there are u1 , u2 , . . . , uk


such that

u ⇒1G u1 ⇒1G u2 ⇒1G · · · ⇒1G uk ⇒1G v (derivation of v from u in G)

The language of the grammar G consists of all words over Σ that are derivable
from the start variable S

L(G) = {w ∈ Σ | S ⇒G w}

L(G) is a context-free language


Fundamentals of Computing (7) 6
Nonpalindromes

Example 3. Define the language N of nonpalindromes over {0, 1}

Basis: 0w1 ∈ N and 1w0 ∈ N , for any w ∈ {0, 1}∗


have to define the language A = {0, 1}∗ (of all binary words) as well
Induction: if w is in N , then so is 0w0 and 1w1
This language can be defined by the following grammar G:

N → 0A1
N → 1A0 A→ε
N → 0N 0 A → 0A
N → 1N 1 A → 1A

BNF: N ::= 0A1 | 1A0 | 0N 0 | 1N 1 A ::= ε | 0A | 1A

Test: is 0010 derivable in G from N ?


N ⇒1G 0N 0 ⇒1G 00A10 ⇒1G 00ε10 = 0010
More tests: N ⇒G 1011 ? 0N A0 ⇒G 001A0 ? N ⇒G A ?
Fundamentals of Computing (7) 7
Regular languages are context-free

Example 4: show that the language of the regular expression 0∗ 1(0 ∪ 1)∗
is context-free

This language can be defined by the following grammar:

S → A1B
A→ε
A → 0A
B→ε BNF: S ::= A1B
B → 0B A ::= ε | 0A
B → 1B B ::= ε | 0B | 1B

Every regular language is also a context-free language

it is also easy to encode DFAs as CFGs


(states as variables, transitions as rules)
Fundamentals of Computing (7) 8
Applications of CFGs

Consider the language of the CFG S ::= ε | (S) | SS


can you describe it in English?

The language of this CFG consists of all strings of ‘(’ and ‘)’
with balanced parentheses

CFGs are used to

– describe fragments of natural languages in linguistics (N. Chomsky)

– describe programming languages and markup languages (HTML)


(and other recursive concepts in Computer Science)

– syntactic analysis in compilers


before a compiler can do anything, it scans the input program (a string of ASCII characters)
and determines the syntactic structure of the program. This process is called parsing.

– give document type definitions in XML


Fundamentals of Computing (7) 9
Problem

How to modify NFAs so that they could recognise context-free languages?

Fundamentals of Computing (7) 10


Pushdown automata

A (nondeterministic) pushdown automaton (PDA) is like an NFA,


except that it has a stack that can be used to record a potentially unbounded
amount of information (in some special way)

' $
u
Finite control > u u Pushdown stack
u s

u
 x – can push symbols
push/pop
y onto the top of the stack
s
g
d s
g
d
u z – or pop them off
& % the top of the stack

' ! (last–in–first–out)

`
Reading head
(left to right, a b a b b c ...
read only)

A stack is a last in, first out abstract data type and data structure

Fundamentals of Computing (7) 11


PDA for {an bn | n ≥ 0}

– Read symbols from the input; as each a is read, push it onto the stack

– As soon as b’s are seen, pop an a off the stack for each b read

– If reading the input is finished exactly when the stack becomes empty,
accept the input

– Otherwise reject the input


– How to test for an empty stack? (bottom)

Push initially some special symbol, say ⊥ , on the stack


a, ε/a b, a/ε
a, x/α
q r (α a string) means:
if PDA is in state q,
ε, ε/⊥ b, a/ε ε, ⊥/ε reads a from input and
> q1 q2 q3 q4 symbol x is on top of stack,
then PDA replaces x with α
and moves to state r
as before, a and x can be ε
what is the language of this automaton if we ignore the stack?
Fundamentals of Computing (7) 12
Exercise

For Σ = {a, b}, design a PDA and a CFG for the language

L = {w ∈ Σ∗ | w contains an equal number of a’s and b’s}

– The strategy will be to keep the excess symbols, either a’s or b’s, on the stack

– One state will represent an excess of a’s

– Another state will represent an excess of b’s

– We can tell when the excess switches from one symbol to the other because
at that point the stack will be empty (⊥ on top)

– In fact, when the stack is empty, we may return to the start state

Fundamentals of Computing (7) 13


Exercise (cont.)
a, ε/a

b, a/ε

a, ε/⊥ a>b

S ::= ε | aSb | bSa | SS

> b, ⊥/ε
a, b/ε
b, ε/⊥
b, ε/b

a, ⊥/ε b>a

Fundamentals of Computing (7) 14


A formal definition of PDAs

A PDA is a 6-tuple A = (Q, Σ, Γ, δ, s, F ) where (cf. the definition of NFAs)

– Q is a finite set of states

– Σ is a finite set, the input alphabet

– Γ is a finite set, the stack alphabet

– s ∈ Q is the initial state

– F ⊆ Q is the set of accepting states



– δ is a transition relation consisting of ‘instructions’ of the form (q, a, x), (r, α)
where q, r are states, a a symbol from Σ (input), x a symbol from Γ (stack),
and α a word over Γ (stack), meaning intuitively that
if (1) A is in state q reading input symbol a on the input tape and
(2) symbol x is on the top of the stack,
then the PDA can (nondeterminism!)
(a) pop x off stack and push α onto stack (the first symbol in α is on the top),
(b) move its head right one cell past the a and enter state r

Fundamentals of Computing (7) 15


Computations of PDAs

Configuration of PDA A: (state, word on tape, stack )

Computation of PDA A on input w: (can be many computations!)

(s, au, ε) s is the initial state, w = au and the stack is empty



↓ if A contains an instruction (s, a, ε), (r, xy) then

(r, u, xy) r is the next state, head scans first symbol in u, stack is xy

↓ if A contains an instruction (r, ε, x), (q, ε) then

(q, u, y) q is the next state, head scans first symbol in u, stack is y



...
(t, ε, α) if t is accepting (t ∈ F ), then the computation is accepting
(similar to computations of NFAs)

Computations can also get stuck, end with non-accepting states, or even loop

Exercise: design PDA recognising the language over {(, )} with balanced parentheses
Fundamentals of Computing (7) 16
Using nondeterminism

Design a PDA recognising the language L = {ai bj ck | i = j or i = k}

L contains strings such as aabbc , aabcc , but not abbcc

Idea: start by reading and pushing the a’s. When the a’s are done, the PDA can match
them with either the b’s or the c’s. Here we use nondeterminism !

a, ε/a b, a/ε c, ε/ε

ε, ε/⊥ ε, ε/ε ε, ⊥/ε


> q1 q2 q3 q4

ε, ε/ε

ε, ε/ε ε, ⊥/ε
q5 q6 q7

this language cannot be recognised


by a deterministic PDA
b, ε/ε c, a/ε
Fundamentals of Computing (7) 17
CFGs and PDAs

Context-free languages are precisely the languages recognised by


pushdown automata

– There is an algorithm that, given any CFG G,


constructs a PDA A such that L(A) = L(G)

– There is an algorithm that, given any PDA A,


constructs a CFG G such that L(G) = L(A)

The following languages are not context free:


– {ww | w ∈ {0, 1}∗ }

– {an bn cn | n ≥ 0}
n
– {a2 | n ≥ 0}

can be shown using an analogue of the pumping lemma for PDAs

Fundamentals of Computing (7) 18


Unrestricted grammars (not examinable)

An unrestricted grammar consists of 4 components G = (V, Σ, R, S)


– V is a finite set of variables
– S ∈ V is a start variable
– Σ is a finite set of terminals (V ∩ Σ = ∅) in CFGs, α is a variable!
– R is a finite set of rules (or productions) of the form α→β
where α and β are strings of variables and terminals

For strings u and v of variables and terminals, we say that

v is derivable from u in one step in G and write u ⇒1G v if

v can be obtained from u by replacing some substring α in u with β


where α → β is a rule in R

Example. The grammar G: S → aBSc, S → abc, Ba → aB, Bb → bb


generates (non-context-free) {an bn cn | n ≥ 0}
S ⇒1G aBSc ⇒1G aBabcc ⇒1G aaBbcc ⇒1G aabbcc
Fundamentals of Computing (7) 19
Testing membership in languages

Problem: given a string w and a language L, decide whether w is in L

– for L given by a DFA: simulate the DFA processing of w.


test takes time proportional to |w|

– for L given by a NFA with k states:


test can be done in time proportional to |w| × k2
each input symbol can be processed by taking the previous set of (at most k) states and
looking at the successors of each of these states

– for L given by a CFG of size k: test can be done in time proportional to


|w|3 × k2
– for L given by an unrestricted grammar:
cannot be solved by any mechanical procedures
(such as computer programs)

Is it possible to design a formal model of computation that would


capture capabilities of any computer program ?
Fundamentals of Computing (7) 20

You might also like