Fundamentals of Data Structures in C - Ellis Horowitz, Sartaj SahniEllis Horowitz
Fundamentals of Data Structures in C - Ellis Horowitz, Sartaj SahniEllis Horowitz
C. –C. Yao
Algorithm
• Definition: An algorithm is a finite set of
instructions that, if followed, accomplishes a
particular task. In addition, all algorithms must
satisfy the following criteria:
1) Input. Zero more quantities are externally supplied.
2) Output. At least one quantity is produced.
3) Definiteness. Each instruction is clear and
unambiguous.
4) Finiteness. If we trace out the instructions of an
algorithm, then for all cases, the algorithm terminates
after a finite number of steps.
5) Effectiveness. Every instruction must be basic enough
to be carried out, in principle, by a person using only
pencil and paper. It is not enough that each operation
be definite as in 3) it also must be feasible.
C. –C. Yao
Example: Selection Sort
• Suppose we must devise a program that sorts a
collection of n ≥ 1 integers.
C. –C. Yao
C++ Program for Selection
Sort
void sort (int *a, const int n)
// sort the n integers a[0] to a[n-1] into non-decreasing order
for (int i = 0; i < n; i++)
{
int j = i;
// find smallest integer in a[i] to a[n-1]
for (int k = i + 1; k < n; k++)
if (a[k] < a[j]) j = k;
// interchange
int temp = a[i]; a[i] = a[j]; a[j] = temp;
}
}
C. –C. Yao
Selection Sort (Cont.)
• Theorem 1.1: sort(a, n) correctly sorts
a set of n ≥ 1 integers; the result
remains in a[0] … a[n-1] such that a[0]
≤ a[1] ≤ … ≤ a[n–1].
C. –C. Yao
Example:Binary Search
• Assume that we have n ≥ 1 distinct integers that are already
sorted and stored in the array a[0] … a[n-1]. Our task is to
determine if the integer x is present and if so to return j such
that x = a[j]; otherwise return -1. By making use of the fact that
the set is sorted, we conceive the following efficient method:
Let left and right, respectively, denote the left and right ends of
the list to be searched. Initially, left = 0 and right = n – 1. Let
middle = (left + right) / 2 be the middle position in the list. If we
compare a[middle] with x, we obtain one of the three results:
(1) x < a[middle]. In this case, if x is present, it must be in the
positions between 0 and middle – 1. Therefore, we set right to
middle – 1.
(2) x == a[middle]. In this case, we return middle.
(3) x > a[middle]. In this case, if x is present, it must be in the
positions between middle+1 and n-1. So, we set left to middle+1.
C. –C. Yao
Algorithm for Binary
Search
int BinarySearch (int *a, const int x, const int n)
// Search the sorted array a[0], … , a[n-1] for x
{
for (initialize left and right; while there are more elements;)
{
let middle be the middle element;
switch (compare (x, a[middle])) {
case ‘>’: set left to middle+1; break;
case ‘<‘: set right to middle -1; break;
case ‘=‘: found x;
} // end of switch
} // end of for
not found;
} // end of BinarySearch
C. –C. Yao
C++ Program for Binary
Search
C. –C. Yao
Algorithm for Binary
Search (Cont.)
int BinarySearch (int *a, const int x, const int n)
// Search the sorted array a[0], … , a[n-1] for x
{
for (int left = 0, right = n - 1; left <= right;)
{
int middle = (left + right) / 2;
switch (compare (x, a[middle])) {
case ‘>’: left = middle+1; break; // x > a[middle]
case ‘<‘: right = middle -1; break; // x < a[middle]
case ‘=‘: return middle; // x == a[middle]
} // end of switch
} // end of for
return -1;
} // end of BinarySearch
C. –C. Yao
Recursive Algorithms
int BinarySearch (int *a, const int x, const int left, const int right)
{
if (left <= right)
{
int middle = (left + right) / 2;
if (x < a[middle]) return BinarySearch(a,x,left,middle-1);
else if (x < a[middle]) return BinarySearch(a,x,left,middle-1);
return middle;
} // end if
return -1;
} // end of BinarySearch
C. –C. Yao
Recursive Algorithms(cont.)
Recursive program:
int main()
{
int n=10;
printf(“%d”, rfib(n));
}
int rfib(int n)
{
if (n==1 || n==2) return 1;
return rfib(n1)+rfib(n2);
}
C. –C. Yao
Performance Analysis
• Space Complexity: The space
complexity of a program is the
amount of memory it needs to run to
completion.
• Time Complexity: The time
complexity of a program is the
amount of computer time it needs to
run to completion.
C. –C. Yao
Space Complexity
• A fixed part that is independent of the
characteristics of the inputs and outputs. This
part typically includes the instruction space,
space for simple varialbes and fixed-size
component variables, space for constants, etc.
• A variable part that consists of the space needed
by component variables whose size is dependent
on the particular problem instance being solved,
the space needed by referenced variables, and
the recursion stack space.
• The space requirement S(P) of any program P is
written as S(P) = c +Sp where c is a constant
C. –C. Yao
Time Complexity
• The time, T(P), taken by a program P
is the sum of the compile time and
the run (or execution) time. The
compile time does not depend on the
instance characteristics. We focus on
the run time of a program, denoted
by tp (instance characteristics).
C. –C. Yao
Time Complexity in C++
• General statements in a C++ program
Step count
– Comments 0
– Declarative statements 0
– Expressions and assignment statements 1
– Iteration statements N
– Switch statement N
– If-else statement N
– Function invocation 1 or N
– Memory management statements 1 or N
– Function statements 0
– Jump statements 1 or N
C. –C. Yao
Time Complexity (Cont.)
• Note that a step count does not
necessarily reflect the complexity of
the statement.
• Step per execution (s/e): The s/e of
a statement is the amount by which
count changes as a result of the
execution of that statement.
C. –C. Yao
Time Complexity Iterative
Example
C. –C. Yao
Step Count of Iterative
Example
float sum (float *a, const int n)
{
float s = 0;
count++; // count is global
for (int i = 0; i < n; i++)
{
count++; // for for
s += a[i];
count++; // for assignment
}
count++; // for last time of for
count++; // for return
return;
}
C. –C. Yao
Step Count of Iterative
Example (Simplified)
void sum (float *a, const int n)
{
for (int i = 0; i < n; i++)
count += 2;
count +=3;
}
C. –C. Yao
Time Complexity of
Recursive Example
C. –C. Yao
Step Count of Recursive
Example
float rsum (float *a, const int n)
{
count++; // for if conditional
if (n <= 0) {
count++; // for return
return 0;
}
else {
count++; // for return
return (rsum(a, n–1) + a[n-1]);
}
}
Assume trsum(0) = 2
trsum(n) = 2 + trsum(n-1)
= 2 + 2 + trsum(n-2)
= 2*2 + trsum(n-2)
= 2n + trsum(0)
= 2n + 2
C. –C. Yao
Matrix Addition Example
line void add (matrix a, matrix b, matrix c, int m, int
n)
1 {
2 for (int i = 0; i < m; i++)
3 for (int j = 0; j < n; j++)
4 c[i][j] = a[i][j] + b[i][j];
5 }
C. –C. Yao
Step Count of Matrix
Addition Example
void add (matrix a, matrix b, matrix c, int m, int n)
{
for (int i = 0; i < m; i++)
{
count++; // for for i
for (int j = 0; j < n; j++)
{
count++; // for for j
c[i][j] = a[i][j] + b[i][j];
count++; // for assigment
}
count++; // for last time of for j
}
count++; // for last time of for i
}
C. –C. Yao
Step Count of Matrix Addition
Example (Simplified)
C. –C. Yao
Step Table of Matrix
Addition Example
C. –C. Yao
Fibonacci Numbers
• The Fibonacci sequence of numbers starts
as 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …
Each new term is obtained by taking the
sum of the two previous terms. If we call
the first term of the sequence F0 then F0 =
0, F1 = 1, and in general
Fn = Fn-1 + Fn-2 , n ≥ 2.
C. –C. Yao
C++ Program of Fibonacci
Numbers
1 void fibonacci (int n)
2 // compute the Fibonacci number F n
3 {
4 if (n <= 1) cout << n << endl; // F0 = 0, and F1 = 1
5 else { // compute Fn
6 int fn; int fnm2 = 0; int fnm1 = 1;
7 for (int i = 2; i <= n; i++)
8 {
9 fn = fnm1 + fnm2;
10 fnm2 = fnm1;
11 fnm1 = fn;
12 } // end of for
13 cout << fn << endl;
14 } // end of else
15 } // end of fibonacci
C. –C. Yao
Step Count of Fibonacci
Program
• Two cases:
– n = 0 or n = 1
• Line 4 regarded as two lines: 4(a) and 4(b), total step
count in this case is 2
– n>1
• Line 4(a), 6, and 13 are each executed once
• Line 7 gets executed n times.
• Lines 8 – 12 get executed n-1 times each.
• Line 6 has s/e of 2 and the remaining lines have an
s/e of 1.
• Total steps for the case n > 1 is 4n + 1
C. –C. Yao
Asymptotic Notation
• Determining step counts help us to compare the
time complexities of two programs and to predict
the growth in run time as the instance
characteristics change.
• But determining exact step counts could be very
difficult. Since the notion of a step count is itself
inexact, it may be worth the effort to compute
the exact step counts.
• Definition [Big “oh”]: f(n) = O(g(n)) iff there exist
positive constants c and n0 such that f(n) ≤ cg(n)
for all n, n ≥ n0
C. –C. Yao
Examples of Asymptotic
Notation
• 3n + 2 = O(n)
3n + 2 ≤ 4n for all n ≥ 3
• 100n + 6 = O(n)
100n + 6 ≤ 101n for all n ≥ 10
• 10n2 + 4n + 2 = O(n2)
10n2 + 4n + 2 ≤ 11n2 for all n ≥ 5
C. –C. Yao
Asymptotic Notation (Cont.)
Theorem 1.2: If f(n) = amnm + … + a1n + a0, then f(n)
= O(nm).
Proof:
m
f ( n )≤∑ ∣ai∣ni
i= 0
m
n m ∑ ∣ai∣ni −m
0
m for n ≥ 1
n
m
∑ ∣ai∣
0
C. –C. Yao
Asymptotic Notation (Cont.)
C. –C. Yao
Asymptotic Notation (Cont.)
C. –C. Yao
Asymptotic Notation (Cont.)
C. –C. Yao
Practical Complexities
• If a program P has complexities Θ(n)
and program Q has complexities
Θ(n2), then, in general, we can assume
program P is faster than Q for a
sufficient large n.
• However, caution needs to be used on
the assertion of “sufficiently large”.
C. –C. Yao
Function Values
log n n n log n n2 n3 2n
0 1 0 1 1 2
1 2 2 4 8 4
2 4 8 16 64 16
3 8 24 64 512 256
C. –C. Yao
Chap 2
Array
ADT
• In C++, class consists four
components:
– class name
– data members : the data that makes up
the class
– member functions : the set of
operations that may be applied to the
objects of class
– levels of program access : public,
protected and private.
Definition of Rectangle
• #ifndef RECTANGLE_H
• #define RECTANGLE_H
• Class Rectangle{
• Public:
• Rectangle();
• ~Rectangle();
• int GetHeight();
• int GetWidth();
• Private:
• int xLow, yLow, height, width;
• };
• #endif
Array
char A[3][4]; // row-major
logical structure physical structure
0 1 2 3
A[0][0]
0 A[0][1]
1 A[0][2]
2 A[0][3]
A[1][0]
A[2][1] A[1][1]
A[1][2]
A[1][3]
Mapping:
A[i][j]=A[0][0]+i*4+j
Array
The Operations on 1-dim Array
– Store
• 將新值寫入陣列中某個位置
• A[3] = 45
0 1 2 3
A 45 ...
• O(1)
Polynomial
Representation 1
private:
Representations
int degree; // degree ≤ MaxDegree
float coef [MaxDegree + 1];
Representation 2
private:
int degree;
float *coef;
Polynomial::Polynomial(int d)
{
degree = d;
coef = new float [degree+1];
}
Representation 3
class Polynomial; // forward delcaration
class term {
friend Polynomial;
private:
float coef; // coefficient
int exp; // exponent
};
private:
static term termArray[MaxTerms];
static int free;
int Start, Finish;
coef 2 1 1 10 3 1
exp 1000 0 4 3 2 0
0 1 2 3 4 5 6
Polynomial Addition
只存非零值 (non-zero) :一元多項式
– Add the following two polynomials:
A(x) = 2x1000 + 1
B(x) = x4 + 10x3 + 3x2 + 1
0 1 2 0 1 2 3 4
A_coef 2 2 1 B_coef 4 1 10 3 1
0 1 2 3 4 5
C_coef 5 2 1 10 3 2
C_exp 1000 4 3 2 0
Polynomial Addition
Polynomial Polynomial:: Add(Polynomial B)
// return the sum of A(x) ( in *this) and B(x)
{
Polynomial C; int a = Start; int b = B.Start; C.Start = free; float c;
while ((a <= Finish) && (b <= B.Finish))
switch (compare(termArray[a].exp, termArray[b].exp)) {
case ‘=‘:
c = termArray[a].coef +termArray[b].coef;
if ( c ) NewTerm(c, termArray[a].exp);
a++; b++;
break;
case ‘<‘:
NewTerm(termArray[b].coef, termArray[b].exp);
b++;
case ‘>’:
NewTerm(termArray[a].coef, termArray[a].exp);
a++;
} // end of switch and while
// add in remaining terms of A(x)
for (; a<= Finish; a++)
NewTerm(termArray[a].coef, termArray[a].exp);
O(m+n)
// add in remaining terms of B(x)
for (; b<= B.Finish; b++)
NewTerm(termArray[b].coef, termArray[b].exp);
C.Finish = free – 1;
return C;
} // end of Add
Program 2.9 Adding a new
Term
[ ]
15 0 0 22 0 −15
[ ]
−27 3 4 0 11 3 0 0 0
6 82 −2 0 0 0 −6 0 0
109 −64 11 0 0 0 0 0 0
12 8 9 91 0 0 0 0 0
48 27 47 0 0 28 0 0 0
Sparse Matrix Representation
class MatrixTerm {
friend class SparseMatrix
private:
int row, col, value;
};
In class SparseMatrix:
private:
int Rows, Cols, Terms;
MatrixTerm smArray[MaxTerms];
Transposing A Matrix
• Intuitive way:
for (each row i)
take element (i, j, value) and store it in (j, i, value)
of the transpose
• More efficient way:
for (all elements in column j)
place element (i, j, value) in position (j, i, value)
Transposing A Matrix
• The Operations on 2-dim Array
– Transpose
[ ] [ ]
1 2 3
1 4 7 10
4 5 6
2 5 8 11
7 8 9
3 6 9 12 3 ×4
10 11 12 4×3
SparseMatrix SparseMatrix::Transpose()
// return the transpose of a (*this)
{
SparseMatrix b;
b.Rows = Cols; // rows in b = columns in a
b.Cols = Rows; // columns in b = rows in a
b.Terms = Terms; // terms in b = terms in a
if (Terms > 0) // nonzero matrix
{
int CurrentB = 0;
for (int c = 0; c < Cols; c++) // transpose by columns
for (int i = 0; i < Terms; i++)
// find elements in column c
if (smArray[i].col == c) {
b.smArray[CurrentB].row = c;
b.smArray[CurrentB].col = smArray[i].row;
b.smArray[CurrentB].value = smArray[i].value;
CurrentB++; O(terms*columns)
}
} // end of if (Terms > 0)
} // end of transpose
Fast Matrix Transpose
• The O(terms*columns) time =>
O(rows*columns2) when terms is the order
of rows*columns
• A better transpose function in Program
2.11. It first computes how many terms in
each columns of matrix a before
transposing to matrix b. Then it
determines where is the starting point of
each row for matrix b. Finally it moves
each term from a to b.
Program 2.11 Fast Matrix
Transposing
SparseMatrix SparseMatrix::Transpose()
// The transpose of a(*this) is placed in b and is found in Q(terms +
columns) time.
{
int *Rows = new int[Cols];
int *RowStart = new int[Rows];
SparseMatrix b;
b.Rows = Cols; b.Cols = Rows; b.Terms = Terms;
if (Terms > 0) // nonzero matrix O(columns)
{ O(terms)
// compute RowSize[i] = number of terms in row i of b
for (int i = 0; I < Cols; i++) RowSize[i] = 0; // Initialize
O(columns-1)
for ( I = 0; I < Terms; I++) RowSize[smArray[i].col]++;
delete [] RowSize;
delete [] RowStart; O(row * column)
return b;
} // end of FastTranspose
Matrix Multiplication
• Definition: Given A and B, where A is mxn
and B is nxp, the product matrix Result has
dimension mxp. Its [i][j] element is
n−1
result ij= ∑ a ij bkj
k=0
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] A[10] A[11
]
Two Dimensional Array Row
Major Order
Row 0 X X X X
Row 1 X X X X
Row u1 - 1 X X X X
u2 u2
elements elements
H e l l o W o r l d \0
String Matching The Knuth-
Morris-Pratt Algorithm
• Definition: If p = p0p1…pn-1 is a pattern,
then its failure function, f, is defined as
f ( j )=
{
largest k<j such that p 0 p1 . .. p k =p j −k p j− k+1 . . . p j
−1
if such a k ≥ 0
otherwise .
exists
}
• If a partial match is found such that si-j …
si-1 = p0p1…pj-1 and si ≠ pj then matching may
be resumed by comparing si and pf(j–1)+1 if j ≠
0. If j = 0, then we may continue by
comparing si+1 and p0.
Fast Matching Example
Suppose exists a string s and a pattern pat =
‘abcabcacab’, let’s try to match pattern pat in
string s.
j 0 1 2 3 4 5 6 7 8 9
pat a b c a b c a c a b
f -1 -1 -1 0 1 2 3 -1 0 1
s= ‘- a b c a ?j = 4,?pf(j-1)+1
. =.p1 . ?’
pat = ‘a b c a b c a c a b’
float farray[100];
int intarray[200];
………..
sort(farray, 100);
sort(intarray, 200);
Template (Cont.)
• Can we use the sort template for the
Rectangle class?
• Well, not directly. We’ll need to use
operator overloading to implement “>”
for Rectangle class.
Stack
• What is a stack? A stack is an
ordered list in which insertions and
deletions are made at one end called
the top. It is also called a Last-In-
First-Out (LIFO) list.
Stack (Cont.)
• Given a stack S = (a0, …, an-1), a0 is the bottom
element, an-1 is the top element, and ai is on top of
element ai-1, 0 < i < n.
a3
a3
a2 a2
a1 a1
a0 a0
local variables
an-1
a2
a1
a0
a0 a1 a2 an-1
Array index 0 1 2 3 n-1
Queue
• A queue is an ordered list in which all
insertions take place at one end and
all deletions take place at the
opposite end. It is also known as
First-In-First-Out (FIFO) lists.
a0 a1 a2 an-1
front rear
ADT 3.2 Abstract Data Type Queue
Template <class KeyType>
class Queue
{
// objects: A finite ordered list with zero or more elements
public:
Queue(int MaxQueueSize = DefaultSize);
// Create an empty queue whose maximum size is MaxQueueSize
Boolean IsFull();
// if number of elements in the queue is equal to the maximum size of
// the queue, return TRUE(1); otherwise, return FALSE(0)
void Add(const KeyType& item);
// if IsFull(), then QueueFull(); else insert item at rear of the queue
Boolean IsEmpty();
// if number of elements in the queue is equal to 0, return TRUE(1)
// else return FALSE(0)
KeyType* Delete(KeyType&);
// if IsEmpty(), then QueueEmpty() and return 0;
// else remove the item at the front of the queue and return a pointer to it
};
Queue Manipulation Issue
• It’s intuitive to use array for
implementing a queue. However,
queue manipulations (add and/or
delete) will require elements in the
array to move. In the worse case, the
complexity is of O(MaxSize).
Shifting Elements in Queue
front rear
front rear
front rear
Circular Queue
• To resolve the issue of moving elements in
the queue, circular queue assigns next
element to q[0] when rear == MaxSize – 1.
• Pointer front will always point one position
counterclockwise from the first element in
the queue.
• Queue is empty when front == rear. But it
is also true when queue is full. This will be
a problem.
Circular Queue (Cont.)
4 J4 4
J3 n-4 n-4
3 3 J1
J2
J1
n-3 J2 n-3
2 2 J4 J3
n-2 n-2
1 1
0 n-1 0 n-1
Stack::~Stack() { }
// Destructor for Bag is automatically called when Stack is destroyed. This ensures that array is deleted.
int* Stack::Delete(int& x)
{ if (IsEmpty()) {Empty(); return 0; }
x = array[top--];
return &x;
}
Class Inheritance Example
Bag b(3); // uses Bag constructor to create array of size 3
Stack s(3); // uses Stack constructor to create array of size 3
int x;
b.Delete(x); // uses Bag::Delete, which calls Bag::IsEmpty and Bag::Emtpy
s.Delete(x);
// uses Stack::Delete, which calls Bag::IsEmtpy and Bag::Emtpy because these
// have not been redefined in Stack.
The Maze Problem
[ ]
Entrance 0 1 0 0 0 1 1 0 0 0 1 1 1 1 1
1 0 0 0 1 1 0 1 1 1 0 0 1 1 1
0 1 1 0 0 0 0 1 1 1 1 0 0 1 1
1 1 0 1 1 1 1 0 1 1 0 1 1 0 0
1 1 0 1 0 0 1 0 1 1 1 1 1 1 1
0 0 1 1 0 1 1 1 0 1 0 0 1 0 1
0 0 1 1 0 1 1 1 0 1 0 0 1 0 1
0 1 1 1 1 0 0 1 1 1 1 1 1 1 1
0 0 1 1 0 1 1 0 1 1 1 1 1 0 1
1 1 0 0 0 1 1 0 1 1 0 0 0 0 0
0 0 1 1 1 1 1 0 0 0 1 1 1 1 0
0 1 0 0 1 1 1 1 1 0 1 1 1 1 0
Exit
Allowable Moves
N
[i-1][j-1] [i-1][j] [i-1][j+1]
NW NE
W [i][j-1] X [i][j+1] E
[i][j]
same priorities, it 2 *, /, %
follows the 3 +, -
direction from left 4 <, <=, >=, >
to right. 5 ==, !=
6 &&
7 ||
Postfix Notation
Expressions are converted into Postfix notation before
compiler can accept and process them.
X = A/B – C + D * E – A * C
Infix A/B–C+D*E–A*C
Postfix => AB/C-DE*+AC*-
Operation Postfix
T1 = A / B T1C-DE*+AC*-
T2 = T1 - C T2 DE*+AC*-
T3 = D * E T2T3+AC*-
T4 = T2 + T3 T4AC*-
T5 = A * C T4 T5 -
T6 = T4 - T5 T6
Postfix notation generation
X = A+ B* C / D – E * F 後置式 : ABC*D/+EF*-
輸出 A A AB AB ABC ABC* ABC* ABC*D
堆疊
* * / /
+ + + + + + +
堆疊
* *
+ - - - -
Postfix notation generation
X = A+ (B+C*D) – E 後置式 : ABCD*++E–
輸出 A A A AB AB ABC ABC ABCD ABCD*
* *
堆疊
+ + + + +
( ( ( ( ( ( (
+ + + + + + + +
堆疊
(
+ + - -
Postfix notation execution
後置式 : ABC*D/+EF*-
堆疊 C D
B B B*C B*C B*C/D
A A A A A A A+B*C/D
F
E E E*F
A+B*C/D A+B*C/D A+B*C/D A+B*C/D-E*F
Multiple Stacks and Queues
Stack A Stack B
• More than two stacks case
0 ⌊ m/n ⌋ −1 2 ⌊ m / n ⌋ −1 m-1
Linked Lists
Review of Sequential Representations
• Previously introduced data structures, including
array, queue, and stack, they all have the property
that successive nodes of data object were stored
a fixed distance apart.
• The drawback of sequential mapping for ordered
lists is that operations such as insertion and
deletion become expensive.
• Also sequential representation tends to have less
space efficiency when handling multiple various
sizes of ordered lists.
Linked List
• A better solutions to resolve the aforementioned
issues of sequential representations is linked lists.
• Elements in a linked list are not stored in
sequential in memory. Instead, they are stored all
over the memory. They form a list by recording
the address of next element for each element in
the list. Therefore, the list is linked together.
• A linked list has a head pointer that points to the
first element of the list.
• By following the links, you can traverse the linked
list and visit each element in the list one by one.
Linked List Insertion
• To insert an element into the three
letter linked list:
– Get a node that is currently unused; let
its address be x.
– Set the data field of this node to GAT.
– Set the link field of x to point to the
node after FAT, which contains HAT.
– Set the link field of the node cotaining
FAT to x.
Linked List Insertion And Deletion
first
GAT
first
class ThreeLetterList {
public:
// List Manipulation operations
.
.
private:
ThreeLetterNode *first;
};
Nested Classes
• The Three Letter List problem can also use nested classes to
represent its structure.
class ThreeLetterList {
public:
// List Manipulation operations
.
.
private:
class ThreeLetterNode { // nested class
public:
char data[3];
ThreeLetterNode *link;
};
ThreeLetterNode *first;
};
Pointer Manipulation in C++
• Addition of integers to pointer variable is
permitted in C++ but sometimes it has no logical
meaning.
• Two pointer variables of the same type can be
compared.
– x == y, x != y, x == 0
x a a x b
y b x b y b
y
x=y *x = * y
Define A Linked List
Template
• A linked list is a container class, so its
implementation is a good template candidate.
• Member functions of a linked list should be
general that can be applied to all types of
objects.
• When some operations are missing in the original
linked list definition, users should not be forced
to add these into the original class design.
• Users should be shielded from the detailed
implementation of a linked list while be able to
traverse the linked list.
Solution => Use of ListIterator
List Iterator
• A list iterator is an object that is used to traverse all
the elements of a container class.
• ListIterator<Type> is delcared as a friend of both
List<Type> and ListNode<Type>.
• A ListIterator<Type> object is initialized with the
name of a List<Type> object l with which it will be
associated.
• The ListIterator<Type> object contains a private data
member current of type ListNode<Type> *. At all
times, current points to a node of list l.
• The ListIterator<Type> object defines public member
functions NotNull(), NextNotNull(), First(), and Next()
to perform various tests on and to retrieve elements
of l.
Template of Linked Lists
Enum Boolean { FALSE, TRUE};
template <class Type> class List;
template <class Type> class ListIterator;
first
last
Linked Stacks and Queues
top
front rear
Linked Queue
0
Linked Stack
Revisit Polynomials
a.first 3 14 2 8 1 0 0
a= 3 x 14 + 2x 8 + 1
b.first 8 14 -3 10 10 6 0
b= 8 x 14 − 3x 10 + 10 x 6
Polynomial Class Definition
struct Term
// all members of Terms are public by default
{
int coef; // coefficient
int exp; // exponent
void Init(int c, int e) {coef = c; exp = e;};
};
class Polynomial
{
friend Polynomial operator+(const Polynomial&, const Polynomial&);
private:
List<Term> poly;
};
Operating On Polynomials
• With linked lists, it is much easier to
perform operations on polynomials
such as adding and deleting.
– E.g., adding two polynomials a and b
a.first 3 14 2 8 1 0 0
b.first 8 14 -3 10 10 6 0
q
(i) p->exp == q->exp
c.first 11 14 0
Operating On Polynomials
a.first 3 14 2 8 1 0 0
b.first 8 14 -3 10 10 6 0
c.first 11 14 0 -3 10 0
a.first 3 14 2 8 1 0 0
b.first 8 14 -3 10 10 6 0
q
c.first 11 14 0 -3 10 2 8 0
av
3
a.first 3 14 2 8 1 0
1
2 second
av
Equivalence Class
• For any polygon x, x ≡ x. Thus, ≡ is
reflexive.
• For any two polygons x and y, if x ≡ y,
then y ≡ x. Thus, the relation ≡ is
symetric.
• For any three polygons x, y, and z, if
x ≡ y and y ≡ z, then x ≡ z. The
relation ≡ is transitive.
Equivalence
Definition: A relation ≡ over a set S, is said
to be an equivalence relation over S iff it
is symmetric, reflexive, and transitive over
S.
Example: Supposed 12 polygons 0 ≡ 4, 3 ≡ 1,
6 ≡ 10, 8 ≡ 9, 7 ≡ 4, 6 ≡ 8, 3 ≡ 5, 2 ≡ 11, and
11 ≡ 0. Then they are partitioned into
three equivalence classes:
{0, 2, 4, 7, 11}; {1 , 3, 5}; {6, 8, 9 , 10}
Equivalence (Cont.)
• Two phases to determine equivalence
– In the first phase the equivalence pairs
(i, j) are read in and stored.
– In phase two, we begin at 0 and find all
pairs of the form (0, j). Continue until
the entire equivalence class containing 0
has been found, marked, and printed.
• Next find another object not yet
output, and repeat the above process.
Equivalence Classes (Cont.)
• If a Boolean array pairs[n][n] is used
to hold the input pairs, then it might
waste a lot of space and its
initialization requires complexity
Θ(n2) .
• The use of linked list is more
efficient on the memory usage and
has less complexity, Θ(m+n) .
Linked List Representation
[0 [1] [2 [3 [4 [5 [6 [7 [8 [9 [10][11]
] ] ] ] ] ] ] ] ]
data 11 3 11 5 7 3 8 4 6 8 6 0
link 0 0 0 0 0 0
data 4 1 0 10 9 2
link 0 0 0 0 0 0
Linked List for Sparse
Matrix
• Sequential representation of sparse matrix suffered from
the same inadequacies as the similar representation of
Polynomial.
• Circular linked list representation of a sparse matrix has
two types of nodes:
– head node: tag, down, right, and next
– entry node: tag, down, row, col, right, value
• Head node i is the head node for both row i and column i.
• Each head node is belonged to three lists: a row list, a
column list, and a head node list.
• For an n x m sparse matrix with r nonzero terms, the
number of nodes needed is max{n, m} + r + 1.
Node Structure for Sparse Matrices
[ ]
0 0 11 0
12 0 0 0 A 4x4 sparse
0 −4 0 0 matrix
0 0 0 −15
Linked Representation of A Sparse
Matrix
H0 H1 H2 H3
Matrix
4 4
head
0 2
H0
11
1 0
H1
12
2 1
H2
-4
3 3
H3
-15
Reading In A Sparse Matrix
• Assume the first line consists of the
number of rows, the number of columns,
and the number of nonzero terms. Then
followed by num-terms lines of input, each
of which is of the form: row, column, and
value.
• Initially, the next field of head node i is to
keep track of the last node in column i.
Then the column field of head nodes are
linked together after all nodes has been
read in.
Complexity Analysis
• Input complexity: O(max{n, m} + r) = O(n +
m + r)
• Complexity of ~Maxtrix(): Since each node
is in only one row list, it is sufficient to
return all the row lists of a matrix. Each
row is circularly linked, so they can be
erased in a constant amount of time. The
complexity is O(m+n).
Doubly Linked Lists
• The problem of a singly linked list is that
supposed we want to find the node precedes a
node ptr, we have to start from the beginning of
the list and search until find the node whose link
field contains ptr.
• To efficiently delete a node, we need to know its
preceding node. Therefore, doubly linked list is
useful.
• A node in a doubly linked list has at least three
fields: left link field (llink), a data field (item),
and a right link field (rlink).
Doubly Linked List
• A head node is also used in a doubly
linked list to allow us to implement
our operations more easily.
x x
newnode p
var y 0 ptr 1 0
var x 0 no 3 2 0
P
Representation of P(x, y, z)
10 8 3 8 2 2 4 3 4
(( x +2x ) y + 3x y ) z +(( x +6x ) y +2y ) z
P(x, y, z)
v z 0 p 2 p 1 0
v y 0 p 3 p 2 0 v y 0 p 4 p 1 0
v x 0 n 3 8 0 v x 0 n 2 0 0
v x 0 n 1 10 n 2 8 0 v x 0 n 1 4 n 6 3 0
Recursive Algorithms For
Lists
• A recursive algorithm consists of two
components:
– The recursive function (the workhorse);
declared as a private function
– A second function that invokes the
recursive function at the top level (the
driver); declared as a public function.
Program 4.6 Copying A List
// Driver
void GenList::Copy(const GenList& l)
{
first = Copy(l.first);
}
// Workhorse
GenListNode* GenList::Copy(GenListNode *p)
// Copy the nonrecursive list with no shared sublists pointed at by p
{
GenListNode *q = 0;
if (p) {
q = new GenListNode;
q->tag = p->tag;
if (!p->tag) q->data = p->data;
else q->dlink = Copy(p->dlink);
q->link = Copy(p->link);
}
return q;
}
Linked Representation for A
A=((a, b), ((c, d), e))
b r
t t 0
t u v
s
f a f b 0 t f e 0
w x
f c f d 0
Generalized List Representation
Example
f b t c 0
B t t 0 0 B=(A, A, ())
C f a t 0 C=(a, C)
Recursiveness GenList::Copy
Level of Value of p Continuing p Continuing p
recursion level level
1 b 2 r 3 u
2 s 3 u 4 v
3 t 4 w 5 0
4 0 5 x 4 v
3 t 6 0 3 u
2 s 5 x 2 r
1 b 4 w 3 0
2 r
1 b
Important List Functions
depth ( s )=
{ 0 if s is an atom
1+max {depth ( x 1 ), ⋅¿ ,depth ( x n ) } if s is the list ( x 1 ,⋅¿ ,x n ) , n≥ 1 }
Reference Counts, Shared and Recursive
Lists
• Lists may be shared by other lists for the
purpose of space saving.
• Lists that are shared by other lists create
problems when performing add or delete
functions. For example, let’s look at the previous
A, B, C, D example. When deleting the front node
of list A would requires List B to update its
pointers.
• The use of the data field of a head node to
record the reference count can resolve the
aforementioned problem. The list can not be
deleted unless the reference count is 0.
Example of Reference Counts, Shared
and Recursive Lists
X f 1 0
A=(a, (b, c))
Y f 3 f a t 0
f 1 f b f c 0
Z f 1 t t t 0 B=(A, A, ())
W f 2 f a t 0 f 1 0
C=(a, C)
Erasing A List Recursively
// Driver
GenList::~GenList()
// Each head node has a reference count. We assume first ≠ 0.
{
Delete(first);
first = 0;
}
// Workhorse
void GenList::Delete(GenListNode* x)
{
x->ref--; // decrement reference coutn of head node.
if (!x->ref)
{
GenListNode *y = x; // y traverses top-level of x.
while (y->link) { y= y->link; if (y->tag == 1) Delete (y->dlink);}
y->link = av; // Attach top-level nodes to av list
av = x;
}
}
Issue In Erasing Recursive
Lists
• When erasing a recursive list (either
direct recursive or indirect
recursive), the reference count does
not become 0. Then the nodes are not
returned to available list. This will
cause memory leak issue.
Chap 5
Trees
Trees
• Definition: A tree is a finite set of
one or more nodes such that:
– There is a specially designated node
called the root.
– The remaining nodes are partitioned into
n ≥ 0 disjoint sets T1, …, Tn, where each
of these sets is a tree. We call T1, …, Tn
the subtrees of the root.
Pedigree Genealogical Chart
Cheryl
Kevin Rosemary
Binary Tree
Lineal Genealogical Chart
Proto Indo-European
Osco Umbrian Spanish French Italian Icelandic Norwegian Swedish Low High Yiddish
Tree Terminology
• Normally we draw a tree with the root at the top.
• The degree of a node is the number of subtrees
of the node.
• The degree of a tree is the maximum degree of
the nodes in the tree.
• A node with degree zero is a leaf or terminal
node.
• A node that has subtrees is the parent of the
roots of the subtrees, and the roots of the
subtrees are the children of the node.
• Children of the same parents are called siblings.
Tree Terminology (Cont.)
• The ancestors of a node are all the nodes
along the path from the root to the node.
• The descendants of a node are all the
nodes that are in its subtrees.
• Assume the root is at level 1, then the
level of a node is the level of the node’s
parent plus one.
• The height or the depth of a tree is the
maximum level of any node in the tree.
A Sample Tree
Level
A 1
B C D 2
E F G H I J 3
K L M 4
List Representation of Trees
The tree in previous slide could be written as
(A (B (E (K, L), F), C(G), D(H (M), I, J)))
A 0
B F 0 C G 0 D I J 0
E K L 0 H M 0
Possible Node Structure For A Tree of
Degree
• Lemma 5.1: If T is a k-ary tree (i.e., a
tree of degree k) with n nodes, each
having a fixed size as in Figure 5.4,
then n(k-1) + 1 of the nk child fileds
are 0, n ≥ 1.
Wasting memory!
Representation of Trees
• Left Child-Right Sibling Representation
– Each node has two links (or pointers).
– Each node only has one leftmost child and one
closest sibling.
A
data
E F G H I J
K L M
Degree Two Tree Representation
E C
K F G D
Binary Tree!
L H
M I
J
Tree Representations
A A A
B B B
A A A
B C B C B
C
Left child-right sibling
Binary tree
Binary Tree
• Definition: A binary tree is a finite set of
nodes that is either empty or consists of a
root and two disjoint binary trees called
the left subtree and the right subtree.
• There is no tree with zero nodes. But
there is an empty binary tree.
• Binary tree distinguishes between the
order of the children while in a tree we do
not.
Binary Tree Examples
A A
A
B
B C
B
C
D E F G
A D
H I
E
B
The Properties of Binary
Trees
• Lemma 5.2 [Maximum number of nodes]
1) The maximum number of nodes on level i of a
binary tree is 2i-1, i ≥ 1.
2) The maximum number of nodes in a binary
tree of depth k is 2k – 1, k ≥ 1.
• Lemma 5.3 [Relation between number of
leaf nodes and nodes of degree 2]: For
any non-empty binary tree, T, if n0 is the
number of leaf nodes and n2 the number
of nodes of degree 2, then n0 = n2 + 1.
• Definition: A full binary tree of depth k
is a binary tree of depth k having 2k – 1
nodes, k ≥ 0.
Binary Tree Definition
• Definition: A binary tree with n nodes and
depth k is complete iff its nodes correspond
to the nodes numbered from 1 to n in the full
binary tree of depth k.
level
1 1
2
2 3
4 5 6 7 3
8 9 10 11 12 13 14 15 4
Array Representation of A Binary
Tree
• Lemma 5.4: If a complete binary tree with
n nodes is represented sequentially, then
for any node with index i, 1 ≤ i ≤ n, we have:
– parent(i) is at ⌊ i/2 ⌋ if i ≠1. If i = 1, i is at the
root and has no parent.
– left_child(i) is at 2i if 2i ≤ n. If 2i > n, then i has no
left child.
– right_child(i) is at 2i + 1 if 2i + 1 ≤ n. If 2i + 1 > n,
then i has no right child.
• Position zero of the array is not used.
Proof of Lemma 5.4 (2)
Assume that for all j, 1 ≤ j ≤ i,
left_child(j) is at 2j. Then two nodes
immediately preceding left_child(i +
1) are the right and left children of i.
The left child is at 2i. Hence, the left
child of i + 1 is at 2i + 2 = 2(i + 1)
unless 2(i + 1) > n, in which case i +
1 has no left child.
Array Representation of Binary
Trees
[1] A
[2 B
] [1] A
[3
][4 [2] B
C
] [3] C
[5
] [4] D
[6
] [5] E
[7
][8 [6] F
D
] [7] G
[9
] [8] H
[9] I
[16] E
Linked Representation
class Tree;
class TreeNode {
friend class Tree;
private:
TreeNode *LeftChild;
char data;
TreeNode *RightChild;
};
class Tree {
public:
// Tree operations
.
private:
TreeNode *root;
};
Node Representation
data
LeftChild RightChild
root
root
A 0
A
B 0
B C
C 0
D 0 D E F 0 G 0
0 E 0 H 0 I 0
Tree Traversal
• When visiting each node of a tree exactly
once, this produces a linear order for the
node of a tree.
• There are 3 traversals if we adopt the
convention that we traverse left before
right: LVR (inorder), LRV (postorder), and
VLR (preorder).
• When implementing the traversal, a
recursion is perfect for the task.
Binary Tree With Arithmetic Expression
* E
* D
/ C
A B
Tree Traversal
• Inorder Traversal: A/B*C*D+E
=> Infix form
• Preorder Traversal: +**/ABCDE
=> Prefix form
• Postorder Traversal: AB/C*D*E+
=> Postfix form
Inorder traversal
template <class T>
void Tree <T>::Inorder()
{//Driver calls workhorse for traversal of entire tree.
Inorder(root);
}
+*E*D/CAB
Traversal Without A Stack
• Use of parent field to each node.
• Use of two bits per node to
represents binary trees as threaded
binary trees.
Some Other Binary Tree Functions
• With the inorder, postorder, or
preorder mechanisms, we can
implement all needed binary tree
functions. E.g.,
– Copying Binary Trees
– Testing Equality
• Two binary trees are equal if their
topologies are the same and the information
in corresponding nodes is identical.
The Satisfiability
Problem
• Expression Rules
– A variable is an expression
– If x and y are expressions then
x ∧ y,x ∨ y,and ¬ x are expressions
– Parentheses can be used to alter the
normal order of evaluation, which is not
before and before or.
Propositional Formula In A Binary Tree
¬¿
¿
x3
x1 ¬¿
¿
¬¿
¿ x3
( x 1∧ ¬ x 2 )∨ ( ¬ x 1 ∧ x 3 )∨ ¬ x 3
x2 x1 O(g2n)
Perform Formula Evaluation
• To evaluate an expression, we can
traverse its tree in postorder.
• To perform evaluation, we assume
each node has four fields
– LeftChild
– data
– value
– RightChild LeftChild data value RightChild
First Version of Satisfiability Algorithm
For all 2n possible truth value combinations for the n
variables
{
generate the next combination;
replace the variables by their values;
evaluate the formula by traversing the tree it points
to in postorder;
if (formula.rootvalue()) {cout << combination; return;}
}
Cout << “no satisfiable combination”;
Evaluating A Formula
void SatTree::PostOrderEval() // Driver
{
PostOrderEval(root);
}
B C
D E F G
H I
Inorder sequence: H, D, I, B, E, A, F, C, G
Threads
• To distinguish between normal
pointers and threads, two boolean
fields, LeftThread and RightThread,
are added to the record in memory
representation.
– t->LeftChild = TRUE
=> t->LeftChild is a thread
– t->LeftChild = FALSE
=> t->LeftChild is a pointer to the left child.
Threads (Cont.)
• To avoid dangling threads, a head
node is used in representing a binary
tree.
• The original tree becomes the left
subtree of the head node.
• Empty Binary Tree
TRUE FALSE
Memory Representation of Threaded
Tree of Figure 5.20
f - f
f A f
f B f f B f
f D f t E t f D f t E t
t H t t I t
Finding the inorder successor
without stack
• By using the threads, we can perform an
inorder traversal without making use of a
stack.
T* ThreadedInorderIterator::Next()
{//Return the inorder successor of currentnode
ThreadedNode <T> *temp = currentNode -> rightChild;
if (!currentNode->rightThread)
while (!temp->leftThread) temp = temp -> leftChild;
currentNode = temp;
if (currentNode == root) return 0;
else return ¤tNode -> data;
}
Inserting A Node to AThreaded Binary
Tree
• Inserting a node r as the right child of a
node s.
– If s has an empty right subtree, then the
insertion is simple and diagram in Figure 5.23(a).
– If the right subtree of s is not empty, the this
right subtree is made the right subtree of r
after insertion. When thisis done, r becomes the
inorder predecessor of a node that has a
LdeftThread==TRUE field, and consequently
there is an thread which has to be updated to
point to r. The node containing this thread was
previously the inorder successor of s. Figure
5.23(b) illustrates the insertion for this case.
Insertion of r As A Right Child of s in A
Threaded Binary Tree
s s
r r
before after
Insertion of r As A Right Child of s in A
Threaded Binary Tree (Cont.)
s s
r r r
before after
Program 5.16 Inserting r As The Right
Child of s
void ThreadedTree::InsertRight(ThreadNode *s, ThreadedNode *r)
// Insert r as the right child of s
{
r->RightChild = s->RightChild;
r->RightThread = s->RightThread;
r->LeftChild = s;
r->LeftThread = TRUE; // LeftChild is a thread
r->RightChild = r; // attach r to s
r->RightThread = FALSE;
if (!r->RightThread) {
ThreadedNode *temp = InorderSucc(r); // returns the inorder
successor of r
temp->LeftChild = r;
}
}
Priority Queues
• In a priority queue, the element to be
deleted is the one with highest (or lowest)
priority.
• An element with arbitrary priority can be
inserted into the queue according to its
priority.
• A data structure supports the above two
operations is called max (min) priority
queue.
Examples of Priority Queues
• Suppose a server that serve multiple
users. Each user may request different
amount of server time. A priority queue is
used to always select the request with the
smallest time. Hence, any new user’s
request is put into the priority queue. This
is the min priority queue.
• If each user needs the same amount of
time but willing to pay more money to
obtain the service quicker, then this is
max priority queue.
Priority Queue
Representation
• Unorder Linear List
– Addition complexity: O(1)
– Deletion complexity: O(n)
• Chain
– Addition complexity: O(1)
– Deletion complexity: O(n)
• Ordered List
– Addition complexity: O(n)
– Deletion complexity: O(1)
Max (Min) Heap
• Heaps are frequently used to implement
priority queues. The complexity is O(log n).
• Definition: A max (min) tree is a tree in
which the key value in each node is no
smaller (larger) than the key values in its
children (if any). A max heap is a complete
binary tree that is also a max tree. A min
heap is a complete binary tree that is also
a min tree.
Max Heap Examples
14 9
12 7 6 3
10 8 6 5
Insertion Into A Max Heap
(1)
20 20
15 2 15 2
14 10 14 10 1
Insertion Into A Max Heap
(2)
20 20
15 2 15 2
5
14 10 14 10 2
Insertion Into A Max Heap
(3)
20 20
21
15 2 15 20
2
14 10 14 10 2
Deletion From A Max Heap
template <class Type>
void MaxHeap <T>::Pop()
{// Delete from the max heap
if (IsEmpty()) throw “Heap is empty.”;
heap[1].~T(); //delete max element
//remove last element from heap
T lastE = heap [heapSize--];
int currentNode = 1;
int child = 2;
while (child >= leapSize)
{
if (child < heapSize && heap[child] < heap[child+1]) child++;
if (lastE >= heap[child]) break;
heap[currentNode] = heap[child];
currentNode = child; child *= 2;
}
heap[currentNode] = lastE;
}
Deletion From A Max Heap (Cont.)
21 2
15 20 15 20
14 10 2 14 10 2
Deletion From A Max Heap (Cont.)
2
20
15 20
15 2
14 10
14 10
Binary Search Tree
• Definition: A binary serach tree is a binary
tree. It may be empty. If it is not empty
then it satisfies the following properties:
– Every element has a key and no two elements
have the same key (i.e., the keys are distinct)
– The keys (if any) in the left subtree are
smaller than the key in the root.
– The keys (if any) in the right subtree are
larger than the key in the root.
– The left and right subtrees are also binary
search trees.
Binary Trees
30 60
20
5 40 70
15 25
65 80
14 10 22 2
30 30
5 40 5 40
2 80 2 35 80
Insertion Into A Binary Search Tree
Template <class Type>
Boolean BST<Type>::Insert(const Element<Type>& x)
// insert x into the binary search tree
{
// Search for x.key, q is the parent of p
BstNode<Type> *p = root; BstNode<Type> *q = 0;
while(p) {
q = p;
if (x.key == p->data.key) return FALSE; // x.key is already in tree
if (x.key < p->data.key) p = p->LeftChild;
else p = p->RightChild;
} O(h)
// Perform insertion
p = new BstNode<Type>;
p->LeftChild = p->RightChild = 0; p->data = x;
if (!root) root = p;
else if (x.key < q->data.key) q->LeftChild = p;
else q->RightChild = p;
return TRUE;
}
Deletion From A Binary Search Tree
• Delete a leaf node
– A leaf node which is a right child of its parent
– A leaf node which is a left child of its parent
• Delete a non-leaf node
– A node that has one child
– A node that has two children
• Replaced by the largest element in its left subtree, or
• Replaced by the smallest element in its right subtree
• Again, the delete function has complexity of
O(h)
Deleting From A Binary Search Tree
30
30
5 40 2
5 40
2 35 80 2 80
Deleting From A Binary Search Tree
30 5
30 5
30
2
5 40 2
5 40 2 40
2 80 2 80 80
Joining and Splitting Binary Trees
• C.ThreeWayJoin(A, x, B): Creates a binary
search tree C that consists of binary
search tree A, B, and element x.
• C.TwoWayJoin(A, B): Joins two binary
search trees A and B to obtain a single
binary search tree C.
• A.Split(i, B, x, C): Binary search tree A
splits into three parts: B (a binary search
tree that contains all elements of A that
have key less than i); if A contains a key i
than this element is copied into x and a
pointer to x returned; C is a binary search
tree that contains all records of A that
have key larger than i.
ThreeWayJoin(A, x, B)
81
30 90
5 40 81 85 94
2 35 80 84 92
A x B
TwoWayJoin(A, B)
80
84
30 90
5 40 85 94
2 35 80 84 92
A B
A.Split(i, B, x, C)
30 x i = 30
5 40
2 35 80
B
75 81
A C
A.Split(i, B, x, C)
i = 80 30 t Y L Z R
30 L 81
5 40 t R
5 40 L
C
2 35 80 t
2 35 75 80
75 81
x
A B
Selection Trees
• When trying to merge k ordered sequences
(assume in non-decreasing order) into a
single sequence, the most intuitive way is
probably to perform k – 1 comparison each
time to select the smallest one among the
first number of each of the k ordered
sequences. This goes on until all numbers in
every sequences are visited.
• There should be a better way to do this.
Winner Tree
• A winner tree is a complete binary tree in
which each node represents the smaller of
its two children. Thus the root represents
the smallest node in the tree.
• Each leaf node represents the first record
in the corresponding run.
• Each non-leaf node in the tree represents
the winner of its right and left subtrees.
Winner Tree For k = 8
1
6
2 3
6 8
4 5 7
6
9 6 8 17
8 9 10 11 12 13 14 15
10 9 20 6 8 9 90 17
15 20 20 15 15 11 95 18
16 38 30 25 50 16 99 20
28
1
8
2 3
9 8
4 5 7
6
9 15 8 17
8 9 10 11 12 13 14 15
10 9 20 15 8 9 90 17
15 20 20 25 15 11 95 18
16 38 30 28 50 16 99 20
0
6 Overall
winner
1
8
2 3
9 17
4 5 7
6
10 20 9 90
9 10 11 12 13 14 15
10 9 20 6 8 9 90 17
run 1 2 3 4 5 6 7 8
Loser Tree
0
8 Overall
winner
1
9
2 3
15 17
4 5 7
6
10 20 9 90
9 10 11 12 13 14 15
10 9 20 15 8 9 90 17
run 1 2 3 4 5 6 7 8
Forests
• Definition: A forest is a set of n ≥ 0
disjoint trees.
• When we remove a root from a tree,
we’ll get a forest. E.g., Removing the
root of a binary tree will get a forest
of two trees.
Transforming A Forest Into A Binary
Tree
• Definition: If T1, …, Tn is a forest of
trees, then the binary tree corresponding
to this forest, denoted by B(T1, …, Tn),
– is empty if n = 0
– has root equal to root (T1); has left subtree
equal to B(T11, T12,…, T1m), where T11, T12,…, T1m
are the subtrees of root (T1); and has right
subtree B(T2, …, Tn).
Set Representation
• Trees can be used to represent sets.
• Disjoint set union: If Si and Sj are
two disjoint sets, then their union Si
∪Sj = {all elements x such that x is in
Si or Sj}.
• Find(i). Find the set containing
element i.
Possible Tree Representation of Sets
0 4 2
6 7 8 1 9 3 5
S1 S2 S3
Possible Representations of Si ∪Sj
0 4
6
7 8 4 0 1 9
1 9 6 7 8
Unions of Sets
• To obtain the union of two sets, just
set the parent field of one of the
roots to the other root.
• To figure out which set an element is
belonged to, just follow its parent
link to the root and then follow the
pointer in the root to the set name.
Data Representation for S1, S2, S3
Set
Name Pointer
S1 0 4 2
S2
6 7 8 1 9 3 5
S3
Array Representation of S1, S2, S3
• We could use an array for the set
name. Or the set name can be an
element at the root.
• Assume set elements are numbered 0
through n-1.
i [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
parent -1 4 -1 2 -1 2 0 0 0 4
Union-Find Operations
• For a set of n elements each in a set of its
own, then the result of the union function
is a degenerate tree.
• The time complexity of the following
union-find operation is O(n2).
union(0, 1), union(1, 2), …, union(n-2, n-1)
find(0), find (1), …, find(n-1)
• The complexity can be improved by using
weighting rule for union.
Degenerate Tree
Find operation
O(n2) union(n-2, n-1), find(0)
0
Weighting Rule
• Definition [Weighting rule for
union(I, j)]: If the number of nodes
in the tree with root i is less than
the number in the tree with root j,
then make j the parent of i;
otherwise make i the parent of j.
Trees Obtained Using The Weighting
Rule
1 1 2
0 4 n-1 0
1 2 3
1 2 3 n-1
Weighted Union
• Lemma 5.5: Assume that we start with a
forest of trees, each having one node. Let T
be a tree with m nodes created as a result
of a sequence of unions each performed
using function WeightedUnion. The height
of T is no greater than ⌊ log 2 m ⌋ +1 .
• For the processing of an intermixed
sequence of u – 1 unions and f find
operations, the time complexity is O(u +
f*log u).
Trees Achieving Worst-Case Bound
0 1 2 3 4 5 6 7
0 2 4 6
1 3 5 7
(b) Height-2 trees following union (0, 1), (2, 3), (4, 5), and (6, 7)
Trees Achieving Worst-Case Bound
(Cont.)
0 4 0
1 2 5 6 1 2 4
3 7 3 5 6
[-8] [-8]
0 0
1 2 4 1 2 4 6 7
3 5 6 3 5
7 After
Before collapsing
collapsing
Analysis of WeightedUnion and
CollapsingFind
• The use of collapsing rule roughly double the
time for an individual find. However, it
reduces the worst-case time over a
sequence of finds.
• Lemma 5.6 [Tarjan and Van Leeuwen]:
Assume that we start with a forest of
trees, each having one node. Let T(f, u) be
the maximum time required to process any
intermixed sequence of f finds and u unions.
Assume that u ≥ n/2. Then
k1(n + fα(f + n, n)) ≤ T(f, u) ≤ k2(n + fα(f + n, n))
for some positive constants k1 and k2.
Revisit Equivalence Class
• The aforementioned techniques can be applied to the equivalence
class problem.
• Assume initially all n polygons are in an equivalence class of their
own: parent[i] = -1, 0 ≤ i < n.
– Firstly, we must determine the sets that contains i and j.
– If the two are in different set, then the two sets are to be replaced
by their union.
– If the two are in the same set, then nothing need to be done since
they are already in the equivalence class.
– So we need to perform two finds and at most one union.
• If we have n polygons and m equivalence pairs, we need
– O(n) to set up the initial n-tree forest.
– 2m finds
– at most min{n-1, m} unions.
• If weightedUnion and CollapsingFind are used, the time complexity
is O(n + m (2m, min{n-1, m})).
– This seems to slightly worse than section 4.7 (O(m+n)). But this
scheme demands less space.
Example 5.5
[-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1]
0 1 2 3 4 5 6 7 8 9 10 11
0 3 6 8 2 5 7 11
4 1 10 9
0 6 3 2
4 7 10 8 1 5 11
9
[-3] [-4] [-3]
0 6 3
4 7 2 10 8 1 5
11 9
Uniqueness of A Binary Tree
• In section 5.3 we introduced
preorder, inorder, and postorder
traversal of a binary tree. Now
suppose we are given a sequence (e.g.,
inorder sequence BCAEDGHFI), does
the sequence uniquely define a binary
tree?
Constructing A Binary Tree From Its
Inorder Sequence
A A
B, C D, E, F, G, H, I B D, E, F, G, H, I
C
Constructing A Binary Tree From Its
Inorder Sequence (Cont.)
1
A
2 4
B D
3 5 6
C E F
7 9
G I
8
H
Preorder: 1, 2, 3, 4, 5, 6, 7, 8, 9
Inorder: 2, 3, 1, 5, 4, 7, 8, 6, 9
Distinct Binary Trees
1 1 1 1 1
2 2 2 3 2 2
3 3 3 3
bn
bi bn-i-1
Distinct Binary Trees
(Cont.)
B ( x )= ∑ b i x
i
• Assume we let i≥ 0 which is the generating
function for the number of binary trees.
• By the recurrence relation we get
2
xB ( x )=B ( x )−1
1−√ 1−4x
B( x )=
2x
B( x )=
1
2x ( ( )
1− ∑
n≥0
1/ 2
n
(−4x )n = ∑
) ( )
1/ 2
m≥0 m+ 1
(−1 )m 2 2m+ 1 x m
bn=
1 2n
n+1 n ( )
≈ b n =O (4 n /n3/2 )
Chap 6
Graph
Konigsberg Bridge Problem
• A river Pregel flows around the island
Keniphof and then divides into two.
• Four land areas A, B, C, D have this river
on their borders.
• The four lands are connected by 7 bridges
a – g.
• Determine whether it’s possible to walk
across all the bridges exactly once in
returning back to the starting land area.
Konigsberg Bridge Problem (Cont.)
c C
d g
A
e D
Kneiphof
C
g
a f c d
B b e
A D
a b
f
B
Euler’s Graph
• Define the degree of a vertex to be the
number of edges incident to it
• Euler showed that there is a walk starting
at any vertex, going through each edge
exactly once and terminating at the start
vertex iff the degree of each vertex is
even. This walk is called Eulerian.
• No Eulerian walk of the Konigsberg bridge
problem since all four vertices are of odd
edges.
Application of Graphs
• Analysis of electrical circuits
• Finding shortest routes
• Project planning
• Identification of chemical compounds
• Statistical mechanics
• Genertics
• Cybernetics
• Linguistics
• Social Sciences, and so on …
Definition of A Graph
• A graph, G, consists tof two sets, V and E.
– V is a finite, nonempty set of vertices.
– E is set of pairs of vertices called edges.
• The vertices of a graph G can be represented as
V(G).
• Likewise, the edges of a graph, G, can be
represented as E(G).
• Graphs can be either undirected graphs or
directed graphs.
• For a undirected graph, a pair of vertices (u, v) or
(v, u) represent the same edge.
• For a directed graph, a directed pair <u, v> has u
as the tail and the v as the head. Therefore, <u, v>
and <v, u> represent different edges.
Three Sample Graphs
0 0
0
1 2 1 2
1
3
3 4 5 6
2
V(G1) = {0, 1, 2, 3}
V(G2) = {0, 1, 2, 3, 4, 5, 6}
V(G3) = {0, 1, 2}
0 1 1 3
2 2
0 0 0 0
1 1 1
(i) (ii) 2
2 2
(a) Some subgraphs of G3 (iii) (iv)
Connected Graph
• Two vertices u and v are connected in an
undirected graph iff there is a path from
u to v (and v to u).
• An undirected graph is connected iff for
every pair of distinct vertices u and v in
V(G) there is a path from u to v in G.
• A connected component of an undirected is a
maximal connected subgraph.
• A tree is a connected acyclic graph.
Strongly Connected Graph
• A directed graph G is strongly
connected iff for every pair of
distinct vertices u and v in V(G),
there is directed path from u to v
and also from v to u.
• A strongly connected component is a
maximal subgraph that is strongly
connected.
Graphs with Two Connected Components
H2
H1 0 0
1 2 1 2
3 3
G4
Strongly Connected Components of G3
1 2
Degree of A Vertex
• Degree of a vertex: The degree of a vertex is the
number of edges incident to that vertex.
• If G is a directed graph, then we define
– in-degree of a vertex: is the number of edges for which
vertex is the head.
– out-degree of a vertex: is the number of edges for which the
vertex is the tail.
• For a graph G with n vertices and e edges, if di is
the degree of a vertex i in G, then the number of
edges of G is n−1
e=( ∑ d i )/2
i= 0
Abstract of Data Type
Graphs
class Graph
{
// objects: A nonempty set of vertices and a set of
undirected edges
// where each edge is a pair of vertices
public:
Graph(); // Create an empty graph
void InsertVertex(Vertex v);
void InsertEdge(Vertex u, Vertex v);
void DeleteVertex(Vertex v);
void DeleteEdge(Vertex u, Vertex v);
[ ]
0 1 2 3 4 5 6 7
0 0 1 1 0 0 0 0 0
1 1 0 0 1 0 0 0 0
2 1 0 0 1 0 0 0 0
0 1 2 3 3 0 1 1 0 0 0 0 0
[ ] [ ]
0 0 1 1 1 0 1 2 4 0 0 0 0 0 1 0 0
1 1 0 1 1 00 1 0 5 0 0 0 0 1 0 1 0
2 1 1 0 1 11 0 1 6 0 0 0 0 0 1 0 1
3 1 1 1 0 20 0 0 7 0 0 0 0 0 0 1 0
HeadNodes
[0 3 1 2 0
][1
2 3 0 0
]
[2 1 3 0 0
]
[3 0 1 2 0
]
(a) G1
HeadNodes
[0 1 0
][1
2 0 0
]
[2 0
]
(b) G3
Adjacent Lists (Cont.)
HeadNodes
[0 2 1 0
][1
3 0 0
]
[2 0 3 0
]
[3 1 1 0
]
[4 5 0
][5
6 4 0
]
[6 5 7 0
]
[7 6 0
]
(c) G4
Sequential Representation of Graph G4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
9 11 13 15 17 18 20 22 23 2 1 3 0 0 3 1 2 5 6 4 5 7 6
Inverse Adjacency Lists for
G3
[0] 1 0
[1] 0 0
[2] 1 0
Multilists
• In the adjacency-list representation of an
undirected graph, each edge (u, v) is
represented by two entries.
• Multilists: To be able to determine the
second entry for a particular edge and
mark that edge as having been examined,
we use a structure called multilists.
– Each edge is represented by one node.
– Each node will be in two lists.
Orthogonal List Representation for G3
head nodes
(shown twice) 0 1 2
0 0 1 0 0
1 1 0 0 1 2 0 0
2 0
Adjacency Multilists for G1
HeadNodes
[0 N0 0 1 N1 N3 edge (0, 1)
][1
] N1 0 2 N2 N3 edge (0, 2
[2
]
[3 N2 0 3 0 N4 edge (0, 3)
]
N3 1 2 N4 N5 edge (1, 2)
The lists are
N4 1 3 0 N5 edge (1, 3)
Vertex 0: N0 -> N1 -> N2
Vertex 1: N0 -> N3 -> N4 2 3 0 0 edge (2, 3)
N5
Vertex 2: N1 -> N3 -> N5
Vertex 3: N2 -> N4 -> N5
Weighted Edges
• Very often the edges of a graph have
weights associated with them.
– distance from one vertex to another
– cost of going from one vertex to an
adjacent vertex.
– To represent weight, we need additional
field, weight, in each entry.
– A graph with weighted edges is called a
network.
Graph Operations
• A general operation on a graph G is to
visit all vertices in G that are
reachable from a vertex v.
– Depth-first search
– Breath-first search
Depth-First Search
• Starting from vertex, an unvisited vertex w
adjacent to v is selected and a depth-first search
from w is initiated.
• When the search operation has reached a vertex
u such that all its adjacent vertices have been
visited, we back up to the last vertex visited that
has an unvisited vertex w adjacent to it and
initiate a depth-first search from w again.
• The above process repeats until no unvisited
vertex can be reached from any of the visited
vertices.
Graph G and Its Adjacency
Lists
0
1 2
3 4 5 6
HeadNodes
[0] 7
1 2 0
[1] 0 3 4 0
[2] 0 5 6 0
[3] 1 7 0
[4] 1 7 0
[5] 2 7 0
[6] 2 7 0
[7 3 4 5 6 0
Analysis of DFS
• If G is represented by its adjacency
lists, the DFS time complexity is
O(e).
• If G is represented by its adjacency
matrix, then the time complexity to
complete DFS is O(n2).
Breath-First Search
• Starting from a vertex v, visit all unvisited
vertices adjacent to vertex v.
• Unvisited vertices adjacent to these newly
visited vertices are then visited, and so on.
• If an adjacency matrix is used, the BFS
complexity is O(n2).
• If adjacency lists are used, the time
complexity of BFS is O(e).
A Complete Graph and Three of Its
Spanning Trees
Depth-First and Breath-First Spanning
Trees
0 0
1 2 1 2
3 4 5 6 3 4 5 6
7 7
(a) DFS (0) spanning tree (b) BFS (0) spanning tree
Spanning Tree
0 8 9 0 8 9
1 7 1 7 7
2 3 5 1 7
4 6 2 3 3 5 5
4 6
(a) A connected graph
(b) Its biconnected components
Biconnected Components
• Definition: A vertex v of G is an articulation point
iff the deletion of v, together with the deletion
of all edges incident to v, leaves behind a graph
that has at least two connected components.
• Definition: A biconnected graph is a connected
graph that has no articulation points.
• Definition: A biconnected component of a
connected graph G is a maximal biconnected
subgraph H of G. By maximal, we mean that G
contains no other subgraph that is both
biconnected and properly contains H.
Biconnected Components
(Cont.)
• Two biconnected components of the same graph can have at
most one vertex in common.
• No edge can be in two or more biconnected components.
• The biconnected components of G partition the edges of G.
• The biconnected components of a connected, undirected
graph G can be found by using any depth-first spanning tree
of G.
• A nontree edge (u, v) is a back edge with respect to a
spanning tree T iff either u is an ancestor of v or v is an
ancestor of u.
• A nontree edge that is not back edge is called a cross edge.
• No graph can have cross edges with respect to any of its
depth-first spanning trees.
Biconnected Components (Cont.)
• The root of the depth-first spanning tree is an
articulation point iff it has at least two children.
• Any other vertex u is an articulation point iff it
has at least one child, w, such that it is not
possible to reach an ancestor of u using apath
composed solely of w, descendants of w, and a
single back edge.
• Define low(w) as the lowest depth-first number
that can be reached fro w using a path of
descendants followed by, at most, one back edge.
dfn ( w ) ,min { low ( x )∣ x is a child of w } ,
low ( w )= min
• u is an articulation point iff u is
either the root of the spanning tree
and has two or more children or u is
not the root and u has a child w such
that low(w) ≥ dfn(u).
Depth-First Spanning Tree
3
5 0 10 8 9 9
2 6
4 5
4 1 7 8
1 6 3 2 7 6
3 2 3 5
4
1 8 7
4 6
2 7
5 0 9 8 9 10
dfn and low values for the Spanning Tree
vertex 0 1 2 3 4 5 6 7 8 9
dfn 5 4 3 1 2 6 7 8 10 9
low 5 1 1 1 1 6 6 6 10 9
Minimal Cost Spanning Tree
• The cost of a spanning tree of a weighted,
undirected graph is the sum of the costs
(weights) of the edges in the spanning tree.
• A minimum-cost spanning tree is a spanning tree of
least cost.
• Three greedy-method algorithms available to
obtain a minimum-cost spanning tree of a
connected, undirected graph.
– Kruskal’s algorithm
– Prim’s algorithm
– Sollin’s algorithm
Kruskal’s Algorithm
• Kruskal’s algorithm builds a minimum-cost
spanning tree T by adding edges to T one at a
time.
• The algorithm selects the edges for inclusion in T
in nondecreasing order of their cost.
• An edge is added to T if it does not form a cycle
with the edges that are already in T.
• Theorem 6.1: Let G be any undirected, connected
graph. Kruskal’s algorithm generates a minimum-
cost spanning tree.
Stages in Kruskal’s Algorithm
0 28 0 0
10 1 1 1
10
14 16
5 6 2 5 6 2 5 6 2
24
25 18
12
4 4 4
22 3 3 3
0 0 0
10 1 1 1
10 10
14 14 16
5 6 2 5 6 2 5 6 2
12 12 12
4 4 4
3 3 3
0 0
1 10 1
10 14
14 16
16
5 6 2 5 6 2
25 12
12 4
4
3 22 3
22
(g) (g)
Prim’s Algorithm
0 0 0
1 10
1 1
10 10
5 6 2 5 6 2 5 6 2
25 25
4 4 4
3 3 22 3
0 0 0
1 1 10
1
10 10
16 14 16
5 6 2 5 6 2 5 6 2
25 25 25 12
12 12
4 4 4
22 3 22 3 22 3
0 0
1 10 1
10
14 14 16
5 6 2 5 6 2
25
12 12
4 4
3 22 3
22
(a) (b)
Shortest Paths
• Usually, the highway structure can be
represented by graphs with vertices
representing cities and edges representing
sections of highways.
• Edges may be assigned weights to
represent the distance or the average
driving time between two cities connected
by a highway.
• Often, for most drivers, it is desirable to
find the shortest path from the
originating city to the destination city.
Single Source/All Destinations:
Nonnegative Edge Costs
• Let S denotes the set of vertices to which the
shortest paths have already been found.
1) If the next shortest path is to vertex u, then the path
begins at v, ends at u, and goes through only vertices
that are in S.
2) The destination of the next path generated must be
the vertex u that has the minimum distance among all
vertices not in S.
3) The vertex u selected in 2) becomes a member of S.
• The algorithm is first given by Edsger Dijkstra.
Therefore, it’s sometimes called Dijstra
Algorithm.
Single Source/All Destinations: General
Weights
• When negative edge lengths are permitted, the graph must not
have cycles of negative length.
• When there are no cycles of negative length, there is a shortest
path between any two vertices of an n-vertex graph that has at
most n-1 edges on it.
1. If the shortest paht from v to u with at most k, k > 1, edges has no
more than k – 1 edges, then distk[u] = distk-1[u].
2. If the shortest path from v to u with at most k, k > 1, edges has
exactly k edges, then it is comprised of a shortest path from v to
some vertex j followed by the edge <j, u>. The path from v to j has k –
1 edges, and its length is distk-1[j].
• The distance can be computed in recurrence by the following:
{ d is t k − 1 [ i ] + le n g th [ i ] [ u ] } ¿ ¿ ¿
• The algorithm is also referred to as the Bellman and Ford
Algorithm.
Graph and Shortest Paths From Vertex
0 to all destinations
50 10
0 1 2 Path Length
35
15 1) 0, 3 10
10 20 20 30
2) 0, 3, 4 25
3 4 5 3) 0, 3, 4, 1 45
15 3
4) 0, 2 45
1200
0 1000 250
San Francisco
800 0 New York
0 0
Denver 1400
300 1000 900
0 1700
0 1000
Los Angeles New Orleans 0 Miami
[ ]
0 1 2 3 4 5 6 7
0 0
1 300 0
2 1000 800 0
3 1200 0
4 0 1500 0 250
5 0 1000 0 900 1400
6 0 0 1000
7 1700 0 0
Action of Shortest Path
7 -5
0 1 2
-2
0 1 2
1 1
0 1 2 3 4 5 6
1 4
1 0 6 5 5 ∞ ∞ ∞
2 0 3 3 5 5 4 ∞
0 2 6
3 0 1 3 5 2 4 7
3 5 4 0 1 3 5 0 4 5
5 0 1 3 5 0 4 3
(b) distk
All-Pairs Shortest Paths
• In all-pairs shortest-path problem, we are
to find the shortest paths between all
pairs of vertices u and v, u ≠ v.
– Use n independent single-source/all-destination
problems using each of the n vertices of G as a
source vertex. Its complexity is O(n3) (or O(n2
logn + ne) if Fibonacci heaps are used).
– On graphs with negative edges the run time will
be O(n4). if adjacency matrices are used and
O(n2e) if adjacency lists are used.
All-Pairs Shortest Paths
(Cont.)
• A simpler algorithm with complexity O(n 3) is available. It works
faster when G has edges with negative length, as long as the
graphs have at least c*n edges for some suitable constant c.
– An-1[i][j]: the length of the shortest i-to-j path in G
– Ak[i][j]: the length of the shortest path from I to j going through no
intermediate vertex of index greater than k.
– A-1[i][j]: is just the length[i][j]
1. The shortest path from i to j going through no vertex with index
greater than k does not go through the vertex with index k. so its
length is Ak-1[i][j].
2. The shortest path goes through vertex k. The path consists of
subpath from i to k and another one from k to j.
A-1 0 1 2 A0 0 1 2
0 0 4 11 0 0 4 11
6
1 6 0 2 1 6 0 2
0 1 2 3 ∞ 0 2 3 7 0
4
(b) A-1 (c) A0
11
2
3
2 A1 0 1 2 A2 0 1 2
0 0 4 6 0 0 4 6
1 6 0 2 1 5 0 2
2 3 7 0 2 3 7 0
(d) A1 (e) A2
Transitive Closure
• Definition: The transitive closure matrix,
denoted A+, of a graph G, is a matrix such
that A+[i][j] = 1 if there is a path of length
> 0 fromi to j; otherwise, A*[i][j] = 0.
• Definition: The reflexive transitive closure
matrix, denoted A*, of a graph G, is a
matrix such that A*[i][j] = 1 if there is a
path of length 0 from i to j; otherwise,
A*[i][j] = 0.
Graph G and Its Adjacency Matrix A, A+,
A*
0 1 2 3 4
[ ]
0 0 1 0 0 0
0 1 2 3 4 1 0 0 1 0 0
2 0 0 0 1 0
(a) Digraph G 3 0 0 0 0 1
4 0 0 0 0 0
[ ]
0 1 2 3 4
[ ]
0 0 1 1 1 1
1 0 0 1 1 1 0 1 1 1 1 1
2 0 0 1 1 1 1 0 1 1 1 1
3 0 0 1 1 1 2 0 0 1 1 1
4 0 0 1 1 1 3 0 0 1 1 1
4 0 0 1 1 1
(c) A+
(d) A*
Activity-on-Vertex (AOV) Networks
C9
C10 C11
C1
C8 C12
C2
C3 C7 C13 C14
C4 C5 C6 C15
Figure 6.36 Action of Program 6.11
on an AOV network
1 1 1
0 2 4 2 4 2 4
3 5 3 5 5
1 1
4 4
4
a1 = 6
1 6 a10 = 2
a4 = 6
a7= 9
start 0 4 8 finish
a2 = 4
a5 = 1 a11 = 4
2 7
a3 = 5
a9 = 4
3 5
a6 = 2
event interpretation
0 Start of project
1 Completion of activity a1
4 Completion of activities a4 and a5
7 Completion of activities a8 and a9
8 Completion of project
Adjacency lists for Figure 6.38 (a)
ee [0] [1] [2] [3] [4] [5] [6] [7] [8] Stack
Initial 0 0 0 0 0 0 0 0 0 [0]
output 0 0 6 4 5 0 0 0 0 0 [3,2,1]
output 3 0 6 4 5 0 7 0 0 0 [5,2,1]
output 5 0 6 4 5 0 7 0 11 0 [2,1]
output 2 0 6 4 5 0 7 0 11 0 [1]
output 1 0 6 4 5 5 7 0 11 0 [4]
output 4 0 6 4 5 7 7 0 14 0 [7,6]
output 7 0 6 4 5 7 7 16 14 18 [6]
output 6 0 6 4 5 7 7 16 14 18 [8]
output 8
Chap 7
Sorting
Motivation of Sorting
• The term list here is a collection of records.
• Each record has one or more fields.
• Each record has a key to distinguish one
record with another.
• For example, the phone directory is a list.
Name, phone number, and even address can
be the key, depending on the application or
need.
Sorting
• Two ways to store a collection of records
– Sequential
– Non-sequential
• Assume a sequential list f. To retrieve a
record with key f[i].key from such a list, we
can do search in the following order:
f[n].key, f[n-1].key, …, f[1].key => sequential search
Example of An Element of A Search List
class Element
{
public:
int getKey() const {return key;};
void setKey(int k) {key = k;};
private:
int key;
// other records
…
}
Sequential Search
n−1
O( ∑ (i+1 )=O(n 2 )
i=1
Insertion Sort Example 1
[26 5 37 1 61 11 59 15 48 19 1 10
1 5 11 15 19 26 [59 61 48 37] 7 10
1 5 11 15 19 26 37 48 59 [61] 10 10
1 5 11 15 19 26 37 48 59 61
Quick Sort (Cont.)
• In QuickSort(), list[n+1] has been set to have a key
at least as large as the remaining keys.
• Analysis of QuickSort
– The worst case O(n2)
– If each time a record is correctly positioned, the sublist of
its left is of the same size of the sublist of its right.
Assume T(n) is the time taken to sort a list of size n:
T(n) ≤ cn + 2T(n/2), for some constant c
≤ ≤ cn + 2(cn/2 +2T(n/4))
≤ 2cn + 4T(n/4)
:
:
≤ cn log2n + T(1) = O(n logn)
Lemma 7.1
• Lemma 7.1: Let Tavg(n) be the expected
time for function QuickSort to sort a
list with n records. Then there exists a
constant k such that Tavg(n) ≤ kn logen
for n ≥ 2.
Analysis of Quick Sort
• Unlike insertion sort (which only needs additional
space for a record), quick sort needs stack space to
implement the recursion.
• If the lists split evenly, the maximum recursion
depth would be log n and the stack space is of O(log
n).
• The worst case is when the lists split into a left
sublist of size n – 1 and a right sublist of size 0 at
each level of recursion. In this case, the recursion
depth is n, the stack space of O(n).
• The worst case stack space can be reduced by a
factor of 4 by realizing that right sublists of size
less than 2 need not be stacked. Asymptotic
reduction in stack space can be achieved by sorting
smaller sublists first. In this case the additional
stack space is at most O(log n).
Quick Sort Variations
• Quick sort using a median of three:
Pick the median of the first, middle,
and last keys in the current sublist as
the pivot. Thus, pivot = median{Kl,
K(l+r)/2, Kr}.
Decision Tree
• So far both insertion sorting and quick
sorting have worst-case complexity of O(n2).
• If we restrict the question to sorting
algorithms in which the only operations
permitted on keys are comparisons and
interchanges, then O(n logn) is the best
possible time.
• This is done by using a tree that describes
the sorting process. Each vertex of the tree
represents a key comparison, and the
branches indicate the result. Such a tree is
called decision tree.
Decision Tree for Insertion Sort
Yes K1 ≤ K 2 No
K2 ≤ K 3 K1 ≤ K 3
Yes No Yes No
stop K1 ≤ K 3 stop K2 ≤ K 2
Yes No IV Yes No
I
II III V VI
Decision Tree (Cont.)
• Theorem 7.1: Any decision tree that
sorts n distinct elements has a height
of at least log2(n!) + 1
0 2 4 6 8 a c e g i j k l m n t w z|1 3 5 7 9 b d f h o p q r s u v x y
0 2 4 6 8 a c e g i j k l m n t w z 1 3 5 7 9 b d f h o p q r s u v x y
0 1 2 3 4 5 6 7 8 9 a w v y z x u b c e g i j k|d f h o p q|l m n r s t
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k v z u|y x w o p q|l m n r s t
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k v z u y x w o p q|l m n r s t
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q y x w|v z u r s t
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t|v z u y x w
Analysis of O(1) Space Merge
• Step 1 and 2 and the swapping of Step 3 each take
O( √ n ) time and O(1) space.
• The sort of Step 3 can be done in O(n) time and
O(1) space using an insertion sort.
• Step 4 can be done in O(n) time and O(1) space
using a selection sort. (Selection sort sorts m
records using O(m2) key comparisons and O(m)
record moves. So it needs O(n) comparisons and the
time to move blocks is O(n).
• If insertion sort is used in Step 4, then the time
becomes O(n1.5) since insertion sort needs O(m2)
record moves ( √ n records per block * n record
moves).
Analysis of O(1) Space Merge
(Cont.)
5 26 1 77 11 61 15 59 19 48
1 5 26 77 11 15 59 61 19 48
1 5 11 15 26 59 61 77 19 48
1 5 11 15 19 26 48 59 61 77
Iterative Merge Sort
26 5 77 1 61 11 59 15 48 19
5 26 11 59 19 48
5 26 77 1 61 11 15 59 19 48
1 5 26 61 77 11 15 19 48 59
1 5 11 15 19 26 48 59 61 77
Program 7.11 (Recursive
Merge Sort )
class Element
{
private:
int key;
Field other;
int link;
public:
Element() {link = 0;};
};
26 5 77 1 61 11 59 15 48 19
5 26 77 1 11 59 61 15 19 48
1 5 11 26 59 61 77 15 19 48
1 5 11 15 19 26 48 59 61 77
Heap Sort
• Merge sort needs additional storage space
proportional to the number of records in the file
being sorted, even though its computing time is O(n
log n)
• O(1) merge only needs O(1) space but the sorting
algorithm is much slower.
• We will see that heap sort only requires a fixed
amount of additional storage and achieves worst-
case and average computing time O(n log n).
• Heap sort uses the max-heap structure.
Heap Sort (Cont.)
• For heap sort, first of all, the n
records are inserted into an empty
heap.
• Next, the records are extracted from
the heap one at a time.
• With the use of a special function
adjust(), we can create a heap of n
records faster.
Program 7.13 (Adjusting A
Max Heap)
[1] 26 [1] 77
[2 5 [3 77 [2 61 [3 59
] ] ] ]
15 48 19 15 1 5
[8] [9] [10 [8] [9] [10
] ]
(a) Input array (b) Initial heap
Heap Sort Example
[1] 61 [1] 59
[2 48 [3 59 [2 48 [3 26
] ] ] ]
5 1 5
e[0] e[1] e[2] e[3] e[4] e[5] e[6] e[7] e[8] e[9]
33 859
f[0] f[1] f[2] f[3] f[4] f[5] f[6] f[7] f[8] f[9]
list[1 list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9] list[10
] ]
271 93 33 984 55 306 208 179 859 9
Radix Sort Example (Cont.)
list[1 list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9] list[10
] ]
271 93 33 984 55 306 208 179 859 9
e[0] e[1] e[2] e[3] e[4] e[5] e[6] e[7] e[8] e[9]
f[0] f[1] f[2] f[3] f[4] f[5] f[6] f[7] f[8] f[9]
list[1 list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9] list[10
] ]
306 208 9 33 55 859 271 179 984 93
Radix Sort Example (Cont.)
list[1 list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9] list[10
] ]
306 208 9 33 55 859 271 179 984 93
e[0] e[1] e[2] e[3] e[4] e[5] e[6] e[7] e[8] e[9]
93
55
33 271
f[0] f[1] f[2] f[3] f[4] f[5] f[6] f[7] f[8] f[9]
list[1 list[2] list[3] list[4] list[5] list[6] list[7] list[8] list[9] list[10
] ]
9 33 55 93 179 208 271 306 859 948
List And Table Sorts
• Apart from radix sort and recursive merge sort,
all the sorting methods we have looked at so far
require excessive data movement.
• When the amount of data to be sorted is large,
data movement tends to slow down the process.
• It is desirable to modify sorting algorithms to
minimize the data movement.
• Methods such as insertion sort or merge sort
can be modified to work with a linked list rather
than a sequential list. Instead of physical
movement, an additional link field is used to
reflect the change in the position of the record
in the list.
Program 7.16 (Rearranging Records Using
A Doubly Linked List
i
R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 26 5 77 1 61 11 59 15 48 19
link 9 6 0 2 3 8 5 10 7 1
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 26 5 77 1 61 11 59 15 48 19
link 9 6 0 2 3 8 5 10 7 1
linkb 10 4 5 0 7 2 9 6 1 8
Example 7.9 (Cont.)
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 77 26 61 11 59 15 48 19
link 2 6 0 9 3 8 5 10 7 4
linkb 0 4 5 10 7 2 9 6 4 8
Configuration after first iteration of the for loop of list1, first = 2
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 77 26 61 11 59 15 48 19
link 2 6 0 9 3 8 5 10 7 1
linkb 0 4 5 10 7 2 9 6 1 8
Configuration after second iteration of the for loop of list1, first = 6
Example 7.9 (Cont.)
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 26 61 77 59 15 48 19
link 2 6 8 9 6 0 5 10 7 4
linkb 0 4 2 10 7 5 9 6 4 8
Configuration after third iteration of the for loop of list1, first = 8
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 15 61 77 59 26 48 19
link 2 6 8 10 6 0 5 9 7 8
linkb 0 4 2 6 7 5 9 10 8 8
Configuration after fourth iteration of the for loop of list1, first = 10
Example 7.10 (Rearranging Records Using
Only One Link)
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 77 26 61 11 59 15 48 19
link 4 6 0 9 3 8 5 10 7 1
Configuration after first iteration of the for loop of list1, first = 2
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 77 26 61 11 59 15 48 19
link 4 6 0 9 3 8 5 10 7 1
Configuration after second iteration of the for loop of list1, first = 6
Example 7.10 (Rearranging Records Using
Only One Link)
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 26 61 77 59 15 48 19
link 4 6 6 9 3 0 5 10 7 1
Configuration after third iteration of the for loop of list1, first = 8
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 15 61 77 59 26 48 19
link 4 6 6 8 3 0 5 9 7 1
Configuration after fourth iteration of the for loop of list1, first = 10
Example 7.10 (Rearranging Records Using
Only One Link)
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 15 19 77 59 26 48 61
link 4 6 6 8 10 0 5 9 7 3
Configuration after fifth iteration of the for loop of list1, first = 1
i R1 R2 R3 R4 R5 R6 R7 R8 R9 R10
key 1 5 11 15 19 26 59 77 48 61
link 4 6 6 8 1 8 5 0 7 3
Configuration after sixth iteration of the for loop of list1, first = 9
Table Sort
• The list-sort technique is not well suited for quick
sort and heap sort.
• One can maintain an auxiliary table, t, with one entry
per record. The entries serve as an indirect
reference to the records.
• Initially, t[i] = i. When interchanges are required,
only the table entries are exchanged.
• It may be necessary to physically rearrange the
records according to the permutation specified by t
sometimes.
Table Sort (Cont.)
• The function to rearrange records
corresponding to the permutation t[1],
t[2], …, t[n] can be considered as an
application of a theorem from
mathematics:
– Every permutation is made up of disjoint
cycles. The cycle for any element i is made
up of i, t[i], t2[i], …, tk[i], where tj[i]=t[tj-
1
[i]], t0[i]=i, tk[i]=i.
Program 7.18 (Table Sort)
R1 R2 R3 R4 R5 R6 R7 R8
key 35 14 12 42 26 50 31 18
t 3 2 8 5 7 1 4 6
Initial configuration
1 2 3
key 12 14 18 42 26 35 31 50
t 1 2 3 5 7 6 4 8
key 12 14 18 26 31 35 42 50
t 1 2 3 4 5 6 7 8
∑ (k l +1)
l=0, kl ≠0
run2 (751 – run3 (1501 – run4 (2251 – run5 (3001 – run6 (3751 –
run1 (1 – 750) 2250) 3750) 4500)
1500) 3000)
Example 7.12 (Cont.)
• tIO = ts + tl +trw
tIS = time to internal sort 750 records
ntm = time to merge n records from
input buffers to the output buffer
ts = maximum seek time
tl = maximum latency time
trw = time to read or write on block of 250 records
Example 7.12 (Cont.)
Operation Time
(1) Read 18 blocks of input, 18tIO, 36tIO + 6tIS
internally sort, 6tIS , write 18
blocks, 18tIO
11 15 6 20
6 5 2 4 5 15
2 4
0 1
M4
0 1
M3
0 1
M1 M2
Huffman Function
class BinaryTree {
public:
BinaryTree(BinaryTree bt1, BinaryTree bt2) {
root->LeftChild = bt1.root;
root->RightChild = bt2.root;
root->weight = bt1.root->weight + bt2.root->weight;
}
private:
BinaryTreeNode *root;
}
2 3 5 5 9 7
(c)
2 3
(a)
39
(b)
23
16 23
10 13
9 7 10 13
5 5
5 5
2 3 (d) (e)
2 3
Chap 8
Hashing
Symbol Table
• Symbol table is used widely in many applications.
– dictionary is a kind of symbol table
– data dictionary is database management
• In general, the following operations are performed
on a symbol table
– determine if a particular name is in the table
– retrieve the attribute of that name
– modify the attributes of that name
– insert a new name and its attributes
– delete a name and its attributes
Symbol Table (Cont.)
• Popular operations on a symbol table include search,
insertion, and deletion
• A binary search tree could be used to represent a
symbol table.
– The complexities for the operations are O(n).
• A technique called hashing can provide a good
performance for search, insert, and delete.
• Instead of using comparisons to perform search,
hashing relies on a formula called the hash function.
• Hashing can be divided into static hashing and
dynamic hashing
Static Hashing
0 ≤ i≤ 5
Hash Tables
• Definition: The identifier density of a
hash table is the ratio n/T, where n is
the number of identifiers in the table
and T is the total number of possible
identifiers. The loading density or
loading factor of a hash table is α=
n/(sb).
Hash Tables (Cont.)
• Two identifiers, I1, and I2, are said to be
synonyms with respect to h if h(I1) = h(I2).
• An overflow occurs when a new identifier i is
mapped or hashed by h into a full bucket.
• A collision occurs when two non-identical
identifiers are hashed into the same bucket.
• If the bucket size is 1, collisions and
overflows occur at the same time.
Example 8.1
Slot 1 Slot 2
0 A A2
If no overflow occur, the time
1 required for hashing depends
only on the time required to
2
compute the hash function h.
3 D
4
5
6 GA G Large number
of collisions and
overflows!
25
Hash Function
• A hash function, h, transforms an identifier, x, into
a bucket address in the hash table.
• Ideally, the hashing function should be both easy to
compute and results in very few collisions.
• Also because the size of the identifier space, T, is
usually several orders of magnitude larger than the
number of buckets, b, and s is small, overflows
necessarily occur. Hence, a mechanism to handle
overflow is needed.
Hash Function (Cont.)
• Generally, a hash function should not
have bias on the use of the hash table.
• A uniform hash function supports that
a random x has an equal chance of
hashing into any of the b buckets
Mid-Square
• Mid-Square function, hm, is computed by
squaring the identifier and then using an
appropriate number of bits from the middle
of the square to obtain the bucket address.
• Since the middle bits of the square usually
depend on all the characters in the
identifier, different identifiers are
expected to result in different hash
addresses with high probability.
Division
• Another simple hash function is using the
modulo (%) operator.
• An identifier x is divided by some number M
and the remainder is used as the hash
address for x.
• The bucket addresses are in the range of 0
through M-1.
• If M is a power of 2, then hD(x) depends
only on the least significant bits of x.
Division (Cont.)
• If p =3, then
( f D ( x )− f D ( y )) %p= ( 643C ( x 1 )3 +C ( x 2 )3 )
−643C( x 2 ) 3 −C ( x 1 ) 3 ) 3
C ( x 1 ) 3 +C ( x 2 ) 3 −C ( x 2 )3 −C ( x 1 )3
03
Division (Cont.)
• Program in which many variables are
permutations of each other would again
result in a biased use of the table and hence
result in many collisions.
– In the previous example, 64%3 =1 and 64%7=1.
• To avoid the above problem, M needs to be a
prime number. Then, the only factors of M
are M and 1.
Folding
• The identifier x is partitioned into several
parts, all but the last being of the same
length.
• All partitions are added together to obtain
the hash address for x.
– Shift folding: different partitions are added
together to get h(x).
– Folding at the boundaries: identifier is folded
at the partition boundaries, and digits falling
into the same position are added together to
obtain h(x). This is similar to reversing every
other partition and then adding.
Example 8.2
• x=12320324111220 are partitioned into three decimal digits long.
P1 = 123, P2 = 203, P3 = 241, P4 = 112, P5 = 20.
• Shift folding: 5
h( x )=∑ Pi =123+203 +241+112+20=699
i=1
• Folding at the boundaries: 123 203 241 112 20
302
h(x) = 123 + 302 + 241 + 211 + 20 = 897
211
241
Digit Analysis
• This method is useful when a static file where all
the identifiers in the table are known in advance.
• Each identifier x is interpreted as a number using
some radix r.
• The digits of each identifier are examined.
• Digits having most skewed distributions are
deleted.
• Enough digits are deleted so that the number of
remaining digits is small enough to give an address
in the range of the hash table.
Overflow Handling
• There are two ways to handle
overflow:
– Open addressing
– Chaining
Open Addressing
• Assumes the hash table is an array
• The hash table is initialized so that
each slot contains the null identifier.
• When a new identifier is hashed into a
full bucket, find the closest unfilled
bucket. This is called linear probing or
linear open addressing
Example 8.3
0 A
• Assume 26-bucket table with 1 A2
one slot per bucket and the 2 A1
following identifiers: GA, D, 3 D
A, G, L, A2, A1, A3, A4, Z, 4 A3
ZA, E. Let the hash function 5 A4
6 GA
h(x) = first character of x.
7 G
• When entering G, G collides 8 ZA
with GA and is entered at 9 E
ht[7] instead of ht[6].
25
Open Addressing (Cont.)
• When linear open address is used to handle
overflows, a hash table search for identifier
x proceeds as follows:
– compute h(x)
– examine identifiers at positions ht[h(x)], ht[h(x)
+1], …, ht[h(x)+j], in this order until one of the
following condition happens:
• ht[h(x)+j]=x; in this case x is found
• ht[h(x)+j] is null; x is not in the table
• We return to the starting position h(x); the table is
full and x is not in the table
Linear Probing
25 Z
Quadratic Probing
• One of the problems of linear open addressing is
that it tends to create clusters of identifiers.
• These clusters tend to merge as more identifiers
are entered, leading to bigger clusters.
• A quadratic probing scheme improves the growth of
clusters. A quadratic function of i is used as the
increment when searching through buckets.
• Perform search by examining bucket h(x), (h(x)
+i2)%b, (h(x)-i2)%b for 1 ≤ i ≤ (b-1)/2.
• When b is a prime number of the form 4j+3, for j
an integer, the quadratic search examine every
bucket in the table.
Rehashing
• Another way to control the growth of
clusters is to use a series of hash
functions h1, h2, …, hm. This is called
rehashing.
• Buckets hi(x), 1 ≤ i ≤ m are examined in
that order.
Chaining
• We have seen that linear probing perform poorly because the
search for an identifier involves comparisons with identifiers
that have different hash values.
– e.g., search of ZA involves comparisons with the buckets ht[0] –
ht[7] which are not possible of colliding with ZA.
• Unnecessary comparisons can be avoided if all the synonyms
are put in the same list, where one list per bucket.
• As the size of the list is unknown before hand, it is best to
use linked chain.
• Each chain has a head node. Head nodes are stored
sequentially.
Hash Chain Example
ht
0 A4 A3 A1 A2 A 0
1 0
2 0
3 D 0
4 E 0
5 0
6 G GA 0
7 0
8 0
Average search length is (6*1+3*2+1*3+1*4+1*5)/12
9 0 =2
10 0
11 L 0
25 ZA Z 0
Chaining (Cont.)
• The expected number of identifier comparisons can
be shown to be ~ 1 +α/2, where αis the loading
density n/b (b=number of head nodes). For α=0.5,
it’s 1.25. And if α=1, then it’s about 1.5.
• Another advantage of this scheme is that only the
b head nodes must be sequential and reserved at
the beginning.
• The scheme only allocates other nodes when they
are needed. This could reduce overall space
requirement for some load densities, despite of
links.
Hash Functions
• Theoretically, the performance of a hash table
depends only on the method used to handle
overflows and is independent of the hash function
as long as an uniform hash function is used.
• In reality, there is a tendency to make a biased use
of identifiers.
• Many identifiers in use have a common suffix or
prefix or are simple permutations of other
identifiers.
– Therefore, different hash functions would give different
performance.
Average Number of Bucket Accesses
Per Identifier Retrieved
shift fold 1.33 21.75 1.48 65.10 1.40 77.01 1.51 118.57
A0 100 000
A1 100 001
B0 101 000
B1 101 001
C0 110 000
C1 110 001
C2 110 010
C3 110 011
C5 110 101
• Now put these identifiers into a table of four pages. Each page can hold
at most two identifiers, and each page is indexed by two-bit sequence
00, 01, 10, 11.
• Now place A0, B0, C2, A1, B1, and C3 in a binary tree, called Trie, which
is branching based on the last significant bit at root. If the bit is 0, the
upper branch is taken; otherwise, the lower branch is taken. Repeat this
for next least significant bit for the next level.
A Trie To Hold Identifiers
0 A0, B0 0 A0, B0
0 1 0
C2 1 C2
0 A1, B1
1
0 A1, B1 1
0
1 C5
1 C3 1 C3
(a) two-level trie 0 A0, B0
on four pages (b) inserting C5
0
1 C2 0 A1, C1 with overflow
0
1 0 1 B1
1 C5 (c) inserting C1
with overflow
1 C3
Issues of Trie Representation
• From the example, we find that two
major factors that affects the
retrieval time.
– Access time for a page depends on the
number of bits needed to distinguish the
identifiers.
– If identifiers have a skewed distribution,
the tree is also skewed.
Extensible Hashing
• Fagin et al. present a method, called extensible
hashing, for solving the above issues.
– A hash function is used to avoid skewed distribution. The
function takes the key and produces a random set of
binary digits.
– To avoid long search down the trie, the trie is mapped to a
directory, where a directory is a table of pointers.
– If k bits are needed to distinguish the identifiers, the
directory has 2k entries indexed 0, 1, …, 2k-1
– Each entry contains a pointer to a page.
Trie Collapsed Into Directories
a a a
0000 c A0, B0
00 A0, B0 000 A0, B0
c c 0001 A1, C1
001 A1, B1 b
01 A1, B1 0010 f C2
b b
10 C2 010 C2 0011 a C3
d e 0100
11 C3 011 C3 e
a 0101 C5
b
100 0110 f
d 0111
101 C5 a
b 1000 d
110 1001 B1
c b
111 1010
f
1011
a
1100 e
1101
b
1110 f
1111
0 A0, B0 00 A0
B0
0 1 01 C2
C2
-
10 A1
0 A1, B1
1 B1
11 C3
1 C3 -
Directoryless Scheme Overflow
Handling
00 A0 000 A0 000 A0
B0 B0 B0
overflow
01 C2 001 C2 001 C2
page
- - -
10 A1 010 A1 010 A1
B1 B1 C5 B1 C1 C5
11 C3 011 C3 011 C3
- - -
100 - 100 -
- new page - new page
101 -
-
Figure 8.14 The rth Phase of Expansion of
Directoryless Method
pages already split pages not yet split pages added so far
q r
2r pages at start
Analysis of Directoryless
Hashing
• The advantage of this scheme is that for many
retrievals the time is one access for those
identifiers that are in the page directly addressed
by the hash function.
• The problem is that for others, substantially more
than two accesses might be required as one moves
along the overflow chain.
• Also when a new page is added and the identifiers
split across the two pages, all identifiers including
the overflows are rehashed.
• Hence, the space utilization is not good, about 60%.
(shown by Litwin).
Chap 9
Priority Queues
Operations supported by priority queue
A G
B C H I
D E F J
Leftist Trees (Cont.)
• Let x be a node in an extended binary tree. Let
LeftChild(x) and RightChild(x), respectively,
denote the left and right children of the internal
node x.
• Define shortest(x) to be the length of a shortest path
from x to an external node. It is easy to see that
shortest(x) satisfies the following recurrence:
0 if x is an external node
¿
shortest(x) = ¿
1 + min{shortest(LeftChild(x)), RightChild(x))} otherwise
Shortest(x) of Extended Binary Trees
2 2
A G
2 1 1 1
B C H I
1 1 1 1
D E F J
Leftist Tree Definition
Definition: A leftist tree is a binary tree such
that if it is not empty, then
shortest(LeftChild(x)) ≥
shortest(RightChild(x)) for every internal
node x.
Lemma 9.1: Let x be the root of a leftist tree
that has n (internal) nodes
(a)n ≥ 2shortest(x) – 1
(b)The rightmost root to external node path is
the shortest root to external node path. Its
length is shortest(x).
Class Definition of A Leftist
Tree
template<class KeyType>
class MinLeftistTree:public MinPQ<KeyType> {
public:
// constructor
MinLeftistTree(LeftistNode<KeyType> *int = 0) root(int) {};
// the three min-leftist tree operations
void Insert(const Element<KeyType>&);
Element<KeyType>* DeleteMin(Element<KeyType>&);
void MinCombine(MinLeftistTree<KeyType>*);
private:
LeftistNode<KeyType>* MinUnion(LeftistNode<KeyType>*, LeftistNode<KeyType>*);
LeftistNode<KeyType>* root;
};
Definition of A Min (Max) Leftist Tree
2 2
2 2
1 1 1 1
7 50 9 8
1 1 2 1
11 80 12 10
1 1 1 1
13 20 18 15
Min (Max) Leftist Tree (Cont.)
• Like any other tree structures, the popular
operations on the min (max) leftist trees are
insert, delete, and combine.
• The insert and delete-min operations can
both be done by using the combine operation.
– e.g., to insert an element x into a min leftist tree,
we first create a min leftist tree that contains
the single element x. Then we combine the two
min leftist trees.
– To delete the min element from a nonempty min
leftist tree, we combine the min leftist trees
root->LeftChild and root->RightChild and delete
the node root.
Combine Leftist Trees
• To combine two leftist trees:
– First, a new binary tree containing all elements in
both trees is obtained by following the rightmost
paths in one or both trees.
– Next, the left and right subtrees of nodes are
interchanged as necessary to convert this binary
tree into a leftist tree.
• The complexity of combining two leftist
trees is O(log n)
Combining The Min Leftist Tree
2
2 2
8
1 1 1 2
10 50 7 5
1 1 1 1 2 2
15 80 11 9 8 2
1 2 1 1 2 2
13 12 10 50 5 7
2 1 1 1
5 20 15 2
18 80 1 1
1 2 8 9 11
9 8 1 1 2 1
2 1 1 10 50 12 13
12 10 50 1
1 1 1
1 1 15 80 20 18
20 18 15 80
Binomial Heaps
• A binomial heap is a data structure that
supports the same functions as those
supported by leftist trees.
• Unlike leftist trees, where an individual
operation can be performed in O(log n) time,
it is possible that certain individual
operations performed on a binomial heap may
take O(n) time.
• By amortizing part of the cost of expensive
operations over the inexpensive ones, then
the amortized complexity of an individual
operation is either O(1) or O(log n) depending
on the type of operations.
Cost Amortization
• Given a sequence of operations I1, I2, D1, I3, I4, I5, I6, D2,
I7. Assume each insert operation costs one time unit and D1
and D2 operations take 8 and 10 time units, respectively.
• The total cost to perform the sequence of operations is 25.
• If we charge some actual cost of an operation to other
operations, this is called cost amortization.
• In this example, the amortized cost of I1 – I6 each has 2 time
units, I7 has one, and D1 and D2 each has 6.
• Now suppose we can prove that no matter what sequence of
insert and delete-min operations is performed, we can charge
costs in such a way that the amortized cost of each insertion is
no more than 2 and that of each deletion is no more than 6. We
can claim that the sequence of insert/delete-min operations
has cost no more than 2*i + 6*d.
• With the actual cost, we conclude that the sequence cost is no
more than i+10*d.
• Combining the above two bounds, we obtain min{2*i+6*d,
i+10*d}.
Binomial Heaps
• Binomial heaps have min binomial heap and max binomial
heap.
• We refer to the min binomial heap as B-heap.
• B-heap can perform an insert and a combine operation
in O(1) actual and amortized time and a delete-min
operation with O(log n) amortized time.
• A node in a B-heap has the following data members:
– degree: is the number of children it has
– child: is a pointer points to any one of its children. All children
forms a circular list.
– link: is a singly link used to maintain a circular list with its
siblings.
– data
• The roots of the min trees that comprise a B-heap are
linked to form a singly linked circular list. The B-heap
is then pointed at by a single pointer min to the min
tree root with smallest key.
B-Heap Example
1
8 3
12 7 16
10 5 4
15 30 9
6
min
20
8 3 1
10 12 7 16
5 4
6 15 30 9
20
Insertion Into A B-Heap
• An element x can be inserted into a B-heap
by first putting x into a new node and then
inserting this node into the circular list
pointed at by min. The operation is done in
O(1) Time.
• To combine two nonempty B-heaps, combine
the top circular lists of each into a single
circular list.
• The new combined B-heap pointer is the min
pointer of one of the two trees, depending on
which has the smaller key.
• Again the combine operation can be done in
O(1) time.
The Tree Structure After Deletion of
Min From B-Heap
8 3 12 7 16
10 15 30 9
5 4
6 20
Deletion of Min Element
• If min is 0, then the B-heap is empty. No
delete operations can be performed.
• If min is not 0, the node is pointed by min.
Delete-min operation deletes this node from
the circular list. The new B-heap consists of
the remaining min trees and the submin trees
of the delete root.
• To form the new B-heap, min trees with the
same degrees are joined in pairs. The min
tree whose root has the larger key becomes
the subtree of the other min tree.
Joining Two Degree-One Min
Trees
7 3 12 16
8 9 5 4 15 30
10 6 20
Joining Two Degree-Two Min
Trees
3 16
12
7 5 4
15 30
8 9 6
20
min
12 3 16
15 30 7 5 4
20 8 9 6
10
Program 9.12 Steps In A Delete-Min
Operation
template<class KeyType>
Element<KeyType>*Binomial<KeyType>::DeleteMin(Element<KeyType>&
x)
8 3 1 11 30
10 4 16 20
5 7
6 9
min
8 3 1 11 30
10 4 16 20
5 7
6 9
Decrease Key
• To decrease the key in node b, do the
following:
(1) Reduce the key in b
(2) If b is not a min tree root and its key is smaller
than that in its parent, then delete b from its
doubly linked list and insert it into the doubly
linked list of min tree roots.
(3) Change min to point to b if the key in b is
smaller than that in min.
F-Heap After The Reduction of 15 by 4
min
8 3 1
10 4 12 7 16
5
6 15 30 9
20 min
8 3 1 11
10 4 12 7 16 20
5
6 30 9
Cascading Cut
• Because the new delete and decrease-key
operations, the F-heap is not necessary a
Binomial tree. Therefore, the analysis of
theorem 9.1 is no longer true for F-heaps
if no restructuring is done.
• To ensure that each min tree of degree k
has at least ck nodes, for some c, c> 1, each
delete and decrease-key operations must
be followed by a particular step called
cascading cut.
• The data member ChildCut is used to
assist the cascading cut step.
• ChildCut data member is only used for non-
Cascading Cut (Cont.)
• ChildCut of node x is TRUE iff one of the children of
node x was cut off after the most recent time x was
made the child of its current parent.
• Whenever a delete or decrease-key operation deletes
a node q that is not a min tree root from its doubly
linked list, then the cascading cut step is invoked.
• During the steps, we examine the nodes on the path
from the parent p of the deleted node q up the nearest
ancestor of the deleted node with ChildCut = FALSE.
• If there is no such ancestor, then the path goes from p
to the root of the min tree containing p.
• All nonroot nodes on this path with ChildCut data
member TRUE are deleted from their respective
doubly linked list and added to the doubly linked list of
min tree root nodes of the F-heap.
• If the path has a node with ChildCut set to FALSE,
then it is changed to TRUE.
A Cascading Cut Example
2 10 12 10
4 5 16 15 18 30 11
6 60
8 6
20 8 7
20 7
10
2
30 12 11
4 5 ChildCut=TRUE
*
14 18
60
16 15
F-Heap Analysis
7 min
70 40 max
30 9 10 15 min
45 50 30 20 12 max
Min-Max Heap (Cont.)
7 min
70 40 max
30 9 10 15 min
45 50 30 20 12 j max
Min-Max Heap After Inserting Key 5
5 min
70 40 max
30 9 7 15 min
45 50 30 20 12 10 max
Min-Max Heap After Inserting Key 80
7 min
70 80 max
30 9 10 15 min
45 50 30 20 12 40 max
Program 9.3 Insertion Into A Min-Max
Heap
template <class KeyType>
void MinMaxHeap<KeyType>::Insert(const Element<KeyType>& x)
// inset x into the min-max heap
{
if (n==MaxSize) {MinMaxFull(); return;}
n++;
int p =n/2; // p is the parent of the new node
if(!p) {h[1] = x; return;} // insert into an empty heap
switch(level(p)) {
case MIN:
if (x.key < h[p].key) { // follow min levels
h[n] = h[p];
VerifyMin(p, x);
}
else { VerifyMax(n, x); } // follow max levels
break;
case MAX:
if (x.key > h[p].key) { // follow max levels
h[n] = h[p];
VerifyMax(p, x);
}
else { VerifyMin(n, x); } // follow min levels
break;
}
}
Program 9.4 Searching For The Correct Max
Node For Insertion
12 min
70 40 max
30 9 10 15 min
45 50 30 20 max
Deletion of the Min Element
(Cont.)
• When delete the smallest key from the min-max heap, the
root has the smallest key (key 7). So the root is deleted.
• The last element with key 12 is also deleted from the min-
max heap and then reinsert into the min-max heap. Two
steps to follow:
– The root has no children. In this case x is to be inserted into
the root.
– The root has at least one child. Now the smallest key in the
min-max heap is in one of the children or grandchildren of the
root. Assume node k has the smallest key, then following
conditions must be considered:
• x.key ≤ h[k].key. x may be inserted into the root.
• x.key >h[k].key and k is a child of the root. Since k is a max node, it
has not descendents with key larger than h[k].key. So, node k has
no descendents with key larger than x.key. So the element h[k]
may be moved to the root, and x can be inserted into node k.
• x.key> h[k] and k is a grandchild of the root. h[k] is moved to the
root. Let p the parent of k. If x.key > h[p].key, then h[p] and x are
to be interchanged.
Min-Max Heap After Deleting Min
Element
9 min
70 40 max
30 12 10 15 min
45 50 30 20 max
Deaps
• A deap is a double-ended heap that
supports the double-ended priority
operations of insert, delet-min, and
delete-max.
• Similar to min-max heap but deap is
faster on these operations by a
constant factor, and the algorithms are
simpler.
Deaps (Cont.)
• Definition: A deap is a complete binary tree that
is either empty or satisfies the following
properties:
(1) The root contains no element
(2) The left subtree is a min heap.
(3) The right subtree is a max heap.
(4) If the right subtree is not empty, then let i be
any node in the left subtree. Let j be the
corresponding node in the right subtree. If such
a j does not exist, then let j be the node in the
right subtree that corresponds to the parent
of i. The key in node i is less than or equal to
that of j.
A Deap Example
5 45
10 8 25 40
15 19 9 30 20
Deap (Cont.)
• From Deap’s definition, it’s obvious that for an n-element
deap, the min element is the root of min heap and the max
element is the root of the max heap.
• If n = 1, then the min and max elements are the same and
are in the root of the min heap.
• Since deap is a complete binary tree, it may be stored as an
implicit data structure in a one-dimension array similar to
min, max, min-max heaps.
• In the case of deap, the position 1 of the array is not used.
For an n-element deap, it occupied n+1 element of an array.
• If i is a node in the min heap of the deap, its corresponding
node in the max heap is i+ 2⌊ log i ⌋−1 .
2
5 45
10 8 25 40
15 19 9 30 20 j
i
Figure 9.8: Deap Structure After
Insertion of 4
4 45
5 8 25 40
15 10 9 30 20 19
Figure 9.8: Deap Structure After
Insertion of 30
5 45
10 8 30 40
15 19 9 30 20 25
Program 9.7: Inserting Into A
Deap
8 45
10 9 25 40
15 19 20 30
Part I – AVL Trees
Unbalanced Binary Search
Tree
Number of 5
comparisons
4 8
needed to search
for NOV: 6. 1 7 9
Average number
2 6
of comparisons: 12
3.5 3
11
10
Skew Binary Search Tree
Consider the keys are entered in
lexicographic
1
2
order.
3
8
Binary Search Tree
Consider a balanced
binary search tree as
illustrated. 6
Number of comparisons 4 9
comparisons: 3.1
Binary Search Trees:
Balanced vs. Unbalanced
The average and maximum search time
can be minimized if the binary search
tree is maintained as a complete binary
tree all the times.
The time becomes O(log n) for an n-node binary
search tree.
AVL tree (1962)
Balanced binary search tree with respect to the
heights of subtrees.
Any retrievals can be performed in O(log n)
Height-Balanced
Definition
An empty tree is height-balanced.
If T is nonempty binary tree with TL and TR as
its left and right subtrees respectively.
T is height-balanced iff
• TL and TR are height-balanced, and
• |hL-hR|≦1 where hL and hR are heights of TL and
TR, respectively.
Examples
5
6
4 8
4 9
1 7 9
2 5 8 11
2 6 12
11 1 3 7 10 12
3
10
1 1
0 1 -1
0
0
0 0 -1 0
0
Construction of an AVL Tree
Consider to insert the following numbers:
8, 9, 10, 2, 1, 5, 3, 6, 4, 7, 11, 12
0 -1
8 8
0
Insert 8 Insert 9
Consider the
-2 nearest parent
A with bf = ±2
8 0
-1
9
RR
9 0 0
0 8 10
insert in the
10 right subtree of
the right
Insert 10 subtree of A
1
9
1 0
8 10
0
Insert 2
2
Consider the
2 9 nearest parent 1
0
A with bf = ±2
0 9
8 10 0
1
LL 2 10
2 0
0 0
1 8
1
insert in the
left subtree of
the left subtree
Insert 1 of A
Consider the
2 nearest parent 0
A with bf = ±2
0 8
9 -1
-1 0
LR 2 9
2 10 0 0
0 0
1
1 5 10
1 8 insert in the
0 right subtree of
the left subtree
5 of A
Insert 5
1 1
-1 8 -1 8
-1 -1
2 9 2 9
0 0 0 0
1 0
1 5 10 1 5 10
0 0 0
3 3 6
Insert 3 Insert 6
1
2 Consider the
nearest
8
-2 8 parent A 0 -1
-1
with bf = ±2
3 9
2 9 0
0 0 1 0
1
1 RL 2 5 10
5 10 0
0 0
-1 0
1
3 6 insert in the 4 6
0
left subtree
of the right
4
subtree of A
Insert 4
2 0
5
88
-1 -1 1 0
3 9 3 8
1 -1 0
LR 1 0 -1 -1
2 5 10 2 4 6 9
0
0 0 0 0
-1
1 1 7 10
4 6
0
Insert 7
-1
5
1 -1
3 8
1 0 -1 -2
9
0
2 4 6
0 -1
RR
1 7 10 0 0
11 5
0
Insert 11 1
3 8
1 0 -1 0
2 4 6 10
0 0 0 0
1 7 9 11
-1
5
1 -1
3 8
1 0 -1 -1
2 10
0 4 6
0 0 -1
1 7 9 11
0
12
Insert 12
Rotation Types (1)
Suppose Y is the new node.
LL: Y is inserted in the left subtree of the left
subtree of A.
A B
B LL C A
TA
C TB TB TA
Rotation Types (2)
LR: Y is inserted in the right subtree of the left
subtree of A
A
C
B TA LR
B A
C
TCL TCR TA
TCL TCR
Rotation Types (3)
RR: Y is inserted in the right subtree of the
right subtree of A.
A B
TA B RR A C
TB C
TA TB
Rotation Types (4)
RL: Y is inserted in the left subtree of the right
subtree of A
C
A
RL A B
TA B
C
TA TCL TCR
TCL TCR
The Class Definition of AVL
Tree
class AvlNode {
friend class AVL;
public:
AvlNode(int k) {data = k; bf = 0; leftChild = NULL; rightChild =
NULL; }
private: Store the value of
int data; balance factor of the
int bf; node
AvlNode *leftChild, *rightChild;
};
class AVL {
public:
AVL() : root(0) {};
bool Search(int key);
bool Insert(int key);
bool Delete(int key);
private:
AvlNode *root;
};
Phase 1
bool AVL::Insert(int key)
{
if (!root)
{
root = new AvlNode(key);
return true;
}
}
if (pa == NULL)
root = b; pa
else if (a == pa->leftChild)
9 pa
pa->leftChild = b; a
else 9
b
pa->rightChild = b; b 8 10
return true; 2 10
} 2 a
LL
1 8
1
Data Structures
Chapter 11: Multiway Search Trees
11-574
m-way Search Trees
Definition: An m-way search tree, either is
empty or satisfies the following properties:
(1) The root has at most m subtrees and has
the following structures:
n, A0, (K1, A1), (K2, A2), …, (Kn, An)
where the Ai, 0 ≤ i ≤ n ≤ m, are pointers to
subtrees, and the Ki, 1 ≤ i ≤ n ≤ m, are key
values.
(2) Ki < Ki +1, 1 i n
(3) Let K0 =- and Kn+1 =. All key values in
the subtree Ai are less than Ki +1 and greater
than Ki , 0 i n 11-575
(4) The subtrees A , 0 i n , are also m-way
A Three-way Search Tree
a
T 20, 40
b c d
10, 15 25, 30 45, 50
e
28 node schematic format
a 2, b, (20, c), (40, d)
b 2, 0, (10, 0), (15, 0)
c 2, 0, (25, e), (30, 0)
d 2, 0, (45, 0), (50, 0)
e 1, 0, (28, 0)
11-576
Searching an m-way Search Tree
Suppose to search an m-way search tree T for
the key value x. By searching the keys of the
root, we determine i such that Ki x < Ki+1.
– If x = Ki, the search is complete.
– If x Ki, x must be in a subtree Ai if x is in
T.
We proceed to search x in subtree Ai and
continue the search until we find x or
determine that x is not in T.
m i
Maximum number ( h
of nodes
m
0i h 1
1) /(m in
1) a tree of degree m
and height h:
11-577
B-trees (1)
Definition: A B-tree of order m is an m-way
search tree that either is empty or satisfies
the following properties:
(1) 2 # children of root m
(2) All nodes other than the root node and
external nodes:
m/2 # children m
(3) All external nodes are at the same level.
B C
2-3-4 tree: B-tree 10 20 80
of order 4 50
10 70 80
5 7 8 30 40 60 75 85 90 92
11-580
Insertion into a B-tree of
Order 5
320 540
430 480
11-581
395 430 480
380 382 406 412 451 472 493 506 511 518
87 140
97 102 140
11-584
Deletion from a B-tree of
Order 5
(i) Rotation: Shift a key from its parent and
its sibling.
80 120 150
80 126 150
11-585
(ii) Combination: Take a key from its parent
and combine with its sibling.
80 126 150
11-586
(iii) Combine, then rotate for its parent
60 170
A B C D E
65 72 87 96 153 162 173 178 187 202
Delete 65,
combine
60 180 and rotate
A B C D E
72 80 87 96 153 162 173 178 187 202
11-587
(iv) Combine, then combine for its parent.
60 180 300
G
A B C D E F
153 162 173 178 187 202
Deleting 173
60 300 and a double
G combination
30 50 150 180 220 280
D E F
A B C
153 162 170 178 187 202
11-588
Deletion in a B-tree
The combination process may be done up to the
root.
If the root has more than one key,
Done
If the root has only one key
remove the root
The height of the tree decreases by 1.
B C D
data
10 30 70 80
2 12 20 32 40 71 80
4 16 25 36 50 72 82
6 18 60 84
11-590
Insertion into the B+-tree
A A
20 40 20 40
B C D B C D
10 30 70 80 10 16 30 70 80
2 12 20 32 40 71 80 2 12 16 20 32 40 71 80
4 16 25 36 50 72 82 4 14 18 25 36 50 72 82
6 18 60 84 6 60 84
(a) Initial B+-tree G (b) 14 inserted (split)
40
A F
20 80
(c) 86 inserted
B C D E
(split)
10 16 30 70 84
2 12 16 20 32 40 71 80 84
4 14 18 25 36 50 72 82 86 11-591
6 60
Deletion from a B+-tree
A A
20 40 20 40
B C D B C D
10 30 70 80 10 16 30 70 82
2 12 20 32 40 71 80 2 12 16 20 32 40 72 82
4 16 25 36 50 72 82 4 14 18 25 36 50 80 84
6 18 60 84 6 60
(a) Initial B+-tree A (b) 71 deleted
20 40 (borrow)
B C D
10 16 30 70
(c) 80 deleted
2 12 16 20 32 40 72 (combine)
4 14 18 25 36 50 82
11-592
6 60 84
Deleting 32 (1)
G
40
A F
20 80
B C D E
10 16 30 70 84
2 12 16 20 32 40 71 80 84
4 14 18 25 36 50 72 82 86
6 60
(a) Initial B+-tree
G
40
A F
20 80
B C D E
10 16 70 84
2 12 16 20 40 71 80 84 (b) C is deficient
4 14 18 25 50 72 82 86 11-593
6 36 60
Deleting 32 (2)
G
40
A F
20 80
B C D E
10 16 70 84
2 12 16 20 40 71 80 84
4 14 18 25 50 72 82 86
6 36 60 (b) C is deficient
G
40
A F
16 80
B C D E
10 20 70 84
(c) After borrowing
2 12 16 20 40 71 80 84 from B
4 14 18 25 50 72 82 86
11-594
6 36 60
Deleting 86 (1)
G
40
A F
20 80
B C D E
10 16 30 70 84
2 12 16 20 32 40 71 80 84
4 14 18 25 36 50 72 82 86
6 60
(a) Initial B+-tree
G
40
A F
20 80
B C D E
10 16 30 70
(b) E becomes
2 12 16 20 40 71 80 deficient (combine)
4 14 18 25 50 72 82 11-595
6 36 60 84
Deleting 86 (2)
G
40
A F
20
B C D
10 16 30 70 80 (c) F becomes deficient
(combine)
2 12 16 20 32 40 71 80
4 14 18 25 36 50 72 82
6 60 84
A
20 40
B C D
10 16 30 70 80
(d) After combining
2 12 16 20 32 40 71 80 A,G,F
4 14 18 25 36 50 72 82 11-596
6 60 84