sorting_bound
sorting_bound
Comparison-Based Sorting
These algorithms use “comparisons” to achieve their output.
insertion_sort and mergesort are comparison-based sorting algorithms.
quicksort is a comparison-based algorithm.
A comparison compares two values. e.g. Is A[0] < A[1]? Is A[0] < A[4]?
Recall, insertion sort.
4 3 1 5 2 Is 3 < 4?
...
3 4 1 5 2 Is 1 < 4? Is 1 < 3?
...
Comparison-Based Sorting
Theorem: Any deterministic comparison-based sorting algorithm requires
Ω(n log(n))-time.
Proof:
Hmm …
Comparison-Based Sorting
We can represent the comparisons made by a comparison-
based sorting algorithm as a decision tree.
Suppose we want to sort three items in A.
…
A[0], A[1], A[2] A[1], A[0], A[2]
Runtime: O(n+k)
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 0 0 0 0
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 1 0 0 0
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 0 0 0
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 0 0 1
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 1 0 1
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 2 0 1
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 2 0 2
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 2 3 0 2
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 3 3 0 2
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 3 3 0 2
result 0 0 0
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 3 3 0 2
result 0 0 0 1 1 1
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 3 3 0 2
result 0 0 0 1 1 1
Counting sort
Suppose A consists of 8 ints ranging from 0 to 3.
counting_sort(A, 4)
0 0 3 1 1 3 1 0
counts 3 3 0 2
result 0 0 0 1 1 1 3 3
Bucket sort
algorithm bucket_sort(A, k, num_buckets):
# A consists of n (key, value) pairs,
# with keys ranging from 0 to k-1
buckets = [[] * num_buckets]
for key, value in A:
buckets[get_bucket(key)].append((key, value))
if num_buckets < k:
for bucket in buckets:
stable_sort(bucket) by their keys
result = concatenate buckets by their values
return result
1 13 17 28
0 12 16 27
15
Bucket sort, case (2)
Why O(nlogn) in case (2)?
With multiple keys per bucket, a bucket might receive all of the inserted keys.
Suppose the bucket_sort caller specifies k = 3000 and num_buckets = 10, but
then inserts elements all from the same bucket.
A = [380, 370, 340, 320, 410, …] would need to stable_sort all of the
elements in the original list since they all fall in the same bucket.
0-299 300-599 600-899 900-1199 1200-1499 1500-1799 1800-2099 2100-2399 2400-2699 2700-2999
380
370
340
320
410
...
Radix sort
algorithm radix_sort(A, d, k):
# A consists of n d-digit ints, with
# digits ranging 0 -> k-1
for j = 0 to d-1:
A_j = A converted to (key, value) pairs, where
key is the jth digit of value
result = bucket_sort(A_j, k, k)
A = result
return A
Runtime: O(d(n+k))
Radix sort
Suppose A consists of 8 3-digit ints, with digits ranging from
0 to 9.
radix_sort(A, 3, 10)
j 0
A_j (1, 031) (5, 005) (0, 210) (4, 014) … (5, 125)
j 1
Radix sort
Suppose A consists of 8 3-digit ints, with digits ranging from
0 to 9.
radix_sort(A, 3, 10)
j 1
A_j (1, 210) (3, 031) (1, 014) (0, 005) … (7, 477)
j 2
Radix sort
Suppose A consists of 8 3-digit ints, with digits ranging from
0 to 9.
radix_sort(A, 3, 10)
j 2
A_j (0, 005) (2, 210) (0, 014) (1, 125) … (0, 095)
1 2 3 4 5 6 7 8 9 10 ... n-1 n
O(1)
Insert/Delete given a pointer to the O(n) O(log n)
element; otherwise, O(n)
to search for it
Tree Terminology
This is a “vertex” or The vertex without a
“node”; it has key 3 5 parent is the “root”
3 6
2 4 7
The left child
of 3 is 2
1 The right child
of 3 is 4
3 6
2 4 7
1
The parent of 1 is
2; the ancestors of
1 are 2, 3, and 5.
Binary Search Trees
Binary Trees are trees such that each vertex has at most 2
children.
Binary Search Trees are Binary Trees such that:
Every left descendent of a vertex has key less than that vertex.
Every right descendent of a vertex has key greater than that vertex.
Building BSTs by Example
7 4 1 5 3 6 2
Building BSTs by Example
4 1 3 2 5 7 6
4 1 3 2 7 6
1 2 3 4 6 7
3 6
1 2 4 7
3 6
2 4 7
3 6
2 4 7
3 6 2 6
2 4 7 1 3 7
1 4
5 5
4 6 2 6
2 7 1 4 7
1 3 3
3 6 2 6
1 4 7 1 3 7
2 5
5 4
4 7 2 5
2 6 1 7 6
1 3 3
3 6
2 4 7
3 6
2 4 7
3 6
2 4 7
x 5 x 5
x 5 3 3
4 2 4
3 3.1
5 5
5
4 3.1
2 4
Runtime Analysis
5
3 6
2 4 7
5
4
2
1
But this is a valid BST and the depth of the tree is n, resulting
in a runtime of O(n) for search.
In what order would we need to insert vertices
to generate this tree?
What To Do?
We could keep track of the depth of the tree. If it gets too tall,
re-do everything from scratch.
At least Ω(n) every so often …
Yoink!
X Y Y
Y C A X A X
B
A B C B C
Not binary!
Idea 1: Rotations
Maintain the BST property, and move some of the vertices
(but not all of them) around.
Yoink!
6 4
4 7 2 6
2 5 1 3 5 7
1 3
Idea 2: Proxy for Balance
Maintaining perfect balance is too difficult.
Instead, let’s determine some proxy for balance.
i.e. If the tree satisfies some property, then it’s “pretty balanced.”
We can maintain this property using rotations.
Red-Black Trees
There exist many ways to achieve this proxy for balance, but
here we’ll study the red-black tree.
1. Every vertex is colored red or black.
2. The root vertex is a black vertex.
3. A NIL child is a black vertex.
4. The child of a red vertex must be a black vertex.
5. For all vertices v, all paths from v to its NIL descendants have the same
number of black vertices.
Red-Black Trees by Example
1. Every vertex is colored red or black.
2. The root vertex is a black vertex.
3. A NIL child is a black vertex.
4. The child of a red vertex must be a
black vertex.
5. For all vertices v, all paths from v to
its NIL descendants have the same
number of black vertices.
Red-Black Trees by Example
1. Every vertex is colored red or black.
2. The root vertex is a black vertex.
3. A NIL child is a black vertex.
4. The child of a red vertex must be a
black vertex.
Violates 2
5. For all vertices v, all paths from v to
its NIL descendants have the same
number of black vertices.
Violates 4 Violates 5
Red-Black Trees
Maintaining these properties maintains a “pretty balanced”
BST.
The black vertices are balanced.
The red vertices are “spread out.”
We can maintain this property as we insert/delete vertices,
using rotations.
2 6
1 3 5 7
8
Red-Black Trees
To see why a red-black tree is “pretty balanced,” consider
that its height is at most O(log(n)).
One path could be twice as long as the others if we pad it with red vertices.
Not actually a
valid coloring;
just used for
demonstration
purposes.
Red-Black Trees
Lemma: The number of non-NIL descendants of x is at least 2b(x) - 1.
Proof:
To prove this statement, we proceed by induction.
For our base case, note that a NIL node has b(x) = 0 and at least 20 - 1 = 0
non-NIL descendants.
For our inductive step, let d(x) be the number of non-NIL descendants of x.
Then:
d(x) = 1 + d(x.left) + d(x.right)
≥ 1 + (2b(x) - 1 - 1) + (2b(x) - 1 - 1)
= 2b(x) - 1
Thus, the number of non-NIL descendants of x is at least 2b(x) - 1. ◼
Red-Black Trees
Theorem: A Red-Black Tree has height ≤ 2 log2(n+1) = O(log n).
Proof:
By our lemma, the number of non-NIL descendants of x is at least 2b(x) - 1.
Notice that on any root to NIL path there are no two consecutive red vertices
(otherwise the tree violates rule 4), so the number of black vertices is at least
the number of red vertices. Thus, b(x) is at least half of the height. Then
n ≥ 2b(r) - 1 ≥ 2h/2 - 1, and hence h ≤ 2 log2(n+1). ◼
insert in Red-Black Trees
algorithm rb_insert(root, key_to_insert):
x = search(root, key_to_insert)
v = new red vertex with key_to_insert
if key_to_insert > x.key:
x.right = v
fix things up, if necessary
if key_to_insert < x.key: What does
x.left = v that mean?
5 5
3 6 3 6
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
1
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
1
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
1
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
1 1
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
3 6 3 6
1 1
Violates 4
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
What if we
insert a black
vertex instead?
3 6 3 6
1 1
insert in Red-Black Trees
What does “if necessary” mean?
Suppose we want to insert(1).
5 5
What if we
insert a black
vertex instead?
3 6 3 6
1 1
Violates 5
insert in Red-Black Trees
What does “if necessary” mean?
So it seems we’re happy if the parent of the inserted vertex is black.
But there’s an issue if the parent of the inserted vertex is red.
insert in Red-Black Trees
algorithm rb_insert(root, key_to_insert):
x = search(root, key_to_insert)
v = new red vertex with key_to_insert
if key_to_insert > x.key:
x.right = v
recolor(v)
if key_to_insert < x.key:
x.left = v
recolor(v)
if key_to_insert == x.key:
return
Runtime: O(log n)
insert in Red-Black Trees
algorithm recolor(v):
p = parent(x)
if p.color == black:
return
grand_p = p.parent
uncle = grand_p.right
if uncle.color == red:
p.color = black
uncle.color = black
grand_p.color = red
recolor(grand_p)
else: # uncle.color == black
p.color = black
grand_p.color = red
right_rotate(grand_p) # yoink
Runtime: O(log n)
Red-Black Trees
Since we maintain the red-black property in O(log n), then
insert, delete, and search all require O(log n)-time.
YAY!