Introduction To Algorithms: Prof. Shafi Goldwasser
Introduction To Algorithms: Prof. Shafi Goldwasser
6.046
Lecture 6
Prof. Shafi Goldwasser
September 24, 2001 L5.1
How fast can we sort?
All the sorting algorithms we have seen so far
are comparison sorts: only use comparisons to
determine the relative order of elements.
• E.g., insertion sort, merge sort, quicksort,
heapsort.
The best worst-case running time that we’ve
seen for comparison sorting is O(n lg n) .
Q:Is O(n lg n) the best we can do?
A: Yes, as long as we use comparison sorts
Upper Bound:
A, I, A(I)=P(I) and Time(A,I) Tupper(|I|)
Lower Bound:
A, I, A(I) P(I) or Time(A,I) Tlower(|I|)
How?
Decision trees help us.
Decision-tree example
Sort a1, a2, …, an 1:2
2:3 1:3
class InsertionSortAlgorithm {
for (int i = 1; i < a.length; i++) { 1:2
int j = i;
while ((j > 0) && (a[j-1] > a[i])) { 2:3 1:3
a[j] = a[j-1];
j--; } 123 1:3 213 2:3
a[j] = B; }}
132 312 231 321
Lower bound for decision-
tree sorting
Theorem. Any decision tree that can sort n
elements must have height (n lg n) .
Proof. The tree must contain n! leaves, since
there are n! possible permutations. A height-h
binary tree has 2h leaves. Thus, n! 2h .
h lg(n!) (lg is mono. increasing)
lg ((n/e)n) (Stirling’s formula)
= n lg n – n lg e
= (n lg n) .
Lower bound for comparison
sorting
Corollary. Heapsort and merge sort are
asymptotically optimal comparison sorting
algorithms.
Sorting
Lower Bound
Is there a faster algorithm?
If different model of computation?
class InsertionSortAlgorithm {
for (int i = 1; i < a.length; i++) {
int j = i;
while ((j > 0) && (a[j-1] > a[i])) {
a[j] = a[j-1];
j--; }
a[j] = B; }}
Sorting in linear time
Counting sort: No comparisons between elements.
• Input: A[1 . . n], where A[ j]{1, 2, …, k} .
• Output: B[1 . . n], sorted.
• Auxiliary storage: C[1 . . k] .
Counting sort
for i 1 to k
do C[i] 0
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
for i 2 to k
do C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Counting-sort example
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C:
B:
Loop 1
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 0 0 0
B:
for i 1 to k
do C[i] 0
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 0 0 1
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 0 1
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 1 1
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 1 2
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 2 2
for i 2 to k
do C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 3 2
for i 2 to k
do C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 3 5
for i 2 to k
do C[i] C[i] + C[i–1] ⊳ C[i] = |{key i}|
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 3 5
B: 3 C': 1 1 2 5
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 2 5
B: 3 4 C': 1 1 2 4
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 2 4
B: 3 3 4 C': 1 1 1 4
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 1 4
B: 1 3 3 4 C': 0 1 1 4
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 1 1 4
B: 1 3 3 4 4 C': 0 1 1 3
for j n downto 1
doB[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
Analysis
(k) for i 1 to k
do C[i] 0
(n) for j 1 to n
do C[A[ j]] C[A[ j]] + 1
(k) for i 2 to k
do C[i] C[i] + C[i–1]
for j n downto 1
(n) do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] – 1
(n + k)
Running time
If k = O(n), then counting sort takes (n) time.
• But, sorting takes (n lg n) time!
• Where’s the fallacy?
Answer:
• Comparison sorting takes (n lg n) time.
• Counting sort is not a comparison sort.
• In fact, not a single comparison between
elements occurs!
Stable sorting
Counting sort is a stable sort: it preserves
the input order among equal elements.
A: 4 1 3 4 3
B: 1 3 3 4 4
Produced by
the WWW
Virtual Punch-
Card Server.
.
Choose r to minimize T(n, b):
• Increasing r means fewer passes, but as
r >> lg n, the time grows exponentially.
?=
Choosing r
Replica of punch
card from the
1900 U.S. census.
[Howells 2000]
Hollerith’s
tabulating Figure from
[Howells 2000].
system
•Pantograph card
punch
•Hand-press reader
•Dial counters
•Sorting box
Origin of radix sort
Hollerith’s original 1889 patent alludes to a
most-significant-digit-first radix sort:
“The most complicated combinations can readily be
counted with comparatively few counters or relays by first
assorting the cards according to the first items entering
into the combinations, then reassorting each group
according to the second item entering into the
combination, and so on, and finally counting on a few
counters the last item of the combination for each group of
cards.”
Least-significant-digit-first radix sort seems to be
a folk invention originated by machine operators.
Web resources on punched-
card technology
• Doug Jones’s punched card index
• Biography of Herman Hollerith
• The 1890 U.S. Census
• Early history of IBM
• Pictures of Hollerith’s inventions
• Hollerith’s patent application (borrowed
from Gordon Bell’s CyberMuseum)
• Impact of punched cards on U.S. history
Operation of the sorter
• An operator inserts a card into
the press.
• Pins on the press reach through
the punched holes to make
electrical contact with mercury-
filled cups beneath the card.
• Whenever a particular digit
value is punched, the lid of the
corresponding sorting bin lifts.
• The operator deposits the card
Hollerith Tabulator, Pantograph, Press, and Sorter
into the bin and closes the lid.
• When all cards have been processed, the front panel is opened, and
the cards are collected in order, yielding one pass of a stable sort.