Randomized Algorithms
CS648
Lecture 1
1
Deterministic Algorithm
Input Output
Algorithm
The output as well as the running time are functions only of the input.
2
3
EXAMPLE 1 : APPROXIMATE MEDIAN
4
Approximate Median
•
Problem :
Given an array A[] storing distinct numbers, compute the element with rank .
Best Deterministic Algorithm:
• “Median of Medians” algorithm
• Running time: O()
• Lower bound:
Reason: Every element must be read. Otherwise, …
A
The unread element could be the median element.
5
Approximate Median
•
Definition:
Given an array A[] storing numbers and > ,
compute an element whose rank is in the range [( )/, ()/].
Best Deterministic Algorithm:
• Running time: O()
• No faster deterministic algorithm possible for approximate median
Reason: At least elements must be read. Otherwise, …
A
½ - Approximate median elements could all be unread elements
6
½ - Approximate median
• 𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒
Increasing order of values
If we pick an element randomly uniformly,
it is going to be ½ - Approximate median for the array with probability .
A
How to boost the success Pick a few elements and
probability to be close to ? return their median.
8
½ - Approximate median
A Randomized Algorithm
•
Rand-Approx-Median(A[])
1. Let be a positive integer (to be fixed later on );
2. S ∅;
3. For = to
4. random //returns a no. uniformly random from []
5. A;
6. S S U {};
7. Sort S.
8. Report the median of S. in Output
On a few occasions
Running time: O( log )
9
EXAMPLE 2 : RANDOMIZED QUICK SORT
10
C. A. R. Hoare John von Neumann
Quick Sort, 1960 Merge Sort, 1945
Merge Sort Quick Sort
Average case comparisons 𝒏
𝐥𝐨𝐠 𝟐 𝒏 1.39
Worst case comparisons
𝒏
𝐥𝐨𝐠 𝟐 𝒏 𝒏(𝒏
−𝟏)
11
QuickSort()
When the input is stored in an array
•
QuickSort(,, )
{ If ( < )
[];
Partition(,,,x);
QuickSort(,, );
QuickSort(,, )
}
12
Quick sort versus Merge Sort
In Practice
•
Input: a random permutation of numbers.
No. of repetitions:
This table was reported as a part of an assignment by a student in the Data Structures course (taught in 2014).
𝒏=𝟏𝟎𝟎
𝒏=𝟏𝟎𝟎𝟎
𝒏 ≥𝟏𝟎𝟎𝟎𝟎
No. of times Merge sort
outperformed Quick sort 𝟎 .𝟏 % 𝟎 . 𝟎𝟐% 𝟎 %
Reasons:
• Overhead of copying in Merge Sort
• Cache misses
13
A serious problem with QuickSort
•
QuickSort(,, )
{ If ( < )
[]; an element selected randomly uniformly from [..];
Partition(,,,x);
QuickSort(,, );
QuickSort(,, )
}
• Distribution sensitive:
Time taken depends upon the initial permutation of .
Can we make QuickSort
distribution insensitive ?
14
Randomized QuickSort()
When the input is stored in an array
•
QuickSort(,, )
{ If ( < )
[]; an element selected randomly uniformly from [..];
Partition(,,,x);
QuickSort(,, );
QuickSort(,, )
}
• Distribution insensitive: Time taken does not depend on initial permutation of .
• Time taken depends upon the random choices of pivot elements.
15
What makes Randomized Quick sort popular ?
No. of repetitions =
•
This table was reported as a part of an assignment by a student in the Data Structures course (taught in 2014).
No. of times run time exceeds average by 100 1000
190 49 22 10 3
28 17 12 3 0
2 1 1 0 0
0 0 0 0 0
Inference:
As increases,
the chances of deviation from average case
The reliability of quick sort
The analysis of Randomized Quick sort
•
Theorem [Colin McDiarmid, 1991]:
𝒙
− 𝐥𝐧 𝐥𝐧𝒏
Prob. the run time of Randomized Quick sort exceeds average by = 𝒏 𝟏𝟎𝟎
Prob. run time of Randomized quick sort is double the average for is ? 𝟏 𝟎−𝟏𝟓
Deterministic Algorithm
Input Output
Algorithm
The output as well as the running time are functions only of the input.
18
Deterministic
Randomized Algorithm
Random bits
Input Output
Algorithm
The output as well as the running time are functions of the input and random bits chosen.
or
in Output
Excess running time On a few occasions
on a few occasions 19
Still why to study
Randomized Algorithms ?
……….
Deterministic algo.
Efficiency
Simplicity
20
Types of Randomized Algorithms
Randomized Las Vegas Algorithms:
• Output is always correct
• Running time is a random variable
Example: Randomized Quick Sort
Randomized Monte Carlo Algorithms:
• Output may be incorrect with some probability
• Running time is deterministic.
Example: Randomized algorithm for approximate median
21
½ - Approximate median
A Randomized Algorithm
•
Rand-Approx-Median(A[])
1. Let be a positive integer (to be fixed later on );
2. S ∅;
3. For = to
4. random //returns a no. uniformly random from []
5. A;
6. S S U {};
7. Sort S.
8. Report the median of S.
Running time: O( log )
22
A simple probability exercise
•
What is the probability
that we get at least
HEADS ?
P[HEADS] = ¼
P[TAILS] = ¾
The coin is tossed times.
[Stirling’s approximation for Factorial: ? ]
23
Probability of getting
in tosses”
“at least HEADS
•
Probability of getting at least heads:
Using Stirling’s approximation
Inverse exponential in .
Since , so …
24
Analyzing the error probability of
Rand-approx-median
𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒
Increasing order of values
Left Quarter Right Quarter
Lets us consider an instance of sample set S
25
Analyzing the error probability of
Rand-approx-median
𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒 Median of S
Increasing order of values
Left Quarter Right Quarter
Answer is correct
When does Rand-approx-median make an
error ?
26
Analyzing the error probability of
Rand-approx-median
𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒 Median of S
Increasing order of values
Left Quarter Right Quarter
Answer is incorrect
What might have gone
wrong with S ?
27
Analyzing the error probability of
Rand-approx-median
𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒 Median of S
Increasing order of values
Left Quarter Right Quarter
Observation: Rand-approx-median makes an error only if
……………………………………………?.......................................
at least elements got sampled from the Right Quarter (or Left Quarter).
28
Analyzing the error probability of
Rand-approx-median
• 𝒏/
𝟒 Elements of A arranged in 𝟑 𝒏/𝟒
Increasing order of values
Left Quarter Right Quarter
Pr[ An element selected randomly from A is from Right quarter] = ??
¼
Pr[ Out of elements sampled from A, at least are from Right quarter] = ??
for
Exactly the same as the coin
tossing exercise we did !
29
•
Theorem:
The Rand-approx-median algorithm fails to report ½ -approximate median
from array A[1.. ] with probability at most.
30
What makes Randomized Algorithms so popular ?
• [A study by Microsoft in ]
Title: Cycles, Cells and Platters: An Empirical Analysis of Hardware Failures on a Million Consumer PCs
Authors: Edmund B. Nightingale, John R. Douceur, Vince Orgovan
Available at : research.microsoft.com/pubs/144888/eurosys84-nightingale.pdf
Event Probability
The desktop of RM301 crashes during this lecture
≥𝟏 𝟎𝟔
𝒏 RandQsort takes time at least double the average ¿ 𝟏𝟎 −𝟏𝟓
-ApproxMedian returns wrong output ¿ 𝟏𝟎 −𝟏𝟐
Simplicity 31
MOTIVATING EXAMPLES FOR
RANDOMIZED ALGORITHMS
32
Exact Median
•
Theorem**: Every deterministic algorithm for exact median
must perform at least comparisons in the worst case.
Project 1
Theorem: There is a randomized Las Vegas algorithm that computes exact median
and performs only comparisons on expectation.
**Dorit Dor, Uri Zwick:
On Lower Bounds for Selecting the Median.
SIAM J. Discrete Math. 14(3): 312-325 (2001)
33
Example 3: Smallest Enclosing circle
•compute
Problem definition: Given points in a plane,
the smallest radius circle that encloses all point.
Simple exercise : O()
Applications: Facility location problem
Best deterministic algorithm : [Megiddo, 1983]
• O() time complexity, too complex, uses advanced geometry
Randomized Las Vegas algorithm: [Welzl, 1991]
• Average O() time complexity, too simple, uses elementary geometry Project 2
34
Example 4: minimum Cut
•compute
Problem definition: Given a connected graph G=(V,E) on n vertices and m edges,
the smallest set of edges that will make G disconnected.
Best deterministic algorithm : [Stoer and Wagner, 1997]
• O() time complexity.
Randomized Monte Carlo algorithm: [Karger, 1993]
• O( log ) time complexity.
• Error probability: for any that we desire
35
Example 5: Primality Testing
•
Problem definition: Given an bit integer, determine if it is prime.
Applications:
• RSA-cryptosystem,
• Algebraic algorithms
Best deterministic algorithm : [Agrawal, Kayal and Saxena, 2002]
• O( ) time complexity.
Randomized Monte Carlo algorithm: [Rabin, 1980]
• O( ) time complexity.
• Error probability: for any that we desire
For = , this probability is
36
COURSE STRUCTURE
37
Prerequisites for the course
Elementary knowledge of probability
Formal analysis
Good knowledge of algorithms Formal proof of
correctness
Basic algorithm
Ability to work very hard
paradigms
Commitment to attend all classes
unless you have any genuine personal/medical problem
38
Marks Breakup
Assignments 25%
Quizzes (announced) 20%
Project (involves programming also) 10%
Mid-semester exam 20%
End-semester exam 25%
Project:
• Optional (To be done in groups of 2)
Assignments:
• All theoretical.
• To be done in groups of 2.
Passing criteria:
• At least 25% marks in the exams (Quizzes, Mid-semester exam, End-semester exam).
39
Contact Details
Office: 307, Department of CSE
Office Hours:
– fix appointment by email (preferably one day in advance).
• Course website will be maintained at moodle.cse.iitk.ac.in
• Entire course material (including all lecture slides) will be available on this
website.
40
(Optional) Extra class
Time and date: 05:00 pm on 1st August
Venue: RM101.
Agenda:
Revising fundamentals of probability theory from perspective of this course.
41
42
43
44