Lecture 01
Lecture 01
Jin-Yi Cai
Department of Computer Sciences
University of Wisconsin
Madison, WI 53706
Email: [email protected]
1 Genesis 5
1.1 Turing Machines and Undecidability . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Time, Space and Non-determinism . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Hierarchy Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Complexity Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Reductions and NP-Completeness . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Polynomial Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7 Counting Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8 Complexity Classes - Review . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Randomization 21
3.1 Randomized Algorithms – Examples . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Probabilistic inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Randomized Complexity Classes . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Sipser-Lautemann theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Universal Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6 BPP ⊆ Σp2 - Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Sparse Sets 25
4.1 Sparse set, Polynomial circuits, P/poly . . . . . . . . . . . . . . . . . . . . . 25
3
4 CONTENTS
5 Hartmanis Conjectures 29
7 IP=PSAPCE 45
8 Derandomization 47
8.1 Pseudorandom Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.2 One-way Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.3 Goldreich-Levin Hardcore Bit . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.4 Construction of Pseudorandom Generators . . . . . . . . . . . . . . . . . . . 55
10 Switching Lemmas 61
Chapter 1
Genesis
Chapter Outline: Brief overview of complexity theory. Hilbert’s Tenth Problem. Turing
Machines. Undecidability. Cantor’s method of diagonalization. Undecidability of the halting
problem. Time and space bounded Turing machines. Hierarchy Theorems. Complexity
Classes (P, NP, PSPACE). NP-Completeness. Polynomial Hierarchy. Counting class #P.
Easy containment among “classical” complexity classes.
5
6 CHAPTER 1. GENESIS
Theory and finite group theory was a discovery to investigate the solvability of equations by
radicals; the Prime Number Theorem was first conjectured by Gauss after much computa-
tional experiments. It is also true that much of the advances made in structural mathematics
had also greatly influenced the advances in computational mathematics.
While it can be said that the subject of algorithms is as old as mathematics itself, the
study of algorithms as a subject, rather than the use of them, is a relatively new development.
Perhaps one could trace this beginning to set theory, that most structural of all subjects.
In his study of Fourier series (surely one of most computational subject in origin), Cantor gave
birth to a set of ideas that we now call (naive) set theory. Cantor’s ideas are revolutinary in
many aspects. In its basic framework it is highly non-constructive. For example, Cantor gave
a conceptually crisp and simple proof of the existence of transcendental numbers, whereby
inventing his famous diagonalization method. This proof is remarkable in many ways:
Firstly, it is much simpler than the monumental achievement of Hermite and Lindemann
on the transcendence of e and π respectively. Perhaps one can still make the case that
the “real” transcendental number theory is more along the lines of Hermite, Lindemann
and Liouville, and not the mere existence proof by the magic of diagonalization. But even
the most dedicated practitioners of “hard analysis” today will not dismiss the elegance and
efficiency of Cantor’s method. On the other hand, today many interesting computational
problems, such as basis reductions for lattices, simultaneous Diophantine approximations,
and volume estimations of convex bodies, form very active research areas which can be traced
directly to the work such as Dirichlet, Liouville, Hermite and Minkowski.
Secondly, as Kronecker was quick to point out, Cantor’s method is inherently non-
constructive, and in his view, borders on the “philosophical”. In particular it did not conform
to the strictly finitistic and constructive approach that Kronecker had been advocating. To
the end of his day, Kronecker never accepted Cantor’s idea. The finitists distrust it on philo-
sophical ground, which is ironic because the finitists are particularly concerned with the
soundness of mathematical foundation, which is to be demonstrated in coming years to be
closely related to computational undecidability, in which Cantor’s diagonalization method is
a forerunner.
Thirdly, the diagonalization method was to find its great application in Turing’s undecid-
ability proof of the Halting Problem. It subsequently became one of the basic mathematical
tools in recurcsion theory, and in the founding of complexity theory with the proof of the
time and space hierarchy theorems.
Because of its fundamental importance we will give the diagonalization proof by Cantor.
An algebraic number is a root of a polynomial with integral coefficients. A non-algebraic
number is called a transcendental number. A set is countable if it can be put into one-to-
one correspondence with the integers. It is clear that the set of all algebraic numbers is
countable, since we can count all integral polynomials, and each polynomial of degree n has
at most n roots.
Theorem 1.1 The set of real numbers is uncountable; in particular, there are non-algebraic
real numbers.
1.1. TURING MACHINES AND UNDECIDABILITY 7
A curious historical note: In order not to offend Kronecker, who was powerful and some-
what petty at the same time and might block the publication of this work, Cantor had to
phrase his main result strictly on the existence of non-algebraic numbers.
Proof Consider all binary infinite sequences B = {β}, where
β = b 1 b2 . . . bn . . . ,
For any function f (n), DTIME[f ] = {L | for some M , L(M ) = L, andtimeM (n) ≤ f (n), for all large n.}.
For technical reasons we will only consider “nice” functions f (n), called fully time-constructible
functions. This means that there is some TM M , which for every n, and any input x of size
n, M (x) runs in exactly f (n) steps and then halts. Almost any reasonable functions ≥ n,
k k n
such as n, nk , ni (log n)j , 2(log) , 2n , and 22...2 . 2 In other words, DTIME[f ] consists of
2
It is a fact in complexity theory that there exist functions which are not fully time-constructible, but we
will not be concerned with that.
10 CHAPTER 1. GENESIS
problems that are computable by some TM with running time at most f (n) asymptotically.
By some silly tricks such as enlarging the alphabet set, and the set of finite states, one can
show that any constant factor in f (·) does not matter.
The model of TM is chosen because it is relatively robust. One can show, for in-
stance, that any k-tape TM running in time f (n) can be simulated by a 2-tape TM in
time O(f (n) log f (n)). For our purposes we will only need the more trivial simulation in
time O(f (n)2 ), even by 1-tape TM. This simulation can be seen easily as follows: Devide
the single tape into 2k tracks, and use a large alphabet set with more than 22k symbols, say.
Keep on the single tape the configurations of all k tapes, together with a mark for each head
position. Then one step of the computation of the the k-tape TM is simulated by the 1-tape
TM with 2 sweeps. Note that each sweep of the tape area used takes at most f (n) steps.
Let poly denote the class of polynomials, or simply ni + i, i = 1, 2, . . .. Then the union
[
P = DT IM E[f ]
f ∈poly
is the class of deterministic polynomial time. Clearly this definition is invariant when re-
stricted to 1-tape TMs. One can simularly define exponential time classes
[
E= DT IM E[2kn ]
k>0
[ k
EXP = DT IM E[2n ]
k>0
One can define space complexity similarly. In one aspect it is even simpler, since we can
use k tracks to mimic k tapes and there is no additional space overhead in the simulation.
So we will have one work tape. However in another respect, there is a slight complicaiton.
This happens when we wish to study sublinear space complexity, which consists of important
problems. In order to account for sublinear space, one uses a separate read-only input tape,
in addition to a read-write work tape. On the read-only input tape, the input of length n is
written, but these n cells do not count toward space complexity. We only count the number
of tape cells used on the read-write work tape. Thus for space complexity, the standard
model is what is known as an off-line TM, which has one read-only input tape, and one
read-write work tape. Then one can define, in an obvious way,
spaceM (n) = max{# of cells on work tape used in M (x), for |x| = n}.
Again we restrict to “nice” functions, called fully space constructible functions. (What’s a
suitable deinition?) For any such function f (n), define DSPACE[f ] = {L | for some M , L(M ) =
L, andspaceM (n) ≤ f (n), for all large n.}.
We define [
P SP ACE = DSP ACE[f ],
f ∈poly
the class of polynomial space. (There is a reason why we omit the word “deterministic”,
as we shall explain later.) We also have deterministic logarithmic space L = DSPACE[log].
S
Note again that constant factors do not matter, thus DSPACE[log] = c>0 DSPACE[c log].
1.2. TIME, SPACE AND NON-DETERMINISM 11
next vertex vi and verify that (vi−1 , vi ) is an edge. We can keep a counter to count up to n,
and accept iff within n steps, some vk = t. Note that we only need to keep at all times the
pair of current vertices on the work tape; when we guess for vi+1 we need only to remember
vi (in order to verify that (vi , vi+1 ) is an edge), but we no longer need to keep vi−1 and
all previously guessed vertices. In addition we only need to keep a counter to count up to
n. For a graph with n vertices, to name a vertex takes only log n space. Thus GAP is in
non-deterministic logspace.
We will only be concerned with time/space bounded NTM, and thus we can assume each
computational path terminates within the specified time/space bound. Then for a NTM N
and a fully time constructible f we define
timeN (x) = max{n + 1, min # of steps in N (x), and N (x) accepts along p}},
p
[
NP = N T IM E[f ],
f ∈poly
[
N P SP ACE = N SP ACE[f ],
f ∈poly
and also [
NL = N SP ACE[c log].
c>0