0% found this document useful (0 votes)
9 views

Computational Number Theory Lecture Notes

The document consists of lecture notes on Computational Number Theory by N.R. Aravind, covering various topics including polynomials, congruences, quadratic equations, finite fields, and primality testing. It includes historical context, algorithms, and exercises for better understanding. The notes are structured into sections with detailed explanations and applications of number theory concepts.

Uploaded by

nraravind
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Computational Number Theory Lecture Notes

The document consists of lecture notes on Computational Number Theory by N.R. Aravind, covering various topics including polynomials, congruences, quadratic equations, finite fields, and primality testing. It includes historical context, algorithms, and exercises for better understanding. The notes are structured into sections with detailed explanations and applications of number theory concepts.

Uploaded by

nraravind
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Computational Number Theory

Lecture Notes

N.R.Aravind
—————————
Acknowledgements

The latex source file was built using the LegrandOrangeBook template (copyright
2022, Goro Akechi) available at book-website.com and licensed under the Creative
Commons Attribution-NonCommercial 4.0 License (the “License”). A copy of the
License is available at https://creativecommons.org/licenses/by-nc-sa/4.0.
Minor modifications were made in the use of the template.
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

I Polynomials in One Variable


1 Two Equations from Ancient Times . . . . . . . . . . . . . . . . . . . . . . 11
1.1 The Cubic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.1 Geometric solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.2 Algebraic solution and depressed cubics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.3 Solving a depressed cubic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.4 Quartic and higher powers: More history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 The Equation ax + by = c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.1 Bezout’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.2 Euclid’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.3 The Extended Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 The Fundamental Theorem of Arithmetic . . . . . . . . . . . . . . . . 19


2.1 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Euclid’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 The fundamental theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Definition and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Arithmetic in Zn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Linear congruences and the Chinese Remainder Theorem . . . . . . . . . 22
4 CONTENTS

3.4 The ring Zn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24


3.5 Fermat’s little theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6 Lagrange’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.7 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.8 Euler’s totient function and Euler’s theorem . . . . . . . . . . . . . . . . . . . . . 26
3.9 Euclid’s algorithm and unique factorization for polynomials . . . . . . . . 26
3.10 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.11 The equations xd = 1 and xd = a in Zp . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.12 Application: The RSA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.13 Order and primitive roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 The Quadratic Equation in Zp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31


4.1 Quadratic Residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Application: Coin Tossing over a telephone . . . . . . . . . . . . . . . . . . . . . . 32
4.3 The Legendre Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 The equation x2 = a: Two easy cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.1 p ≡ 3 ( mod 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.2 p ≡ 1 (mod 4), a = −1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2
!
4.5 Wilson’s theorem and the value of . . . . . . . . . . . . . . . . . . . . . . . . . 35
p
4.6 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.7 The Tonelli-Shanks Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.7.1 p ≡ 5 (mod 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.7.2 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.8 Hensel Lifting: From Zp to Zpk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.9 A second algorithm for finding square-roots . . . . . . . . . . . . . . . . . . . . . . 40

5 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1.1 Cayley Tables and Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.1.2 Direct products and subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.1.3 Cosets and Lagrange’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 Irreducible polynomials in Zp [x] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 Application: Secret sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Polynomial Factorization over Zp . . . . . . . . . . . . . . . . . . . . . . . . . 49


6.1 Phase 1: Finding the square-free part . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
CONTENTS 5

6.2 Phase 2: Distinct-degree factorization . . . . . . . . . . . . . . . . . . . . . . . . . . 50


6.3 Phase 3: Finding irreducible factors of degree i . . . . . . . . . . . . . . . . . . 51

II Quadratic Equations in Two Variables


7 Primality Testing: Before 2002 . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.1 Fermat and Mersenne primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.1.1 Primes of the form 2n + 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.1.2 Primes of the form 2n − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.2 Testing Fermat’s little theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.3 Fibonacci and Lucas pseudoprimality tests . . . . . . . . . . . . . . . . . . . . . . . 57
7.4 The Miller-Rabin Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.4.1 Analysis of time complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.4.2 Analysis of correctness probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

8 The Integer Factoring Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


8.1 Trial Division and Fermat’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.2 Pollard rho Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.3 Dixon’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

9 Primality Testing: The AKS algorithm . . . . . . . . . . . . . . . . . . . 67


9.1 A Polynomial Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.2.1 Running Time: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.3.1 The Proof Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3.2 The Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Preface

The purpose of these notes is to present elementary algorithms in number theory


from the point of view of solving polynomial equations - primarily over Z and over
Zp (the ring of integers modulo p with p prime). The simplest case, namely that of
factorizing polynomials in one variable, already uses non-trivial ideas.
The two-variable case is non-trivial even when the degree is restricted to two. We will
see (only) some of the classical examples and basic theory of binary quadratic forms.
The fundamental problems of primality testing and integer factoring are included in
this part as they concern the equation xy = n.
What about the multivariate cases? Linear equations are solvable efficiently, whether
over Z or Zn . Factoring multivariate polynomials can also be done efficiently, i.e. in
(randomized) polynomial time, over Zp . This requires some understanding of finite
fields, so we shall study finite fields as well as one or two applications.
Over integers, the picture is very different. The problem of deciding if an arbitrary
polynomial in any number of variables has an integer solution - mentioned by
Hilbert among his 23 problems for the twentieth century, was famously shown to be
undecidable by Matiyasevich in 1970 following a series of work by Julia Robinson,
Martin Davis and Hilary Putnam.
1 Two Equations from Ancient Times
11
1.1 The Cubic Equation . . . . . . . . . . . . . . . . . . . . . 11
1.2 The Equation ax + by = c . . . . . . . . . . . . . . . . . 14
1.3 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . 17

I
Polynomials
2 The in One Theorem
Fundamental Variable
of Arith-
metic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Euclid’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 The fundamental theorem . . . . . . . . . . . . . . . . . 19

3 Congruences . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Definition and properties . . . . . . . . . . . . . . . . . . 21
3.2 Arithmetic in Zn . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Linear congruences and the Chinese Remainder Theo-
rem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 The ring Zn . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Fermat’s little theorem . . . . . . . . . . . . . . . . . . . 24
3.6 Lagrange’s theorem . . . . . . . . . . . . . . . . . . . . . 25
3.7 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . 25
3.8 Euler’s totient function and Euler’s theorem . . . 26
3.9 Euclid’s algorithm and unique factorization for poly-
nomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.10 Classroom Exercises . . . . . . . . . . . . . . . . . . . . . 27
3.11 The equations xd = 1 and xd = a in Zp . . . . . . 28
3.12 Application: The RSA Algorithm . . . . . . . . . . . 29
3.13 Order and primitive roots . . . . . . . . . . . . . . . . . 29

4 The Quadratic Equation in Zp . . . 31


4.1 Quadratic Residues . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Application: Coin Tossing over a telephone . . . . 32
4.3 The Legendre Symbol . . . . . . . . . . . . . . . . . . . . 33
4.4 The equation x2 = a: Two easy cases . . . . . . . . 34
!
2
4.5 Wilson’s theorem and the value of ...... 35
p
4.6 Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . 36
4.7 The Tonelli-Shanks Algorithm . . . . . . . . . . . . . . 36
4.8 Hensel Lifting: From Zp to Zpk . . . . . . . . . . . . 39
4.9 A second algorithm for finding square-roots . . . 40

5 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . 43
5.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 Irreducible polynomials in Zp [x] . . . . . . . . . . . . 47
5.6 Application: Secret sharing . . . . . . . . . . . . . . . . 48

6 Polynomial Factorization over Zp 49


6.1 Phase 1: Finding the square-free part . . . . . . . . 49
6.2 Phase 2: Distinct-degree factorization . . . . . . . . 50
6.3 Phase 3: Finding irreducible factors of degree i 51
1. Two Equations from Ancient Times

1.1 The Cubic Equation


Solutions to linear and even quadratic equations have been known from a very long
time. In 1800 BC, Egyptians solved quadratic equations by the "method of false
position", i.e. by finding successively smaller intervals containing a root. From a clay
tablet dated between 1800 BC to 1600 BC, we know that Babylonians of the period
knew how to solve quadratic equations exactly.

A natural follow-up to the quadratic equation is: what about cubic equations? Solving
the simplest cubic equation x3 = a, boils down to finding cube-roots, and finding
cube-roots numerically was also known for a long time; for example, Aryabhatta
(around AD 500),gave a method for finding both square-roots and cube-roots.

What about general cubic equations? This was first studied by Omar Khayyam (AD
1100), where he gave a geometric solution.

1.1.1 Geometric solution


x2
Consider the intersection of the parabola y = and the circle (x − r)2 + y 2 = r2 .
a
x4
An intersection point satisfies + x2 − 2rx = 0, i.e. x3 + a2 x = 2ra2 . Thus we get a
a2
solution to the cubic equation x3 + px = q when p ≥ 0. In Omar Khayyam’s time,
negative numbers were avoided and thus the same equation above with a negative
value for p would instead be written as x3 = px + q with p positive. For example, to
solve the equation x3 + x = 12, we intersect y = x2 with (x − 6)2 + y 2 = 36. This is
illustrated in Figure 1.1.1 shown below.

Omar Khayyam divided the cubic equation into various categories so that the
coefficients would be positive, and gave different geometric solutions for them.
12 Chapter 1. Two Equations from Ancient Times

y
P

x
1 2

Figure 1.1: Illustration of the geometric solution of x3 + x = 12


1.1.2 Algebraic solution and depressed cubics
While a geometric solution is useful, does the cubic equation have a "closed-form"
expression for its solutions in terms of its coefficients, like the quadratic equation
does? The answer is Yes and here is a solution. A solution to x3 + px + q = 0 is given
by:

−q √ −q √
s s
x= 3
+ D+ 3
− D. (1.1)
2 2

q 2 p3
where D = + .
4 27
But how do we obtain this? And what about the more general cubic with a non-zero
x2 term? To answer the latter question first, it turns out that we can reduce the
solution of any cubic polynomial to a solution of a cubic without the x2 term: such
a cubic polynomial is called a depressed cubic.

Consider the polynomial f (x) = x3 + ax2 + bx + c. Substitute x = y + r to obtain

f (x) = y 3 + (3r + a)y 2 + (3r2 + 2ar + b)y + (r3 + ar2 + br + c).

If we choose r = −a/3, then we obtain a depressed cubic in y, let’s call it g(y). Thus
to solve f (x) = 0, we can solve g(y) = 0 and then find the corresponding roots of f .
1.1 The Cubic Equation 13

History
Around AD 1500, Scipione del Ferro, professor at the University of Bologna,
discovered a formula for depressed cubic equations (cubic equations with a
missing x2 term) and shortly before his death in 1526, communicated it to his
disciple, Antonio del Fiore. After del Ferro’s death, around 1535, Fiore issued
a challenge to another mathematician Tartaglia, with a list of 30 problems
all of whose solutions depended on knowing how to solve the cubic equations
x3 + px = q and x3 = px + q.
Interestingly, Tartaglia had five years earlier, independently figured out solu-
tions to x3 + ax2 = b and x3 = ax2 + b. After accepting the challenge he figured
out shortly how to solve the other kind of cubic equation and thus solved all
the 30 problems posed by Fiore; he himself posed both kinds of equations in
the counter-challenge which Fiore could not solve, and hence won the duel.
Here are two of the equations that Tartaglia solved in the duel: x3 + x = 12
and x3 + 3x = 15.
After the duel, Tartaglia was approached by Gerolamo Cardano, to share
his secret as Cardano was writing a book on arithmetic etc. [Incidentally,
Cardano was also the first to write a book on probability although he made
many mistakes in it.] Initially, Tartaglia refused, but later shared his formula
by means of a poem on the promise that Cardano would keep it secret.
Cardano kept his secret for some time, but a few things changed his mind.
Firstly, he himself figured out how to reduce the most general equation to
a depressed cubic; secondly his student Ferrari figured out how to solve the
biquadratic equation, i.e. an equation of degree four. Thirdly, he visited del
Ferro’s house and examined his manuscripts and was convinced that del Ferro
was the original discoverer of the solution to the cubic. He published the
solutions in his new book Ars Magna, which led to a fallout between Cardano
and Tartaglia.

1.1.3 Solving a depressed cubic


√ √
We consider the polynomial f (x) = x3 + px + q. Suppose we write x = 3
u+ 3
v.
Then


x3 = u + v + 3 3 uvx. (1.2)

√ √ √
Now we notice that if u+v = −q and 3 3 uv = −p, then x = 3 u+ 3 v satisfies f (x) = 0.
p3
Clearly we can find such a pair u, v as roots of the quadratic polynomial z + qz − .
2
27
−q √ −q √ q 2 p3
Thus we find u = + D, v = − D, where D = + . The corresponding
2 2 4 27
√ √
s s
−q −q
root is: x = 3 + D+ 3 − D.
2 2
14 Chapter 1. Two Equations from Ancient Times

Exercise 1.1 Find a cube-root of the polynomial x3 + x2 − 10. ■

1.1.4 Quartic and higher powers: More history


Quartic (fourth-degree) polynomials also have a closed-form expression, found by
Ferrari (as mentioned in the history), and naturally this led to the question of a
formula for polynomials of degree five and higher. By formula, we mean an expression
using the four standard arithmetic operations plus the operation of taking nth roots.
No such formula was however found despite many attempts and around 1800, Ruffini
(and later Abel), proved that no general formula can exist. This does not however
rule out individual polynomials having roots expressed in terms of radicals; it only
rules out the absence of a common formula that works for all polynomials.
Nevertheless Galois in 1830 figured out the exact conditions under which a polynomial
has a radical solution; in particular most polynomials of degree five and higher do
not have closed form expressions for their roots. The simplest example of such a
polynomial is x5 − x − 1. Galois thus solved the problem completely and the theory
he built (and further refined by subsequent mathematicians) is called Galois theory.
In a different direction, that every polynomial of degree n has exactly n complex
roots (the fundamental theorem of algebra) was proved by d’Alembert, Gauss and
Argand around 1800.

1.2 The Equation ax + by = c


In contrast to solving equations over the reals (R) or over the complex numbers (C),
the focus in number theory is to consider equations over integers (Z) or over rational
numbers (Q).
Such equations are called Diophantine equations in honor of Diophantus (ÃD 250).
Diophantus wrote a book called Arithmetica in which he collected over 200 problems
and explained their solutions. He was mainly interested in solutions in terms of
positive rationals; for many problems he found general parametric solutions. His
book famously inspired Fermat to write in its margins. He was also one of the first
persons to use symbolic notation (combined with reasoning by words).

1.2.1 Bezout’s Lemma


The simplest equations, over Z, are, as in the case of reals, linear equations. Aryab-
hatta was the first to explain how to solve an equation of the form ax + by = c; an
example that he used was 137x+10=60y.
First, let’s look at an example without a solution. Consider the equation 4x + 6y = 5.
This has no solution in integers because 2 divides the LHS but not the RHS. In
general, we see that if d|a and d|b, then for ax + by = c to have a solution, d must
divide c. In particular, the greatest common divisor of (a, b) must divide c. This is a
necessary condition but it also turns out to be sufficient.
Theorem 1.1 — Bezout’s Lemma. Let a, b, c be natural numbers. The equation
1.2 The Equation ax + by = c 15

ax + by = c has a solution in integers if and only if gcd(a, b) divides c.

Proof. It is both necessary and sufficient to prove that we can express gcd(a, b) as
ax + by for some integers x, y. Let d = gcd(a, b). For a given value of d, we prove
by induction on a + b (over pairs (a, b) with (gcd(a, b) = d) that d can be written
as ax + by. The base case is when a + b = d, i.e. a = d and b = 0. In this clearly
x = 1, y = 0 is a solution. Now consider an arbitrary pair (a, b) with gcd(a, b) = d
and without loss of generality, let a ≥ b. Then gcd(a − b, b) = d and by the induction
hypothesis, we have: d = (a − b)x + by. This implies that d = ax + b(y − x) and thus
d is an integer-linear combination of a, b as desired. This completes the proof. ■

We make some remarks: firstly, the proof can be made algorithmic by finding
successively smaller pairs (a, b) till we reach the pair (d, 0) and work backwards. A
simple recursive algorithm is the following:

Algorithm 1 Recursive-Bezout
1: procedure Simple-Euclid((a, b))
2: if a = 0 then return (0, 1)
3: end if
4: if b = 0 then return (1, 0)
5: end if
6: x ← SIMPLE-EUCLID(a − b, b)(1)
7: y ← SIMPLE-EUCLID(a − b, b)(2)
8: if a > b then return (x, y − x)
9: else return (x − y, y)
10: end if
11: end procedure

Now the second remark: as in the case of Euclid’s algorithm, we can make it more
efficient by considering not just a − b but a − qb for q = ⌊a/b⌋. We first look at
Euclid’s algorithm to thus find the gcd of two numbers.

1.2.2 Euclid’s Algorithm


Euclid (3̃50 BC) wrote his algorithm in his famous book The Elements, along with a
few other statements in number theory.

Algorithm 2 Euclid’s algorithm


1: procedure Euclid(a, b) ▷ Returns gcd(a, b)
2: A ← max(a, b), B ← min(a, b)
3: while B ̸= 0 do
4: r ← A mod B ▷ gcd(A, B) = gcd(B, r)
5: A←B
6: B←r
7: end while
8: return A
9: end procedure
16 Chapter 1. Two Equations from Ancient Times

One of our concerns in this course will be the design of efficient algorithms, often
algorithms running in time polynomial in the input size. How efficient is Euclid’s
algorithm in terms of its input size?
Theorem 1.2 Let m = max(log2 a, log2 b). Then the number of iterations in Euclid’s
algorithm 2m. Further, the time complexity of each iteration is at most O(m log m).

Proof. The last statement follows from the fact that the complexity of each iteration
is essentially the cost of integer division, which has the same complexity as integer
multiplication. In 2020, Harvey and van der Hoeven gave a O(n log n) time algorithm
to multiply two n-bit integers; for comparison, the complexity of FFT-based integer
multiplication is O(n log2 n). Thus the cost of all the basic arithmetic operations
performed on two n-bit integers is O˜(n).
To prove the first statement, let (Ai , Bi ) denote the value of the pair after i iterations,
with (A0 , B0 ) = (max(a, b), min(a, b)). Further, let Ai = Bi qi + ri with 0 < ri < Bi .
Note that Ai+2 = Bi+1 = ri < Bi .
We have Ai = Bi qi + Ai+1 ≥ Bi + Ai+2 > 2Ai+2 . The Ai s reduce by a factor of at least
2 after every two iterations, so that the number of iterations is at most 2 log2 m. ■

1.2.3 The Extended Euclidean algorithm


We now look at the extended Euclid’s algorithm which, in addition to finding gcd(a, b),
finds two integers x, y such that ax + by = gcd(a, b).

Algorithm 3 Extended Euclid’s algorithm


1: procedure Extended-Euclid(a, b) ▷ Finds x, y such that ax + by = gcd(a, b)
2: A ← max(a, b), B ← min(a, b)
3: x ← 1, y ← 0 ▷ ax + by = A will be invariant.
4: u ← 0, v ← 1 ▷ au + bv = B will be invariant.
5: while B ̸= 0 do
6: r ← A mod B ▷ gcd(A, B) = gcd(B, r)
7: q ← ⌊A/B⌋
8: A←B
9: B ← r! !
x u u x − qu
10: ←
y v v y − qv
11: end while
12: return (x, y, A)
13: end procedure

How the algorithm works: The key observation is that the pair (Ai+1 , Bi+1 ) is
obtained from (Ai , Bi ) by a linear transformation, namely:

0 1
!
(Ai+1 , Bi+1 ) = (Ai , Bi ) (1.3)
1 −qi
1.3 Classroom Exercises 17

0 1
!
Let M = i . Then we have: (A0 , B0 )M = (gcd(a, b), 0). Thus the first
Q
1 −qi
column of M yields a solution to ax+by = gcd(a, b). Also, the equality A0 x+B0 y = A
is maintained as an invariant after every iteration.
An illustration of the algorithm for a = 3, b = 8:
A B x y u v q
8 3 1 0 0 1 2
3 2 0 1 1 -2 1
2 1 1 -2 -1 3 2
1 0 -1 3 3 -8
Thus, we find that the gcd is 1 and (−1, 3) is a solution to 8x + 3y = 1. We may
check that 8x + 3y = A is valid in each iteration.
Exercise 1.2 Find all integers x, y such that 86x + 197y = 1. ■

Exercise 1.3 Find three integers x, y, z such that 6x + 10y + 15z = 1. ■

Exercise 1.4 Find all integers x, y, z such that 2x + 3y + 5z = 0. ■

1.3 Classroom Exercises


Exercise 1.5 Find an integer solution to 32x+57y=1. ■

Solution: We iterate through the extended Euclidean algorithm.


A B x y u v q
57 32 1 0 0 1 1
32 25 0 1 1 -1 1
25 7 1 -1 -1 2 3
7 4 -1 2 4 -7 1
4 3 4 -7 -5 9 1
3 1 -5 9 9 -16 3
1 0 9 -16 -32 57
Thus, we find an integral solution x = 9, y = −16.
Exercise 1.6 Find all integer solutions to 4x+7y=1. ■

Solution: We may find one solution by inspection or using the Euclidean algorithm.
For example, x = 2, y = −1 is a solution. We may now use this to find all solutions:
if (x, y) is a solution, then we have 4(x − 2) + 7(y + 1) = 0. We observe that 7 divides
4(x − 2) and hence 7|(x − 2). Thus, we get 7|(x − 2); let x = 7k + 2, this implies that
y = −4k − 1. Thus, the general solution is given by {(7k + 2, −4k − 1) : k ∈ Z}.

Exercise 1.7 Prove that if d|a and d|b, then d|gcd(a, b). ■

Solution: By Bezout’s lemma, there exist integers x, y such that ax + by = gcd(a, b).
18 Chapter 1. Two Equations from Ancient Times

Since d divides the LHS, it must divide the RHS which proves the claim.
2. The Fundamental Theorem of
Arithmetic

2.1 Prime numbers


A natural number n is defined to be a prime number if it has exactly 2 divisors
(namely 1 and n). The sequence of prime numbers begins 2, 3, 5, 7, . . .. The first
interesting fact about prime number is that there are infinitely many of them.
Theorem 2.1 There are infinitely many prime numbers.

2.2 Euclid’s lemma


Lemma 1 If p is prime and p|ab, then p|a or p|b.

Proof. Suppose for contradiction that p does not divide a and p does not divide
b. Then gcd(p, a) = 1 and by Bezout’s lemma, there exist integers x, y such that
px + ay = 1. Similarly, gcd(p, b) = 1 and there exist integers u, v such that pu + bv = 1.
Multiplying the two relations, we get: (px + ay)(pu + bv) = 1. Expanding the LHS,
we get a contradiction because p divides each term on the LHS, but the RHS is equal
to 1. This proves the lemma. ■

2.3 The fundamental theorem


Theorem 2.2 Every natural number n > 1 can be uniquely factored into a product
of prime numbers, i.e. we can write n = p1 p2 . . . pk , where the pi s are prime (not
necessarily distinct), and if n = q1 q2 . . . ql with each qi prime, then k = l and
{q1 , q2 , . . . , ql } is equal (as a multiset) to {p1 , p2 , . . . , pk }.

Proof. There are two statements to prove, (a) that every natural number larger
than 1 can be factored into primes and (b) that the factorization is unique (up to
ordering).
20 Chapter 2. The Fundamental Theorem of Arithmetic

We first prove (a) by induction. The first few base cases are 2, 3, 4 which we see have
a prime factorization. The induction step: We consider an arbitrary natural number
n > 1 and inductively assume that 1, 2, . . . , n − 1 have a prime factorization. If n is
prime, then we are done, otherwise let n = ab with 1 < a, b < n. By the induction
hypothesis, a, b have a prime factorization. The two factorizations may be combined
to give a factorization for n. This proves (a).
We now prove (b), also by induction. As before, we may verify that (b) holds for the
base cases 2, 3, 4. For the induction step, we consider an arbitrary n > 1, assuming
that (b) holds for all numbers lesser than n. Suppose that n = p1 p2 . . . pk = q1 q2 . . . qk .
We apply Euclid’s lemma to the product q1 q2 . . . qk to deduce that p1 divides some
qi . Since qi is prime, we have p1 = qi . Now we apply the induction hypothesis to
n/p1 = p2 . . . pk = j̸=i qj and conclude that {p2 , . . . , pk } = {qj |j = ̸ i} (and k −1 = l −1).
Q

Together with pq = qi , this gives {p1 , . . . , pk } = {q1 , . . . , ql }, completing the proof of


(b). ■

As a consequence of the fundamental theorem, we can write every natural number


n > 1 as pe11 . . . pekk with distinct primes pi and ei being a non-negative integer.

Exercise 2.1 Show that if gcd(a, b) = 1, then gcd(ab, c) = gcd(a, c)gcd(b, c). Deduce
that if gcd(a, b) = 1 and a|bc, then a|c. ■
3. Congruences

3.1 Definition and properties


Let a, b ∈ Z and n > 1 be a natural number. We say that a ≡ b (mod n) (read as a is
congruent to b modulo n) if a − b is divisible by n.

Examples: 17 ≡ 3 (mod 7), −20 ≡ −8 (mod 3), 360 ≡ 0 (mod 60).

Properties satisfied by congruences:

1. If a ≡ b (mod n) and c ≡ d (mod n), then a + c ≡ b + d (mod n).


2. If a ≡ b (mod n) and c ≡ d (mod n), then ac = bd (mod n).
3. Note that division doesn’t work the same way as reals: in general, ac ≡ bc (mod
n) does not imply that a ≡ b (mod n) OR c ≡ 0 (mod n). A counter-example
is n = 6, a = 4, b = 2, c = 3.
To understand what we can deduce from ac ≡ bc (mod n), we rewrite it as
(a − b)c ≡ 0 (mod n). Now we see that in some cases, we can draw some
conclusions. For example, if gcd(c, n) = 1, then we can conclude that a ≡ b
(mod n). Also, if n is prime, then we can conclude that n divides one of
(a − b), c, i.e. a ≡ b (mod n) or c ≡ 0 (mod n).
4. The congruence relation is an equivalence relation, i.e. it satisfies the following
properties:
(i) Reflexivity: a ≡ a (mod n);
(ii) Symmetry: a ≡ b (mod n) implies that b ≡ a (mod n);
(iii) Transitivity: a ≡ b (mod n) and b ≡ c (mod n) implies that a ≡ c (mod n).
An equivalence relation partitions the set into equivalence classes, in this
case congruence classes. The congruence classes for a given n are {0, ±n, ± −
2n, . . . , }, {1, 1 ± n, 1 ± 2n, . . . , }, . . . , {n − 1, n − 1 ± n, n − 1 ± 2n, . . .}.
We consider the numbers {0, 1, . . . , n − 1} to be the canonical representatives of
these congruence classes, as these numbers are the remainders when divided
by n.
22 Chapter 3. Congruences

3.2 Arithmetic in Zn
The basic arithmetic operations of addition and subtraction have time complexity
O(log n); the complexity of multiplication is O(log n log log n); the complexity of find-
ing ab (mod n) (by repeated squaring) is O(log b log n log log n) = O(log2 n log log n).
An example of exponentiation by repeated squaring: we find 3100 modulo 35 as
follows: we first find the values of 3k modulo 35 for k being a power of two.
k 1 2 4 8 16 32 64
3k (mod 35) 3 9 11 16 11 16 11
We now find:

3100 ≡ 364 · 332 · 34 ( mod 35)


≡ 11 × 16 × 11( mod 35)
≡ 11(mod 35)

What about division? A reasonable definition of b/a modulo n is to define it as


the integer c ∈ {0, 1, . . . , n − 1} such that b ≡ ac (mod n). Ideally, we would like the
solution to be unique: this is possible if gcd(a, n) = 1 (see next section).

3.3 Linear congruences and the Chinese Remainder Theo-


rem
Consider the following linear congruence in one variable:

ax ≡ b(mod n). (3.1)

We can rewrite the above equation as ax = b + ny; thus we see that this equation has
a solution if and only if gcd(a, n) divides b. It is usually convenient to divide this
equation on both sides by gcd(a, n); translating this back to congruences, this means
considering Equation 3.1 when (a, n) = 1. In this case, it turns out that the solution
is unique.
Theorem 3.1 Given a ∈ {0, 1, . . . , n − 1} such that gcd(a, n) = 1, the congruence
ax ≡ b ( mod n) has a unique solution modulo n.

Proof. Suppose that x1 , x2 are two solutions to the given congruence equation. Then
ax1 ≡ b (mod n) and ax2 ≡ b (mod n); subtracting we get:

a(x1 − x2 ) ≡ 0( mod n).

Since gcd(a, n) = 1, this implies that n divides x1 − x2 , i.e. x1 ≡ x2 (mod n). ■

Without the assumption that gcd(a, n) = 1, how many solutions does the linear
congruence 3.1 have? You will figure this out in an exercise below!
3.3 Linear congruences and the Chinese Remainder Theorem 23

Exercise 3.1 Find all the solutions in {0, 1, . . . , 24} to 10x ≡ 15 (mod 25). ■

Exercise 3.2 Suppose that gcd(a, n) divides b. How many distinct solutions to
ax ≡ b (mod n), modulo n, are there? Justify your answer. [Hint: You may be
able to first guess the answer from what you find in the previous exercise.] ■

The second type of linear congruence that we look at is simultaneous congruences,


the most general problem being the following: find all x ∈ Z such that x ≡ a1 (mod
n1 ), x ≡ a2 (mod n2 ), . . . , x ≡ a2 (mod nk ). That is, we want the integers x that
simultaneously satisfy all the k congruences.
We first look at the simplest case, i.e. k = 2. Consider the following pair of
congruences:

x ≡ a( mod n), x ≡ b( mod m). (3.2)

We have x = a + ny = b + mz; thus we obtain the equation ny − mz = b − a. This


equation has a solution for y, z in integers if and only if gcd(n, m) divides (b − a). As
before, we simplify further and first consider the case that gcd(n, m) = 1. In this
case, 3.2 has a solution, and further, this solution is unique modulo mn.
Theorem 3.2 Let gcd(m, n) = 1. Then the map f : {0, 1, . . . , mn−1} → {0, 1, . . . , m−
1} × {0, 1, . . . , n − 1}, given by f (x) = (x1 , x2 ) where x ≡ x1 (mod m) and x ≡ x2
(mod n), is a bijection.

Proof. The fact that this map is surjective follows from the preceding paragraph:
given (x1 , x2 ), we may use the extended Euclidean algorithm to find y, z such that
ny − mz = x2 − x1 ; then x1 + ny = x2 + mz is a pre-image of (x1 , x2 ).
From the fact that this map is surjective and the fact that the domain and co-domain
are finite sets of the same size, we may already conclude that f is a bijection.
Nevertheless, we may also prove that f is injective, as follows. Let x, y be two
different numbers in {0, 1, . . . , mn − 1}. Then x − y ∈ {1, . . . , mn − 1}, hence x − y is
not divisible by mn. Since gcd(m, n) = 1, we can conclude: x − y is not divisible by
m OR x − y is not divisible by n. In either case we have f (x) ̸= f (y), proving that f
is an injective map. ■

Example: Consider m = 3, n = 4. The values of f (0), f (1), . . . , f (11) are, in order:


(0, 0), (1, 1), (2, 2), (0, 3), (1, 0), (2, 1), (0, 2), (1, 3), (2, 0), (0, 1), (1, 2), (2, 3).
The above result extends naturally to multiple moduli.
Theorem 3.3 — Chinese Remainder Theorem. Suppose that n1 , . . . , nk are natural
numbers such that gcd(ni , nj ) = 1 for every i ̸= j. Then the congruences x ≡ a1
(mod n1 ), . . . , x ≡ ak (mod nk ) has a unique solution modulo n1 . . . nk . Equivalently,
the map f : {0, 1, . . . , n1 . . . nk − 1} → {0, 1, . . . , n1 − 1} × . . . × {0, 1, . . . , nk − 1} given
by f (x) = (x1 , . . . , xk ) with x ≡ xi (mod ni ) for all i, is a bijection.
24 Chapter 3. Congruences

Further, we can find this unique value of x in polynomial-time.

Proof Sketch: The first part follows by a repeated application of Theorem 3.2. For
the second part, we may use the Extended Euclidean algorithm to successively solve
pairs of congruences. ■

3.4 The ring Zn


We define Zn to be the ({0, 1, . . . , n − 1}, +, ×), where + is defined as a + b = c if
a + b ≡ c (modulo n) and a × b = c if ab ≡ c (modulo n).
When we are working with the elements of Zn , the congruence relation becomes an
equality; thus, in Z7 , we have: 5 + 4 = 2 and 5 × 4 = 6.
In this language, we may express Theorem 3.2 as: if gcd(m, n) = 1, then Zmn ∼ =
Zm × Zn . This is read as the two sets being isomorphic; this means that the bijective
map f satisfies: f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y).
We define Z∗n = {a ∈ Zn |gcd(a, n) = 1}. If gcd(m, n) = 1, then the bijection from Zmn
to Zm × Zn is also a bijection from Z∗mn to Z∗m × Z∗n .
For a set R with two binary operations (+, ×), we call (R, +, ×) a r ing if the following
properties are satisfied:
(a)[Closures] For every a, b ∈ R, we have: a + b ∈ R and a, b ∈ R;
(b)[Identities] We have distinct elements 0, 1 ∈ R such that a + 0 = 0 + a = a and
a × 1 = 1 × a = a for every a ∈ R;
(c)[Associativity] For all a, b, c ∈ R, we have: a + (b + c) = (a + b) + c and a × (b × c) =
(a × b) × c.
(d)[Distributivity] For all a, b, c ∈ R, we have: a(b+c) = ab+ac and (a+b)c = ac+bc.
(e)Addition is commutative For all a, b ∈ R, we have: a + b = b + a.
(f)Additive inverses For all a ∈ R, there is an element b such that a + b = 0; we
may just write −a for this element.
If multiplication is also commutative, then we call R a commutative ring.

3.5 Fermat’s little theorem


Theorem 3.4 Let p be a prime. Then, for every natural number a, we have:

ap ≡ a( mod p). (3.3)

Equivalently, we may say: for every natural a such that p does not divide a, we
have:

ap−1 ≡ 1( mod p). (3.4)

Proof. We give two proofs.


3.6 Lagrange’s theorem 25

Proof by mathematical induction on a: We prove Eqn 3.3 for every natural ! number a.
p
The base case is a = 1; for the induction step, write (a + 1)p = ap + p−1 ap−i + 1;
P
i=1
i
observe that every term in the summation is divisible by p and complete using the
induction hypothesis.
Proof by a bijection: We assume that p does not divide a and observe that
{a, 2a, . . . , (p − 1)a} must all have distinct values modulo p (to see this, consider their
differences). This implies that the set of these values modulo p must be the same (pos-
sibly in different order) as {1, 2, . . . , p − 1}. Now we note that a × 2a . . . × (p − 1)a ≡
(p − 1)! (modulo p). We may divide by (p − 1)! on both sides; this is possible because
p does not divide (p − 1)!; this gives us the desired result. ■

3.6 Lagrange’s theorem


Theorem 3.5 Let p be a prime. Then a polynomial f (x) in Zp [x] has at most
deg(f ) roots.

Proof. We prove this by induction on the degree of f , with linear polynomials


being the case case. We already noted that ax − b has a unique root in Zn when
gcd(a, n) = 1. Thus, if a =
̸ 0 and a ∈ Zp , then the polynomial (ax − b) has exactly
one root.
Now assume that f (x) is a polynomial of degree d > 1 and assume inductively that
the statement of the theorem is true for all polynomials o degree less than d.
Let x0 be one root of f (x). By the remainder theorem for polynomials, we have:
f (x) = (x − x0 )g(x). Now if x1 is a root of f , then it must be the case that x1 = x0
or g(x0 ) = 0. This is because in Zp , if ab = 0, we must have a = 0 or b = 0.
Thus the number of roots of f is at most one plus the number of roots of g, and
using the induction hypothesis, this value is at most 1 + (d − 1) = d.
Remark: Since x0 can be a multiple root of f , it may be cleaner/more rigorous
to first write f (x) = (x − x0 )k h(x) for the largest value of k possible and use the
induction hypothesis on h(x). ■

3.7 Classroom Exercises


1. Find the remainder when 31000 is divided by 23.
Solution: We observe that 322 ≡ 1 (mod 23) by Fermat’s little theorem. Thus,
if we write 1000 = 22q + r, then we get: 31000 ≡ 322q+r ≡ 3r (mod 23). We
find that 1000 = 22 × 45 + 10 so that 31000 ≡ 310 (mod 23), which we find by
repeated squaring to be 8.
2. Let p be prime and a ∈ Zp , a ̸= 0. Find the number of solutions of ax = 1.
Solution: There’s a unique solution by Theorem 3.1; we denote this by a−1
1
or .
a
3. Let p be prime. Find the number of solutions of xy = 1 in Zp .
26 Chapter 3. Congruences

Solution: Using the previous exercise, for every x ̸= 0, there is one solution,
thus the number of solutions is p − 1.
4. Let p be prime and a, b be non-zero elements of Zp . Find the number of
solutions of ax + by = 1.
Solution: For every x, there is a unique value of y; thus the number of
solutions is p.

3.8 Euler’s totient function and Euler’s theorem


Euler’s totient function is defined for natural numbers as: ϕ(n) = |Z∗n |. As observed
earlier, if gcd(m, n) = 1, then there’s a bijection between Z∗mn and Z∗m × Z∗n , so that
we have ϕ(mn) = ϕ(m)ϕ(n) in this case.
In particular, if n = pe11 . . . pekk where the pi s are distinct primes, then we have:
ϕ(n) = i phi(pei i ).
Q

1
If p is prime, then ϕ(pk ) = pk − pk−1 = pk (1 − ). Thus, we obtain:
p
e e
Theorem 3.6 If n = p11 . . . pkk , then

1 1
! !
ϕ(n) = n 1 − ... 1− .
p1 pk

We now present a generalization of Fermat’s little theorem.

Theorem 3.7 [Euler’s Theorem] If gcd(a, n) = 1, then aϕ(n) ≡ 1 (mod n). Equiva-
lently, if a ∈ Z∗n , then aϕ(n) = 1 in Zn .

Proof. We generalize the second proof of Fermat’s little theorem. Consider the set
Z∗n . Firstly, we claim that if a ∈ Z∗n , then the sets aZ∗n and Z∗n are equal, as subsets
of Zn . To see this observe that if b ∈ Z∗n , then ab ∈ Z)∗n as well. Thus, aZ∗n ⊆ Z∗n .
Also, if b1 , b2 are two distinct elements of Z∗n , then ab1 ̸= ab2 , because b1 − b2 ̸≡ 0
(mod n) and gcd(a, n) = 1.
Let P = x∈Z∗n x, Q = x∈aZ∗n x. We have: P = Q and Q = aϕ(n) P (in Zn ). Since
Q Q

P ∈ Z∗n , we may cancel P on both sides from the equation aϕ(n) P = P to obtain the
theorem statement. ■

3.9 Euclid’s algorithm and unique factorization for polyno-


mials
Let p be prime. Given two polynomials f (x), g(x) ∈ Zp [x], we may write

g(x) = f (x)q(x) + r(x) (3.5)

with deg(r(x)) < deg(f (x)). We remark that 3.5 does not always hold over Zn for n
composite, for example, if n = 4 and f (x) = 2x, g(x) = x.
3.10 Classroom Exercises 27

The Extended Euclidean algorirthm thus works for polynomials in the same way
as for integers; the running time is linear in the degree of the polynomials and
polynomial in the size of the coefficients. In particular, it finds the gcd of two
polynomials, which we define below.
Definition 3.1 Given two polynomials f (x), g(x) ∈ Zp [x], we define the greatest
common divisor (gcd) of f (x) and g(x) as the unique monic polynomial of largest
degree that divides f (x) and g(x).

Why is the monic polynomial of largest degree that divides both polynomials unique?
To see this, we may argue that if h(x) is one such polynomial and t(x) some polynomial
that also divides both f (x) and g(x), then t(x) must divide h(x). From this, the
desired conclusion may be drawn.
As a consequence of the Euclidean algorithm, we also deduce Bezout’s lemma for
polynomials.
Theorem 3.8 — Bezout’s Lemma. Let p be prime and f (x), g(x) be polynomials in
Zp [x]. Then there exist polynomials u(x), v(x) such that f (x)u(x) + g(x)v(x) =
gcd(f (x), g(x)). Further, the Euclidean algorithm finds u(x), v(x) such that
deg(u(x)) < deg(g(x)) and deg(v(x)) < deg(f (x)).

The final analog of natural numbers for polynomials is unique factorization. We say
that a polynomial f (x) ∈ Zp [x] is irreducible if f (x) = g(x)h(x) implies g(x) = 1 or
h(x) = 1.

Theorem 3.9 Every polynomial f (x) ∈ Zp [x] can be uniquely factorized into
irreducible polynomials.

An illustration of Euclid’s algorithm for f (x) = x3 + x, g(x) = 5x2 − 13x + 6 over Z17 :
A B r q u v s t
x3 + x 5x2 − 13x + 6 −5x − 14 7x + 8 1 0 0 1
5x2 − 13x + 6 −5x − 14 0 −x + 2 0 1 1 −7x − 8
−5x − 14 0 1 −7x − 8 x−2 −7x2 + 5x

Thus, the gcd of x3 + x and 5x2 − 13x + 6 in Z17 [x] is (−5x − 14)/ − 5 = (x − 4). We
further have: (x3 + x)u(x) + (5x2 − 13x + 6)v(x) = −5x − 14, where u(x) = 1 and
v(x) = (−7x − 8).

3.10 Classroom Exercises


1. Find ϕ(360).
Solution: ϕ(360) = ϕ(9)ϕ(8)ϕ(5) = 6 × 4 × 4 = 96.
2. Find the remainder when 71000 is divided by 360.
Solution: Algorithmically, repeated squaring is the best way to solve such a
problem; in this case, we may use Euler’s theorem to obtain: 71000 = 740 (mod
360) and for the reduced exponent, we may apply repeated squaring.
3. Solve the following equations in Z13 .
28 Chapter 3. Congruences

a. x2 = 2: No solution.
b. x2 = 3: x = 4 and x = 9 are the solutions.
c. x3 = 1: x = 1, x = 3, x = 9 are the solutions.
d. x5 = 2. We combine this with x12 = 1 as follows: We find k such that
5k + 12l = 1. Now, this gives us: x = x5k+12l = 2k .

3.11 The equations xd = 1 and xd = a in Zp


In this section, we assume that p is prime and state all results for Zp . We consider
solutions of the equation xd = a, which rest on the consideration of two "opposite"
cases: d|(p − 1) and gcd(d, p − 1) = 1.

Theorem 3.10 Let d|(p − 1). Then the polynomial xd − 1 has d distinct roots.

Proof. We know that xp−1 − 1 has p − 1 distinct roots, namely 1, 2, . . . , p − 1. Further,


xp−1 − 1 is divisible by xd − 1 because d|(p − 1). Thus, we can write xp−1 − 1 =
(xd − 1)g(x) for a polynomial g(x) of degree p − 1 − d. If (xd − 1) has less than d
roots, then the number of roots of the RHS would be at most d − 1 + deg(g) < p − 1
(because of Lagrange’s theorem applied to g(x)), which is a contradiction. This
completes the proof. ■

We remark that if f (x) is a polynomial with deg(f ) roots, then for every divisor g(x)
of f (x), the polynomial g(x) must have deg(g)) roots.

Theorem 3.11 Let gcd(d, p − 1) = 1 and a ∈ Zp . Then the polynomial xd − a has a


unique root, which we can find efficiently.

Proof. Since gcd(d, p − 1) = 1, we can find (efficiently) a positive integer k such that:

dk ≡ 1 mod (p − 1).

Raising both sides of xd = a to the power k, we obtain xdk = ak and applying Fermat’s
little theorem, we get x = ak . ■

Combining the previous ideas, we can obtain the following corollary to Theorem
3.10.
Corollary 3.1 In Zp , the number of roots of xd − 1 is equal to gcd(d, p − 1).

The above result is obtained by finding k such that dk ≡ gcd(d, p − 1) (mod (p − 1))
and noting that xd = 1 raised to the power of k gives: xgcd(d,p−1) = 1, which has
gcd(d, p − 1) solutions.

Exercise 3.3 Show that the number of roots of xd − a in Zp is either zero or equal
to gcd(d, p − 1). ■

Theorem 3.11 also generalizes as follows, with the theorem statement itself containing
the explanation.
3.12 Application: The RSA Algorithm 29

Theorem 3.12 Let n ∈ N. If gcd(d, ϕ(n)) = 1 and gcd(a, n) = 1, then the equation
xd = a has a unique solution in Zn , given by x = ak , where dk ≡ 1 (mod ϕ(n)).

3.12 Application: The RSA Algorithm


The idea behind Theorem 3.12 is used in the RSA algorithm. The original RSA
paper is very readable; the link is here: Link to original RSA paper.
Sections 3,4 of the above paper can be skipped; after Section 2, just go to Section 5.

3.13 Order and primitive roots


Definition 3.2 We say that an element a ∈ Z∗n has order d if ad ≡ 1 (mod n) and
ak ̸≡ 1 (mod n) for every positive integer k < d. We denote this by ordn (a).

This definition is well-defined because if a ∈ Z∗n , then aϕ(n) ≡ 1 (mod n), and thus
there must exist a smallest positive ineger d such that ad ≡ 1 (mod n).
As an example, we compute the orders of every element in Z∗7 .
a 1 2 3 4 5 6
ord7 (a) 1 3 6 3 6 2
Observation 2 For every a ∈ Z∗n , we have: ordn (a) divides ϕ(n).

Proof. We have aϕ(n) ≡ 1 (mod n). Let d = ordn (a) and suppose for contradiction
that d does not divide ϕ(n); then we can write ϕ(n) = qd + r where 0 < r < d. Since
ad ≡ 1 (mod n), we obtain ar ≡ 1 (mod n), which is a contradiction to the definition
of d as the order. This proves the observation. ■

In particular, when p is prime and a ∈ Z∗p „ we have ordp (a) divides (p − 1).
Definition 3.3 We call an element a ∈ Z∗n a primitive kth-root of unity if ordn (a) = k.
We call an element a ∈ Z∗n a primitive root if ordn (a) = ϕ(n).

One reason that primitive roots are interesting is that their powers generate Z∗n , that
is if g is a primitive root, then {g, g 2 , . . . , g ϕ(n) } = Z∗n . To see this, note that if g i ≡ g j
(mod n), then g i−j ≡ 1 (mod n); this is not possible if i, j are distinct and less than
ϕ(n) (because ordn (g) = ϕ(n)).
Primitive roots exist only for some moduli; the relevant fact for us is that they exist
when n is prime.
Theorem 3.13 Let p be a prime. Then there is an element g ∈ Z∗p such that
ordp (g) = p − 1.
4. The Quadratic Equation in Zp

4.1 Quadratic Residues


The goal of this chapter is to solve the equation x2 = a in Zp , where p is prime. The
first question is whether, for a given a, it is solvable at all. To this end, we show the
following for the more general question of when a given element in Zp is a dth power.

Theorem 4.1 Let d|(p − 1) and let a ∈ Zp . Then xd = a is solvable if and only if
a(p−1)/d = 1.

Proof. Firstly, suppose that xd = a is solvable, with a ̸= 0. Let bd = a for some b ∈ 𭟋p .


Clearly b ̸= 0. Raising both sides to the power of (p − 1)/d, we obtain: bp−1 = ap−1 /d;
since the LHS equals 1, we find that a(p−1)/d is a necessary condition.
Now we show that it is also a sufficient condition. We begin by noting that if xd = a
has a solution, then in fact it has d solutions. This is because if bd = a, then (bc)d = a
for every c such that cd = 1; the number of c such that cxd = 1 equals d.
Thus the map f : Z∗p → Z∗p defined as f (x) = xd has d pre-images for every element
in the range of f . This shows that |Range(f )| = (p − 1)/d. Now, let S = {a ∈
Range(f )}, i.e. S is the set of elements which can be expressed as a dth power. Let
T = {a : a(p−1)/d = 1}.
By the first paragraph, we have: T ⊆ S. Further, |T | = |S| = (p − 1)/d, hence T = S,
which completes the proof. ■

For arbitrary d, i.e. without the assumption that d|(p − 1), we can still answer the
same question.

Corollary 4.1 Let d be a positive integer, and let a ∈ Zp . Then xd = a is solvable


if and only if ak(p−1)/gcd(d,p−1) = 1, where kd ≡ gcd(d, p − 1) (mod p − 1).
32 Chapter 4. The Quadratic Equation in Zp

4.2 Application: Coin Tossing over a telephone


Suppose that Alice and Bob would like to have a fair coin toss over a distance, such
as via telephone. How can they do this without either of them being able to gain an
unfair advantage? A solution to this based was suggested by Manuel Blum.

First, let’s consider the following abstract protocol:

1. Alice and Bob agree on a function f , whose domain is two- valued, say {H, T }.
2. Alice tosses a coin, say the result is x ∈ {H, T }.
3. Alice sends f (x) to Bob.
4. Now Bob calls Heads or Tails, let y be the value called by Bob.
5. Bob sends y to Alice.
6. Alice now sends x to Bob; both of them know x, y and know who has won the
toss.

The key properties required of the function f for the above protocol to work are:

(a) Given the value of f (x), Bob cannot find out the value of x. This means that
f (H) must equal f (T ), otherwise Bob can compute each of these two values and see
which one was sent by Alice.

(b) Alice shouldn’t be able to find x2 =


̸ x1 such that f (x1 ) = f (x2 ); otherwise she
can always switch to the other value if needed.

The two properties appear to be negations of each other; however the idea is that
the two-valuedness of the domain is indirectly encoded.

The protocol suggested by Blum is the following.

1. Bob chooses two large primes p, q and sends their product n = pq to Alice.
2. Alice picks a random number x ∈ Zn ; with high probability, x ∈ Z∗n .
3. Alice sends y = x2 ∈ Zn to Bob.
4. Bob solves the equation x2 = y (with y being known and x being unknown)
separately in Zp and in Zq . Combining the solutions, Bob gets FOUR solutions
in Zn .
5. Now Bob calls one of the four values, say Bob calls z and sends it to Alice.
6. Alice now sends x to Bob.
7. Bob wins the toss if z = x or z = −x (in Zn ), else Alice wins the toss.

Let’s see the two properties satisfied by the above protocol.

(a) Suppose the four square-roots of y in Zn are x, −x, x2 , −x2 . Given these four
values, Bob has exactly a 50% chance of guessing one of x, −x.

(b) Alice cannot switch to one of x2 , −x2 because Alice cannot find the other pair
of square-roots without essentially factoring n. This is because if Alice did find the
value of x2 , then she can factor n by finding gcd(x1 − x2 , n) and gcd(x1 + x2 , n) which
would be the values p, q.

The security of the protocol thus depends on the hardness of factoring an integer.
Currently there is no known polynomial-time algorithm for factoring an integer and
4.3 The Legendre Symbol 33

it is believed that such an algorithm is unlikely. If the numbers p, q are large enough,
then Alice would need hundreds of years to factor n should she wish to do so.

4.3 The Legendre Symbol


The Legendre symbol is defined as follows, for a natural number a and prime number
p. If p does not divide a, then it is defined as:
!
a
= +1 if x2 = a has solutions in Zp
p
= −1 otherwise
!
a
If p divides a, then it is defined as = 0.
p
By Theorem 4.1, we obtain the following:
!
a (p−1)
Lemma 3 [Euler’s Criterion] ≡ a 2 (mod p).
p
!
a
The numbers a ∈ Zp such that = 1 are called quadratic residues and the numbers
!
p
a
a ∈ Zp such that = −1 are called quadratic non-residues.
p
We now use Euler’s criterion in the following exercises.
Class Exercises:
!
−1
1. Find for the following values of p: 3, 5, 7, 11, 13.
p
Solution: We compute −1(p−1)/2 for each prime. We see that the above
quantity is 1 if and only if (p − 1)/2 is even, i.e. if p ≡ 1 (mod 4). Thus, we get
the following.
!
−1
p
p
3 -1
5 1
7 -1
11 1
13 1
2
!
2. Find for the following values of p: 5, 7, 11, 17.
p
Solution: We compute 2(p−1)/2 for each prime.
(a) p = 5: 22 ≡ −1 (mod 5).
(b) p = 7: 23 ≡ 1 (mod 7).
(c) p = 11: 25 ≡ −1 (mod 11).
(d) p = 17: 28 = 162 ≡ 1 (mod 17).
Thus, we get the following.
34 Chapter 4. The Quadratic Equation in Zp

2
!
p
p
5 -1
7 1
11 -1
17 1
3. (a) Does the equation x3 ≡ 2 (mod 19) have solutions?
(b) Does the equation x3 ≡ 2 (mod 17) have solutions?
Solution:
(a) We have p = 19 and 3|(p − 1). So we compute 2( p − 1)/3, i.e. 26 ≡ 64 ≡ 7
(mod 19). Thus, this equation does not have solutions.
(b) We have p = 17 and gcd(3, 16) = 1. Thus, there is a unique solution, given
by x ≡ 2k (mod 17), where 3k ≡ 1 (mod 16).
More generally, given an arbitrary d, to check whether xd = a has solutions, we
first find k such that dk ≡ gcd(d, p − 1) (mod p − 1), and then check whether
the equation xgcd(d,p01)!= ak has !solutions. !
a b ab
4. Given the values of and , find the corresponding values of .
p p ! ! !
p
ab a b
Solution: Because of Euler’s criterion, we find that = . Thus,
p p p
we get! the ! following.
!
a b b
p p p
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1

4.4 The equation x2 = a: Two easy cases


4.4.1 p ≡ 3 ( mod 4)
When p ≡ 3 (mod 4), we can solve x2 = a in Zp directly, as the following exercise
illustrates.
Exercise 4.1 Solve x2 = a in Z79 , assuming that it has solutions. ■

Solution: Since a is a quadratic residue, we must have: a39 ≡ 1 (mod 79). Multiply-
ing by a on both sides, we get a40 ≡ a (mod 79), so that x = a20 is a square-root of
a.

The above idea generalizes for any prime p ≡ 3 (mod 4) because the condition
a(p−1)/2 ≡ 1 (mod p) is equivalent to a(p+1)/2 ≡ a (mod p) and (p + 1)/2 is even when
p ≡ 3 (mod 4).

Thus, in this case, a(p+1)/4 is a square-root of a.


2
!
4.5 Wilson’s theorem and the value of 35
p
4.4.2 p ≡ 1 (mod 4), a = −1
A key idea that we will use is that we can find a quadratic non-residue; for this we
can simply sample a random element a and check if a(p−1)/2 ≡ −1 (mod p), as half
of the elements in Zp are quadratic non-residues. By sampling several elements, we
can ensure that the probability of success is high.
Lemma
! 4 There’s an efficient randomized algorithm that finds b ∈ Zp such that
b
= −1 with high probability.
p

In particular, once we find such a b, notice that b satisfies b(p−1)/2 = −1 and if p ≡ 1


(mod 4), then b(p−1)/4 is a square-root of -1.

2
 

4.5 Wilson’s theorem and the value of  


 
p
Theorem 4.2 Let p be a prime. Then (p − 1)! ≡ −1 (mod p).

Proof. It is possible to prove this in several ways. One proof idea is to note that
the numbers in {1, 2, . . . , p − 1} may be paired up as (a, a−1 ), except for 1, (p − 1)
which are their own inverses. The product of each pair is 1, and the product of the
remaining elements is -1.
Another proof idea is to observe that xp−1 − 1 = (x − 1) . . . (x − p + 1) because of
Fermat’s little theorem. Comparing the constant term gives the result. ■

Using
! Wilson’s theorem, we derive the following which we shall use in computing
2
.
p
p−1
 
Claim 5 Let r = !. Then r2 ≡ −1 (mod p) if p ≡ 1 (mod 4), and r2 ≡ 1 (mod
2
p) if p ≡ 3 (mod 4).

Proof. Writing p − 1 = −1, p − 2 = −2, . . . , (p + 1)/2 = −(p − 1)/2, we obtain, (p − 1)! =


r2 (−1)(p−1)/2 . From this and Wilson’s theorem, the claim follows. ■

2
!
We now give the value of .
p

Theorem 4.3 Let p be an odd prime. If p ≡ ±1 (mod 8), then 2 is a quadratic


residue modulo p, otherwise, 2 is a quadratic non-residue modulo p.

Proof. All calculations are in Zp . We write (p−1)! = ST , where S= 1×3×. . .×(p−2)
p − 1

and T = 2 × 4 × . . . × (p − 1). We note that T = 2(p−1)/2 !. Also, we rewrite
2
p−1
 
S as S = 1 × −2 × 3 × . . . (p − 1)(−1)(p−1)/2+1 = !(−1)⌊(p−1)/4⌋ . Letting r =
2
36 Chapter 4. The Quadratic Equation in Zp
p−1
 
! and substituting for S, T , we get:
2

(p − 1)! = 2(p−1)/2 r2 (−1)⌊(p−1)/4⌋ .

The LHS is -1; on the RHS, we know the value of r2 from the previous claim. Thus
we get an expression for 2(p−1)/2 , whose value we may obtain for each of the values
of p modulo 8. ■

4.6 Quadratic Reciprocity


One of the most interesting results about quadratic residues is that of the law of
quadratic reciprocity, proved by Gauss.
Theorem 4.4 — Law of quadratic reciprocity. If p, q are odd primes, then
! !
p q
= if p ≡ 1(mod 4) or q ≡ 1(mod 4)
q p
!
q
=− if p ≡ q ≡ 3(mod 4)
p

3
!
For example, to calculate , we can write:
97

3 97 1
! ! !
= = = 1.
97 3 3

4.7 The Tonelli-Shanks Algorithm


Before we describe the algorithm, we recall that the main idea in solving x2 = a
when p is a prime congruent to 3 mod 4 is that we had: am = 1, where m = (p − 1)/2
is odd. By multiplying by a on both sides, we get am+1 = a and since m + 1 is even,
a(m+1)/2 is a square-root of a.
This fails for primes of the form 1 mod 4; nevertheless a modification of the idea
works. We first illustrate this in the case that p is congruent to 5 mod 8.

4.7.1 p ≡ 5 (mod 8)
Suppose that p ≡ 5 (mod 8). Then, we can write p − 1 = 4m, where m is odd. For
example, if p = 61, we can write p − 1 = 4m with m = 15.
Consider the number am in Zp . If am = 1, then our earlier method works, i.e. a(m+1)/2
is a square-root of a. Suppose that am ≠ 1. We also know that a2m = a(p−1)/2 = 1.
This implies that am = −1. How can we use this information?
The key idea is to find a quadratic non-residue, i.e. a number r such that r(p−1)/2 = −1.
We can find a such an element by sampling random elements from Z∗p and testing if
they satisfy the condition. Since at least half of the elements in Z∗p are quadratic
non-residues, the probability of success is 1/2.
4.7 The Tonelli-Shanks Algorithm 37

Lemma 6 Given a prime p, there is an efficient randomized algorithm to find an


element r ∈ Zp such that r is quadratic non-residue, i.e. r(p−1)/2 = −1.
In particular, for our current example, we have: r2m = −1. This means that
(ar2 )m = 1. Multiplying by a on both sides we obtain: am+1 r2m = a. Now the LHS
has both exponents even so that we find a(m+1)/2 rm as a square-root of a.
3
!
Example: Let p = 61 and a = 3. By quadratic reciprocity, we know that =
61
61
!
= 1.
3
We have p − 1 = 4m with m = 15. We find am = 315 = −1. We also know from our
earlier calculations that 2 is a quadratic non-residue when p ≡ 5 (mod 8). Thus we
set r = 2. We have r2m = 230 = −1, thus 315 · 230 = 1. Multiplying by 3 on both sides,
we obtain 38 · 215 = 8 as a square-root of 3. The other square-root is −8 = 53.
To summarize, when p ≡ 5 (mod 8), we have two cases.

If am = 1, then a(m+1)2 is a square-root.

If am = −1, then a(m+1)/2 · 2m is a square-root.

4.7.2 The general case


Given an odd prime p, we write p − 1 = 2t m, with m odd. Given a number a ∈ Zp
whose square-root we wish to find, the key idea is to find a number b such that

(ab2 )m = 1.

By multiplying by a on both sides, we then find a(m+1)/2 bm as a square-root of a.


t−1
How can we find such a b? Let’s first consider the powers am , a2m , . . . , a2 m = 1.
k
Let k be the least non-negative integer such that a2 m = 1. If k = 0, then we are done
(with b = 1), otherwise let a1 = are , where r is a quadratic non-residue modulo p and
k−1
e = 2t−k . Then we have a21 m = 1. We now find the smallest non-negative integer
k1
k1 such that a21 m = 1; note that k1 < k. By repeating this process, we obtain a
sequence of values a0 = a, a1 , . . . , al where ai+1 is of the form ai+1 = ai rei with ei
being an even non-negative integer and am l = 1.

Before describing the algorithm formally, we illustrate it with an example.


Example: Solve the equation x2 = 2 in Z97 .
Solution: We have a = 2, m = 3 and we find:
a3 = 8, a6 = 64, a12 = 22, a24 = −1, a48 = 1.
We now find r such that r is a quadratic non-residue modulo 97. r = 5 works (as
can be checked using quadratic reciprocity, for example).
We now that r48 = −1, thus we set a1 = ar2 so that a24
1 = 1. Thus, a1 = 2 × 25 = 50.
38 Chapter 4. The Quadratic Equation in Zp

Now we compute a31 = 64, a61 = 22, a12


1 = −1, a1 = 1. We set a2 = a1 r so that a2 = 1.
24 4 12

Thus, a2 = 50 × 5 = 16.
4

We find a32 = 22, a62 = −1, a12


2 = 1 and we set a3 = a2 r so that a3 = 1. Thus, a3 =
8 6

16 × 5 = −1.
8

 3
Finally, we set a4 = a3 r16 = 61 and we have a34 = 1. Thus, we get ar2+4+8+16 = 1.
From this, we find a square-root of a to be: a2 r3+6+12+24 , which is equal to 83.

Thus, the two square-roots are 14,83.

We now describe the algorithm formally. Since every computation involves a mth
power it is convenient to compute mth powers at the beginning of the algorithm
itself.

In the algorithm, findk(z) is a function that returns the least non-negative integer
k
k such that z 2 = 1. This need not exist for every z ∈ Zp , but it exists for values z
which are mth powers (on which the function is invoked).

Algorithm 4 Tonelli-Shanks Algorithm


1: procedure Tonelli-Shanks(a, p) ▷ Finds x such that x2 = a in Zp
2: Write p − 1 = 2t m with m odd.
3: b ← am
4: k ← f indk(b)
5: if k=t then return Not a square.
6: end if(m+1)
7: x←a 2 .
8: if k=0 then return x.
9: end if (p−1)
10: Find r such that r 2 = −1.
11: s ← rm
t−k
12: S ← s2
13: while k > 0 do
14: b ← bS
t−k−1
15: x ← xs2
16: k ← f indk(b)
t−k
17: S ← s2
18: end while
19: return x
20: end procedure

The invariant x2 = ab is maintained at the beginning and end of each iteration. The
values of b, k, S, x for our earlier example (a = 2, p = 97) would be as shown in the
table below (with r = 5).
4.8 Hensel Lifting: From Zp to Zpk 39

b k S x
8 4 28 4
64 3 8 15
22 2 64 23
96 1 22 17
1 0 96 83

4.8 Hensel Lifting: From Zp to Zpk


We have seen one method to solve the quadratic equation in Zp ; we will now see how
to extend this to solving quadratic equations in Zkp for any given positive integer k.
By using the algorithm in the previous section we first find a solution b to x2 ≡ a
(mod p). To solve the equation x2 ≡ a (mod p2 ), we write x = py + b; this gives us:

(py + b)2 ≡ a( mod p2 ),

which is equivalent to:

2pby ≡ −(b2 − a)( mod p2 ).

Writing b2 − a = pc for an integer c (since p divides b2 − a), and dividing by p, we get:

2by ≡ −c(mod p).

For an odd prime p, we can now solve this equation as gcd(2b, p) = 1 (assuming
gcd(a, p) = 1).
The same method works for finding a solution modulo p2k given a solution modulo
pk ; this is called Hensel Lifting. Thus, from a solution modulo p, we can find a
solution modulo pk using O(log k) calls to a linear-equation solver.
This method generalizes further to finding the roots of polynomials of arbitrary
degree as well as for factoring polynomials.
We illustrate this with a couple of examples:
Example 1: Solve the equation x2 ≡ 5 (modulo 361).
Solution: Note that 361 = 192 . We first solve x2 ≡ 5 (modulo 19). We find 59 ≡ 1
(modulo 19) so that 5 is a quadratic residue. Multiplying by 5 on both sides, we
obtain 55 ≡ 9 modulo 19.
Now we write x = 19y + 9 and substitute to obtain:

(19y + 9)2 ≡ 5( mod 361).

That is:

18y ≡ −4 ( mod 19).

Solving this gives: y ≡ 4 (mod 19), so that x = 85 is a solution.


Example 2: Solve the equation x3 ≡ 3 (mod 289).
40 Chapter 4. The Quadratic Equation in Zp

Solution: We have 289 = 172 . We first solve the equation x3 ≡ 3 (mod 17). The
exponent 3 is co-prime to 16, so we first solve the auxiliary equation 3k + 16l = 1,
which gives us k = 5, l = −1 as a solution. We now raise our congruence equation to
the 5th power on both sides to obtain:

x15 ≡ 35 ( mod 17).

1
Since x16 ≡ 1 (mod 17), this is equivalent to x ≡ (mod 17), which gives us x ≡ 7
35
(mod 17).

Now we write x = 17y + 7 to obtain:

(17y + 7)3 ≡ 3( mod 289),

which is equivalent (after simplification) to

3 × 49 × y ≡ −20 mod 17.

From this we solve for y to find y ≡ 9 (mod 17), so that x ≡ 160 (mod 289) is a
solution.

4.9 A second algorithm for finding square-roots


We know describe another method to solving the equation

x2 = a

in Zp .

The algorithm itself is easy to describe and is as follows.

Algorithm 5 Square-root Algorithm


1: procedure Find Square-root(a, p) ▷ Finds x such that x2 = a in Zp
2: Pick random r ∈ Zp .
3: if gcd(x2 − a, (x − r)(p−1)/2 − 1) = (x − b) then return b, −b.
4: end if
5: end procedure

The gcd is found by repeatedly squaring (x − r)(p−1)/1 modulo (x2 − a) (in Zp [x]).
Note that (sx + t)2 ≡ 2stx + s2 a + t2 modulo (x2 − a); this expression may also be
used to find the successive remainders.

If the gcd in step 2 is 1 instead, then we repeat the algorithm with another random
value of r, until success. The probability that the above algorithm succeeds is at
least 1/2.

Example: x2 = 10 in Z41
4.9 A second algorithm for finding square-roots 41

Solution: For r = 1, we find the gcd to be 1; for r = 2, the calculation is as follows,


the congruences being modulo (x2 − 10).

(x − 2)2 ≡ 14 − 4x
(x − 2)4 ≡ (14 − 4x)2 ≡ 28 + 11x
(x − 2)8 ≡ (28 + 11x)2 ≡ x + 26
(x − 2)16 ≡ (x + 26)2 ≡ 11x + 30

Thus, (x − 2)20 ≡ (11x + 28)(11x + 30) ≡ 23x. The gcd of (x2 − 10) and (x − 2)20 − 1
is gcd(x2 − 10, 23x − 1) = (x + 16).
Thus the square-roots of 10 in Z41 are ±16.
5. Finite Fields

In this chapter, we study finite fields which are of importance in several areas of
computer science, such as coding theory, cryptography and complexity theory. This
will also lead us to a factorization algorithm for polynomials in Zp [x].

5.1 Groups
In abstract algebra, the most basic objects are groups.
Definition 5.1 A group is a pair (G, ∗), where G is a set and ∗ is a binary operation
on G which satisfies all of the following properties.
(a) Closure: For all g, h ∈ G, we have g ∗ h ∈ G.
(b) Associativity: For all g, h, k ∈ G, we have: g ∗ (h ∗ k) = (g ∗ h) ∗ k.
(c) Identity: There is a unique element e ∈ G such that for all g ∈ G, we have:
g ∗ e = e ∗ g = g.
(d) Inverse: For every element g ∈ G, there is an element h such that g ∗ h =
h ∗ g = e. This element is usually denoted by g −1 .

Examples of groups:

1. (R, +), (C, +), (Q, +), (Z, +), (Rn , +), (R[x], +)
2. (R \ {0}, ×), (C \ {0}, ×), (Q \ {0}), ×)
3. (Zn , +) (the cyclic group), (Z∗n , ×)
4. For every fixed n, the set of all n × n matrices over R under addition forms a
group.
5. For every fixed n, the set of all non-singular n × n matrices over R under
multiplication forms a group.
6. The set of all permutations of {1, 2, . . . , n} under composition forms a group;
this group is called the symmetric group and is denoted by Sn .
7. (R, ∗) where a ∗ b = a + b + 1.
44 Chapter 5. Finite Fields

Non-examples:

1. The set {0, 1, 2} under addition is not a group because it fails the Closure
property (a).
2. (R, −),(C, −) are not groups because they fail the Associativity property (b).
3. The pair (R, ∗) with ∗ being defined as a ∗ b = 1 is not a group because it fails
to have an Identity element.
4. (2S , ∪), (2S , ∩) are not groups because they fail the existence of an inverse for
every element.
5. The sets R, C, Zn are not groups under multiplication but note that we can
obtain groups from them by removing the elements without an inverse (Zero
in the first two cases, elements having a common factor with n in the third).

An important property of groups is cancellation and it follows from the existence of


inverses.

Observation 7 — Cancellation property. If (G, ∗) is a group and a ∗ b = a ∗ c, for


elements a, b, c ∈ G, then b = c. Similarly, if g ∗ h = j ∗ h for elements g, h, j ∈ G, then
g = j.

5.1.1 Cayley Tables and Isomorphism


The Cayley table for a group (G, ∗) is a matrix with rows and columns indexed by
the elements of G and the entry in row g and column h being the element g ∗ h. The
Cayley table defines the operation ∗. From the cancellation property, it follows that
every row (similarly, every column) must contain all the elements of G exactly once
(in some order).

Here’s an example: we define a two-element group G = ({a, e}, ∗) (with e being the
identity element) using the following table:

∗ e a + 0 1
e e a Now consider the Cayley table of the group (Z2 , +). 0 0 1
a a e 1 1 0

We may notice that the tables of the two groups are identical except for a changing
of the element names (with 0 in place of e and 1 in place of a, + in place of *).
When this happens for two groups, we call them isomorphic. Here’s a more formal
definition.
Definition 5.2 Two groups (G, ∗) and (H, ·) are said to be isomorphic if there
exists a bijection ϕ : G → H such that the following holds:
For every g1 , g2 , g3 ∈ G, g1 ∗ g2 = g3 if and only if ϕ(g1 ) · ϕ(g2 ) = ϕ(g3 ).

Examples of isomorphism:

1. Every group of size 3 is isomorphic to (Z3 , +).


2. The group of all nth roots of unity (under multiplication) is isomorphic to the
group (Zn , +). This group is also known as the cyclic group on n elements, and
is denoted by Cn .
5.1 Groups 45

5.1.2 Direct products and subgroups


We now look at two ways to construct new groups from existing groups.
Definition 5.3 Given two groups (G, ∗) and (H, ·), the direct product of G and H
is the group consisting of G × H with the group operation being co-ordinate wise,
i.e. (g1 , h1 ) × (g2 , h2 ) = (g1 ∗ g2 , h1 · h2 ).

It is easy to verify that G × H satisfies the definition of a group.


The direct product is useful to construct groups larger than (and containing)a given
group. In contrast, a group that is smaller than (and contained) in a given group is
called a subgroup.
Definition 5.4 We say that H is a subgroup of (G, ∗) (written H ≤ G) if H ⊆ G
and (H, ∗) is a group.

From this definition, it is not clear how to construct subgroups of a given group, but
one method is the following. Let S ⊆ G. We define the group generated by S as:

< S >= {g1 ∗ g2 ∗ . . . ∗ gk |k ∈ N and g1 , . . . , gk ∈ S ∪ S −1 }.

As in the case of direct product, we can easily verify that (< S >, ∗) satisfies the
group axioms, and is hence a subgroup of G. Conversely, if H is a subgroup of G
and < S >= H, then we call < S > a generating set for H.
Examples of subgroups:
Group Subgroup(s)
(R, +) Q, Z
(Z, +) 2Z, 3Z, 4Z, . . .
(Z6 , +) < 3 >= {0, 3} ,< 2 >= {0, 2, 4}
(Z∗p , ×) Quadratic residues
(Z2 × Z2 , +) < (1, 1) >= {(1, 1), (0, 0)}
GLn (R) SLn (R)
GLn (R): Real, invertible n × n matrices; SLn (R): Real n × n matrices with determi-
nant 1

5.1.3 Cosets and Lagrange’s theorem


In this section, we prove the following result, which is the main tool from group
theory that we will use.
Theorem 8 — Lagrange’s Theorem. Let G be a finite group and H be a subgroup of
G. Then |H| divides |G|. In particular, for all g ∈ G, we have: g |G| = e.
Before we begin the proof, we first define cosets which we will need.
Definition 5.5 Given a group (G, ∗) and a subgroup H of G, a left coset of H is a
set of the form g ∗ H = {g ∗ h|h ∈ H} for g ∈ G.

For example, for G = (Z, +) and H = 4Z, a coset is 3 + H = {3 + 4k : k ∈ Z}. In fact,


there are exactly four left cosets, namely 4Z, 4Z + 1, 4Z + 2, 4Z + 3. For example,
46 Chapter 5. Finite Fields

the coset 7 + H is the same as 3 + H.


Lemma 9 If (G, ∗) is a group and H is a subgroup of G, then for all g, h ∈ G, we
have: g ∗ H = h ∗ H or g ∗ H ∩ h ∗ H = ∅.
We first prove Lagrange’s theorem using the lemma (which we will prove subse-
quently).

Proof of Theorem 8: Lemma 9 implies that the left cosets of G form a partition of
G. Thus, we can write G = H ∪ g1 ∗ H ∪ . . . ∪ gr−1 ∗ H for some g1 , . . . , gr−1 .
Each coset has the same size, i.e. |H|. Thus, we obtain: r|H| = |G|, i.e. |H| divides
|G|, as desired.
The second part follows by considering the subgroup H = {a, a2 , . . . , ak = e} (for some
k ∈ N). We then have: k divides |G|, and hence a| G| = e. ■

5.2 Rings
Definition 5.6 A ring is a triple (R, +, ·) with +, · being binary operations on R,
satisfying the following properties.
1. (R, +) is an abelian group; its identity element is denoted by 0.
2. (R, ·) is associative and has an identity element which is denoted by 1.
3. For all a, b, c ∈ R, we have: a · (b + c) = a · b + a · c and (a + b) · c = a · c + b · c.

Examples of rings: In each example, the addition and multiplication are the
natural operators.
1. Z, R, C
2. Z[x], R[x]
3. Zn , Zn [x] (for every fixed n)
4. Zn [x]/(f (x)), i.e. the ring of polynomials in Zn [x] modulo f (x)
5. The ring of n × n real matrices (for each fixed n)

5.3 Fields
A field is a ring with the additional property that every non-zero element has a
multiplicative inverse.
Definition 5.7 A field is a triple (F, +, ·) with +, · being binary operations on R,
satisfying the following properties.
1. (F, +) is an abelian group; its identity element is denoted by 0.
2. (F \ {0}, ·) is a group, whose identity element is denoted by 1.
3. For all a, b, c ∈ F , we have: a · (b + c) = a · b + a · c and (a + b) · c = a · c + b · c.

Examples of fields: R, Q, C, Zp (p prime). Another important example for us will


be: Zp [x]/(f (x)), where f (x) is an irreducible polynomial in Zp [x].
Fields are highly structured objects with many nice properties. Here are two useful
ones:
5.4 Finite Fields 47

• If F is a field, then the Euclidean algorithm and Bezout’s Lemma work for
polynomials in F [x].
• If F is a field, then Gaussian elimination (and all of linear algebra) work over
F in the same way as for reals.
Theorem 5.1 If F is a field, then F [x]/(f (x)) is a field if and only if f (x) is an
irreducible polynomial in F [x].

Proof. Suppose that f (x) is an irreducible polynomial. Let g(x) ∈ F [x]/(f (x)) be
a polynomial of degree less than deg(f ). Then gcd(g(x), f (x)) = 1 and in F [x],
we can apply Bezout’s Lemma to obtain polynomials u(x) and v(x) such that
f (x)u(x) + g(x)v(x) = 1. That is, g(x)v(x) ≡ 1 (mod (f (x))), so that g(x) has a
multiplicative inverse in F [x]/(f (x)). This shows that F [x]/(f (x)) is a field.
Conversely, suppose that f (x) is not irreducible, let f (x) = g(x)h(x) for a polynomial
g(x) of degree less than deg(f ). Then g(x) does not have a multiplicative inverse in
F [x]/(f (x)) and hence F [x]/(f (x)) is not a field. This completes the proof. ■

5.4 Finite Fields


Facts about finite fields: Let q = pk where p is prime and k ∈ N.
1. There exists a finite field of size q; such a field may be constructed as Zp [x] (f (x))
for some irreducible polynomial of degree k in Zp [x].
2. Any 2 fields of size q are isomorphic. Thus, we often write Fq to denote the
field of size q.
3. F∗q is cyclic.
4. For every a ∈ Fq , we have: aq = a.

5.5 Irreducible polynomials in Zp[x]


Theorem 10 In Zp [x], we have the following:


k
(a): If f (x) is an irreducible polynomial of degree k, then f (x) divides xp − x.

k
(b): xp − x = d|k {f (x) : f (x) is monic and irreducible, deg(f) = d}.
Q Q

Proof. (a) Let q = pk . Then F [x]/(f (x)) ∼


= Fq . Considering the polynomaial x as
an element of F [x]/(f (x)), we therefore have: xq − x = 0 (mod f (x)) which is the
desired statement.
(b) We first show that the RHS divides the LHS. Let f (x) be an irreducible polynomial
d
of degree d, with d|k, let k = rd. Then by part(a), we have: xp ≡ x (mod f (x)).
2d d
Raising both sides to the pd th power we obtain xp ≡ xp ≡ x (mod f (x)). Thus
id
inductively we obtain xp ≡ x (mod f (x)) for every i; setting i = r, we get the
desired result. ■

For example, in Z3 [x], we have: x9 − x = x(x − 1)(x − 2)(x2 − 2)(x2 + x + 2)(x2 + 2x +


2).
48 Chapter 5. Finite Fields

5.6 Application: Secret sharing


There’s a secret S that has to be distributed to n persons; each person gets a share
and what we want is the following: if any k persons combine their share, then they
can recover the secret, but from any k − 1 shares, zero information is obtained about
the secret. Such a scheme is called a k-out-of-n threshold scheme (k being the
threshold). We can imagine that a majority threshold (say 3-out-of-5) is relevant for
a consensus decision to recover the secret.
The following scheme is due to Shamir ("How to Share a Secret", 1979).
1. Encode the secret as an element of Fq for suitable q; if the secret is a sequence
of bits, then q may be a power of 2.
2. Generate k − 1 random values of Fq : a1 , . . . , ak−1 .
3. Construct the polynomial f (x) = a0 + a1 x + . . . + ak−1 xk−1 , where a0 is the
secret.
4. Generate n distinct values x1 , . . . , xn ∈ Fq and compute yi = f (xi ) for each i.
5. The pair (xi , yi ) is the share of the ith person.
Explanation: Given k pairs (xi , f (xi )), it is possible to recover all the coefficients of
f (x), as there are k unknowns (the coefficients) and k linear equations. In particular,
the coefficient a0 , which is the secret, can be obtained.
With k − 1 shares/equations, the system of equations results in a solution space
which is a 1-dimensional affine subspace of Fkq and each of the q values is equally
likely for a0 .
6. Polynomial Factorization over Zp

In this chapter, we’ll see the Cantor-Zassenhaus algorithm for factoring univariate
polynomials in Zp [x].
We divide the algorithm into three components/phases and finally combine them
together into a single algorithm.
In the first phase, given a polynomial f (x) ∈ Z[x], we’ll obtain a polynomial g(x)
which is the square-free part of f (x). The square-free part of a polynomial is defined
as follows. Let f (x) = h1 (x)e1 . . . hk (x)ek with hi (x) being distinct irreducibles and
ei s being natural numbers. Then the square-free part of f (x) is the polynomial
g(x) = h1 (x) . . . hk (x).
In the second phase, we’ll partition the square-free polynomial g(x) into f1 (x)f2 (x) . . . fr (x),
where fi (x) is the product of all monic irreducible factors of g(x) that have degree
equal to i.
In the third phase, we’ll factor each fi (x) into irreducible factors of degree i.
Finally, for each irreducible factor h(x) that divides f (x), we find the largest natural
number r such that h(x)r divides f (x).
We now describe algorithms for each of the three phases.

6.1 Phase 1: Finding the square-free part


The main idea behind finding the square-free part is the following claim.
Claim 11 Let f (x) ∈ Zp [x] be such that f (x) is not divisible by h(x)p for any
polynomial h(x) of degree at least one. Then the square-free part of f (x) is given by
f (x)
.
gcd(f (x), f ′ (x))
Notice that if f (x) = h(x)p g(x), then f ′ (x) = h(x)p g ′ (x); thus we are unable to use
the above idea to find the square-free part of h(x) if h(x)p divides f (x).
50 Chapter 6. Polynomial Factorization over Zp

To deal with this issue, we do the following: we first obtain the largest degree
polynomial h(x) such that h(x)p divides f (x), then find the square-free part of h(x)
recursively. We then multiply this by the square-free part of g(x) to obtain the
square-free part of f (x).
For the first step in the above idea, we use the following claim.
Claim 12 Let f (x) ∈ Zp [x] and let h(x) be the largest degree polynomial such
that f (x) is divisible by h(x)p . Then there exists k such that f (k) (x) = h(x)p and
f (k+1) (x) = 0. Here f (j) (x) denotes the jth derivative of f (x).
To check whether a given polynomial in Zp [x] is a pth power and to finds its pth
root, we will use the following observation.
Claim 13 Let f (x) = ∈ Zp [x]. Then f (x) is equal to g(x)p for some
Pd i
i=0 ai x
polynomial g(x) if aand only if: for every i, ai ̸= 0 implies p|i. Further, in this case,
the polynomial g(x) equals p|i ai xi/p .
P

We now have all the ingredients to find the square-free part of a given polynomial
f (x) ∈ Zp [x].

Algorithm 6 Algorithm to find square-free part of f (x) ∈ Zp [x]


1: procedure Square-free Part(f (x), p)
2: g(x) ← f (x), F (x) ← f (x)
3: while g(x) ̸= 0 do
4: h(x) ← g(x), g(x) ← g ′ (x)
5: end while
6: if h(x) ∈ Zp then
7: Return f (x)/gcd(f (x), f ′ (x))
8: end if
9: F (x) ← F (x)/h(x)
10: if h(x) = H(xp ) then
11: h(x) ← H(x)
12: h(x) ← SQUARE-FREE PART(h(x), p)
13: end if
14: F (x) ← h(x)F (x)/gcd(F (x), F ′ (x))
15: Return F (x)
16: end procedure

6.2 Phase 2: Distinct-degree factorization


In this phase, we now have a polynomial which is square-free. Our goal is to factorize
it as f1 (x) . . . fr (x), where fi (x) is the product of all irreducible factors of the given
polynomial of degree i.
The idea behind this algorithm is to use Theorem 10. Since the only irreducible
polynomials that divides xp − x are those of degree one, gcd(f (x), xp − x) gives us
f1 (x). Note: Here, f (x) is an input polynomial which is square-free, and corresponds
to the output g(x) of the first phase rather than to the original input polynomial.
6.3 Phase 3: Finding irreducible factors of degree i 51

Repeating this idea, suppose! that we have found f1 (x), . . . , fi (x). Then fi+1 (x) =
f (x) i+1
gcd , xp − x . We note that in finding the gcds, we shall use re-
f1 (x) . . . fi (x)
peated squaring for the power of x and subsequently apply the Euclidean algorithm.

Algorithm 7 Distinct-degree factorization


1: procedure Distinct-degree Factors(f (x), p) ▷
2: #: Finds f1 (x), f2 (x), . . . where fi (x) is the product of monic irreducible factors
of f (x) of degree i in Zp [x]. ▷
3: Assumes that f (x) is square-free.
4: i ← 1, g(x) ← f (x).
5: while g(x) ̸= 1 do
i
6: fi (x) ← gcd(g(x), xp − x).
7: g(x) ← g(x)/fi (x).
8: i ← i + 1.
9: end while
10: Return {f1 (x), f2 (x), . . .}.
11: end procedure

6.3 Phase 3: Finding irreducible factors of degree i


In this stage, our goal is to solve the following problem. Given a polynomial fi (x),
all of whose irreducible factors have the same degree, i, find those factors. There are
two ideas that we will use for this phase, and they are the content of the following
two results.
Theorem 14 — Chinese Remainder Theorem for polynomials. If h(x) = h1 (x) . . . hk (x),
where the hi (x)s are pairwise co-prime polynomials in Zp [x], then:

Zp [x]/(h(x)) ∼
= Zp [x]/(h1 (x)) × . . . × Zp [x]/(hk (x)).

Claim 15 If h(x) is an irreducible polynomial of degree i and g(x) is a randomly


chosen non-zero polynomial of degree less than i in Zp [x], then:
i i 1
P rob[g(x)(p −1)/2 ≡ 1( mod h(x))] = P rob[g(x)(p −1)/2 ≡ −1( mod h(x))] = .
2

Let fi (x) be a polynomial with an unknown factorization fi (x) = h1 (x) . . . ht (x),


where each hj (x) is irreducible of degree i. Consider a polynomial g(x) of degree less
than fi (x), chosen uniformly at random. By Theorem 14, g(x) modulo hj (x) will be
a uniformly-at-random element of Z[x]/hj (x), for every j. Applying Claim 15, for
i
each j, the probability that hj (x) divides gcd(g(x)(p −1)/2 − 1) is 1/2. Further these
events (various hj (x)s dividing this gcd) are mutually independent. Thus, if t ≥ 2,
then with probability at least 1/2, the above gcd will contain as factors some and
not all of the irreducible factors of fi (x), which gives us a non-trivial factorization of
fi (x). We can then recursively factorize each of the two factors thus obtained. The
algorithm just described is presented below formally.
52 Chapter 6. Polynomial Factorization over Zp

Algorithm 8 Uniform-degree irreducible factorization


1: procedure IrreducibleFactors(f (x), i, p) ▷
2: #: Returns list of irreducible factors of f (x) of degree i in Zp [x] ▷
3: #: Assumes that f (x) is square-free and has only irreducible factors of degree i
4: if deg(f ) = i then
5: Return {f }.
6: end if
7: Pick random g(x) ∈ Zp [x] of degree less than f (x).
8: h1 (x) ← 1, h2 (x) ← 1.
9: while (h1 (x) = 1) OR (h2 (x) == 1) do
i
10: h1 (x) = gcd(f (x), g(x)(p −1)/2 − 1).
11: h2 (x) = f (x)/h1 (x).
12: end while
13: List1 ← IRREDU CIBLEF ACT ORS(h1 (x), i, p)
14: List2 ← IRREDU CIBLEF ACT ORS(h2 (x), i, p)
15: Return List1 ∪ List2.
16: end procedure

Finally, for each h(x) in the list of irreducible factors obtained from Algorithm 8,
we find the largest exponent e such that h(x)e divides f (x). This completes the
description of the factorization algorithm for polynomials in Zp [x].
Remarks: In the above descriptions, we assumed that p is an odd prime. If p = 2,
then a small change is needed in Phase 3; however we skip the details. This method
also:
1. works over any finite field (apart from Zp );
2. can be combined with Hensel lifting to factorize polynomials in Zpk ;
3. can be extended to factorize polynomials in several variables over a finite field.
II
Quadratic Equations in Two Variables

7 Primality Testing: Before 2002 . . 55


7.1 Fermat and Mersenne primes . . . . . . . . . . . . . . 55
7.2 Testing Fermat’s little theorem . . . . . . . . . . . . . 56
7.3 Fibonacci and Lucas pseudoprimality tests . . . . 57
7.4 The Miller-Rabin Test . . . . . . . . . . . . . . . . . . . . 58

8 The Integer Factoring Problem . . 61


8.1 Trial Division and Fermat’s Method . . . . . . . . . 61
8.2 Pollard rho Algorithm . . . . . . . . . . . . . . . . . . . . 61
8.3 Dixon’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . 63

9 Primality Testing: The AKS algo-


rithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.1 A Polynomial Identity . . . . . . . . . . . . . . . . . . . . 67
9.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7. Primality Testing: Before 2002

7.1 Fermat and Mersenne primes


Some of the largest primes known have been primes of the form ab ± 1. We consider
two such forms.

7.1.1 Primes of the form 2n + 1


The 17th century mathematician Fermat noticed that all of the numbers 2 + 1, 22 +
n
1, 24 + 1, 28 + 1, 216 + are prime and he conjectured that all the numbers 22 + 1 are
prime. But this turned out to be false. Euler showed in the 1700s that 232 + 1 is
divisible by 641, and modern computation has revealed only composite numbers
n
so far among numbers of the form 22 + 1. Indeed, it is possible that the sequence
contains no further primes at all, although this hasn’t been proved either. Numbers
n
of the form 22 + 1 are called Fermat numbers and if they are prime, they are called
Fermat primes.
Let us see why it is necessary that n itself should be a power of 2 for 2n + 1 to be a
prime.
Claim 16 If 2n + 1 is a prime, then n is a power of 2.

Proof. Suppose for contradiction that n = ab, where a > 1 is odd. Then we have:
2n + 1 = 2ab + 1 = (2b )a + 1, which is divisible by 2b + 1, because more generally if a is
odd, then xa + 1 is divisible by (x + 1) (using the remainder theorem for polynomials).
This contradicts the assumption that 2n + 1 is prime. Therefore n cannot have any
odd divisors and must be a power of 2. ■

A factor of 232 + 1: How did Euler find that 641 was a factor of 232 + 1? Here’s an
n
observation: suppose that p|22 + 1. Then by considering the order of 2 with respect
to p, we note that 2n+1 must divide p − 1, i.e. p ≡ 1 (mod 2n+1 ). Thus it is sufficient
to look for such factors. For n = 5, it is sufficient to look for numbers of the form
56 Chapter 7. Primality Testing: Before 2002

64k + 1. We can speculate that Euler tested small divisibility for small values of k
and found a hit for k = 10.
Let’s prove that 641 does divide 232 + 1. We have: 10 × 26 ≡ −1 (mod 641), that
is: 5 × 27 ≡ −1 (mod 641). Raising both sides to the fourth power, we obtain:
625 × 228 ≡ 1 (mod 641). Since 625 ≡ −16 (mod 641), we obtain: −232 ≡ 1 (mod
641) which is the desired claim.

7.1.2 Primes of the form 2n − 1


Mersenne primes are primes of the form 2p − 1 where p is a prime. As in the case of
Fermat primes, let us see why the restriction on the exponent is necessary.
Claim 17 If 2n − 1 is prime, then n is prime.

Proof. Suppose for contradiction that n = ab where a, b > 1. Then 2n − 1 = 2ab − 1


is divisible by 2a − 1 which is a proper divisor of 2n − 1. This follows as in the case
of Fermat primes, from the more general divisibility of xab − 1 by xa − 1 using the
remainder theorem. Thus, we obtain a contradiction if n is composite and hence n
must be prime. ■

The largest explicit primes that we know are and have been Mersenne primes, and
the reason for this is that there is an algorithm running in time Õ(p2 ) time to check
whether 2p − 1 is prime.
Some large values of p such that 2p − 1 is prime: 13466917 (record-holder in
2001),32582657 (record-holder in 2006), 57885161 (record-holder in 2013), 82589933
(current record-holder). These primes are also the record-holders for the largest
known prime for those years.

7.2 Testing Fermat’s little theorem


A natural candidate for primality test is Fermat’s little theorem, i.e. we can check
whether the relation holds for some values of a: an ≡ a (mod n). If n is prime, then
the above congruence should hold for every a. What if n is composite? For "most"
composites n, there will be "many" values of a for which the above congruence will
fail to hold. Unfortunately though, there do exist composite numbers n for which
the above relation holds for every a. Such numbers are called Carmichael numbers
and the smallest of them is 561.
Thus, a direct testing of Fermat’s little theorem is insufficient as a primality test,
nevertheless it is still at the basis of the two well-known primality testing algorithms:
the Miller-Rabin algorithm (Section 7.4) and the AKS algorithm (Chapter 9).
Further, Fermat’s little theorem can still show up the compositeness of many com-
posite numbers, which we make precise in the following claim.
Claim 18 Suppose that n is a composite number and let A = {Z∗n |an−1 ≡ 1 mod n}.
Then A = Z∗n or |A| ≤ |Z∗n |/2.
If n is such that |A| ≤ Z∗n /2, then picking random a ∈ Zn \ {0} and applying the
Fermat’s little theorem test will, with probability at least 1/2, show that n is
7.3 Fibonacci and Lucas pseudoprimality tests 57

composite.
To prove Claim 18, we observe that the set A is a subgroup of Zn ∗ and combining
this with Lagrange’s theorem, the claim follows.

7.3 Fibonacci and Lucas pseudoprimality tests


The second primality testing idea we look at is also a pseudoprimality test, i.e. every
prime will pass the test, but some composites may also pass. In practice though,
such tests can be fast and can be combined with other tests (like Fermat’s little
theorem).
The Lucas sequence Un (a, b) is defined as:

U0 = 0, U1 = 1, Un = aUn−1 + bUn−2 ;

For each fixed value of a, b ∈ N, we get a sequence of natural numbers. When


a = 1, b = 1, we get the Fibonacci sequence Fn .
The polynomial fa,b (x) = x2 − ax − b is called the characteristic polynomial of
Un (a, b). We shall henceforth assume that a, b are fixed and write f (x), Un in place
of fa,b (x), Un (a, b).
Claim 19 Let α, β be the roots of f (x) and d = a2 − 4b. If d ̸= 0, then

αn − β n
Un = √ .
d

a b
" #
Claim 20 Let M = .
1 0
Then
" #
Un bUn−1
M =
n
.
Un−1 bUn−2

In particular, we can compute Un modulo m in time poly(log n, log m) by using


repeated squaring on the matrix M .
We also note that the characteristic polynomial of M is also f (x), so that M 2 −
aM + bI2 = 0.
!
d
Let r = , i.e. r = 1 if d is a square modulo p and r = −1 otherwise.
p

Theorem 7.1 Up−r ≡ 0 (mod p).

In particular, for the Fibonacci sequence Fn = Un (1, 1), we have:


58 Chapter 7. Primality Testing: Before 2002

Corollary 7.1 If n is a prime, then n|Fn−1 if n ≡ ±1 (mod 5) and n|Fn+1 if n ≡ ±2


(mod 5).

The Lucas
! primality test: Fix a Lucas sequence U (a, b). Given an integer n, find
d
r= ; this is easier when d is a prime ≡ 1 mod 4. Then find Un−r mod n; if it is
n
zero, then n passes the test. When the sequence is Fn , the test is called Fibonacci
pseudo-primality test.

Proof. of Theorem 7.1: We consider the cases r = 1 and r = −1 separately.


Case 1: r = 1
We work in Zp . Let c ∈ Zp be such that c2 = d. Thus, both α, β are elements of Z∗p
αp−1 − β p−1
and we have αp−1 = β p−1 = 1; hence Up−1 = = 0.
c
Case 2: r = −1
Now we work in Zp [x] modulo (x2 − d).
(p−1)/2
We have xp−1 = (x2 ) = d(p−1)/2 = −1. Now (a + x)p = ap + xp = a + xp = a − x;
hence (a + x)p+1 = a − x = a2 − d = 4b.
2 2

Similarly, (a − x)p = a + x and (a − x)p+1 = a2 − x2 = 4b.


Thus, (a + x)p+1 = (a − x)p+1 (mod (x2 − d)), i.e. Up+1 = 0.
This completes the proof of Theorem 7.1. ■

Remarks:
1. The calculations for Up−r (or for Vp−r ) essentially come from the identity
(x + a)p = xp + a in Zp [x], where we further reduced the term xp modulo some
quadratic polynomial. Writing F (x) = (x + a)p and G(x) = xp + a, the Lucas
test verifies that F (x) = G(x) modulo some quadratic polynomial. This identity,
which is true only for primes, is the starting point of the AKS algorithm, which
checks that F (x) = G(x) modulo h(x) for a bunch of polynomials h(x).
2. It appears that we do not know any composite n ≡ ±2 (mod 5), which passes
both the Fibonacci test as well as the test 2n−1 ≡ 1 (mod n); thus, performing
just these two tests should detect composites effectively for such values.
If you find such a composite n which does pass both tests, you win $620 with
$500 from Selfridge, and $20 from Pomerance. If you prove that no such
composite exists, you stilll win the same amount, with $500 from Pomerance,
and $20 from Selfridge. The remaining $100 comes from Wagstaff in both
cases.

7.4 The Miller-Rabin Test


The Miller-Rabin test is a randomized primality test that runs in polynomial time
and it was the first of this kind. It is based on the idea that if n has at least two
distinct prime factors p, q, then there are at least four distinct square-roots for 1 in
7.4 The Miller-Rabin Test 59

Zn , whereas if n is prime, then there are exactly two square-roots, namely −1, 1, in
Zn .
We also know that if n is prime, then for a < n, we must have: an−1 ≡ 1 (mod
n). Thus, the idea is to consider the numbers a(n−1)/2 , a(n−1)/4 , . . ., until we find
a number c which is not equal to 1. If c ̸= −1, then we know that n is not prime.
However, it may also happen that c = −1 for composites n. The claim however is
that c ̸= −1 with sufficiently large probability if n is composite and if a is a random
element of Zn .
We first describe the algorithm in detail.

Algorithm 9 Miller-Rabin Algorithm for primality testing


1: procedure Miller-Rabin Test(n)
2: If n > 2 is even, return COMPOSITE.
3: If n is a perfect power, return COMPOSITE.
4: Find t, m such that n − 1 = 2t m with t ≥ 1, m odd.
5: Pick random a ∈ {2, . . . , n − 1}.
6: If gcd(a, n) ̸= 1, return COMPOSITE.
7: b ← am , i ← 0, c ← −1.
8: while i ≤ t AND b ̸= 1 do
9: c ← b, b ← b2 . ▷ Calculations in Zn
10: i ← i + 1.
11: end while
12: if b ̸= 1 then
13: return COMPOSITE.
14: end if
15: if c ̸= −1 then
16: return COMPOSITE.
17: end if
18: Return PRIME.
19: end procedure

7.4.1 Analysis of time complexity


Step 3: To check if n is a perfect power, it is sufficient if n is a kth power for
2 ≤ k ≤ log2 n. For every fixed k, this can be done by binary search. Thus the
time complexity of this step is at most O(log2 n) (excluding the complexity of the
arithmetic operations involved).
Step 4: The values of t, m can be found in O(log2 n) time.
Step 8: The number of iterations is at most t ≤ log2 n.
The other steps have time complexity O(1), thus the overall time complexity is
O(log2 n) times the complexity of arithmetic operations in Zn , i.e. Õ(log3 n).

7.4.2 Analysis of correctness probability


The main claim is that the Miller-Rabin algorithm is correct with probability at least
1/2 on every input.
60 Chapter 7. Primality Testing: Before 2002

Claim 21 • If n is prime, then the algorithm will return PRIME.


• If n is composite, then the algorithm will return COMPOSITE with probability
at least 1/2.

Proof. If n is prime, then consider the values of c, b at the end of the while loop. If
the while loop terminates because b = 1, then the value of b must be equal to 1 and
c is a square-root of 1 in Zn . Further c =
̸ 1 since the algorithm terminates the first
time that b becomes 1. Thus c = −1 and the algorithm goes to step 18.
If the while loop terminates because i = t, then by Fermat’s little theorem, the value
t
of b must still be 1 at the end of the while loop, because am2 = ap−1 ≡ 1 (mod p).
As before, the algorithm goes to step 18.
Now, suppose that n is composite. If n is a perfect power of a prime, then the
algorithm outputs COMPOSITE in step 3. Thus, suppose that n is composite and
has at least two prime factors. Let A = {s ∈ Z∗n |sn−1 = 1}. If |A| ≤ |Z∗n |/2, then
with probability at least 1/2, the algorithm will return COMPOSITE in line 13.
Otherwise, using Claim 18, we deduce that A = Z∗n .
Let r ≥ 4 be the number of square-roots of unity in Zn . For d ∈ N, we define the set
Sd = {r ∈ Z∗n : rd = 1}. Note that Sd is a subgroup of Z∗n and also that Sn−1 = A = Z∗n .
We also have: if d = d1 d2 such that gcd(d1 , d2 ) = 1, then Sd is isomorphic to Sd1 × Sd2 .
Thus, we get: Z∗n = Sn−1 is isomorphic to S2t × Sm .
Consider the directed graph G = (V, E) where V = S2t and E = {(a, a2 )|a ∈ S2t }, with
calculations modulo n. Then the underlying graph of G is a tree (with a self-loop at
1); if we fix 1 as the root, then it is a rooted tree T with the parent of vertex a being
a2 . Also by our deduction that Z∗n is isomorphic to S2t × Sm , we find that the set
{am : ainZ∗n } is equal to V .
Thus, in terms of the tree T , the algorithm picks a random vertex in V and traverses
up the tree until it reaches the root. For an element a ∈ V , let T (a) denote the subtree
rooted at a. To show that the algorithm returns COMPOSITE with probability at
least 1/2, we must show that |T (−1)| ≤ |V |/2.
The root vertex has d − 1 children, let them be a1 = −1, a2 , . . . , ad . Ever other node
in the tree has either zero children (if it is not a square in Zn ) or d children (if it is a
square in Zn ). Since d ≥ 4, if we show that the height of T (ai ) is greater than or
equal to the height of T (−1) for each i ≥ 2, this would imply that |T (−1)| ≤ |V |/3.
k
Let T (−1) have height k. That is, there exists an element α ∈ Zn such that α2 = −1.
Let n = pe11 . . . perr . Let ϕ be the natural isomorphism from Zn to Zpe1 × . . . × Zperr ,
1
i.e. ϕ(x) = (x1 , . . . , xr ) where x ≡ xi (mod pei i ). Notice that ϕ(−1) = (−1, . . . , −1),
whereas for i ≥ 2, ϕ(ai ) is a tuple consisting of 1s and -1s, with at least one -1 and
at least one 1. Suppose that ϕ(α) = (α1 , . . . , αr ). Let β be such that ϕ(β)j = αj
whenever ϕ(ai )j = 1 and let ϕ(β)j = 1 whenever ϕ(ai )j = −1. Then we can observe
k
that β 2 = ai . Thus, the height of T (ai ) is greater than or equal to the height of
T (−1). This shows that |T (−1)| ≤ |V |/3 and completes the proof of Claim 21.

8. The Integer Factoring Problem

8.1 Trial Division and Fermat’s Method


The simplest method of factoring a given integer n is trial division, which is to divide

by each number a ∈ {2, . . . , n − 1}. The time complexity of this method is Õ( n).
In later sections, we will see factoring algorithms which are asymptotically faster,
but we first see an alternative method to trial division, whose worst-case complexity
is however Θ(n).
A simple observation that can help factor the number 323 is that 323 + 1 = 324 = 182 .
Hence, 323 = 182 − 1 = 17 × 19. More generally, suppose that we consider the numbers
n + 12 , n + 22 , . . . until we find a perfect square. Then we have: n + a2 = b2 and we get
the factorization n = (b − a)(b + a). This idea is called Fermat’s method, and it can
be effictive if n = AB with A, B very close to each other. For such a factorization,
the corresponding a, b are a = (B − A)/2, b = (B + A)/2, thus the number of steps
is (B − A)/2. The worst-case complexity of this method is Θ(n), but it may be a
useful test in combination with trial division and can be stopped in case we don’t
find a factor after some threshold number of steps.

8.2 Pollard rho Algorithm


In 1975, J. Pollard came up with an interesting algorithm that takes O(n1/4 ) time
to find a factor of n (if it exists) with high probability.
There are two ideas behind the Pollard rho algorithm, which we first explain.
Lemma 22 If we pick a random sequence a1 , a√
2 , . . . , ak with each ai being indepen-
dently chosen from {1, 2, . . . , N }, and if k = 4 N , then the probability that there
exist i < j with ai = aj , is at least 0.6.
The above lemma is a generalization of the so-called birthday paradox, which is that
if we pick 23 people randomly, then with probability more than 1/2, there will be two
people with the same birthday (assuming independence of birthdays). We skip the
62 Chapter 8. The Integer Factoring Problem

proof of the lemma but we note that an exact expression for the desired probability
(N − 1)(N − 2) . . . (N − k + 1)
is 1 − .
Nk
Lemma 23 Given a function f : {0, 1, . . . , N − 1} → {0, 1, . . . , N − 1} and a sequence
a1 , . . . , am where ai = f (ai−1 ) for every i ≥ 2, there’s an algorithm which can test in
O(m) time whether there exist distinct i, j such that ai = aj and further find such
i, j if they exist.

Proof. The problem described in the statement is known as the cycle detection
problem and the solution that we will describe is known as Floyd’s cycle detection
method.
Let j be the least index such that aj is equal to some previous element, say ai ,
and let L = i − j + 1. Then for every k ≥ i, we have ak = ak+L , while the elements
a1 , . . . , ai−1 appear only once in the sequence. Note that even though the sequence
given to us is only the first m elements, the elements am+1 , . . . are also well-defined
by the relation ai = f (ai−1 ).
We claim that there exists t ≤ m such that at = a2t . Let t = i + T . Then at = ar
where r = T %L and a2t = as , where s = (i + 2T )%L. Thus, if T ≡ i + 2T (modulo L),
then we have at = a2t , and this certainly happens when T ≡ −i (mod L), i.e. when t
is of the form t = qL.
Now, we describe the algorithm. For t = 1, 2, . . ., we consider the pair of elements
at , a2t in the tth iteration. If at =
̸ a2t for t ≤ m, then we conclude that all the elements
are distinct.
Otherwise, the least t for which at = a2t must be the value of L We may know
find the value of i in any number of ways, for example, by considering the pairs
(at−1 , a2t−1 ), (at−2 , a2t−2 ) etc until the values in the pair are different.
The number of iterations in this algorithm is at most 2L ≤ 2m; this completes the
proof of the lemma. ■

Algorithm 10 Pollard rho algorithm


1: procedure Pollard rho(n)
2: m ← ⌈4n1/4 ⌉
3: Pick a1 , r ∈ {1, 2, . . . , n − 1} uniformly at random.
4: for i=2 to m do
5: ai ← a2i−1 + r. ▷ f (x) = x2 + r in Zn .
6: end for
7: if gcd(ai − aj , n) ∈/ {1, n} for some i, j then
8: Return gcd(ai − aj , n) as a factor.
9: end if
10: end procedure

The time complexity of the algorithm is O(m) = O(n1/4 ), where we use Floyd’s
algorithm for Step 7. Suppose that n is composite and p is the least prime factor of
8.3 Dixon’s Algorithm 63

n. Then p ≤ n and hence by Lemma 22, with probability at least 0.6, there exist
i, j such that ai ≡ aj (mod p) so that gcd(ai − aj , n) ̸= 1 with probability at least 0.6.

8.3 Dixon’s Algorithm


The previous algorithms for factoring were exponential-time algorithms, as they had
time complexity of the form O(nc√) = O(ec log n ). We now look at a subexponential
time algorithm, of complexity eO( log n log log n) .
The main idea behind Dixon’s algorithm is to find two numbers α, β such that
α2 ≡ β 2 (mod n). If α, β are also random, then we can expect that gcd(α − β, n) is a
non-trivial factor of n with good probability.
How can we find such α, β? Let us see an example. Let n = 8857. In the calculations
below are the values of some squares modulo n (the congruences written are modulo
n) and their factorizations.

952 ≡ 168 = 25 × 3 × 7 (8.1)


97 ≡ 552 = 2 × 3 × 23
2 3
(8.2)
1072 ≡ 2512 = 25 × 34 (8.3)
1732 ≡ 3358 = 2 × 23 × 73 (8.4)
206 ≡ 7008 = 2 × 3 × 73
2 5
(8.5)
742 ≡ 1430 = 2 × 5 × 11 × 13.
2
(8.6)

Consider the equations (8.2),(8.3),(8.4),(8.5). Multiplying all of them, we obtain:

972 × 1072 × 1732 × 2062 ≡ 214 × 36 × 232 × 732 (mod n). (8.7)

Thus, we have α2 ≡ β 2 (mod n), where α = 97 × 107 × 173 × 206 ≡ 768 (mod n) and
β = 27 × 33 × 23 × 73 ≡ 1289 (mod n). We find gcd(α − β, n) = 521 which is a divisor
of n.
Now we explain the idea. Firstly, we compute several random squares modulo n.
We then maintain a list of those which have prime factors only in {p1 , . . . , pk }, where
pi is the ith prime number and where the threshold k is fixed in advance. We also
maintain the list of their corresponding factorizations.
Let a number in the list have the factorization pe11 . . . pekk . Consider the exponent
vector (e1 , . . . , ek ). A key idea is that if there are at least k + 1 such exponent vectors,
then there must be a subset of them whose sum is an even number in every co-
ordinate. This is because the vector space Fk2 can have at most k linearly independent
vectors.
We then multiply the corresponding relations; this gives us on the RHS a number
2fk f1 fk
of the form p2f 1 2f2
1 p2 . . . pk , which is a square. Thus, we can set β = p1 . . . pk . The
linear relation among the exponent vectors (modulo 2) can be found by Gaussian
elimination.
We know describe the algorithm in detail.
64 Chapter 8. The Integer Factoring Problem

Algorithm 11 Dixon’s algorithm


1: procedure Dixon(n) √
2: Fix B, m ▷ B = m = ⌈e4 log n log log n ⌉.
3: Find PB = {p1 , . . . , pk }, the set of primes in {1, 2, . . . , B}.
4: Pick a1 , . . . , am ∈ {1, 2, . . . , n − 1} uniformly at random.
5: L ← ∅, I ← ∅.
6: for i=2 to m do
7: bi ← a2i . ▷ In Zn .
8: Divide bi by each prime in PB .
9: if bi is B-smooth then
e
vi ← (ei,1 , . . . , ei,k ), where bi = kj=1 pj i,j .
Q
10:
11: Add (ai , bi , vi ) to L; add i to I.
12: end if
13: end for
Find T ⊆ I such that i∈T P vi ≡ (0, 0, . . . , 0) (mod 2). ▷ |T | ≤ k + 1.
P
14:
e
i∈T i,j
α ← i∈T ai ; β ← j=1 pj . ▷ α2 ≡ β 2 (mod n)
Q Qk 2
15:
16: Find n1 = gcd(α − β, n), n2 = n/n1 as possible non-trivial factors of n.
17: end procedure

Analysis of running time and success probability:

We first analyze the running time of in terms of n, B, m without fixing the values of
B, m. We do not explicitly include the cost of arithmetic operations in Zn which we
know to be Õ(log2 n).

• Lines 1-5: O(m + B)


• Lines 6-13: O(mB)
• Line 14: O(k 3 ) = O(B 3 ) via Gaussian elimination; it turns out that this can
be improved to O(B 2 ) by using the sparseness of the matrix.
• Lines 15,16: O(k + log n) = O(B + log n)

We thus find that the total time complexity is O(mB + B 2 ) with some additional
poly(log n) factors. Next, we need to fix m, B as functions of n.

The probability that the algorithm succeeds is related to the probability of finding at
least k +1 numbers among a1 , . . . , am which are B-smooth, and further the probability
that the final congruence α2 ≡ β 2 (mod n) gives us a non-trivial factor of n. We’ll
only focus on the first part.

Let S(n, B) denote the number of B-smooth numbers in {1, 2, . . . , n}. Then the
expected number of random elements we must pick in order to find k + 1 numbers
(k + 1)n
that are B-smooth is . Thus, the choice of m will be a constant multiple of
S(n, B)
the above expression. We also need an estimate on S(n, B) which is given below.
n
Lemma 24 If B = n1/u , then S(n, B) ∼ u+o(1) .
u
Writing B = n1/u and applying the above expression, we get the running time in
8.3 Dixon’s Algorithm 65

terms of n, u to be:
 
T (n, u) = Õ uu+1 n2/u .

Now we can find u that minimizes T (n, u) by taking logarithms and then s differen-
log n log n
tiating with respect to u. This gives us: log u − 2 ∼ 0, so that u ∼ .
 √u  log log n
Thus, we get the running time to be T (n) = Õ e2 log n log log n . This completes our
analysis of Dixon’s algorithm.
9. Primality Testing: The AKS
algorithm

In 2002, Manindra Agrawal, Neeraj Kayal and Nitin Saxenah came up with the first
(and only) known deterministic polynomial time algorithm. It is now commonly
referred to as the AKS algorithm.

9.1 A Polynomial Identity


The first key ingredient in the AKS algorithm is the following observation.

Lemma 25 Let a ∈ Z∗n . Then n is prime if and only if

(x + a)n = xn + a (9.1)

holds in Zn [x].

!
p
Proof. ⇒: Let n = p be prime. Then in Zp [x], we have: (x + a)p = xp + p−1 +
P
i=1
!
i
p
ap = xp + a. The last equality follows from the fact that p divides for 1 ≤ i ≤ p − 1
i
and from the fact that ap ≡ a (mod p).

⇐: Let n be composite with a prime factor p and ! let p be the largest power of p
k
n n(n − 1) . . . (n − p + 1)
that divides n. Then the coefficient of xp in is equal to
p p!
which is divisible by p k−1 but not by p . Thus this coefficient is non-zero in Zn , and
k

(x + a)n ̸= x + a. ■

9.2 The Algorithm


We now describe the AKS algorithm. The polynomial computation is in Zn [x].
68 Chapter 9. Primality Testing: The AKS algorithm

Algorithm 12 AKS algorithm


1: procedure AKS(n)
2: Test whether n is a perfect power. If yes, return COMPOSITE.
3: Find r ≤ ⌈16 log5 n⌉ such that ordr (n) > 4 log2 n.
4: for i = 1 to r do
5: If gcd(i, n) > 1, return COMPOSITE.
6: end for√
7: k ← ⌊2 r log n⌋.
8: for a = 1 to k do
9: If (x + a)n ̸≡ xn + a mod (xr − 1) return COMPOSITE.
10: end for
11: Return PRIME.
12: end procedure

9.2.1 Running Time:


Step 3 can be done as follows: For each r ≤ 16 log5 n, do the following: for each
d < 4 log2 n, check whether nd − 1 is not divisible by r. If it is not divisible for every
d, the corresponding r is chosen. The time complexity of this step is Õ(log8 n).
The other main contribution to the running time is lines 8-10; the number of
iterations is O(log3.5 n) and each computation can be done in Õ(log6 n) time (since
deg(xr − 1) = O(log5 n)). Thus the total time complexity is Õ(log9 n).

9.3 Correctness
We will show that the algorithm returns PRIME if and only if n is prime. We will
also prove why r exists as in step 3.
If n is prime, then the algorithm clearly does not return COMPOSITE in lines 2,5.
It also does not return COMPOSITE in line 9 because of Lemma 25.
Suppose now that n is composite. If the algorithm does not return COMPOSITE in
line 2, then n must have at least two distinct prime factors; let p denote the least
prime factor of n.
Definition 9.1 Let r be a fixed positive integer. Let m ∈ N, f (x) ∈ Zp [x]. We say
that the pair (m, f (x)) is introspective (with respect to p) if

f (x)m ≡ f (xm ) mod (p, (xr − 1)) . (9.2)

We denote by IP the set of all introspective pairs (with respect to p).

Examples:
• (1, f (x)) for every f (x), p, r;
• (p, f (x)) for every f (x), p, r;
• (m, x) for every m, p, r;
• (561, x + 1) for p = 3, 7, 11 and r = 4.
The connection of this definition to the algorithm is that in line 9, we are checking
whether the pair (n, x + a) is introspective.
9.3 Correctness 69

Introspective pairs satisfy the following two multiplicative properties.


Claim 26 (a) Let (m1 , f (x)) and (m2 , f (x)) be introspective. Then (m1 m2 , f (x)) is
also introspective.
(b) Let (m, f (x)) and (m, g(x)) be introspective. Then (m, f (x)g(x)) is also intro-
spective.
We can now explain the broad proof strategy.

9.3.1 The Proof Strategy


The strategy behind the proof of correctness is the following: We will show that if
the algorithm does not return COMPOSITE in lines 8-10, then there are "many"
introspective pairs by using the multiplicative property.
On the other hand, we will argue that if there are too many of them, then for a
suitable polynomial of some degree m, there are more than m + 1 roots. These roots
will correspond to the polynomials in the introspective pairs, and the polynomial
will be in a finite field, which gives the desired contradiction.

9.3.2 The Proof


Let I = {ni pj |i, j ≥ 0} and let P = { a=1 (x + a) }, where the exponents ea range
Qk ea

over all values in N ∪ {0}.


The following observation comes from the multiplicative properties of introspective
pairs and also uses the fact that the algorithm has not returned COMPOSITE in
lines 8-10.
Observation 27 For every m ∈ I and every f (x) ∈ P , the pair (m, f (x)) is introspective.

The sets I, P are infinite; we first define some finite subsets of them. We define
G to be the set obtained by considering the values of I modulo r. Formally,
G = {a ∈ Zr : a ≡ i mod r for some i ∈ I}. Note that G is a subgroup of Z∗r under
multiplication.
Let h(x) be an irreducible factor of xr − 1 of largest degree. We define R to
be the set obtained by considering the values of P modulo h(x). Formally, R =
{f (x) ∈ Zn [x]/(h(x)) : f (x) ≡ g(x)mod h(x) for some g(x) ∈ P }. We note that R is
a subgroup of (Zn [x]/(h(x)))∗ .
We now make three claims about the sizes of |G|, |R|.
Claim 28 Let t = |G|. Then t > 4 log2 n.

t+k−2
!
Claim 29 |R| ≥ .
k−1
!√t
n2
Claim 30 |R| ≤ .
2
Assuming the claims, the contradiction follows from Claims 29 and 30 after substitut-
ing the known bounds on t, r, k. We skip this calculation and instead give sketches
70 Chapter 9. Primality Testing: The AKS algorithm

of the proofs of the claims.

Proof. of Claim 28. Let d = ordr (n). Then the elements 1, n, . . . , nd−1 are all distinct
modulo r and from line 3, we have: ordr (n) > 4 log2 n. This proves the claim. ■

Proof. of Claim 29. Consider the polynomials in P of degree less than t. The number
t+k−2
!
of such polynomials is equal to . All these polynomials are distinct modulo
k−1
h(x) if deg(h(x)) ≥ t. Thus it suffices to prove that deg(h(x)) ≥ t. This can be done
by considering the minimal polynomial of an element in Zr that generates G, but we
skip the details. ■

Proof. of Claim 30. Let Iˆ = {ni pj |0 ≤ i ≤ j ≤ ⌊ t⌋}. We have |I|
ˆ > t and therefore
there exist distinct m1 , m2 ∈ Iˆ such that m1 ≡ m2 (mod r).
Let f (x) be an arbirary polynomial in R. Since f (x) is introspective, we have:

f (x)m1 ≡ f (xm1 )( mod h(x)) (9.3)


andf (x)m2 ≡ f (xm2 )( mod h(x)) (9.4)
(9.5)

Also, since m1 ≡ m2 (mod r), we have xm1 ≡ xm2 (mod xr −1), and hence: xm1 ≡ xm2
(mod h(x)).
Thus, we find that f (x)m1 − f (x)m2 ≡ 0 (mod h(x)). Now consider F = Zp [x]/(h(x))
which is a field. Seeing f (x) as an element of F , we have that f (x) is a root of the
polynomial Y m1 − Y m2 .
Since F is a field, the polynomial Y m1√− Y m2 has at most max(m1 , m2 ) roots. Thus
! t
√ n2
|R| ≤ max(m1 , m2 ) ≤ (np) t ≤ . ■
2

With the proof of the claims, we have shown that if n is composite, then the algorithm
will return COMPOSITE.
We now consider the problem of showing that there exists r < 16 log5 n such that
ordr (n) > 4 log2 n. For this, we the following lemma is useful.
Lemma 31 Let n ≥ 1 be a natural number. Then lcm(1, 2, . . . , n) ≥ 2(n−1)/2 .

1
Proof. Let L = lcm(1, 2, . . . , 2n + 1), and let I = 01 xn (1 − x)n dx. Then 0 < I < n
R
4
and LI is a positive integer. Thus, L > 4n , from which the statement in the lemma
may be obtained. ■

Let D = ⌊4 log2 n⌋ and R = log5 n. Suppose that every r ∈ {1, 2, . . . , R} divides some
number nd − 1 for d ≤ D. Then LCM (1, 2, . . . , R) must divide (n − 1)(n2 − 1) . . . (nD −
D2
1). Thus, LCM (1, 2, . . . , R) ≤ n 2 . Applying Lemma 31 for the LHS and substituting
for D, we may obtain the desired result.
This completes the proof sketch of the correctness of the AKS algorithm. ■

You might also like