0% found this document useful (0 votes)
57 views

Graham D. Williams

The document is a set of notes on mathematical topics including sets, relations, functions, and groups. It begins with an overview of basic set theory concepts such as defining a set, set membership, equality of sets, subsets, the empty set, union and intersection of sets. It then provides more detailed explanations and formal definitions of these concepts. The notes continue with additional sections on relations, functions, and the topic of groups which includes subgroups, homomorphisms, and examples involving symmetries and matrices.

Uploaded by

ccouy123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Graham D. Williams

The document is a set of notes on mathematical topics including sets, relations, functions, and groups. It begins with an overview of basic set theory concepts such as defining a set, set membership, equality of sets, subsets, the empty set, union and intersection of sets. It then provides more detailed explanations and formal definitions of these concepts. The notes continue with additional sections on relations, functions, and the topic of groups which includes subgroups, homomorphisms, and examples involving symmetries and matrices.

Uploaded by

ccouy123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 151

Graham D.

Williams
Notes

Contents

1 Sets, Relations and Functions 3


1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2 Groups 43
2.1 Operations and Monoids . . . . . . . . . . . . . . . . . . 43
2.2 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.3 Subgroups, Cyclic Groups and
Lagrange’s Theorem . . . . . . . . . . . . . . . . . . . . 66
2.4 Homomorphisms and Isomorphisms . . . . . . . . . . . . 85
2.5 Quotient Groups . . . . . . . . . . . . . . . . . . . . . . 102
2.6 More on Permutations . . . . . . . . . . . . . . . . . . . 121
2.7 Examples: Symmetries and Matrices . . . . . . . . . . . 140
2.7.1 Symmetries . . . . . . . . . . . . . . . . . . . . . 140
2.7.2 Matrix Groups . . . . . . . . . . . . . . . . . . . 146

1
2 CONTENTS
Chapter 1

Sets, Relations and Functions

1.1 Sets

A set is any collection of objects: numbers, points,people, cars, galaxies.


We usually denote sets by capital letters A, B, X, Y, . . .. The objects in
the set are called its elements or members. If the set is small we can write
out all its elements in a list and surround them by brackets of the form
{ }. Thus A = {1, 2, 3, 4} is the set comprising the numbers 1, 2, 3 and
4. If an element x belongs to a set X, we write x ∈ X. If not, we put
x∈/ X. So ∈ means “belongs to” or “is an element of”. In the example
above, we have 2 ∈ A, but 5 ∈ / A.
In specifying a set, all that matters is whether something is in it or not.
The order in which we name the elements is immaterial, so the set above
could also be written A = {4, 3, 2, 1}. There is also no notion of including
something several times over. Thus {1, 1, 2, 1, 3, 4, 4, 1, 2} is still the same
set, though it’s redundant to keep repeating elements. In summary, sets

3
4 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

A and B are counted as equal if they have precisely the same elements:

1.1.1. A = B means that, for all x, we have x ∈ A ⇔ x ∈ B .

We have used here the shorthand symbol ⇔, which is short for “if
and only if”. It means that the two statements on either side of it are
equivalent. If either one of them is true, then the other must follow as a
logical consequence. We shall have more to say about logical symbols like
this in the future.
The elements of a set don’t have to be all of the same “type”:

C = { 2, Albert Einstein, the planet Mercury} is a perfectly good set. If
a set is large, in particular infinite, we cannot list all the elements explic-
itly. We might write A = {1, 2, . . . , 1000} to denote the set of all integers
from 1 to 1000, although strictly this is imprecise, since the reader is be-
ing asked to guess the pattern involved. A few sets occur so frequently
throughout Mathematics that we reserve special symbols for them:

= {1, 2, 3, . . .} is the set of all positive integers, or natural numbers.


Z = {. . . , −3, −2, −1, 0, 1, 2, 3, . . .} is the set of all integers.
Q is the set of all rational numbers, ie. fractions of the form m/n.
R is the set of all real numbers, rational or irrational.

3
√ √
Thus 4 ∈ Q, 43 ∈
/ Z, 2 ∈ R, 2 ∈
/ Q.

A good way to specify a set is by means of set-builder notation.Suppose


we have some property or statement P (x) which may or may not be true
of a variable x. Here P (x) is a proposition which is true for some instances
1.1. SETS 5

of x and false for others. It could be an equation: 2x2 − 7x + 3 = 0, an


inequality: 4x − 12 ≤ 0, or some other statement such as “x is a prime
number factor of 696729600”. We may now form the set consisting of
precisely those values of x for which the statement is true. This is written
as {x : P (x)}. In words: the set of x such that P (x) (is true). Thus we
have the sets A = {x : 2x2 − 7x + 3 = 0}, B = {x : 4x − 12 ≤ 0} and
C = {x : x is a prime number factor of 696729600}. One might object
that A and B are still a bit vague, since we have not specified the nature of
x. There is usually a context in any mathematical discussion, so it mat well
have been agreed in advance that x is a real number (rather than an apple,
orange or banana), but if we need to be even more precise we can write
A = {x : x ∈ R and 2x2 −7x+3 = 0} or A = {x ∈ R : 2x2 −7x+3 = 0}.
In the three examples here we can easily solve the conditions explicitly
and write our sets as A = {3, 21 }, B = {x : x ≤ 3} and C = {2, 3, 5, 7}.
(Incidentally, you will soon be learning what a group is, and 696729600 =
214 · 35 · 52 · 7 is the number of elements of a particularly interesting group,
the exceptional Weyl group E8 .) Note, however, that A0 = {x ∈ Z :
2x2 − 7x + 3 = 0} = {3} is a different set from A.
It should be stressed that the variable x used above is a dummy vari-
able. It can be changed consistently to any other letter. Thus {x :
P (x)} = {y : P (y)} = {t : P (t)} ; in each case we have the same
instruction to form a set. Of course, expressions such as {x : P (y)} are
meaningless: the instructions don’t make sense.
An extremely useful set which we allow ourselves to form is the empty
set. This is the set with nothing in it at all; think of it as like an empty
suitcase. It could be written { }, but the standard notation is ∅. A popular
6 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

belief among non-experts is that there are lots of empty sets: one that
is empty of apples, one that is empty of bananas, etc. This is not so.
Suppose ∅a and ∅b were two potentially different empty sets. We claim
that they are in fact the same. If this were not so, then by the criterion
above for two sets to be equal, one of them, say ∅a , would contain an
element x not belonging to the other. But this is impossible, since ∅a
hasn’t got any elements. That’s called logic!
The next important idea is that of a subset. Suppose we have two
sets A and B and that B contains all the elements of A and maybe some
more. Then we say that A is a subset of B and write A ⊂ B. This can
also be written B ⊃ A. Formally:

1.1.2. A ⊂ B means that, for all x, we have x ∈ A ⇒ x ∈ B .

We have used here another logical symbol ⇒. This is short for “im-
plies”. Thus if P and Q are two statements or propositions (which might
or might not be true), then P ⇒ Q is the proposition “P implies Q” or
“if P then Q”. This is not to say that either statement necessarily is true,
merely that if P holds, then Q automatically follows. We can also write
this as Q ⇐ P . Note also that P ⇔ Q amounts to saying that P ⇒ Q
and Q ⇒ P simultaneously.
So to prove that A ⊂ B we have to demonstrate that whenever an
element belongs to A then it also belongs to B. From 1.1.1 and 1.1.2 we
have at once:

1.1.3. A = B ⇔ A ⊂ B and B ⊂ A .

Furthermore, amongst all the subsets of a given set A we have A itself


and also ∅:
1.1. SETS 7

1.1.4. A ⊂ A and ∅ ⊂ A .

The first of these is obvious, and the second is similar to an earlier


point: if it were not so, then there would be an element x ∈ ∅ with
x∈/ A. But ∅ has no elements.
If A is a finite set of n elements, then, including the two extremes
of (1.4), it has precisely 2n subsets. This is easily seen by writing A =
{a1 , a2 , . . . , an } and noting that we form a typical subset by including
or excluding a1 , then including or excluding a2 , and so on. This gives
2 × 2 × . . . × 2 = 2n choices as to how we form the subset. Alternatively,
you can prove it by induction (see the Appendix). For the standard sets
mentioned earlier we have N ⊂ Z ⊂ Q ⊂ R.

Z Some (mostly older) books use the symbol ⊆ in place of ⊂, and


reserve A ⊂ B to mean that A is a proper subset of B, ie. A ⊆ B and
A 6= B. We shall adopt the more modern convention.
There are various ways of making new sets from old, the two most
fundamental being union and intersection. The union of A and B, written
A ∪ B, is the set formed by lumping together all the elements of A and
B (and ignoring any repetitions), and the intersection, written A ∩ B,
consists just of the elements common to the sets. Formally:

1.1.5. A ∪ B = {x : x ∈ A or x ∈ B} and A ∩ B = {x : x ∈ A and x ∈


B} .

We remind the reader that if P, Q are two propositions, then the com-
pound proposition “P or Q” is counted as true as soon as at least one of
the component parts is true, whereas “P and Q” is true only if both parts
8 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

are true simultaneously. If two sets are described by set-builder notation,


we have the following consequence:

1.1.6 Lemma. Suppose A = {x : P (x)} and B = {x : Q(x)}. Then


A ∪ B = {x : P (x) or Q(x)} and A ∩ B = {x : P (x) and Q(x)} .

The moral is that ∪ between sets corresponds to “or” between state-


ments, and ∩ between sets corresponds to “and” between statements. If
A ∩ B = ∅ we say that A and B are disjoint.
In the same way, we can form the union of several sets A1 , . . . , An . This
is written A1 ∪ . . . An or ni=1 Ai . Similarly their intersection is written
S

A1 ∩ . . . An or ni=1 Ai . For an infinite sequence of sets, we can write


T
S∞
i=1 Ai etc.
Let us now record a few simple properties of the union and intersection.
We will prove some, to give you a model of how to set things out formally,
and leave the rest as exercises:

1.1.7 Proposition. Let A, B, C be sets. Then the following are true:


(i) A ∪ B = B ∪ A (ii)A ∩ B = B ∩ A
(iii) A ∪ (B ∪ C) = (A ∪ B) ∪ C (iv)A ∩ (B ∩ C) = (A ∩ B) ∩ C
(v) A ∪ ∅ = A (vi) A ∩ ∅ = ∅
(vii)A ⊂ A ∪ B (viii) A ∩ B ⊂ A
(ix) A ∪ B = A ⇔ B ⊂ A (x)A ∩ B = A ⇔ A ⊂ B.

Proof. (i) We have to show that x ∈ A ∪ B if and only if x ∈ B ∪ A. We


can do this as follows: x ∈ A ∪ B ⇔ x ∈ A or x ∈ B ⇔ x ∈ B or x ∈
A ⇔ x ∈ B ∪ A. Of course we are using the obvious symmetry of the
word “or” in the above.
1.1. SETS 9

(iv) x ∈ A ∩ (B ∩ C) ⇔ x ∈ A and x ∈ B ∩ C ⇔ x ∈ A and x ∈


B and x ∈ C ⇔ x ∈ A ∩ B and x ∈ C ⇔ x ∈ (A ∩ B) ∩ C. Hence the
sets are equal.
(vii)We must show that if x ∈ A then it follows that x ∈ A ∪ B. So
let x ∈ A. Then x ∈ A or x ∈ B. This is a logical point: we might as
well have put “x ∈ A or the moon is made of green cheese”. As soon
as the first part of the statement is true then so is the whole compound
statement. Thus x ∈ A ∪ B.
(v) Using (vii) we already have that A ⊂ A ∪ ∅. In the other direction,
suppose x ∈ A ∪ ∅. Then x ∈ A or x ∈ ∅. But the latter definitely cannot
happen (∅ hasn’t got any elements at all!) So in fact x ∈ A, proving that
A ∪ ∅ ⊂ A.
(ix)There is a bit more to do here¿ We must show that if the left-hand
statement holds, then so does the right-hand one, and vice-versa. So
suppose A ∪ B = A. By (i) and (vii) we already know that B ⊂ A ∪ B
and it follows that B ⊂ A. Now suppose instead that B ⊂ A. We have to
prove that A ∪ B = A. Since we already have A ⊂ A ∪ B, we just have to
prove the reverse inclusion. So let x ∈ A ∪ B. Then x ∈ A or x ∈ B. But
the latter case still leads to the conclusion x ∈ A, since B ⊂ A. Either
way we get x ∈ A, as required. So A ∪ B = A and we are done. The
remaining parts are left as exercises.

The above may seem like a lot of tedious nit-picking to establish results
which are in most cases intuitively obvious, especially if you draw Venn
diagrams, but we have included it to give you a chance to start learning
how to go about proving things formally. (i) and (ii) tell us that ∪ and ∩
are commutative processes, and (iii) and (iv) say that they are associative.
10 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

Here are two more facts:

1.1.8 Proposition. (Distributive Laws) Let A, B, C be sets. Then:


(i) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
(ii) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C).

Proof. We prove (i) and leave (ii) as an exercise. Let x ∈ A ∩ (B ∪ C).


Then x ∈ A and x ∈ B ∪ C. So x ∈ B or x ∈ C. If the former, then
x ∈ A∩B; if the latter, then x ∈ A∩C. Either way x ∈ (A∩B)∪(A∩C).
We have shown that the left-hand set is contained in the right-hand one.
In the other direction, start by assuming that x ∈ (A∩B)∪(A∩C). Then
x ∈ A ∩ B or x ∈ A ∩ C, say the former, the other case being similar.
Then x ∈ A and x ∈ B. Using Proposition 1.1.7(vii) we have x ∈ B ∪ C,
and so x ∈ A ∩ (B ∪ C), as required.

We round off this section by introducing one more idea, that of the
complement of a set.

1.1.9 Definition. If A and B are sets then the complement of B in A is


the set {x ∈ A : x ∈
/ B}. We denote this set by A − B or A \ B.

Note that we are not requiring that B should be a subset of A. Observe


in particular that A − A = ∅ and A − ∅ = A. I leave you to prove the
following as an exercise:

1.1.10 Proposition. (De Morgan’s Laws) Let A, B, C be sets. Then:


(i) A − (B ∪ C) = (A − B) ∩ (A − C)
(ii) A − (B ∩ C) = (A − B) ∪ (A − C).
1.2. RELATIONS 11

Exercises:

1. Write down the number of elements in each of the following sets.


If there are infinitely many, say so:

(a) {1, 2, 3} (b) {1, 3, 3} (c) {{1, 3}} (d) {1, 3, {3}} (e) N (f ) {N}
(g) {1, N} (h) {N, Z} (i) ∅ (j) {∅} (k) {{∅}} (l) {∅, {∅}}.

2. Write down all the subsets of each of the following sets:

(a) {1, 2} (b) {a, b, c} (c) ∅ (d) {∅} (e) {∅, ∅}.

3. Prove the following facts about sets A, B, C. Follow the model of


1.1.7 and 1.1.8:
(a) A ∩ B ⊂ A
(b) A ∩ B = B ∩ A
(c) A ∩ ∅ = ∅
(d) A ∪ (B ∪ C) = (A ∪ B) ∪ C
(e) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
(f) A ∩ B = A ⇔ A ⊂ B
(g) A − (A − B) = A ∩ B
(h) A − (B ∪ C) = (A − B) ∩ (A − C).

1.2 Relations
Another way to make a new set out of two given sets A and B is to form
their Cartesian product A × B (pronounced “A cross B”). The elements
12 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

of this are not elements of A or B, but new objects (a, b) called ordered
pairs, where a ∈ A and b ∈ B. We will take (a, b) to be a primitive
concept, although it is possible to give a formal definition of it in terms of
sets. This is, however, artificial and unilluminating, so we won’t do it. In
practice, you will be quite familiar with the idea from ordinary coordinates
of points in the plane, although this is now much more general, since A
and B need not be subsets of R. The point to stress is that the order
matters. For example, if A = B = R, then (1, 2) and (2, 1) are different
ordered pairs. So, the formal definition is this:

1.2.1 Definition. The Cartesian product of two sets A and B is A × B =


{(a, b) : a ∈ A, b ∈ B}.

As an example, suppose A = {1, 2, 3, 4} and B = {x, y, z}. Then A×


B = {(1, x), (2, x), (3, x), (4, x), (1, y), (2, y), (3, y), (4, y), (1, z), (2, z), (3, z), (4, z)}.
We call a the first coordinate of (a, b) and b the second coordinate. Ob-
serve that R × R is just the familiar coordinate plane, also denoted R2 .
The adjective Cartesian comes from the name of René Descartes, the
seventeenth-century mathematician and philosopher who introduces the
idea of coordinates. If A is a finite set, we denote the number of elements
by |A|. The following is now clear:

1.2.2 Lemma. If |A| = m and |B| = n, then |A × B| = mn.

One of the central themes in Mathematics is the study of relationships


or relations between variable quantities. Two numbers x, y may be related
by virtue of satisfying some equation such as y 2 = x3 + 17, or some
inequality such as 3x + 4y ≥ 0. Two integers m, n may be related by the
1.2. RELATIONS 13

requirement that m be a multiple of n. Two lines in the plane may be


related by parallelism, or perpendicularity. We need to develop a language
which will incorporate all such examples. To be more precise, given two
sets A and B, whether of numbers, points, lines or whatever, we want
to define what we mean by a relationship between elements of A and
elements of B. Let us approach this by means of some examples. Take
A = {1, 2, 3, 4, 5, 6} and B = {−2, −1, 0, 1, 2}. Here are a few examples
of relationships we might consider between elements a ∈ A and b ∈ B:

(i) a = b ; (ii) a = 2b+3 ; (iii) a2 +b2 ≤ 4 ; (iv) a is a multiple of b.

Observe that each of these relationships gives rise to a subset of A × B,


consisting of just those pairs (a, b) which satisfy the condition. The sub-
sets we get here are:

(i) R = {(1, 1), (2, 2)} ; (ii) S = {(1, −1), (3, 0), (5, 1)} ;
(iii) T = {(1, −1), (1, 0), (1, 1), (2, 0)} ;
(iv) U = {(2, −2), (4, −2), (6, −2), (1, −1), (2, −1), (3, −1), (4, −1), (5, −1),
(6, −1), (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (2, 2), (4, 2), (6, 2)}.
Certainly all the above examples are of a “mathematical” type, but we
don’t want to limit our options in advance by specifying that only certain
types of relationship are mathematically “significant”. So we go for broke
and make the following:

1.2.3 Definition. A relation R from a set A to a set B is a subset of


A × B. If (a, b) ∈ R we will also write aRb.

Note that any subset R ⊂ A × B at all is allowed. If aRb we will say


14 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

that a is related to b (via R), or R-related to b. Two extreme case are the
total relation R = A×B in which everything in A is related to everything in
B, and the empty relation ∅ ⊂ A×B where nothing is related to anything.
If A = B, then a relation R ⊂ A × A is often just called a relation on
A. For example, the set {(x, y) ∈ Z × Z : y 2 = x3 + 17} is a relation
on Z. Actually, by very advanced means(theory of elliptic curves and
algebraic number fields) this last set can be shown to consist of the pairs
(−2, ±3), (−1, ±4), (2, ±5), (4, ±9), (8, ±23), (43, ±282), (52, ±375),
(5234, ±378661), but we won’t go into that here!
If we concentrate for a moment on the familiar set R, then a relation
on R is just a subset of the plane R2 = R × R, which we can visualise.
For instance,the relation given by the equation x2 +y 2 = 1 is the unit circle
around the origin, ie. the graph of the equation. Note that according to
the definition the relation is the graph.
An important (but dull) relation on a set A is that of equality. This is
given by the so-called diagonal subset ∆ of A × A, namely ∆ = {(a, a) :
a ∈ A}. Here a∆b ⇔ a = b. In the case A = R the corresponding
picture (the graph) is the straight line y = x.
For the rest of this section we will restrict our attention to relations on
a single set A (ie. to the case A = B). Many naturally arising relations on
A have one or more especially nice properties, and we want to concentrate
on a few of these now:

1.2.4 Definition. Let R be a relation on a set A. We say that R is:


(i) Reflexive if (a, a0 ∈ R for everya ∈ A;
(ii) Symmetric if (a, b) ∈ R ⇒ (b, a) ∈ R;
(iii) Transitive if (a, b) ∈ R and (b, c) ∈ R ⇒ (a, c) ∈ R.
1.2. RELATIONS 15

We emphasise at once that these are very special properties and most
subsets of A×A will not satisfy any of them. Let us spell out the meaning
of each of these a bit more. To say that R is reflexive means that at the
very least it must contain the diagonal set ∆ above as a subset. Put
another way, we require that every element of A is related to itself. If R
is symmetric, then whenever a is related to b we must also have that b is
related to a. In the case A = R, the corresponding picture (the graph)
of a reflexive relation must contain the diagonal line y = x, and that of a
symmetric relation must have mirror symmetry about this line. Transitivity
is harder to visualise directly, but the requirement for this sort of relation
is that whenever a is related to b and b is related to c, then we must also
have that a is related to c.

1.2.5 Examples. To get used to the ideas here are a few examples of
relations on the set A = {1, 2, 3, 4}:

Relation Reflexive Symmetric Transitive


R : (1, 1), (2, 2), (3, 3), (4, 4), (1, 2) X × X
(1, 1), (2, 2), (3, 3), (4, 4)
S: X X ×
(1, 2), (2, 1), (2, 3), (3, 2)
T : (1, 1), (2, 2), (3, 3), (2, 3), (3, 2) × X X
U : (1, 1), (2, 2), (3, 4), (4, 3) × X ×

The first is not symmetric, since although it contains (1, 2) it does not
contain (2, 1). The other three don’t have this asymmetry. The relation
S is not transitive, since although 1S2 and 2S3, we don’t have 1S3. Like-
wise 3U 4 and 4U 3, yet we don’t have 3U 3, so U is not transitive. The
16 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

fact that the other two relations are transitive is a rather tedious check
through all the possibilities, and we leave you todo this. You have to check
that whenever the relation contains two pairs such that the second coor-
dinate of the first equals the first coordinate of the second, then the pair
consisting of the first coordinate of the first pair followed by the second
coordinate of the second is also present! For example, since T contains
(2, 3) and (3, 2) it had better contain (2, 2) as well, and of course it does.
Note that these two pairs also fit together the other way round, so T needs
to contain (3, 3) too in order to have a chance of being transitive. It is
important to carry out all of the required checks for transitivity, not just
some of them. However, you can save some time by noticing that you
need never check a couple of pairs of the form (a, a) and (a, b). For (a, b)
is present in the relation already. Likewise in the case (a, b) and (b, b). So
R is transitive for rather trivial reasons.
Note that on any set the total relation A × A and the equality (di-
agonal) relation ∆ have all three properties above. Here are a few more
examples:

1.2.6 Examples. (i) The relation ≤ on R is reflexive and transitive, but


not symmetric;
(ii) The relation < on R is transitive, but neither reflexive nor symmetric;
(iii) The relation “a is a multiple of b” on N is reflexive and transitive,
but not symmetric;
(iv) The relation “a − b is even” on Z has all three properties;
(v) On the set of all human beings, the relation “is the brother of” is
transitive, but neither reflexive nor symmetric (why?);
(vi) On the set of all human males, the relation “is the brother of” is
1.2. RELATIONS 17

transitive and symmetric, but not reflexive.

We will now concentrate on the important and very nice class of rela-
tions having all three properties above:

1.2.7 Definition. A relation R on a set A is an equivalence relation if it


is reflexive, symmetric and transitive.

The total and equality relations on any set are equivalence relations,
as is Example 1.2.6(iv). So also is the relation “is parallel to” on the set
of all straight lines in the plane. We now investigate the behaviour of such
relations:

1.2.8 Definition. Let R be an equivalence relation on a set A and consider


an element x ∈ A. The set {y ∈ A : xRy} is called the equivalence class
of x, and we denote it by [x].

Thus [x] is a subset of A. We note at once that x ∈ [x], since xRx.


We next observe that two equivalence classes [x] and [y] cannot partially
overlap: they are either identical or disjoint:

1.2.9 Proposition. Let R be an equivalence relation on A and let x, y ∈


A. Then:
[x] = [y] if xRy, and [x] ∩ [y] = ∅ if not.

Proof. Suppose xRy. To prove that [x] = [y] we must show that whenever
an element z belongs to [x] then it also belongs to [y], and vice-versa. So
let z ∈ [x]. By definition we have xRz, and we also have yRx since R is
assumed symmetric. But now transitivity of R gives yRz, so that z ∈ [y].
So far we have shown that z ∈ [x] ⇒ z ∈ [y], and so [x] ⊂ [y]. For
18 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

the reverse inclusion we simply interchange the roles of x and y. Hence


[x] = [y], as required.
For the other part of the Proposition we have to show that if it is not
the case that xRy, then [x] ∩ [y] = ∅. Turning this around, this amounts
to showing that if[x] ∩ [y] 6= ∅, then xRy. Assuming, then, that [x] and
[y] are not disjoint, choose an element z in their intersection. Then xRz
and yRz. Since R is symmetric we have zRy, and then by transitivity
xRy, which is what we wanted.

Note that in proving the above, we have made use of all three proper-
ties in the definition of an equivalence relation. The upshot is that if we
have an equivalence relation R on a set A, then the various distinct equiv-
alence classes partition up A into a number of disjoint subsets. All the
elements in a given subset are equivalent to each other (under R), and el-
ements in different subsets are inequivalent. Here are two stupid examples:

(i) If R is the relation “=”, then each equivalence class has just one
element: [x] = x.
(ii) If R is the total relation (in which everything is related to everything
else) there is just one equivalence class, namely A.

More interestingly, consider the equivalence relation 1.2.6(iv) on Z.


There are precisely two equivalence classes: E = {. . . , −1, 0, 2, 4, . . .} =
{n ∈ Z : n is even} and F = {. . . , −1, 1, 3, 5, . . .} = {n ∈ Z : n is odd}.
Note that this does indeed partition Z into two disjoint subsets. Here is a
further example of the same kind, which we leave you to check in detail:
1.2. RELATIONS 19

1.2.10 Example. “a − b is a multiple of 3” is an equivalence relation on


Z. There are three equivalence classes:
A = {3n : n ∈ Z} = {. . . , −3, 0, 3, 6, . . .},
B = {3n + 1 : n ∈ Z} = {. . . , −2, 1, 4, 7, . . .},
C = {3n + 2 : n ∈ Z} = {. . . , −1, 2, 5, 8, . . .}.

For the equivalence relation of parallelism on the set L of all straight


lines in the plane, there would be one equivalence class corresponding to
each “direction”, or more precisely to each angle θ in the range 0 ≤ θ < π,
where θ is the angle made by a class of parallel lines with the positive x-
axis.
We close this section by looking at one more property which a relation
might or might not have:

1.2.11 Definition. A relation R on a set A is antisymmetric if aRb and


bRa ⇒ a = b.

In other words, if two elements are related to each other both ways
round, they are forced to be equal. Only very boring relations are sym-
metric and antisymmetric at the same time. As an exercise, show that if
this is the case, then in fact R ⊂ ∆.
Observe that in Examples 1.2.5 R is antisymmetric, but the other three
are not. Consider also Examples 1.2.6. Of these, (i) is clearly antisymmet-
ric: it is a basic property of the real numbers that if a ≤ b and b ≤ a, then
a = b. So also is (ii), though this might at first seem a little confusing.
We have to show that any time we simultaneously have a < b and b < a,
then in fact a = b. But the first pair of conditions can never occur, so
20 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

the check can never go wrong! Put another way, if this implication were
false, there would have to exist two numbers simultaneously satisfying the
conditions a < b, b < a, a 6= b, which is impossible. 1.2.6(iii) is also
antisymmetric, since if a and b are two positive integers and each is a
multiple of the other, then a = b. Examples (iv), (v) and (vi) are not
antisymmetric. Motivated by Example 1.2.6(i) we make the following:
1.2.12 Definition. A relation R on a set A is a partial order relation or
partial ordering if it is reflexive, antisymmetric and transitive.
1.2.13 Examples. (i) The relation ≤ on |R is a partial order; this is Ex-
ample 1.2.6(i) again:
(ii) The relation “a is a multiple of b on N is a partial order:
(iii) If A = {1, 2, 3, 4, 5}, then R = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (1, 2),
(1, 3), (1, 4), (1, 5), (3, 4), (3, 5)} is a partial order on A, as a little tedious
checking shows.
Motivated by the first of these examples, we often use the symbol ≤,
rather than R, for some partial ordering on a general set A, and may write
a ≤ b rather than aRb Of course, if we refer to such a partial order on a
set A, we are not implying that a and b are real numbers or that ≤ means
an ordinary inequality between numbers, but simply that it is a relation
satisfying the following properties:

a≤a; a ≤ b and b ≤ a ⇒ a = b ; a ≤ b and b ≤ c ⇒ a ≤ c .

Exercises:
1.2. RELATIONS 21

1. If A = {a, b, c, d} and B = {0, 1, 2}, list all the elements of A × B.

2. Write down the number of elements in each of the following sets.


If there are infinitely many, say so:

(a) {1, 2} × {3, 4} (b) {0} × {a, b, c, d, e, f } (c) R × {1, 2} (d) ∅ × Q


(e) {N} × {Z}.

3. If C ⊂ A and D ⊂ B, show that C × D ⊂ A × B. Give an ex-


ample of two sets A and B and a subset of A × B which is not of the
form C × D for any choice of C and D.

4. If A, B, C, D are four sets, prove that (A × B) ∩ (C × D) = (A ∩ C) ×


(B ∩ D).

5. If A, B, C, D are four sets, is it true that (A × B) ∪ (C × D) =


(A∪C)×(B ∪D)? If it is, prove it; if not, give a concrete counterexample.
(Hint: draw a picture!)

6. If |A| = m and |B| = n, how many relations are there from A to


B (ie. subsets of A × B)?

7. If |A| = n, how many reflexive relations are there on A? How many


symmetric relations?

8. True or false: If |A| = m and |B| = n, then the number of sub-


22 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

sets of A × B having the form C × D equals 2m · 2n = 2m+n ? (This is


not as obvious as you might think: consider the empty set.)

9. The following are some relations on the set A = {0, 1, 2, 3, 4}. Fill
out the table with ticks and crosses to indicate whether each relation is
reflexive, symmetric or transitive:

Relation Refl Symm Trans


R : (0, 0), (1, 1), (4, 4), (0, 3), (3, 0)
S : (0, 0), (1, 1), (1, 2), (2, 3)
T : (1, 1), (2, 2), (3, 3), (4, 4), (1, 2), (1, 3), (2, 3)
(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (0, 1),
U:
(1, 0), (1, 2), (2, 1)
(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (1, 3),
V :
(3, 1), (1, 4), (4, 1), (3, 4), (4, 3)

10. Which of the previous relations are equivalence relations? For any
that is, write down the equivalence classes.

11. Here are some more relations on various sets. Fill out the table
with ticks and crosses to indicate which of the properties applies to each:
1.2. RELATIONS 23

Set Relation R Refl Symm Anti Trans


R xRy ⇔ xy > 0
R xRy ⇔ xy ≥ 0
R∗ xRy ⇔ xy > 0
N xRy ⇔ HCF (x, y) > 1
R xRy ⇔ x2 = y 2
C (x + iy)R(z + it) ⇔ x ≤ z

Here R∗ = {x ∈ R : x 6= 0}, and HCF (x, y) means the highest common


factor of the natural numbers x, y (see the discussion following 2.3.24).

12. Are any of the relations of the previous question equivalence rela-
tions? If so, what are the equivalence classes?

13. Define a relation ∼ on the set A = N × N by (a, b) ∼ (c, d) if


a + d = b + c. Show that this is an equivalence relation.

14. Fix a positive integer n and, for any integers x, y, define x ≡ y


(mod n) to mean that x − y is a multiple of n. (We pronounce this:
x is congruent to y modulo n). Show that this is an equivalence rela-
tion on Z. Show that there are precisely n equivalence classes, namely
[0], [1], . . . , [n − 1]. (In other words, show that every integer falls into
exactly one of these sets.)

15. On the set C of complex numbers define a relation ∼ by setting


24 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

z ∼ w if |z| = |w|. Show that this is an equivalence relation. What do


the equivalence classes look like as subsets of the complex plane?

16. Let X be a set and let P = P(X) be the power set of X. By


definition, this means that P is the set whose elements are the various
subsets of X. Thus P = {S : S ⊂ X}. For S, T ∈ P define S ≤ T to
mean that S ⊂ T . Prove that ≤ is a partial ordering on P.

17. Show that if R is a relation on a set A which is both symmet-


ric and antisymmetric, then in fact R is a subset of the diagonal set
∆ = {(a, a) : a ∈ A}.

1.3 Functions
In this section we return to the set-up of a relation F ⊂ A × B between
any two sets A and B, and we suppose now that F has a very special
property. Namely, suppose that whatever element a ∈ A we start with,
there is exactly one pair (no more and no less) in F beginning with a, ie.
whose first coordinate is a. Then we call F a function from A to B. Let
us repeat this definition formally and extend it:

1.3.1 Definition. A function from A to B is a relation F ⊂ A × B with


the property that for each a ∈ A there exists exactly one b ∈ B such that
aF b, ie. (a, b) ∈ F . This unique b, which depends on a, is written F (a)
and is called the value of F at a (pronounced “F of a”).

Actually, although we can use any letters we like, it is most common to


1.3. FUNCTIONS 25

use small letters such as f, g, h for functions. To get used to the idea, here
are a few examples of relations from A = {1, 2, 3, 4, 5} to B = {0, 2, 4, 6}.
Some are functions, some are not:

1.3.2 Examples.

Relation Function?
f : (1, 0), (2, 4), (3, 6), (4, 0), (5, 2) X
g : (1, 6), (2, 4), (3, 0), (4, 0), (5, 4) X
h : (1, 2), (2, 2), (3, 2), (4, 0), (5, 0) X
R : (1, 4), (2, 6), (4, 0), (5, 2) ×
S : (1, 0), (2, 0), (2, 4), (3, 4), (4, 6), (5, 2) ×

R fails to be a function because if we start with the element 3 ∈ A,


there is no element b ∈ B such that 3Rb. Put another way, there is no
value associated with 3. Relation S is not a function since there is more
than one b ∈ B with the property that 2Sb. In other words, there are too
many values associated with 2. The first three above are functions, and
we have the following table of values:

f(1)=0,f(2)=4,f(3)=6,f(4)=0,f(5)=2
g(1)=6,g(2)=4,g(3)=0,g(4)=0,g(5)=4
h(1)=2,h(2)=2,h(3)=2,h(4)=0,h(5)=0.
In the case of functions, we use the notation f : A → B, rather than
f ⊂ A×B, to refer to a function f from A to B. Occasionally we write the
f
f over the arrow, instead of at the front, thus: A → B. Although logically,
according to our definition, f is a subset of A×B, psychologically we don’t
26 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

think of it like this. Rather, we think of f as a “rule” or “transformation”


or “process” which converts each “input” a ∈ A into an “output” or
“value” f (a) ∈ B. The whole point of Definition 1.3.1is to ensure that
for each and every possible input a ∈ A there is exactly one output in
B. In some areas of Mathematics, notably geometry and topology, the
words “mapping” and “map” are used as alternatives to “function”. The
first set A is called the domain of the function, and the second set B its
codomain.

When the set A is small, as in the examples above, we can specify a


given function from A to B by simply listing all the values at the inputs.
But for large sets, and in particular infinite ones such as Z or |R, we
cannot do this. In such cases a function is usually specified by giving some
rule which enables ones to calculate the output corresponding to any given
input. For example, we may say:

Let f : R → R be the function given by f (x) = x2 .

This means that whatever input real number we start with, the corre-
sponding output is x2 . Thus f (2) = 4, f (0) = 0 and f (−3) = 9. The
corresponding subset of R × R, ie. the graph of f (which, logically, is just
f ) is the parabola with equation y = x2 . It is important to realise that
the symbol x used for the variable in defining the function is a dummy
variable, and can be replaced by any other letter. We could just as well
define the function by specifying that f (y) = y 2 or f (s) = s2 , but not
f (p) = q 2 !
1.3. FUNCTIONS 27

An alternative notation to describe the function above is to write it as


(
f :R→R
f : R → R , x 7→ x2 or .
x 7→ x2

Here, the second arrow tells us what to send the general element x to.
Note the special form of this arrow; we use this type of arrow between
elements and an ordinary arrow between sets. We can pronounce it as
“x goes to x2 ”. Here are a few more examples of functions defined by
various rules:

1.3.3 Examples.

g : R → R , x 7→ 3x3 − 1
(
1 (if x ∈ Q)
h : R → R , h(x) =
0 (if x ∈
/ Q)

α : R → Z , α(y) = the greatest integer ≤ y


(
n2 (if n is a prime number)
β : N → N , β(n) = .
0 (if not)

Observe that sometimes we have to use words as well as symbols in


describing a function, and also that function s may be defined by differ-
ent rules on different parts of their domain, as in h and β. For instance,
these rules give the following values: g(0) = −1, g(2) = 23, h(7) =

1, h( 2) = 0, h( 34 ) = 1, h(π) = 0, α(6) = 6, α(−2) = −2, α(π) =
3, α(−π) = −4 (think!), β(6) = 1, β(7) = 49.
28 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

1.3.4. Z It is essential that in defining a function f : A → B we assign


an unambiguous unique value f (a) to each and every input a ∈ A, and
that this value should belong to B. None of the following are functions
(can you see why?):

k : R → R , x 7→ x−1


l : R → R , x 7→ x


γ : R+ → R , x 7→ ± x , where R+ = {x ∈ R : x ≥ 0}

δ : Z → N , x 7→ |x| .

We recall in this last example that |x| is the absolute value or modulus of
a real number x, defined by:
(
x (if x ≥ 0)
|x| = .
−x (if x < 0)

Thus |3| = 3 , | − 7| = 7 .
It is also crucial to realise that specifying the sets A and B is part and
parcel of defining a function f : A → B. Let us define formally what we
mean by two functions being equal:

1.3.5 Definition. Two functions f : A → B and g : C → D are equal if


and only if A = C, B = D and for all x ∈ A, f (x) = g(x).
1.3. FUNCTIONS 29

So f : R → R , x 7→ x2 and g : R → R+ , x 7→ x2 are two


different functions because they have different codomains, even though
they are given by the same rule. This is why we have given them different
names. In elementary books and in the sciences, people often just talk
about “the function f (x) = x2 ” or even“the function x2 ”, presumably
meaning f : R → R , x 7→ x2 , but in pure mathematics this will not do,
since we have not specified what we want the domain and codomain to
be. Moreover, the x2 and f (x) here are numbers (elements of R), not
functions: f is one thing (a rule), f (x) another (an element), This may
seem like nit-picking at present, but the importance of all this will become
clear as we progress. Next let us make the following:

1.3.6 Definition. The image (or range) of a function f : A → B is the


set {f (a) : a ∈ A}. It is denoted by f (A) or imf .

Thus imf is the subset of B consisting of all the values taken on by


the function. For instance, the image of f : R → R , x 7→ x2 is the set
R+ = {x ∈ R : x ≥ 0}. The function h of Examples () has image {0, 1},
and the function α there has image Z.
We come now to some fundamentally important properties which may
or may not be satisfied by a given function:

1.3.7 Definition. A function f : A → B is called:


(i) Injective if f (x) = f (y) ⇒ x = y, for all x, y ∈ A;
(ii) Surjective if imf = B;
(iii) Bijective if it is both injective and surjective.

Injective functions are also known as injections. Likewise for surjections


and bijections.
30 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

Let us take our time in getting used to these ideas. Firstly injective
functions. In the older literature these are also known as one-one functions.
The requirement is that the only way two elements x and y can have the
same image (ie. map to the same value) is when, in fact, they were the
same element to start with. Put another way, distinct elements map to
distinct images: x 6= y ⇒ f (x) 6= f (y). Thus the function f : R →
R , x 7→ x2 is not injective, since, for example, f (1) = f (−1). If we
return to Examples 1.3.3 we see that none of h, α, β are injective, since we
have, for instance, h(1) = h(2), α(1) = α(1.5) and β(1) = β(4). However
g is injective, since 3x3 − 1 = 3y 3 − 1 ⇒ 3x3 = 3y 3 ⇒ x3 = y 3 ⇒ x = y.
In the last implication we have used an elementary fact about cube-roots of
real numbers, which you may care to prove. Here are some more examples:
1.3.8 Examples. Let us take A = {1, 2, 3, 4, 5} and B = {0, 2, 4, 6, 8}:
(i) f : A → B given by 1 7→ 0, 2 7→ 6, 3 7→ 4, 4 7→ 2, 5 7→ 8
(ii) g : A → B given by 1 7→ 0, 2 7→ 6, 3 7→ 0, 4 7→ 2, 5 7→ 8
(iii) h : A → B given by 1 7→ 2, 2 7→ 2, 3 7→ 2, 4 7→ 2, 5 7→ 2 .
Clearly f is injective, but the other two are not, since, for example,
g(1) = g(3) and h(1) = h(2).
Next let us consider surjective functions (also known as onto func-
tions). The requirement here is that every element of the codomain B
should be hit, ie. should appear as a value of the function. Putting it
formally:

f is surjective if, for each b ∈ B, there exists some a ∈ A such that


f (a) = b.
1.3. FUNCTIONS 31

This is a good moment to introduce you to some standard mathemat-


ical shorthand. Two phrases occur so frequently in logic and mathematics
that it’s convenient to have symbols for them:
1.3.9 Definition. The symbol ∀ means “for all” or “for every” or “for
each”.
The symbol ∃ means “there exists” or “there is”.
So the definition of surjective can be written: ∀ b ∈ B, ∃a ∈ A such
that f (a) = b. If we are in a particularly concise mood, we might even
abbreviate “such that” to a colon: ∀ b ∈ B, ∃a ∈ A : f (a) = b.
In Examples 1.3.8 f is surjective, but the other two are not, since
img = {0, 2, 6, 8} and imh = {2}, neither of which is the whole of b.
Let us also consider the functions in Examples 1.3.3. Firstly, I claim
that g is surjective. To prove this we have to show that whatever y ∈ R
we start with, there then exists some x ∈ R satisfying 3x3 − 1 = y. This
equation is equivalent to 3x3 = y + 1, hence to x3 = 13 (y + 1). Since
everyqreal number has a (unique) real cube-root, we may form the number
x = 3 13 (y + 1), and this then has the required property that f (x) = y.
The point is that given the y, we then have to find the x such that this
holds. Next, h is not surjective, since its image is only {0, 1}. On the
other hand α is surjective, because given any n ∈ Z, there does exist
a real number mapping to n, namely n itself: α(n) = n. Finally, β is
not surjective, since its image consists only of numbers of the form p2 (p
prime), together with 1. Thus, for instance, the image does not contain
2.
Lastly, recall that a function f : A → B which is both injective and
surjective is called bijective. (In the older literature such a function is
32 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

known as a one-one correspondence.) For such a function, each element


b ∈ B comes from exactly one element a ∈ A. We can think of f as
pairing off in a one-to-one fashion all of the elements of A with all of the
elements of B.
Returning once more to Examples 1.3.8, we see that f is bijective and
the other two are not. As for Examples 1.3.3, g is the only one of the four
which is bijective.
The properties of being injective and surjective are in general quite
independent: a function may have either one without the other. Bit there
is one simple situation where they are not. We state this as Proposition
1.3.12. First a lemma:

1.3.10 Lemma. Let A and B be finite sets.


(i) If there exists an injective function f : A → B, then |A| ≤ |B|;
(ii) If there exists a surjective function f : A → B, then |A| ≥ |B|;
(iii) If there exists a bijective function f : A → B, then |A| = |B|.

Proof. (i) Write A = {a1 , a2 , . . . , am } and B = {b1 , b2 , . . . , bn }, so that


|A| = m and |B| = n. Since f is injective, the elements f (a1 ), . . . , f (am )
are all distinct. So B contains at least this many elements, ie. m ≤ n.
(ii) We will prove the equivalent contrapositive implication, ie. |A| <
|B| ⇒ f not surjective. So suppose that m < n. Then the elements
f (a1 ), . . . , f (am ) (which may or may not involve repetitions) cannot ac-
count for all of B, whence f is not surjective.
(iii) follows immediately from (i) and (ii).

We may restate (i) above by writing down the contrapositive implica-


tion:
1.3. FUNCTIONS 33

1.3.11 Corollary. (Pigeon-hole Principle) Let A and B be finite sets such


that |A| > |B|. Then there is no injective function f : A → B.

In concrete terms, we cannot put more than n letters into n pigeon-


holes without putting at least two into one of them! The next principle is
also extremely useful:

1.3.12 Proposition. Let f : A → B be a function between finite sets


such that |A| = |B|. Then: f is injective ⇔ f is surjective⇔ f is bijective.

Proof. Write A = {a1 , a2 , . . . , an } and B = {b1 , b2 , . . . , bn }. We only


have to prove the first equivalence. If f is injective, then f (a1 ), . . . , f (an )
are all distinct, hence account for all the elements of B, so that f is
surjective. If, on the other hand, f is not injective, there are fewer than n
distinct elements in this list, and so f is not surjective.

Z It follows from 1.3.10(iii) that if A is a finite set and B is a proper


subset (meaning that B ⊂ A and B 6= A), then there cannot exist a
bijective function f : A → B. But if A is infinite the situation is very
different. For example, the set of even integers E = {2x : x ∈ Z} is a
proper subset of Z, yet we have a bijective function f : Z → E, n 7→ 2n.
We come now to the most fundamental way of manufacturing new
functions from old:

1.3.13 Definition. Given functions f : A → B and g : B → C, their


composite (or composition) is the function g ◦ f : A → C given by
x 7→ g(f (x)).

Thus g ◦ f is a new function determined by the formula (g ◦ f )(x) =


g(f (x)). We must understand clearly what this means. For a start it is
34 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

crucial that the codomain of f is the same as the domain of g (namely


the set B). Forming the composite is a two-step process. We start with
an input from the set A and feed it into f to get the output f (x) in B.
We then feed this into the function g to get the final output g(f (x)) in
C. Thus g ◦ f means “do f first and g afterwards”.

1.3.14 Example. Let A = {1, 2, 3, 4, 5}, B = {0, 2, 4, 6}, C = {1, 3, 5, 7, 9}


and consider the functions f : A → B and g : B → C given by f (1) =
0, f (2) = 6, f (3) = 0, f (4) = 2, f (5) = 4 and g(0) = 5, g(2) = 1, g(4) =
1, g(6) = 7. Then g ◦ f is given by 1 7→ 5, 2 7→ 7, 3 7→ 5, 4 7→ 1, 5 7→ 1.

Note that in the general set-up of 1.3.13 it makes no sense at all to


try to form a composite f ◦ g the other way round, unless the sets A and
C happen to be equal. Even if A = B = C it will hardly ever be the case
that f ◦ g = g ◦ f . The order matters:

1.3.15 Example. Consider f : R → R , x 7→ x + 1 and g : R →


R , x 7→ x2 . Then (f ◦ g)(x) = f (g(x)) = f (x2 ) = x2 + 1. The beginner
often experiences great confusion at the final step here, thinking that the
answer should be (x + 1)2 . But we must remember that the function f
is a process, not a number. The rule for f is that whatever the input is,
the output is the input plus 1. So if the input is x2 , then the output is
x2 + 1. It may help at the final stage above to introduce a new letter
by putting, say, t = x2 . Then f (x2 ) = f (t) = t + 1 = x2 + 1. In the
same way we calculate that (g ◦ f )(x) = g(f (x)) = g(x + 1) = (x + 1)2 .
Hence we have the two composites: f ◦ g : R → R , x 7→ x2 + 1 and
g ◦ f : R → R , x 7→ (x + 1)2 . Observe that these are different functions.
1.3. FUNCTIONS 35

Extending these ideas, we often wish to compose several functions in


a row:
f g h
A→B→C→D

to obtain a function h ◦ g ◦ f : A → D. But what is this supposed to


mean? Do we mean (h ◦ g) ◦ f (ie. compose together h and g first, and
then the result with f ) or h ◦ (g ◦ f ) (ie. compose together g and f first,
and then h with the result)? Luckily it doesn’t matter:

1.3.16 Proposition. Composition of functions is associative. That is:


Given functions f : A → B, g : B → C and h : C → D, we have
(h ◦ g) ◦ f = h ◦ (g ◦ f ).

Proof. For all x ∈ A, ((h ◦ g) ◦ f )(x) = (h ◦ g)(f (x)) = h(g(f (x))) =


h((g ◦ f )(x)) = (h ◦ (g ◦ f ))(x).

This result allows us simply to write h ◦ g ◦ f , and likewise for longer


multiple compositions.
There is always one very trivial, but useful, function from any set to
itself:

1.3.17 Definition. The identity function on A is the function i : A →


A, x 7→ x. This is also variously denoted 1, id or iA , 1A , idA if one wishes
to specify the set A.

For any function f : A → B we clearly have f ◦ iA = f = iB ◦ f . We


come next to a fact of fundamental importance:

1.3.18 Theorem. A function f : A → B is bijective if and only if there


exists a function g : B → A such that f ◦ g = iB and g ◦ f = iA . In this
36 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

case, the function g is uniquely determined by f. It is called the inverse of


f and written f −1 (pronounced “f inverse”).

Proof. Suppose f is bijective. Then given any b ∈ B, there exists precisely


one a ∈ A such that f (a) = b, and we now define g(b) = a. This gives
us a function g : B → A. By the very construction, for each b ∈ B we
have b = f (a) = f (g(b)), so that f ◦ g = iB . Equally, if we start with any
a ∈ A, we may then put b = f (a), and it follows from the definition that
g(b) = a. Hence a = g(b) == g(f (a)), showing that g ◦ f = iA .
For the converse, assume that a function g with the required properties
exists. Suppose f (x) = f (y). Then g(f (x)) = g(f (y)) and hence x = y,
since g ◦ f = iA . This shows that f is injective. Now let b ∈ B. Since
f ◦ g = iB , we have b = (f ◦ g)(b) = f (g(b)). Thus b comes from the
element g(b) in A, proving that f is surjective and hence bijective.
As to the final statement about uniqueness, suppose h is some other
function such that f ◦ h = iB and h ◦ f = iA . Using Proposition 1.3.16
gives h = h ◦ iB = h ◦ f ◦ g = iA ◦ g = g.

In summary, a function has an inverse function if and only if it is bijec-


tive. In terms of Venn diagrams, we can think of reversing the direction
of all the arrows.

1.3.19 Example. (i) We have seen that the function f of Examples 1.3.8
is bijective. Its inverse f −1 : B → A is given by 0 7→ 1, 6 7→ 2, 4 7→
3, 2 7→ 4, 8 7→ 5.
(ii) We have also seen that the function g : R → R , x 7→ 3x3 − 1 of
Examples 1.3.3 is bijective, and we q
effectively computed its inverse earlier.
It is given by g −1 : R → R , x 7→ 3 13 (x + 1).
1.3. FUNCTIONS 37

The whole point of the inverse of a bijective function is that it reverses


or “undoes” whatever the function did. For instance, the function g above
is built up in several stages:

cube times 3 minus 1


R −→ R −→ R −→ R
.
x 7−→ x3 7−→ 3x3 7−→ 3x3 − 1

To reverse all this we must undo each of these processes in the opposite
order :
plus 1 divide 3 cube−root
R −→ R −→ R −→ R
1
q .
3 1
x 7−→ x + 1 7−→ 3 (x + 1) 7−→ 3 (x + 1)
We conclude this section by looking at two important constructions of
subsets associated with a function:

1.3.20 Definition. Let f : A → B be a function and let X ⊂ A, Y ⊂ B.


Then:
(i) The image of X is the set f (X) = {f (x) : x ∈ X} ;
(ii) The inverse image of Y is the set f −1 (Y ) = {x ∈ A : f (x) ∈ Y } .

Thus f (X) ⊂ B and f −1 (Y ) ⊂ A. Note that we recover our earlier


definition of the image f (A) of f as a particular case. The inverse image
is also known as the preimage or counterimage of Y . It consists of all the
elements of A which happen to be sent into Y by f .
Z This makes sense for any function, and one should not be fooled by
the notation into thinking that f −1 has to exist, ie. that f has to be
bijective. Of course, if f is bijective then there is a potential ambiguity in
the notation f −1 (Y ), since this might mean either the inverse image of Y
under f , or the image of Y under f −1 . Luckily these are the same, as we
38 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

leave you to check.

Note also that f −1 (B) = A, as results at once from the definition. If


Y = {y} has just one element, then it is conventional shorthand to allow
ourselves to write simply f −1 (y) rather than f −1 ({y}) for the inverse image
of the set {y}. Thus f −1 (y) = {x ∈ A : f (x) = y}.

1.3.21 Examples. Reconsidering some of Examples 1.3.3 q


we see that:
g(R+ ) = {x ∈ R : x ≥ −1}, g −1 (R+ ) = {x ∈ R : x ≥ 3 13 }

h(Q) = {1}, h({1, 2}) = {0, 1}, h−1 ({1, 2, 3}) = Q
√ √
α({ 2, − 2, π, −π}) = {1, −2, 3, −4}, α(Q) = Z,
α−1 ({1, 2}) = {x ∈ R : 1 ≤ x < 3}.

Exercises:

1. Let A = {1, 2, 3, 4, 5, 6} and {−1, 0, 2, π, e}. Indicate by a tick
or cross whether each of the following is a function from A to B:

Function from A to B?
f : (1, 0), (2, −1), (3, −1), (4, e), (5, π), (6, e)

g : (1, π), (2, e), (3, −1), (2, 2), (5, 0), (4, e), (6, 0)
h : (1, π), (2, π), (3, −1), (6, 0), (5, e)

k : (1, e), (2, e), (4, 0), (5, −1), (3, π), (6, 2)

l : (1, 0), (2, 2), (4, e), (3, π), (6, −1), (5, 2)

2. Which of the following rules give well-defined functions?


1.3. FUNCTIONS 39

Function?
f : Z → Z , n 7→ n2 − 1
g : N → N , n 7→ n2 − 1
h : R 7→ R , x 7→ (x2 + 1)−1/2
α : R 7→ R , x 7→ (x2 − 1)−1/2
β : N → N , n 7→ 1/n
γ : R → Q , γ(x) = the least rational number > x

(Think hard about the last one!)

3. Indicate which of the properties injective, surjective or bijective ap-


plies to each of the following functions:

Function Inj Surj Bij


f : {a, b, c, d} → {1, 2, 3} , a 7→ 2, b 7→ 1, c 7→ 1, d 7→ 3
g : {a, b, c, d} → {1, 2, 3} , a 7→ 1, b 7→ 2, c 7→ 1, d 7→ 2
h : {a, b, c, d} → {1, 2, 3, 4} , a 7→ 1, b 7→ 2, c 7→ 1, d 7→ 4
j : {a, b, c, d} → {1, 2, 3, 4} , a 7→ 4, b 7→ 2, c 7→ 3, d 7→ 1
k : {a, b, c, d} → {1, 2, 3, 4, 5} , a 7→ 3, b 7→ 2, c 7→ 1, d 7→ 5
l : {a, b, c, d} → {1, 2, 3, 4, 5} , a 7→ 3, b 7→ 5, c 7→ 2, d 7→ 5

4. Indicate which of the properties injective, surjective or bijective ap-


plies to each of the following functions:
40 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

Function Inj Surj Bij


f : Z → Z , x 7→ −x
g : Z → Z , x 7→ 2x
h : R → R , x 7→ 2x
θ : R → R+ , x 7→ x4
ϕ : Z → Z , ϕ(x) = the greatest integer ≤ 12 x

5. Let A = {a, b, c, d, e}, B = {1, 2, 3, 4}, C = {α, β, γ, δ} and consider


the functions f : A → B given by a 7→ 3, b 7→ 2, c 7→ 4, d 7→ 2, e 7→ 3
and g : B → C given by 1 7→ δ, 2 7→ β, 3 7→ α, 4 7→ δ . Express the
function g ◦ f : A → C in a similar way.

6. For the functions α : Z → Z , n 7→ n2 + 2n and β : Z → Z , n 7→


3n−1 work out the formulas for both composites (α◦β)(n) and (α◦β)(n).
Express your answers in the simplest form.

7. For the functions f : R → R , x 7→ x3 and g : R → R , x 7→ sin x


work out the formulas for both composites (f ◦ g)(x) and (g ◦ f )(x).

8. Let f : X → Y and g : Y → Z be functions. Prove the follow-


ing facts:
(a) If f and g are both injective, then so is g ◦ f .
(b) If f and g are both surjective, then so is g ◦ f .
(c) If f and g are both bijective, then so is g ◦ f .
(d) If g ◦ f is injective, then so is f .
(e) If g ◦ f is surjective, then so is g.
1.3. FUNCTIONS 41

9. In the situation of Ex.8(d), give a concrete example to show that g


need not be injective.

10. In the situation of Ex.8(e), give a concrete example to show that


f need not be surjective.

11. Consider the function f : R → R , x 7→ 7x2 + 3. Find each of


the following sets:

(a) f (R) (b) f ({0, 1, −1}) (c) f −1 (3) (d) f −1 (10)


(e) f −1 (1) (f ) f −1 ({3, 10}) (g) f −1 (R).

(Example: x ∈ f −1 (31) ⇔ f (x) = 31 ⇔ 7x2 + 3 = 31 ⇔ x2 =


4 ⇔ x = ±2. Hence f −1 (31) = {2, −2}.)

12. Which of the following functions are bijective? For each one that
is, write down the inverse function:

(a) f : Z → Z , x 7→ −x (b) g : Z → Z , x 7→ 2x
(c) h : R → R , x 7→ 2x (d) α : R → R , x 7→ x2
(e) β : R+ → R+ , x 7→ x2 (f ) γ : R∗ → R∗ , x 7→ x−1
(g) p : R → R , x 7→ 2x5 + 7.
13. Consider a function f : A → B and subsets X, Y ⊂ A. Prove:
(a) f (X ∪ Y ) = f (X) ∪ f (Y )
(b) f (X ∩ Y ) ⊂ f (X) ∪ f (Y ).
Give a concrete example to show that in (b) the two sides may not be
42 CHAPTER 1. SETS, RELATIONS AND FUNCTIONS

equal.
(c) Can you think of a property of the function f which would be enough
to guarantee that we do get equality in (b)?

14. Consider a function f : A → B and subsets S, T ⊂ B. Prove:


(a) f −1 (S ∪ T ) = f −1 (S) ∪ f −1 (T )
(b) f −1 (S ∩ T ) = f −1 (S) ∩ f −1 (T ).

15. Consider a function f : A → B and a subset X ⊂ A. Prove:


(a) X ⊂ f −1 (f (X)), and give an example where the two sides are differ-
ent.
(b) If f is injective, then X = f −1 (f (X)).

16. Consider a function f : A → B and a subset Y ⊂ B. Prove:


(a) Y ⊃ f (f −1 (Y )), and give an example where the two sides are differ-
ent.
(b) If f is surjective, then Y = f (f −1 (Y )).
Chapter 2

Groups

2.1 Operations and Monoids


In the Introduction we explained that one of the most fundamental ideas in
Mathematics is that of an operation, whereby two objects of a given type
are combined in some way to produce another one. We saw numerous
examples: adding two real numbers, multiplying two complex numbers,
multiplying two matrices, composing two rotations in three-dimensional
space, composing two shuffles of a pack of cards and so on. At that
stage our discussion was necessarily a bit vague. However, now that we
have developed the language of sets and functions we can make this idea
precise:

2.1.1 Definition. An operation on a set S is a function f : S × S → S.

Another name for an operation is a law of composition. Thus an op-


eration assigns to each possible ordered pair of elements (a, b) another

43
44 CHAPTER 2. GROUPS

element f (a, b) in S. Incidentally, when we apply a function f to an ele-


ment x we write the result as f (x). So when x = (a, b) we ought really to
write f ((a, b)), but as a notational shortcut it’s usual to drop the second
set of brackets. As an example of an operation we have:

s : R × R → R, s(a, b) = a + b. This is the operation of ordinary


addition on the set of real numbers.
As a matter of fact, when we start to apply some operation several
times over it becomes unbearably cumbersome to use the f (a, b) notation,
and so we usually rewrite this by putting some symbol such as ∗ or • or †
between a and b to denote the output. Thus we might write f (a, b) = a∗b.
This is exactly what we have done in the familiar concrete example above,
where the output has been denoted a+b. Indeed, once we start developing
the general theory, we shall tend not to write anything between a and b,
and just write f (a, b) = ab and call this the product of a and b. Of course,
this does not in general mean ordinary multiplication, since a and b may
not even be numbers. Here are some examples of operations, including
the one above:

2.1.2 Examples.
(i) R × R → R, (a, b) 7→ a + b (addition)
(ii) R × R → R, (a, b) 7→ a − b (subtraction)
(iii) R × R → R, (a, b) 7→ ab (multiplication).
We can get more examples from these three by replacing R with Z, Q or
C.
(iv) N × N → N, (m, n) 7→ mn (exponentiation)
(v) M × M → M, (A, B) 7→ AB (matrix multiplication), where here
2.1. OPERATIONS AND MONOIDS 45

M stands for the set of all (2×2) matrices with real number entries. More
precisely, the standard symbol for this set is M2 (R). (If you don’t know
about matrices yet, then ignore this example for the moment.)

An operation on a (small) finite set may conveniently be described by


a so-called multiplication table:

2.1.3 Example. Consider the operation ∗ on the set S = {a, b, c} given


by the following list of values: a ∗ a = a, a ∗ b = a, a ∗ c = b, b ∗ a =
a, b ∗ b = c, b ∗ c = a, c ∗ a = b, c ∗ b = c, c ∗ c = c. By this we mean, of
course, f : S × S → S given by f (a, a) = a, f (a, b) = a, etc. We can
summarise the information above in the form of a table:

∗ a b c
a a a b
b a c a
c b c c
For any elements x, y, to work out x ∗ y we read off the element where
the row labelled x meets the column labelled y.

We now investigate certain properties which an operation might or


might not have, and make the following:

2.1.4 Definition. Let ∗ be an operation on a set S. We say:


(i) ∗ is commutative if x ∗ y = y ∗ x (∀x, y ∈ S);
(ii) ∗ is associative if (x ∗ y) ∗ z = x ∗ (y ∗ z) (∀x, y, z ∈ S).

Reverting to Examples 2.1.2 it is elementary that ordinary addition and


46 CHAPTER 2. GROUPS

multiplication (examples (i) and (iii)) have both properties:

x+y = y+x, xy = yx, (x+y)+z = x+(y+z), (xy)z = x(yz) for x, y ∈ R.

This is just as well: aspects of everyday life such as adding up shopping bills
would be pretty impossible otherwise! Subtraction, however, has neither.
For instance 3 − 5 6= 5 − 3 and (3 − 4) − 5 6= 3 − (4 − 5).
Example (iv) also has neither property (Exercise: give concrete coun-
terexamples). As for (v), it is shown in basic methods courses that matrix
multiplication is associative: (AB)C = A(BC), but not commutative: in
general AB 6= BA.
Example 2.1.3 also has neither property. For example b∗c = a, whereas
c ∗ b = c, so that ∗ is not commutative. It is also not associative, since
(a ∗ b) ∗ b = a ∗ b = a, whereas a ∗ (b ∗ b) = a ∗ c = b.

2.1.5 Notation. If ∗ is an associative operation on S, we allow ourselves


to drop brackets entirely and simply write x ∗ y ∗ z for the common value
(x ∗ y) ∗ z = x ∗ (y ∗ z). It is only permissible to do this in the associative
case, for otherwise the order in which the two operations are performed is
crucial.

In the case of an expression of the form x ∗ y ∗ z ∗ t a little thought will


show that there are potentially five different meanings this might have,
according as how we choose the order in which to perform the operations.
We can clarify the meanings with brackets:

((x∗y)∗z)∗t, (x∗y)∗(z ∗t), (x∗(y ∗z))∗t, x∗((y ∗z)∗t), x∗(y ∗(z ∗t)).
2.1. OPERATIONS AND MONOIDS 47

In general these could all be different. But if ∗ is associative, they are all
the same. For instance, if we temporarily put a = x∗y, the first expression
equals (a ∗ z) ∗ t = a ∗ 9z ∗ t0 and this is the second expression. I leave
you to check that the other three are also equal to this. For this reason,
we can extend the convention above and simply write x ∗ y ∗ z ∗ t, so long
as ∗ is associative. Although we will not stop to do this now, one can, in
fact, formulate and prove a generalized associativity law which allows us,
for an associative operation, unambiguously to write arbitrarily long ex-
pressions x1 ∗x2 ∗. . .∗xn , and we shall do this without further comment.

Another central idea is that of an identity :

2.1.6 Definition. Let ∗ be an operation on a set S. We say that an


element e ∈ S is an identity element if e ∗ x = x = x ∗ e (∀x ∈ S).

Many authors use the symbol 1 instead of e. Of course, 1 here is just a


symbol for a certain special element of S, and does not mean the number
1. To avoid the risk of confusion, we shall not do so ourselves in this first
course.
If we return once more to Examples 2.1.2 we see that the real number
0 is an identity in (i), and the real number 1 is an identity in (iii):

0 + x = x = x + 0 and 1x = x = x1 (∀x ∈ R).

There is no identity element for subtraction. For suppose there were, say
e. Then the requirement x − e = x implies that e = 0. But then we would
also have x = e − x = 0 − x = −x, which is not true if x 6= 0.
48 CHAPTER 2. GROUPS
!
1 0
Matrix multiplication has the well-known identity I = .
0 1
As for Example 2.1.3, we see at once that there is no identity. Indeed,
if there were an identity e, then the column labelled e in the table would
have to be the same as the column down the left-hand border, and this
does not occur.
We note next that an operation cannot have more than one identity
element:

2.1.7 Proposition. Let ∗ be an operation on a set S, and suppose e and


e0 are two identity elements. Then e = e0 .

Proof. Since e is an identity, we have e∗e0 = e0 . But since e0 is an identity,


we also have e ∗ e0 = e. Thus e = e0 .

In view of this result, we see that if an operation has an identity at all,


then it is unique. We therefore refer to it as the identity element. We are
almost ready to introduce the idea of a group. But first it is convenient
to define an intermediate concept:

2.1.8 Definition. A monoid is a set M equipped with an operation ∗


which is associative and has an identity.

Strictly speaking, the specification of the law ∗ is part of the data of the
monoid, which ought therefore to be denoted as the ordered pair (M, ∗)
consisting of both the set M and the operation ∗ on it. If we wished to
make explicit reference to the identity element, we might also denote the
monoid as (M, ∗, e). In practice we usually just denote the monoid by the
underlying set M , provided the operation has been specified. If we need,
2.1. OPERATIONS AND MONOIDS 49

on occasion, to consider more than one monoid structure on a set M ,we


can use the more precise notation (M, ∗) and (M, ∗0 ) to distinguish the
structures. Or we could denote the second one by M 0 , say.
We now give some examples of monoids. Some we have seen before
and some are new:

2.1.9 Examples. (i) R equipped with ordinary addition is a monoid. The


identity is 0. The fancy notation would be (R, +, 0).
(ii) R equipped with ordinary multiplication is a monoid. The identity is
1. The fancy notation would be (R, ×, 1).
(iii) The set M = M2 (R) of (2×2) real matrices is a monoid under matrix
multiplication. The identity is the identity matrix I.
We obtain several more examples by replacing R with Z, Q or C in
(i),(ii),(iii).
(iv) The set M = {e, a, b} becomes a monoid under the operation ∗ given
by the table:

∗ e a b
e e a b
a a b a
b b a b
The first row and column show that e is the identity. This leaves us with
checking associativity. If we now simply write x ∗ y as xy, this amounts to
checking the equation (xy)z = x(yz) for all possible choices of x, y and
z. Since there are three choices for each, thee are in principle 33 = 27
checks to be done. However, if any one of x, y or z equals e the equation
50 CHAPTER 2. GROUPS

is easily seen to hold:

(ey)z = yz = e(yz) and (xe)z = xz = x(ez) and (xy)e = xy = x(ye).

So this immediately cuts things down to 23 = 8 remaining checks. We can


now trawl through these one by one. We have to verify that for each of
the strings aaa, aab, aba, abb, baa, bab, bba and bbb we get the same result
whether we place the brackets round the first two terms or the last two.
For instance:

(aa)a = ba = a and a(aa) = ab = a so that (aa)a = a(aa);


(aa)b = bb = b and a(ab) = aa = b so that (aa)b = a(ab).

I leave you to do the other six similar checks.

(v) The set M = {e, a, b} also becomes a monoid under the operation •
given by the table:

• e a b
e e a b
a a b e
b b e a
The checks are similar, and I leave them to you. Of course, all this checking
is somewhat tedious. We shall see in due course that for many naturally
arising monoids the associativity checks can be cut down to a minimum,
or even avoided altogether.
2.1. OPERATIONS AND MONOIDS 51

We conclude this section with one more fundamental concept:

2.1.10 Definition. Let M = (M, ∗, e) be a monoid and let x ∈ M . If


there exists an element y ∈ M such that x ∗ y = e = y ∗ x we call y an
inverse of x.

Note then that x is necessarily an inverse of y too. An element x may


very well not have an inverse. But if it does then there is only one:

2.1.11 Proposition. Let M = (M, ∗, e) be a monoid and let x ∈ M . If


y and z are inverses of x, then y = z.

Proof. y = y ∗ e = y ∗ (x ∗ z) = (y ∗ x) ∗ z = e ∗ z = z.

Observe That we made use of the associativity of M in the proof. The


upshot is that if x has an inverse at all, then it is unique. We therefore refer
to it as the inverse of x and denote it by x−1 (pronounced “x-inverse”).
Thus:

x∗x−1 = x−1 ∗x = e or, if we are missing out the ∗, xx−1 = x−1 x = e.

Note also that e ∗ e = e, so that e−1 = e, ie. e is its own inverse.

2.1.12 Examples. Consider again the monoids of Examples 2.1.9:


(i) (R, +, 0): Every element has an inverse, namely −x : x + (−x) =
(−x) + x = 0. Note that when the operation is written as “+” it is more
natural to write the inverse as −x, rather than x−1 , which would suggest
1/x.
(ii) (R, ×, 1): Here every element x apart from 0 has an inverse, namely
1/x. But 0 does not, for there is no real number y such that 0y = 1.
52 CHAPTER 2. GROUPS
!
a b
(iii) M2 (R): It is well known that some (2 × 2) matrices A =
c d
have inverses and some do not. In fact A has an inverse such that AA−1 =
A−1 A = I if and only if its determinant det A = ad − bc is non-zero.
(iv) Since the second and third rows of the table do not contain e, neither
a nor b has an inverse in this example.
(v) Here every element has an inverse: a−1 = b and b−1 = a.

Exercises:

1. For each of the following operations, state whether it is commuta-


tive, associative or has an identity. If there is an identity, state what it is:

Set Operation Commutative Associative Identity


N m ∗ n = mn
R∗ x ∗ y = x/y
R∗ x ∗ y = x/y + y/x
M2 (R) A ∗ B = A + B

2. Let S = {a, b, c, d} and define an operation on S by the table:

a b c d
a d c a b
b c b b a
c a b c d
d b a d a
2.1. OPERATIONS AND MONOIDS 53

Is this commutative or associative, and is there an identity element? If so,


what is it?

3. The following tables define operations on the set M = {e, a, b}:

e a b e a b e a b
e e a b e e a b e e a b
(a) (b) (c)
a a a a a a a a a a a a
b b b b b b b a b b a b

e a b e a b
e e a b e e a b
(d) (e)
a a a b a a e b
b b b a b b b e
In each case it is clear that e is an identity element. Fill out the fol-
lowing table with ticks and crosses to indicate whether each of the eight
strings aaa, aab, . . . , bbb satisfies the associativity condition, and hence
determine which of the above are monoids:

M aaa aab aba abb baa bab bba bbb Monoid?


(a)
(b)
(c)
(d)
(e)
54 CHAPTER 2. GROUPS

4. Let S be a set and define an operation on it by setting xy = y (∀x, y ∈


S). Show that this operation is associative, but that if |S| ≥ 2 there is no
identity element.

5. Let M be a monoid (with multiplication denoted simply ab) and fix an


element m ∈ M . Define a new operation ∗ on M by setting a ∗ b = amb.
Show that this operation is associative. Under what conditions on m is
(M, ∗) a monoid?

2.2 Groups
We are at last ready to define and begin our study of one of the most
fundamental structures in Algebra, and indeed Mathematics as a whole,
namely a group. Groups are ubiquitous throughout Mathematics and are
also crucial to our understanding of many other branches of science such as
quantum physics, relativity, crystallography, chemistry and particle physics.

2.2.1 Definition. A group is a monoid in which every element has an


inverse.

2.2.2 Examples. The following are groups:


(i) (R, +, 0)
(ii)(R∗ , ×, 1). Note, crucially, that the set is R∗ = {x ∈ R : x 6= 0}. We
saw above that (R, ×, 1) is not a group.
We get more examples by replacing R with Q or C in the above.
(iii) The set {1, −1} (ordinary real numbers) becomes a group under or-
2.2. GROUPS 55

dinary multiplication. The multiplication table is:

1 −1
1 1 −1
−1 −1 1

(iv) The set M = {e, a, b} becomes a group under the operation • given
by the table:

• e a b
e e a b
a a b e
b b e a

This was Example 2.1.12(v).


(v) The set G = {e, a, b, c} becomes a group under the operation given
by the table:

e a b c
e e a b c
a a b c e
b b c e a
c c e a b

You may care to do all the necessary associativity checks, as in 2.1.9(iv).


Even taking into account the remarks there, plus any other shortcuts,
there are still quite a few to do. Luckily we shall soon be able to see that
56 CHAPTER 2. GROUPS

this operation is associative in a much quicker way. Observe that e is the


identity element and that each element has an inverse: e−1 = e, a−1 =
c, b−1 = b, c−1 = a.
(vi) A set G consisting of just one element has exactly one operation on
it, ie. one function G × G → G, and this rather obviously turns G into a
group. We call it the trivial group. If we denote its unique element by e,
then G = {e}. Its multiplication table is rather simple:

e
e e

Here are two more basic definitions:

2.2.3 Definition. If a group G is finite, ie. has a finite number of elements,


we call this number the order of G and denote it |G|. If G is infinite, we
say that G has infinite order.

In Examples 2.2.2 above, groups (i) and (ii) are infinite, (iii) has order
2, (iv) has order 3, (v) has order 4 and the trivial group (vi) is of order
1. In fact we have already used the notation |S| more generally for the
number of elements in any finite set.

2.2.4 Definition. If the operation defining a group G is commutative, we


call the group commutative or abelian.

The word “abelian” comes from the name of Niels Hendrik Abel (1802-
1829), the great Norwegian mathematician whose ideas, among others, led
to the modern concept of a group. It is a mark of real fame in Mathematics
when your name becomes an adjective in small letters! All of Examples
2.2. GROUPS 57

2.2.2 are abelian groups, but we shall soon meet many that are not. From
the point of view of the multiplication table (at least for a finite group),
the commutative condition amounts to saying that the table has mirror
symmetry in the main diagonal, eg:

e a b c
e  a b c
a a  c e
b b c  a
c c e a 

Of course, we can talk more generally about commutative monoids.


We come now to the construction of an extremely important class of
groups. Fix some set X, and consider all the various bijections from X to
X. Denote the set of all these bijections by SX . Suppose α : X → X and
β : X → X are two such bijections. Then the composite α ◦ β : X → X
is also a bijection, and in this way we obtain an operation on the set SX :

SX × SX → SX , (α, β) 7→ α ◦ β .

Thus the operation here is just composition of functions. We have seen


earlier in 1.3.16 that composition is associative, and in 1.3.17 that there
is the identity bijection i : X → X such that i ◦ α = α ◦ i = α, for any α.
Moreover, Theorem 1.3.18 shows that each bijection α : X → X has an
inverse bijection α−1 : X → X such that α ◦ α−1 = α−1 ◦ α = i. Putting
all this together, we have therefore proved:
58 CHAPTER 2. GROUPS

2.2.5 Theorem. Let X be a set. Then the set SX of all bijections from
X to X is a group under the operation of composition of functions:

SX × SX → SX , (α, β) 7→ α ◦ β .

2.2.6 Definition. An element α ∈ SX (ie. a bijection α : X → X) is


called a permutation of the set X. The group SX of all permutations
of X is called the symmetric group on X. In the particular case X =
{1, 2, . . . , n} we denote it simply as Sn .

As a simple exercise, prove:

2.2.7 Lemma. The symmetric group Sn has order n!.

Let us concentrate for a bit on the group Sn , ie. the case X =


{1, 2, . . . , n}. There is a convenient way to represent its elements. Such an
element is a bijection α : X → X, and as such is determined by specifying
all the function values α(1), α(2), . . . , α(n). We write the permutation in
the form: !
1 2 ··· n
α= .
α(1) α(2) · · · α(n)
In this two-line notation we place the value α(k) directly beneath the input
k, Since α is bijective, the second line is just some rearrangement of the
top line. For example, the identity element of Sn is:
!
1 2 ··· n
i= .
1 2 ··· n

Recall that if β is another permutation, then the product αβ in Sn is just


2.2. GROUPS 59

composition: αβ = α ◦ β. We shall tend to drop the ◦ when multiplying


elements of Sn . So to work out αβ we simply start with 1, read off the
number β(1) underneath 1 in β, locate that number in the top line of α,
and finally read off the number beneath this. That is the value αβ(1),
and we place it underneath 1 in the two-line symbol for αβ. Now repeat
this procedure for 2, . . . , n. As an example with n = 5, let
! !
1 2 3 4 5 1 2 3 4 5
α= and β = .
2 4 3 5 1 3 2 5 1 4

Then under αβ we have 1 7→ 3 7→ 3, 2 7→ 2 7→ 4, 3 7→ 5 7→ 1, 4 7→ 1 7→ 2


and 5 7→ 4 7→ 5. Hence:
!
1 2 3 4 5
αβ = .
3 4 1 2 5

Z When multiplying permutations, remember to work from right to left,


since multiplication means composition.
Of course S1 is just the trivial group of order 1. Let us now look in
detail at the cases n = 2 and n = 3:

2.2.8. The Symmetric Group S2 :


! !
1 2 1 2
There are two elements: i = and α = . The
1 2 2 1
60 CHAPTER 2. GROUPS

multiplication table is:


i α
i i α
α α i

2.2.9. The Symmetric Group S3 :

This group has order 3! = 6. We will label the elements as follows:


! ! ! !
1 2 3 1 2 3 1 2 3 1 2 3
i= , α= , β= , ρ= ,
1 2 3 2 3 1 3 1 2 2 1 3
! !
1 2 3 1 2 3
σ= and τ = . The multiplication table is:
3 2 1 1 3 2

i α β ρ σ τ
i i α β ρ σ τ
α α β i σ τ ρ
β β i α τ ρ σ
ρ ρ τ σ i β α
σ σ ρ τ α i β
τ τ σ ρ β α i

Observe that this group is non-abelian. For example αρ = σ butρα = τ .


In fact, for n ≥ 3 the symmetric group Sn is never abelian.
As remarked earlier, from now on in our investigation of groups we
shall tend to drop symbols such as ∗ or •, and denote the product of two
elements x, y in a group simply as xy. The next result is very useful:
2.2. GROUPS 61

2.2.10 Proposition. (Cancellation Laws) Let G be a group and a, b, c


T hen : (i) ac = bc ⇒ a = b ;
elements of it.
(ii) ca = cb ⇒ a = b .

Proof. (i) Suppose ac = bc. Multiplying on the right by c−1 gives acc−1 =
bcc−1 . But acc−1 = ae = a, and likewise on the right. Hence a = b. The
proof of (ii) is similar.

2.2.11 Corollary. Let G be a group and fix an element c ∈ G. Then:


(i) The function r : G → G , a 7→ ac is bijective ;
(ii)The function l : G → G , a 7→ ca is bijective .

Proof. (i) The previous proposition shows that r is injective. Now take
any b ∈ G. Then r(bc−1 ) = bc−1 c = be = b, proving that r is also
surjective. Part (ii) is similar.

This leads to the following very useful observation:

2.2.12 Remark. Consider the multiplication table of a finite group G and


let c ∈ G. Cor.2.2.11(i) shows that as a runs once through the elements
of the group, so does ac. In other words, the column labelled by c just
consists of the elements of G in some rearranged order. In any given
column each group element appears once and once only. The same is true
of each row, using (ii). This gives us a very useful accuracy check when
computing the multiplication table of a finite group. Furthermore, if we
know in advance that we are dealing with a group and we wish to calculate
its table,we can use this observation to avoid having to explicitly calculate
all the individual products xy. Once we have computed a certain amount
of the table, we can start to fill in the rest for free.
62 CHAPTER 2. GROUPS

As an example, consider again the symmetric group S3 and suppose


we have so far explicitly calculated this much of the table:

i α β ρ σ τ
i i α β ρ σ τ
α α β i σ y x
β β
ρ ρ
σ σ
τ τ

where x, y and all the rest of the table are as yet undetermined. Then
x must be different from all the other elements in the second row and
last column, and this forces that x = ρ. Now that we know this, we can
further deduce that y = τ .
Returning to the discussion of 2.1.5, we remarked there that in a
monoid M (or indeed any set with an associative multiplicative opera-
tion) we may unambiguously write arbitrarily long expressions x1 x2 . . . xn ,
since no matter how sets of brackets are inserted to clarify the order of
operations the result will always be the same. We will not interrupt the
flow of our account to prove this rather tricky technical result known as
generalized associativity. One needs to use the method of mathematical
induction to establish it, and we will consign it to the Exercises. Granted
that, if x ∈ M and n ≥ 1 it is meaningful to consider the element xx . . . x
(n times). Just as in elementary algebra, we denote this by xn . We also
make the convention that x0 = e (the identity of M ). In the case of a
group we may even talk about negative powers:
2.2. GROUPS 63

2.2.13 Definition. Let G be a group, let x ∈ G and let n ≥ 1. We


define: 
n
 x = xx . . . x (n times)

x−n = (x−1 )n = x−1 x−1 . . . x−1 (n times)

 0
x =e

Powers of an element satisfy the following two familiar laws:

2.2.14 Proposition. Let G be a group, let x ∈ G and let m, n ∈ Z.


(i) xm xn = xm+n
Then:
(ii) (xm )n = xmn .

Proof. (i) This is clear by counting when m, n ≥ 0. Now suppose m, n ≤


0 and put k = −m, l = −n. Then xm xn = x−k x−l = (x−1 )k (x−1 )l =
(x−1 )k+l = x−(k+l) = xm+n , where we have several times used the defini-
tion above of x−k . This leaves us with checking the cases where m, n
have opposite signs. Suppose m > 0, n < 0 and put n = −l. If
m ≥ l, then xm xn = xx . . . xx−1 x−1 . . . x−1 where there are m copies
of x and l of x−1 . This expression collapses down to leave us with
xm xn = xm−l = xm+n , as required. If, on the other hand, m < l we
obtain instead xm xn = (x−1 )l−m = xm−l = xm+n . The case m < 0, n > 0
is entirely similar, and I leave you to check the details.
(ii) Once again this is clear when m, n ≥ 0. Note next that if n ≥ 0 then
xn (x−1 )n = xx . . . xx−1 x−1 . . . x−1 = e, after collapsing the expression
down from the middle. Hence: (xn )−1 = (x−1 )n = x−n , which is a
special case of what we are trying to prove.
Next let us deal with the case m ≥ 0, n ≤ 0 and put n = −l. Then
(xm )n = (xm )−l = ((xm )−1 )l = ((x−1 )m )l = (x−1 )ml = x−ml = xmn ,
64 CHAPTER 2. GROUPS

where we have several times used the special case above. The remaining
cases m ≤ 0, n ≥ 0 and m ≤ 0, n ≤ 0 are similar and left to you.

Here is another basic property of powers:

2.2.15 Proposition. Let G be a group, let n ≥ 0 and let x, y be two


elements of G which commute in the sense that xy = yx. Then (xy)n =
xn y n .

Proof. If n ≥ 1 then (xy)n = xyxy . . . xy, with a total of n copies each


of x and y. As x and y commute, we can shuffle all the x’s to the front
and write the expression as xx . . . xyy . . . y = xn y n . The case n = 0 is
trivial, since both sides then just reduce to e. Finally let n < 0. Then
−n ≥ 1 and by the case already dealt with (xy)−n = (yx)−n = y −n x−n .
Now take the inverse of each side, bearing in mind Prop.2.2.14(ii) and
the fact that (ab)−1 = b−1 a−1 (see Exercises). So (xy)n = ((xy)−n )−1 =
(y −n x−n )−1 = (x−n )−1 (y −n )−1 = xn y n once more.

Those who consider the shuffling argument of the first three lines above
to be a little loose may give a more formal proof using the method of in-
duction.

Z Note that the last Proposition applies in particular to all pairs of el-
ements in an abelian group. But if x and y do not commute, the rule
above is in general quite false. As an exercise, find a counterexample in
the symmetric group S3 .

2.2.16 Remark. In certain groups it is natural to denote the group law


additively and write x+y rather than xy. (We only ever do this for abelian
2.2. GROUPS 65

groups). This is so, for example, for the group (R, +, 0). In such a case we
write nx rather than xn . Thus when n ≥ 1 we have nx = x + x + . . . + x
(n times), and as we saw earlier the inverse of x is written −x.

Exercises:

1. Let G be a group and x, y ∈ G. Prove that (xy)−1 = y −1 x−1 .

2. Let G be a group with the property that x2 = e (∀x ∈ G). Prove that
G is abelian.
! !
1 2 3 4 5 1 2 3 4 5
3. For the elements α = and β =
2 3 1 5 4 1 3 4 5 2
of the symmetric group S5 , calculate α , β , αβ, βα, α and β −1 .
2 2 −1

4. Find two elements x, y ∈ S3 such that (xy)2 6= x2 y 2 .

5. Do any of the tables of Section 2.1, Exercise 3 define a group?

6. Consider the following two (partial) multiplication tables on the set


S = {p, q, r, s, t}:
66 CHAPTER 2. GROUPS

p q r s t p q r s t
p p q r
q r q r
(a) (b)
r r
s s s
t t t

(a) Explain why the first table cannot be completed in any way so as
to define a group.
(b) You are given that the second table can be completed (in exactly one
way) to define a group. Fill out the rest of the entries to give the complete
group table. (Hint: Think what the identity must be first. This enables
you to complete some of the table straight away. Then consider the penul-
timate entry down column 1, bearing in mind Remark 2.2.12, and carry
on like that.)

2.3 Subgroups, Cyclic Groups and


Lagrange’s Theorem
We now investigate the way in which certain subsets of a group may
themselves be groups:

2.3.1 Definition. Let H be a subset of a group G satisfying the conditions:


(i) e ∈ H
(ii) x, y ∈ H ⇒ xy ∈ H
(iii) x ∈ H ⇒ x−1 ∈ H.
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM67

Then H is called a subgroup of G.

Thus H is certainly non-empty, as it is required to contain the identity


e. The second condition says that whenever we multiply two elements of
H the result is still in H: we say that H is closed under multiplication.
The final condition requires that for each element x of H, the inverse x−1
should also lie in H: we say that H is closed under taking inverses.
Let H be a subgroup of G, and consider the multiplication operation
G × G → G, (x, y) 7→ xy. Suppose that we now only allow x and y to
range over the subset H. In view of condition (ii) we obtain a restricted
map H × H → H, (x, y) 7→ xy, or in other words an operation on the
set H. Of course, it is the same multiplication process as before, but
restricted down to the subset H × H ⊂ G × G. We now have:

2.3.2 Proposition. Let H be a subgroup of a group G. Then H becomes


a group in its own right with respect to the restricted operation H × H →
H, (x, y) 7→ xy.

Proof. H contains e, and the equations xe = ex = x hold for all x ∈ G,


and in particular therefore for all x ∈ H. Similarly, since the associativity
law holds for all elements of G, it certainly continues to hold for those in
H. Finally, for any x ∈ H, the inverse element x−1 in G actually lies in H
by (iii), and of course continues to serve as an inverse in H.

We see now that conditions (i), (ii) and (iii) above are precisely what
are needed in order that the subset H should be a group in its own right
with respect to the same multiplication operation.
68 CHAPTER 2. GROUPS

2.3.3 Examples. (i) Z and Q are subgroups of (R, +, 0). For 0 ∈ Z,


and if x, y ∈ Z, then x + y ∈ Z and −x ∈ Z. Likewise for Q.
(ii) Q∗ = {x ∈ Q : x 6= 0} is a subgroup of (R∗ , ×, 1). For 1 ∈ Q∗ , and
if x, y ∈ Q∗ , then xy ∈ Q∗ and x−1 ∈ Q∗ . Clearly {1, −1} is another
subgroup.
(iii) Returning to Example 2.2.2(v), ie. the group G = {e, a, b, c} with
table
e a b c
e e a b c
a a b c e
b b c e a
c c e a b

we see that H = {e, b} is a subgroup since all possible products of elements


of H remain in H, and b−1 = b belongs to H. But K = {e, a} is not a
subgroup: it is not closed under multiplication, since aa = b ∈ / K. For
that matter, it also fails on the third count: a−1 = c ∈ / K. For similar
reasons {e, a, b} is not a subgroup. And of course {a, b, c} has no chance
of being a subgroup, since it doesn’t even contain e.
(iv) Observe that any group G always has at least two subgroups, namely
G itself and the trivial group {e}. Occasionally these may be the only
subgroups, as is the case, for example, for the group M = {e, a, b} of
Examples 2.2.2(iv).

There is one easy way to create examples of subgroups of a given group


G:

2.3.4 Proposition. Let G be a group and let x ∈ G. Then the set


2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM69

hxi = {xn : n ∈ Z} is a subgroup of G. It is called the cyclic subgroup


generated by x.

Proof. Certainly hxi contains e = x0 . Now let xm and xn be two general


elements of hxi. Then hxi also contains the product xm xn = xm+n and
the inverse (xm )−1 = x−m .

It is important to realise that the elements . . . , x−3 , x−2 , x−1 , e, x, x2 , x3 , . . .


comprising hxi may all be distinct or may involve many repetitions. Thus
the set hxi may be finite or infinite.

2.3.5 Examples. (i) In the group (R, +, 0), the cyclic subgroup h1i equals
Z. For that matter, Z is also generated by −1: Z = h1i = h−1i.
h2i is the set of even integers, and h 21 i is the set of all integers and half-
integers.
(ii) In the group (R∗ , ×, 1) we have h2i = {. . . , 81 , 14 , 21 , 1, 2, 4, 8, . . .}. This
subgroup can also be generated by 21 .
(iii) Consider again the group G = {e, a, b, c} of Example 2.2.2(v). We
compute the powers of a:

. . . a−6 a−5 a−4 a−3 a−2 a−1 a0 a1 a2 a3 a4 a5 a6 . . .


... b c e a b c e a b c e a b ...

Thus hai = {e, a, b, c} = G , and also hci = G. However, we calculate in


the same way that hbi = {e, b}.
(iii) The trivial subgroup of any group G is cyclic, since clearly hei = {e}.

2.3.6 Definition. We say that a group G is cyclic if G = hxi, for some


x ∈ G.
70 CHAPTER 2. GROUPS

In other words, G is cyclic if it can be generated by the powers of a


particular element, or again if it equals one of its own cyclic subgroups.

2.3.7 Examples. (i) (Z, +, 0) is a cyclic group, since Z = h1i.


(ii) The group G = {e, a, b, c} of Examples 2.2.2(v) is cyclic, since hai =
G.
(iii) The group M = {e, a, b} of Examples 2.2.2(iv) is cyclic, since hai =
G.
(iv) (R, +, 0) is not cyclic: it is intuitively clear that whatever x ∈ R we
start with, the set of multiples nx (n ∈ Z) will not account for all of R.
We leave a formal proof of this to the Exercises.
(v) The symmetric group S3 of 2.2.9 is not cyclic (exercise).

The following simple fact is left to you to prove:

2.3.8 Proposition. A cyclic group is abelian.

We have earlier defined the order of a group to be the number of


elements. We now define a related concept:

2.3.9 Definition. Let G be a group and x an element. The order of x is


the order of the cyclic subgroup hxi generated by x. If hxi is infinite, we
just say that x has infinite order.

Note that the identity element is the only element of order one.

2.3.10 Proposition. Let x ∈ G. Then x has finite order ⇔ xn = e for


some positive integer n. In that case, the order of x is the least such
integer.
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM71

Proof. Suppose hxi is finite. Then the list e, x, x2 , x3 , . . . must contain


repetitions, and so there are two equal powers xi = xi+n , where n ≥ 1.
Multiplying by x−i now gives xn = e. Conversely, suppose some positive
power of x equals e, and choose the least such power n. So xn = e, but
no lower positive power equals e. Then e, x, x2 , . . . , xn−1 are all distinct.
For otherwise we have xi = xj with 0 ≤ i < j ≤ n − 1. But then xj−i = e
and 1 ≤ j − i < n, contradicting the choice of n as the least positive
power of x giving e. Moreover, whatever integer k we start with, it is a
basic fact of arithmetic (see Theorem 2.3.22) that we can divide n into
k to obtain a quotient q, and a remainder r which can be fixed up to lie
between 0 and n − 1:

k = qn + r, 0 ≤ r ≤ n − 1.

But xqn = (xn )q = eq = e and hence xk = xqn xr = xr . The upshot is


that hxi = {e, x, x2 , . . . , xn−1 } is a group of order n.

2.3.11 Examples. (i) All elements of (R, +, 0) apart from 0 have infinite
order.
(ii) The element −1 in (R∗ , ×, 1) has order 2. All other elements apart
from 1 are of infinite order.
(iii) For the group G = {e, a, b, c} of Examples 2.2.2(v) we see that a and
c have order 4, but b has order 2.
(iv) In the group M = {e, a, b} of Examples 2.2.2(iv), the elements a and
b both have order 3.

We can add a bit more to 2.3.10:


72 CHAPTER 2. GROUPS

2.3.12 Proposition. Let x ∈ G have finite order n. Then xk = e ⇔ k is


a multiple of n.

Proof. Suppose k = mn. Then xk = (xn )m = em = e. In the converse


direction, suppose xk = e, and write k = qn + r, 0 ≤ r ≤ n − 1, as in the
proof of 2.3.10. Then e = xqn+r = (xn )q xr = eq xr = xr . Since n is the
least positive power of x giving e, we must have r = 0, and so k = qn, a
multiple of n.

We are nearly ready to state our first big theorem. First a definition:

2.3.13 Definition. Let H be a subgroup of a group G and let x ∈ G.


The left coset of H by x is the set xH = {xh : h ∈ H}. Likewise the
right coset is the set Hx = {hx : h ∈ H}.

In general the cosets are just subsets of G, not subgroups, because


they won’t even contain e. However, in the case x = e, we clearly have
eH = He = H. Observe that if G is abelian, then xH = Hx. But in
general the cosets xH and Hx are different. Note also that if the group
law is denoted “+”, as in some of the groups above, then it is natural
to write x + H rather than xH. Indeed in that case we would write
x + H = {x + h : h ∈ H}. Here are some examples:

2.3.14 Examples. (i) For the subgroup Z = {. . . , −2, −1, 0, 1, 2, . . .}


of (R, +, 0), we have 21 + Z = {. . . , − 23 , − 21 , 12 , 23 , 52 , . . .} and 1 + Z =
{. . . , −1, 0, 1, 2, 3, . . .} = Z.
(ii) Consider the subgroup H = {1, −1} of (R∗ , ×, 1). Then 2H =
{2, −2}, πH = {π, −π} and in general xH = {x, −x}, for any x ∈ R∗ .
(iii) Let G = {e, a, b, c} be the group of Examples 2.2.2(v) and let
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM73

H = {e, b}. Then aH = {a, c}, bH = {e, b} and cH = {c, a}. Note that
aH = cH and eH = bH. Since G is abelian, the right cosets are just the
same.
(iv) Consider the subgroup H = {i, ρ} of S3 = {i, α, β, ρ, σ, τ } (see
2.2.9). Then iH = H, αH = {α, σ}, βH = {β, τ }, ρH = {ρ, i}, σH =
{σ, α} and τ H = {τ, β}. Thus there are precisely three distinct left
cosets: H = {i, ρ}, {α, σ} and {β, τ }. As for the right cosets, we calcu-
late in the same way that there are three of these also: H = {i, ρ}, {α, τ }
and {β, σ}. But they are not the same as the left cosets.

Next we note that the left and right cosets of H all have the “same
size”:

2.3.15 Proposition. Let H be a subgroup of a group G and let x ∈ G.


Then the maps θ : H → xH , h 7→ xh and ϕ : H → Hx , h 7→ hx are
bijective. In particular if H is finite, then |xH| = |Hx| = |H|.

Proof. By definition, each element of xH is of the form xh for some


h ∈ H, so θ is surjective. If xh = xk then h = k, by 2.2.10, so that
θ is also injective. The case of ϕ is similar, and the final statement is
immediate.

If we look back at Examples 2.3.14. especially (iii) and (iv), we see


that in those cases two left cosets turn out to be either identical or disjoint.
This is always so, as we shall shortly prove. We could in fact give a direct
proof right now, but it is perhaps instructive to show how this links in with
what we did on equivalence relations in the previous chapter. So we begin
by reinterpreting cosets. We will deal with left cosets: the discussion for
right cosets is entirely similar.
74 CHAPTER 2. GROUPS

2.3.16 Proposition. Let H be a subgroup of a group G. Define a relation


∼ on G by setting x ∼ y ⇔ y = xh for some h ∈ H, or in other words
x−1 y ∈ H. Then ∼ is an equivalence relation. The equivalence class [x]
equals xH.

Proof. x ∼ x since x = xe and e ∈ H. So ∼ is reflexive. Suppose


next that x ∼ y, so that y = xh for some h ∈ H. Multiplying on the
right by h−1 gives x = yh−1 , and we note that h−1 ∈ H since H is a
subgroup. Thus y ∼ x and ∼ is symmetric. Finally, suppose x ∼ y and
y ∼ z. Then ∃h, k ∈ H such that y = xh and z = yk. Hence z = xhk,
and hk ∈ H since H is a subgroup. This proves transitivity, and hence
that ∼ is an equivalence relation. Lastly y ∈ [x] ⇔ x ∼ y ⇔ y =
xh (for some h ∈ H) ⇔ y ∈ xH. So [x] = xH.

If we now apply Proposition 1.2.9 we obtain at once:

2.3.17 Corollary. Two left cosets of H are either identical or disjoint.


More precisely:

xH = yH (if y = xh for some h ∈ H)


xH ∩ yH = ∅ (if not).

2.3.18 Remark. All the above works just as well for right cosets. This
time we would define an equivalence relation by x ∼ y ⇔ y = hx (for
some h ∈ H), or equivalently yx−1 ∈ H.

We can now prove one of the most important results in group theory:

2.3.19 Lagrange’s Theorem. Let G be a finite group and H a subgroup.


Then the order of H divides the order of G.
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM75

Proof. Let |G| = n and |H| = m. Let H = x1 H, x2 H, . . . , xk H be the


various distinct left cosets of H, so that H is the union of these sets,
which are mutually disjoint. By 2.3.15 each coset has m elements. Hence
the total number of elements in G equals km, ie. km = n.

This theorem of Joseph Louis Lagrange (1736-1813, one of the founders


of modern group theory) tells us that it’s impossible to have a subgroup
of order 4 inside a group of order 6, or one of order 5 inside a group of
order 24.

Z Lagrange’s Theorem does not say that there necessarily exists a sub-
group of G of every possible order dividing |G|. It simply limits the possi-
bilities. For example, as we shall later see, there is a group of order 12 (the
alternating group A4 , a certain subgroup of S4 ) which has no subgroup of
order 6. Moreover, a group may have many different subgroups of a given
order. For instance, the symmetric group S3 has three subgroups of order
2.

2.3.20 Corollary. If G is a finite group of order n, then the order of each


element x ∈ G divides n, and hence xn = e.

Proof. The first statement follows upon applying Lagrange’s Theorem to


the subgroup hxi, and the second results from 2.3.12.

The next result shows that groups of prime order are very simple:

2.3.21 Proposition. Every group of prime order is cyclic.

Proof. Let |G| = p, prime. Since p ≥ 2, ∃x ∈ G with x 6= e. So by 2.3.20


x has order p. whence hxi = G and G is cyclic.
76 CHAPTER 2. GROUPS

In order to find out more about cyclic groups, we pause to prove a


basic fact of arithmetic to which we have alluded before:

2.3.22 Theorem. (Euclidean Division Algorithm) Let m, d be integers


with d ≥ 1. Then there exist integers q, r satisfying m = qd + r and
0 ≤ r ≤ d − 1. Moreover q and r are unique.

Proof. Consider the set S = {m − qd : q ∈ Z and m − qd ≥ 0}. I claim


that S 6= ∅. If not, then m < qd (∀q ∈ Z). If m = 0, this is clearly
impossible (take q = 0); if m > 0, take q = −m to obtain m < −md
and hence 1 < −d, contradicting d ≥ 1; and if m < 0, take q = m
to obtain m < md and hence 1 > d, again contradicting d ≥ 1. Thus
S is a non-empty set of non-negative integers, and as such contains a
least element, say r = m − qd (by the Well-Ordering Principle, a version
of induction). Then m = qd + r and r ≥ 0. Suppose r ≥ d. Then
m − (q + 1)d = r − d ≥ 0 and so r − d ∈ S, contradicting the choice of
r. Thus 0 ≤ r ≤ d − 1. Uniqueness is left as an exercise.

We are now able to establish an important property of cyclic groups:

2.3.23 Proposition. Let G be a cyclic group. Then all its subgroups are
cyclic.

Proof. Let G = hxi and consider a subgroup H. Certainly {e} = hei is


cyclic, so we may assume that H 6= {e}. Then H contains some positive
power of x, and we take xd ∈ H with the least d ≥ 1. I claim that in fact
H = hxd i. Certainly hxd i ⊂ H, so we now prove the reverse inclusion.
Let xn ∈ H and use 2.3.22 to write n = qd + r with 0 ≤ r ≤ d − 1. Then
xr = xn (xd )−q ∈ H. If r were positive this would contradict the choice
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM77

of d. Hence r = 0 and so xn = (xd )q ∈ hxd i. Thus H = hxd i and is


therefore cyclic.

In particular, we may apply this to the cyclic group Z = (Z, +, 0). We


have remarked before that when the group law is written additively we
write nx rather than xn . By the above, the subgroups of Z are all cyclic
of the form hmi, and since hmi = h−mi we may as well assume that
m ≥ 0. Thus hmi = {nm : n ∈ Z} and it is standard to write this set as
Zm or mZ. To recap:

2.3.24 Proposition. The subgroups of Z are the sets Zm = {nm : n ∈


Z}, for m ≥ 0. 

If m = 0 we get the trivial subgroup 0Z = {0}, if m = 1 we get the


whole group 1Z = Z, and for m ≥ 2 we obtain all the other subgroups:

2Z = {even numbers}, 3Z = {multiples of 3} and so on.

We can now apply our knowledge of the subgroups of Z to give a neat


treatment of the concept of the highest common factor (HCF) of two
integers m, n. Recall that an integer d is a common factor of m, n if it
divides both of them. As a matter of notation, if d divides m we write d|m.
Certainly 1 is a common factor of m, n and it is clear that there must, in
fact, be a greatest positive common factor, and it is this that we shall call
HCF (m, n). In Proposition 2.3.26 we shall establish the existence of the
HCF another way, and at the same time prove several more fundamental
facts about it. First we need a Lemma:
78 CHAPTER 2. GROUPS

2.3.25 Lemma. Let m, n ∈ Z. Then the set Zm + Zn = {am + bn :


a, b ∈ Z} is a subgroup of Z. It is called the sum of Zm and Zn.

Proof. Call the set M . Then 0 = 0m + 0n ∈ M . Now let x, y ∈ M ,


so that x = am + bn, y = cm + dn, with a, b, c, d ∈ Z. Then x + y =
(a + c)m + (b + d)n and −x = (−a)m + (−b)n, both of which belong to
M.

2.3.26 Proposition. Let m, n ∈ Z. Then:


(i) Zm + Zn = Zd, for a unique integer d ≥ 0;
(ii) d|m and d|n;
(iii) If e|m and e|n, then e|d;
(iv) d = am + bn, for some integers a, b;
(v) d = HCF (m, n).

Proof. (i) follows at once from 2.3.24. Now Zm+Zn contains 1m+0n =
m, and so m ∈ Zd. But this just means d|m, and similarly d|n. Since
d ∈ Zd = Zm + Zn, we have d = am + bn, for some integers a, b, and
we have proved (ii) and (iv). Part (iii) follows from (iv), since e|am + bn.
Finally (ii) and (iii) say that d is the greatest common divisor of m and
n.

We have shown that Zm + Zn = Zd, where d is the highest common


factor of m, n. For example 4Z + 6Z = 2Z and 9Z + 15Z = 3Z.

2.3.27 Definition. Two integers m, n are coprime (or relatively prime) if


HCF (m, n) = 1.

Thus 4 and 15 are coprime, but 4 and 6 are not. The case d = 1 of
the above gives at once:
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM79

2.3.28 Proposition. Let m, n ∈ Z. Then:


m, n are coprime ⇔ Zm + Zn = Z ⇔ am + bn = 1, for some integers
a, b.

With all of the above, which is of independent interest, we can now


complete our study of cyclic groups. If G = hxi, then G is also generated
by x−1 , since xn = (x−1 )−n , and G may well be generated by other
elements too. More precisely:

2.3.29 Proposition. Let G = hxi be cyclic. Then:


(i) If G is infinite, the generators are precisely x and x−1 ;
(ii) If |G| = n, the generators are the xk with 1 ≤ k ≤ n and k, n coprime.

Proof : (i) As x has infinite order, the elements xn (n ∈ Z) are all


distinct, as we have seen before. The powers of xk account only for the
elements xn in which n is a multiple of k, and these do not exhaust G
unless k = ±1.
(ii) Suppose k is coprime to n. By 2.3.28 we have 1 = ak + bn (a, b ∈ Z).
Then x = (xk )a (xn )b = (xk )a , since xn = e. Thus x ∈ hxk i and it follows
that G = hxk i. Conversely, let G = hxk i, for some integer k. Then
x ∈ hxk i, and so x = (xk )a = xak , for some integer a. Since x1−ak = e,
it follows from 2.3.12 that 1 − ak is a multiple of n, say 1 − ak = bn, and
k, n are coprime by 2.3.28. 

2.3.30 Examples. Let G = hxi be cyclic of order n. The generators of


80 CHAPTER 2. GROUPS

G are set out below for low values of n:

Order n Generators
2 x
3 x, x2
4 x, x3
5 x, x2 , x3 , x4
6 x, x5
7 x, x2 , x3 , x4 , x5 , x6
8 x, x3 , x5 , x7

The number of these generators defines an interesting function:

2.3.31 Definition. The Euler phi-function ϕ : N → N is defined by


setting ϕ(n) equal to the number of integers k in the range 1 ≤ k ≤ n
which are coprime to n.

Thus ϕ(5) = 4 and ϕ(6) = 2. The Swiss-born Leonhard Euler (1707-


1783) was perhaps the preeminent mathematician of the eighteenth cen-
tury, making fundamental discoveries in many areas of Mathematics, de-
spite being totally blind in the latter part of his life. I leave you to inves-
tigate the properties of ϕ(n) in the Exercises.
We have seen that all subgroups of a cyclic group are cyclic. We can
now be more precise:

2.3.32 Theorem. Let G = hxi be cyclic. Then:


(i) If G is infinite, the distinct subgroups are precisely the hxm i, m ≥ 0;
(ii) If |G| = n, then for each positive divisor d of n, there is a unique
subgroup of order d, namely hxn/d i.
2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM81

Proof :(i) We already know that the subgroups all have the stated form.
Now suppose hxm i = hxn i, for some m, n ≥ 0. Then xm = xan and
xn = xbm , with a, b ∈ Z. As we are in the infinite case, m = an and
n = bm. Either m = n = 0 or both are positive, and then again m = n,
each being a divisor of the other.
(ii) Let d be a positive divisor of n and consider the element y = xn/d . Cer-
tainly y d = xn = e. On the other hand, suppose y c = e. Then xnc/d = e,
so by 2.3.12 nc/d is a multiple of n, which in turn means that c/d ∈ Z,
or again that d|c. This establishes that y has order d. Conversely, let H
be any subgroup of order d. As H is cyclic, we may write H = hxm i.
Then xmd = e, and 2.3.12 again tells us that md = an, for some integer
a. Thus m = a(n/d) and so xm ∈ hxn/d i. As xm generates H, it follows
that H ⊂ hxn/d i and finally that the two sides are equal, as they have the
same size. 

Exercises:

1. This question is all about the symmetric group S3 = {i, α, β, ρ, σ, τ }


of 2.2.9.
(a) Which of the following subsets are subgroups?

(i) {i, α, ρ} (ii) {i, α, β} (iii) {α, β, ρ} (iv) {i, α}


(v) {i, σ} (vi) {i, α, β, τ } (vii) {i, ρ, σ, τ }.

(b) Give a complete list of all of the subgroups of S3 (Hint: there are
six).
82 CHAPTER 2. GROUPS

(c) For each x ∈ S3 work out the cyclic subgroup hxi and state the order
of x. Hence show that S3 is not cyclic.

2. Prove that a cyclic group is abelian. Is the converse true?

3. Prove that (R, +, 0) is not cyclic. (Hint: If it were, we would have


R = hxi for some suitable x. Now think about 21 x.

4. Consider the following elements of the symmetric group S4 :


! ! !
1 2 3 4 1 2 3 4 1 2 3 4
α= , β= and γ = .
2 1 4 3 3 4 1 2 4 3 2 1

Show that V = {i, α, β, γ} is a subgroup of S4 .

5. Check that the set G = {e, a, b, c} becomes a group under the op-
eration given by the table

e a b c
e e a b c
a a e c b
b b c e a
c c b a e

Find all the subgroups. Is G abelian? Is G cyclic?

6. Prove that a finite group of order n is cyclic ⇔ it contains an ele-


2.3. SUBGROUPS, CYCLIC GROUPS ANDLAGRANGE’S THEOREM83

ment of order n.

8. Show that an element x in a group G has order 2 ⇔ x 6= e and


x is self-inverse (x = x−1 ).

8. If G is a group of even order 2n, show that the number of elements


of order two is odd. In particular, G must contain at least one element of
order two. (Hint: Pair each element with its inverse.)

9. If H and K are two subgroups of a group G, show that H ∩ K is


a subgroup.

10. Let H and K be subgroups of a group G. Prove that H ∪ K is


a subgroup ⇔ H ⊂ K or K ⊂ H. (Hint for ⇒: Suppose H ∪ K
is a subgroup, yet neither of H, K is a subset of the other. Choose
x ∈ H, x ∈
/ K, y ∈ K, y ∈/ H and consider xy).

11. Let H and K be finite subgroups of a group G, and suppose their


orders m = |H| and n = |K| are coprime. (This means that the highest
common factor of m and n equals 1.) Use Lagrange’s Theorem to prove
that H ∩ K = {e}.

12. Let H be a subgroup of a group G, and define a relation ∼ on


G by setting x ∼ y ⇔ y = hx (for some h ∈ H). Prove that ∼ is an
equivalence relation and that the equivalence class [x] = Hx, the right
coset of H by x.
84 CHAPTER 2. GROUPS

13. Let G be a group and let C = {x ∈ G : xy = yx (∀y ∈ G)}.


This is called the centre of G and consists of those elements which com-
mute with everything.
(a) Prove that C is a subgroup of G and is abelian.
(b) Show that C = G ⇔ G is abelian.
(c) Find the centre of S3 .

14. You are given that there is a group Q = {1, −1, i, −i, j, −j, k, −k} of
order 8 (the quaternion group) in which 1 is the identity, −1 behaves ac-
cording to the ordinary laws of algebra, and in addition i2 = j 2 = −1, ij =
k and ji = −k. In other words, you are given this much of the group
table:
1 −1 i −i j −j k −k
1 1 −1 i −i j −j k −k
−1 −1 1 −i i −j j −k k
i i −i −1 k
−i −i i
j j −j −k −1
−j −j j
k k −k
−k −k k
(a) Work your way round systematically, using associativity and so on,
to complete the table. (This can be done in many ways. For example:
i(−i) = i(−1)i = iiii = (−1)(−1) = 1 = and ik = iij = (−1)j = −j.
Now do the rest.)
2.4. HOMOMORPHISMS AND ISOMORPHISMS 85

(b) Write down the order of each element.


(c) Find the centre of Q (see Ex.13).

2.4 Homomorphisms and Isomorphisms


We turn our attention now to functions f : G → H between groups.
We could, of course, consider any function, but since G and H are much
more than just sets, it seems natural to look at those functions f which
interact in some sensible way with the group structures present. A couple
of examples will help to make this clear:

2.4.1 Examples. (i) Consider the group R = (R, +, 0) and the doubling
function f : R → R , x 7→ 2x. Given two real numbers x and y, we can
combine them together first within the group (add) and then apply f to
the result, getting 2(x + y). Or we can apply f to each one at the start,
and afterwards combine the results to obtain 2x + 2y. We end up with
the same thing.
(ii) For the same group R consider instead the squaring function g : R →
R , x 7→ x2 . If we first add and then apply g, we obtain (x + y)2 . But if
we apply these processes the other way round, this time we get x2 + y 2 , a
different result.

We see that the function f here behaves in a nice way with respect
to the group structure (addition), whereas g does not. The difference is
expressed by saying that f is a homomorphism, whilst g is not. The formal
definition follows:
86 CHAPTER 2. GROUPS

2.4.2 Definition. A function f : G → H between two groups is a homo-


morphism if it satisfies the condition f (xy) = f (x)f (y) for all x, y ∈ G.

For a homomorphism we get the same result by multiplying x and


y first (in G) and then applying f , as by first of all forming the image
elements f (x) and f (y) and afterwards multiplying them (in H). We say
that f respects the group structures. The doubling function of 2.4.1 is
one example. Before giving further examples let us make the following:

2.4.3 Convention. Henceforth the group (R, +, 0) will simply be written


as R, so that the group law on this set will be understood to be addition.
Likewise for Z, Q and C. If we want to consider some other group law
on any of these sets, we will explicitly say so. Similarly, R∗ will always
mean (R∗ , ×, 1), so that the group law is multiplication unless otherwise
specified. Likewise for Q∗ and C∗ .

2.4.4 Examples. (i) f : R → R , x 7→ −x is a homomorphism, since


−(x + y) = (−x) + (−y);
(ii) g : R∗ → R∗ , x 7→ x2 is a homomorphism, since (xy)2 = x2 y 2 ;
(iii) Let G = {e, a, b, c} and H = {1, −1} be the groups of Examples
2.2.2(v) and (iii) respectively. Then the function θ : G → H given by
e 7→ 1, a 7→ −1, b 7→ 1, c 7→ −1 is a homomorphism. This requires some
checking. We have to verify that the equation θ(xy) = θ(x)θ(y) holds for
all possible choices of x, y ∈ G, in principle a total of 42 = 16 separate
checks. But note at once that if either of x, y equals e, the equation
certainly holds. For example θ(ey) = θ(y) = 1θ(y) = θ(e)θ(y). So we are
now down to 9 checks. Moreover, both groups are abelian, so we don’t
need to check the pair (b, a) as well as (a, b). It remains to deal with
2.4. HOMOMORPHISMS AND ISOMORPHISMS 87

the pairs (a, a),(a, b),(a, c),(b, b),(b, c) and (c, c). For the first of these we
have θ(aa) = θ(b) = 1 and θ(a)θ(a) = (−1)(−1) = 1, and these are
equal. For the second θ(ab) = θ(c) = −1 and θ(a)θ(b) = (−1)1 = −1,
once again equal. I leave you to check the other cases.

All this checking may seem rather painful. Once we have developed
some machinery, we shall be able to avoid a great deal of it, and construct
homomorphisms in a more streamlined manner . Now for a few more:

2.4.5 Non-Examples. (i) h : R → R , x 7→ x + 1. This is because


(x + y) + 1 6= (x + 1) + (y + 1);
(ii) k : R∗ → R∗ , x 7→ 21 (x2 + 1). For k(xy) = 12 (x2 y 2 + 1), whereas
k(x)k(y) = 41 (x2 + 1)(y 2 + 1), a different result in general;
(iii) Consider once more the groups G and H of 2.4.4(iii). The function ϕ :
G → H given by e 7→ 1, a 7→ −1, b 7→ 1, c 7→ 1 is not a homomorphism.
For ϕ(ab) = ϕ(c) = 1, whereas ϕ(a)ϕ(b) = (−1)1 = −1.

2.4.6 Z . We have shown that ϕ here is not a homomorphism by pro-


ducing one specific counterexample. Any of several others would have
done instead. Returning to the previous example(ii), we were rather
sloppy, claiming that the results for k(xy) and k(x)k(y) are visibly dif-
ferent. But can we be so sure? Two expressions which at first sight
appear very different may in fact be equal, due to some algebraic iden-
tity. Another example may make this clear. Define p : R → R by
p(x) = (x + 1)2 − (x − 1)2 . Then p(x + y) = (x + y + 1)2 − (x + y − 1)2
and p(x) + p(y) = (x + 1)2 − (x − 1)2 + (y + 1)2 − (y − 1)2 . At first
sight, these results seem different. But expanding out gives p(x + y) =
(x2 +y 2 +1+2xy +2x+2y)−(x2 +y 2 +1+2xy −2x−2y) = 4x+4y and
88 CHAPTER 2. GROUPS

p(x)+p(y) = (x2 +2x+1)−(x2 −2x+1)+(y 2 +2y +1)−(y 2 −2y +1) =


4x + 4y. So in fact p(x + y) = p(x) + p(y), and p is a homomorphism.
(Of course, the observant reader will spot that we could have simplified
the formula for p(x) to p(x) = 4x in the first place, from which it is much
more visible that p is a homomorphism. The moral of all this is:

• To establish the truth of some general proposition, give a formal


proof.

• To establish the falsity of a general proposition, give a concrete


counterexample.

In the light of all this, the only proper way to deal with the function k of (ii)
above is to produce a concrete numerical counterexample. For instance:
taking x = y = 2 gives k(xy) = 17 25
2 , which is not equal to k(x)k(y) = 4 .

The homomorphism property extends at once to products of three or


more terms:

2.4.7 Lemma. If f : G → H is a homomorphism, then f (xyz) =


f (x)f (y)f (z) and more generally f (x1 x2 · · · xn ) = f (x1 )f (x2 ) · · · f (xn ),
for x1 , . . . , xn ∈ G.

Proof. Using the homomorphism property twice gives f (xyz) = f ((xy)z)


= f (xy)f (z) = f (x)f (y)f (z). In the same way f (x1 x2 · · · xn )
= f (x1 x2 · · · xn−1 )f (xn ) = f (x1 x2 · · · xn−2 )f (xn−1 )f (xn ) = . . .
= f (x1 )f (x2 ) · · · f (xn ). The reader may supply a more formal proof by
induction.

The first three things to observe about homomorphisms are given in:
2.4. HOMOMORPHISMS AND ISOMORPHISMS 89

2.4.8 Proposition. Let G and H be groups, with identities e and e0 , and


let f : G → H be a homomorphism. Then:
(i) f (e) = e0
(ii) f (x−1 ) = (f (x))−1
(iii) f (xn ) = (f (x))n , for all n ∈ Z.

Proof. (i) Since f is a homomorphism, we have f (e)f (e) = f (e2 ) = f (e).


Multiplying (on the left say) by the inverse of f (e) gives the first result.
Nowf (x)f (x−1 ) = f (xx−1 ) = f (e) = e0 , and so f (x−1 ) is the inverse
of f (x), proving (ii). When n > 0 (iii) is just a special case of 2.4.7.
For n < 0 write m = −n, so that m > 0. Then f (xn ) = f ((x−1 )m ) =
(f (x−1 ))m = ((f (x))−1 )m = (f (x))n , using (ii) and the case just dealt
with. Finally, when n = 0 (iii) becomes trivial, both sides collapsing to
e0 .

When checking to see whether a given map is a homomorphism, you


should always look first to see whether it sends the identity to the identity.
If not, you can stop right there. This is the case, for instance, with the
function h : R → R of 2.4.5(i), which sends 0 to 1. Of course the property
f (e) = e0 is by no means enough on its own, as Examples 2.4.5(ii) and
(iii) show.

2.4.9 Example. Let G and H be any two groups, with identities e and
e0 . There is always at least one homomorphism from G to H, namely the
trivial homomorphism f : G → H , x 7→ e0 for all x ∈ G. This just sends
every element of G to the identity.

Here are two more basic facts about homomorphisms:


90 CHAPTER 2. GROUPS

2.4.10 Proposition. Let G,H and K be groups. Then:


(i) The identity function i : G → G is a homomorphism ;
(ii) If f : G → H and g : H → K are homomorphisms, then so is their
composite g ◦ f : G → K.

Proof. The first statement is evident, and the second follows from
(g ◦ f )(xy) = g(f (xy)) = g(f (x)f (y)) = g(f (x))g(f (y))
= (g ◦ f )(x)(g ◦ f )(y).

We have already defined in 1.3.6 the image imf of a function f . For


homomorphisms we now make the following:

2.4.11 Definition. The kernel of a homomorphism f : G → H is the set


{x ∈ G : f (x) = e0 }. It is written ker f .

2.4.12 Proposition. Let f : G → H be a homomorphism. Then ker f is


a subgroup of G and imf = f (G) is a subgroup of H.

Proof. According to Definition 2.3.1 we must check three things. Firstly


we know that f (e) = e0 and so e ∈ ker f . Now suppose x, y ∈ ker f , or
in other words f (x) = f (y) = e0 . Then f (xy) = f (x)f (y) = e0 e0 = e0 ,
and so xy ∈ ker f . Also f (x−1 ) = (f (x))−1 = (e0 )−1 = e0 , so that
x−1 ∈ ker f . This shows that ker f is a subgroup of G.
Now e0 = f (e) visibly belongs to imf . Let x0 , y 0 ∈ imf . Then we
may write x0 = f (x) and y 0 = f (y) for suitable elements x, y ∈ G. Now
x0 y 0 = f (x)f (y) = f (xy), and hence x0 y 0 ∈ imf . Finally, 2.4.8(ii) shows
that x0−1 = f (x−1 ) ∈ imf , proving that imf is a subgroup of H.
2.4. HOMOMORPHISMS AND ISOMORPHISMS 91

Recall from 1.3.20 the concepts of the image and inverse image of
subsets under a function. The next result is a generalization of the previous
Proposition, and is left for the reader to prove along the same lines:

2.4.13 Proposition. Let f : G → H be a homomorphism.


(i) If G0 is a subgroup of G, then f (G0 ) is a subgroup of H;
(ii) If H 0 is a subgroup of H, then f −1 (H 0 ) is a subgroup of G.

For homomorphisms we have the following extremely useful little test


for injectivity:

2.4.14 Lemma. A homomorphism f : G → H is injective if and only if


ker f = {e}.

Proof. Let f be injective and let x ∈ ker f . Then f (x) = e0 = f (e) and
so x = e, proving that ker f = {e}. Conversely, assume that ker f = {e}
and let f (x) = f (y). Then f (xy −1 ) = f (x)f (y −1 ) = f (x)(f (y))−1 = e0 .
Thus xy −1 ∈ ker f and it follows from our assumption that xy −1 = e, so
that x = y, proving injectivity.

We have stressed in the Introduction the idea that Algebra is concerned


with the concept of structure and with the study of features which are
common to all systems which are structurally similar. The technical term
for structural similarity is isomorphism. In the context of group theory this
term is defined as follows:

2.4.15 Definition. A bijective homomorphism f : G → H between two


groups is called an isomorphism. If such an isomorphism exists, we say
that G is isomorphic to H and write G ∼= H.
92 CHAPTER 2. GROUPS

Here are some examples:

2.4.16 Examples. (i) The function f : R → R , x 7→ 2x of Examples


2.4.1(i) is an isomorphism. More generally, if we fix some a ∈ R∗ , then
so also is f : R → R , x 7→ ax;
(ii) Let R∗+ = {x ∈ R : x > 0}, which is a subgroup of R∗ (see Exercise
6). Then the function g : R∗+ → R∗+ , x 7→ x2 is an isomorphism;
(iii) Consider the group M = {e, a, b} of Examples 2.2.2(iv) and also the
subgroup H = {i, α, β} of the symmetric group S3 (see 2.2.9). These are
isomorphic, an explicit isomorphism being given by f : M → H, where
f (e) = i, f (a) = α, f (b) = β, as is easily checked.

The last example illustrates the idea of structural similarity. Clearly M


and H are not the same set but if we were to rename the elements e, a, b
of M as i, α, β, respectively, then the multiplication table of M would
turn into that of H:

• e a b • i α β
e e a b i i α β
a a b e α α β i
b b e a β β i α

We do not wish to distinguish in any essential way between two isomorphic


groups: they both exhibit the same structure, but under different names.
In somewhat the same way we may call a bird a nightingale, rossignol,
Nachtigall or usignolo, but it is still the same bird! The only interesting
properties of a group are those which continue to hold for all other groups
isomorphic to it. Such properties might be called structural properties or
2.4. HOMOMORPHISMS AND ISOMORPHISMS 93

isomorphism invariants. Three simple examples of such invariants are the


order of a group, the property of being abelian and that of being cyclic:

2.4.17 Proposition. Let G and H be isomorphic groups. Then:


(i) G and H have the same order;
(ii) If G is abelian, so is H;
(iii) If G is cyclic, so is H.

Proof. (i) follows at once from 1.3.10. Let f : G → H be an isomorphism


and suppose that G is abelian. Given elements h, h0 ∈ H, we may write
h = f (g) and h0 = f (g 0 ) for suitable g, g 0 ∈ G. Then hh0 = f (g)f (g 0 ) =
f (gg 0 ) = f (g 0 g) = f (g 0 )f (g) = h0 h, proving (ii). As for (iii), let G be
cyclic, generated by an element x, and take any h ∈ H. Write h = f (g).
Then g = xn , for some n ∈ Z. Hence h = f (g) = f (xn ) = (f (x))n ,
showing that H is indeed cyclic, generated by the element f (x) of H.

We have spelt all this out in detail. In future we will not labour the
point, as it will usually be clear immediately that the properties we study
pass over from a given group to one isomorphic to it.
The next result shows that isomorphism behaves just like an equiva-
lence relation:

2.4.18 Proposition. Let G,H and K be groups. Then:


(i) G ∼ =G;
(ii) If G ∼
= H, then H ∼
=G;
(iii) If G ∼
= H and H ∼
= K, then G ∼
=K .

Proof. (i) The identity function i : G → G is evidently an isomorphism.


Now suppose that f : G → H is an isomorphism. Since f is bijective, it
94 CHAPTER 2. GROUPS

has an inverse f −1 : H → G and to prove (ii) we just have to check that


this too is a homomorphism. So let h, h0 ∈ H and write g = f −1 (h), g 0 =
f −1 (h0 ). Then h = f (g), h0 = f (g 0 ) and we have hh0 = f (g)f (g 0 ) =
f (gg 0 ), whence f −1 (hh0 ) = gg 0 = f −1 (h)f −1 (h0 ), as required. Finally,
(iii) follows at once from 2.4.10 together with the obvious fact that the
composition of two bijections is again bijective.

2.4.19 Remark. Why did we say above that isomorphism behaves like an
equivalence relation, rather than is an equivalence relation? The answer is
that an equivalence relation takes place on a set, and for technical reasons
to do with the foundations of set theory the collection of all groups cannot
be regarded as a set. We should call it a class. It turns out that some
classes or collections are so big that the unrestrained use of all the usual
set-theory operations on them leads to logical paradoxes. The problems
only arise with unimaginably vast collections such as the class of all sets,
or (here) the class of all groups. We need not worry any further about
this, as all the classes which arise in practical everyday Mathematics fall
within the scope of set-theory. If two groups are isomorphic, we say that
they lie in the same isomorphism class.

We have seen above that isomorphic groups have the same order. The
converse is very definitely false. The simplest example is this:

2.4.20 Example. Let G = {e, a, b, c} and H = {e0 , p, q, r} be the groups


2.4. HOMOMORPHISMS AND ISOMORPHISMS 95

given by the tables

e a b c e0 p q r
e e a b c e0 e0 p q r
a a b c e p p e0 r q
b b c e a q q r e0 p
c c e a b r r q p e0

We have seen already in Examples 2.2.2(v) that G is a group. The


check that H is a group is similar, and left to you. Both groups have order
four, but they are not isomorphic. For G = hai is cyclic, whereas H is
not, since none of the elements generates it.
In studying the structure of a group G, we can ask what isomorphisms
there are from G to itself:

2.4.21 Definition. An isomorphism f : G → G is called an automorphism


of G.

Every group G has at least one automorphism: the identity i : G → G.


The proof of Proposition 2.4.18 shows that the composite of two automor-
phisms is again an automorphism, as is the inverse of an automorphism.
Thus the set of all automorphisms of G is a subgroup of the full symmetric
group SG of all permutations of the set G. We record this as:

2.4.22 Proposition. The set Aut(G) of all automorphisms of a group


G is itself a group under the operation of composition. It is called the
automorphism group of G.
96 CHAPTER 2. GROUPS

2.4.23 Example. Consider the cyclic group G = {e, a, b, c} = hai in


Example 2.4.20 above. Any automorphism f : G → G must send a to
a generator, so that f (a) = a or c, and since f (an ) = (f (a))n , f is
entirely determined by the value of f (a). Thus there are only two possible
automorphisms, the identity i and another one ϕ determined by ϕ(a) = c.
In full, ϕ is given by e 7→ e, a 7→ c, b 7→ b, c 7→ a. This is easily checked
to be an automorphism, and thus Aut(G) = {i, ϕ}.

An important source of automorphisms of a group G are the inner


automorphisms:

2.4.24 Proposition. (i) For g ∈ G the map γg : G → G , x 7→ gxg −1 is


an automorphism of G. It is called an inner automorphism and the element
gxg −1 is called the conjugate of x by g.
(ii) We have γe = i, γgh = γg γh and γg−1 = (γg )−1 , and the set Inn(G)
of all inner automorphisms of G is a subgroup of Aut(G).

Proof. Firstly γg (x)γg (y) = (gxg −1 )(gyg −1 ) = gxyg −1 = γg (xy), and so


γg is an homomorphism. Now clearly γe = i and γgh (x) = (gh)x(gh)−1 =
ghxh−1 g −1 = γg (hxh−1 ) = γg (γh (x)), showing that γgh = γg ◦ γh . Hence
γg γg−1 = γg−1 γg = γe = i, proving the last of the formulas in (ii). In
particular we have shown that γg has an inverse and so is an automorphism.
Finally, these same formulas show at once that Inn(G) is a subgroup of
Aut(G).

Note next that an injective homomorphism f : G → H (also known as


a monomorphism) gives rise to an isomorphism between G and a subgroup
of H:
2.4. HOMOMORPHISMS AND ISOMORPHISMS 97

2.4.25 Proposition. An injective homomorphism f : G → H gives rise


to an isomorphism f˜ : G → f (G) , x 7→ f (x) between G and the image
f (G).

Proof. All we have done in passing from f to f˜ is to cut the codomain


down to f (G). So f˜ is clearly still an injective homomorphism. But we
have now made it surjective too.

This has one interesting consequence. For any group G, define the
left multiplication map by an element g ∈ G to be the function lg : G →
G , x 7→ gx. Rather as in the proof of 2.4.24 we note at once that
le = i, lg lh = lgh , (lg )−1 = lg−1 , and in particular that lg is a permutation
of the set G. (It is not a homomorphism). We now have:

2.4.26 Cayley’s Theorem. The function l : G → SG given by g 7→ lg is


an injective homomorphism of groups. Thus G is isomorphic to a subgroup
of the symmetric group SG . In particular, a finite group of order n is
isomorphic to a subgroup of Sn .

Proof. The formula lg lh = lgh shows that l is a homomorphism. Suppose


now that g ∈ ker l, so that lg = i. Then g = lg (e) = i(e) = e. So
ker l = {e} and injectivity follows from 2.4.14. For the final point, note
that if G = {g1 , . . . , gn }, then SG ∼
= Sn .

Thus, up to isomorphism, all finite groups can be concretely realized


as groups of permutations, that is as subgroups of a symmetric group Sn .
This is more of theoretical interest than practical use.
98 CHAPTER 2. GROUPS

2.4.27 Example. Consider once more the group G = {e, a, b, c} of Exam-


ple 2.4.20. Using a two-line notation analogous to that following Lemma
2.2.7 we may write:
! ! !
e a b c e a b c e a b c
la = , lb = and lc = .
a b c e b c e a c e a b

If we now rename e, a, b, c as 1, 2, 3, 4, it follows that G is isomorphic to


the subgroup of S4 consisting of i together with the permutations:
! ! !
1 2 3 4 1 2 3 4 1 2 3 4
α= , β= and γ = .
2 3 4 1 3 4 1 2 4 1 2 3

The reader who objects to this renaming procedure is referred to the


exercises for a more formal discussion.

Exercises:

1. Which of the following are homomorphisms? For those which are,


give a proof. For those which are not, give a concrete counterexample:

(a) f : R → R , x 7→ ax (for some fixed a ∈ R)


(b) g : R → R , x 7→ x3
(c) h : R → R , x 7→ |x|
(d) k : R → R , x 7→ 1 for all x ∈ R
(e) l : R → R , x 7→ x (if x ∈ Q), 0 (if not)
2.4. HOMOMORPHISMS AND ISOMORPHISMS 99

(f) p : R∗ → R∗ , x 7→ xn (for some fixed n ∈ Z)


(g) q : R∗ → R∗ , x 7→ 2x
(h) r : R → R∗ , x 7→ ex
(i) s : R → C∗ , x 7→ eix = cos x + i sin x
(j) t : C∗ → R∗ , z 7→ |z|

2. For each of the previous functions which is a homomorphism, de-


termine the kernel and the image.

3. Let G = {e, a, b, c} be the group of Examples 2.2.2(v). Which of


the following are homomorphisms?

(a) f : G → G given by e 7→ e, a 7→ b, b 7→ c, c 7→ a
(b) g : G → G given by e 7→ e, a 7→ c, b 7→ e, c 7→ a
(c) h : G → G given by e 7→ a, a 7→ b, b 7→ e, c 7→ c
(d) k : G → G given by e 7→ e, a 7→ b, b 7→ e, c 7→ b .

4. For each of the above that is a homomorphism, find the kernel and
image.

5. For the group G of Ex.3 show that there are precisely four homo-
morphisms from G to G.

6. Show that R∗+ = {x ∈ R : x > 0} is a subgroup of R∗ and that


f : R → R∗+ , x →
7 ex is an isomorphism.
100 CHAPTER 2. GROUPS

7. Let G and H be groups, with identities e, e0 , and define an opera-


tion on the Cartesian product G × H by setting (x, y)(z, t) = (xz, yt).
(a) Prove that this turns G × H into a group (known as the direct product
of G and H).
(b) Prove that G × {e0 } is a subgroup of G × H, and find an isomorphism

G → G × {e0 }.
(c) Likewise, show that {e} × H is a subgroup isomorphic to H.

8. If G0 , H 0 are subgroups of G, H, show that G0 × H 0 is a subgroup


of G × H. Is every subgroup of G × H of the form G0 × H 0 , for suitable
G0 , H 0 (proof or counterexample)?

9. Consider the symmetric group SR on the set R, as in Theorem


2.2.5. Let a, b ∈ R with a 6= 0, and define a function fa,b : R → R
by f (x) = ax + b.
(a) Show that fa,b ◦ fc,d = fr,s , where r, s are to be found in terms of
a, b, c, d.
−1
(b) Show that fa,b is bijective, and that fa,b = fp,q , where p, q are to be
found in terms of a, b.
(c) Deduce that G = {fa,b : a, b ∈ R, a 6= 0} is a subgroup of SR . What
values of a, b give the identity?
(d) Show that H = {fa,0 : a ∈ R∗ } and K = {f1,b : b ∈ R} are sub-
groups of G, and that H ∼ = R∗ and K ∼ = R.

10. Prove Proposition 2.4.13.


2.4. HOMOMORPHISMS AND ISOMORPHISMS 101

11. Use the procedure of Example 2.4.27 to find subgroups of Sn , for


appropriate n, isomorphic to each of the following:
(a) The cyclic group M = {e, a, b} of Examples 2.2.2(iv)
(b) The group G = {e, a, b, c} of Section 2.3, Exercise 5.
(c) The quaternion group Q of Section 2.3, Exercise 14.

12. Let f : X → Y be a bijection between two sets. Define a func-


tion f˜ : SX → SY by setting f˜(α) = f ◦ α ◦ f −1 , for each α ∈ SX .
Prove that f˜ is an isomorphism, and use this result to give a more formal
treatment of Example 2.4.27.

13. Continuing with the previous exercise, let g : Y → Z be another func-


◦ f = g̃ ◦ f˜.
tion and let i : X → X be the identity. Show that ĩ = i and g]

14. Let f : G → H be a nontrivial group homomorphism (ie. not


equal to the trivial homomorphism of Example 2.4.9). If G is finite of
prime order, show that f is injective.

15. Let G be a group and x any element. Show that f : Z → G, , n 7→ xn


is a homomorphism, and use Proposition 2.3.12 to determine the kernel.

16. Suppose G is a group and X a set in one-one correspondence with



G, via a given bijection f : G → X. We define an operation ∗ on X by
setting x ∗ y = f (f −1 (x)f −1 (y)). Show that this turns X into a group,
and that f is then an isomorphism. We say that we have transported the
group structure from G to X.
102 CHAPTER 2. GROUPS

2.5 Quotient Groups


We come now to one of the most useful and fundamental constructions in
Mathematics, the process of forming quotients. In the context of group
theory the idea is this. Given a group G and a subgroup H, recall from
Section 2.3 that G may be partitioned up as the union of the various
disjoint left cosets xH. Let us now think of these cosets as objects in
their own right, and form the set consisting of all of them. We write this
set as G/H. Thus G/H = {xH : x ∈ G}. It is crucial to realise that the
elements of G/H are not elements of G, but subsets of G, and so G/H
is not a subset of G.

2.5.1 Example. Let G = S3 = {i, α, β, ρ, σ, τ } be the symmetric group


of 2.2.9 and consider the subgroups H = {i, α, β} and K = {i, ρ}. Then:
G/H = {H, ρH} = {{i, α, β}, {ρ, σ, τ }}
.
and G/K = {K, αK, βK} = {{i, %}, {α, σ}, {β, τ }}

Observe that there is a very natural function ϕ from G to G/H, given


by sending each element to its own coset:

ϕ : G → G/H , x 7→ xH.

We now try to turn the set G/H into a group in such a way that ϕ becomes
a homomorphism. If this is to be so, we will have to have ϕ(x)ϕ(y) =
ϕ(xy), or in other words (xH)(yH) = xyH. This last equation tells us
how we are going to have to define the product of two cosets if we are to
have any chance of things working.
But there is a problem here. For a given coset may be represented in
2.5. QUOTIENT GROUPS 103

the form xH in many different ways. As an example, consider the subgroup


K of S3 above, and consider the left cosets C = {α, σ} and D = {β, τ }.
If we represent C and D in the forms C = αK, D = βK then we are led
to define the product of C and D by C · D = αβK = iK = K. But
if we write C = σK, D = βK, we are led instead to C · D = σβK =
τ K = {β, τ }, a different coset. Thus we have no unambiguous definition
of what C · D is supposed to be.
Clearly, then, the programme we wish to carry out is not going to work
for all subgroups H of G. For the proposed definition of the product of
two cosets to have any chance of working, the very least we will need is
that
xH = x0 H, yH = y 0 H ⇒ xyH = x0 y 0 H

and in particular, taking x0 = e and y 0 = y, that for all x, y ∈ G

xH = H ⇒ xyH = yH.

Recall from Corollary 2.3.17 that zH = tH ⇔ t−1 z ∈ H. So the last


condition amounts to saying that x ∈ H ⇒ y −1 xy ∈ H for all y ∈ H, or
again, replacing y by y −1 , that x ∈ H ⇒ yxy −1 ∈ H for all y ∈ H. It
turns out that this necessary condition for our programme to work is also
sufficient and leads us to the following

2.5.2 Definition. A subgroup H of a group G is called normal if it satisfies


the condition that for all g ∈ G, h ∈ H we have ghg −1 ∈ H. We write
H C G to denote that H is a normal subgroup of G.

Recall from 2.4.24 that an element of the form ghg −1 is called a con-
104 CHAPTER 2. GROUPS

jugate of h. Thus H is normal provided all conjugates of its elements


remain within H. We say that H is stable under conjugation. Whether
H is normal or not, let us write gHg −1 = {ghg −1 : h ∈ H}. One may
check at once that this is a subgroup of G, or better still note that this is
the image of H under the inner automorphism γg of Proposition 2.4.24,
hence automatically a subgroup, isomorphic to H. We call it a conjugate
subgroup to H. Then the definition of normality may be rephrased as
gHg −1 ⊂ H, for all g ∈ G.
We have also noted in Examples 2.3.14(iv) that in general left and
right cosets are different. But for normal subgroups they are the same.
We draw all these observations together:

2.5.3 Proposition. If H is a subgroup of G, then the following are equiv-


alent:
(i) H is normal in G
(ii) gHg −1 ⊂ H (∀g ∈ G)
(iii) gHg −1 = H (∀g ∈ G)
(iv) gH = Hg (∀g ∈ G).

Proof. We have seen already that (i) and (ii) are equivalent. Suppose
now that (ii) holds. Then replacing g by g −1 gives g −1 Hg ⊂ H too.
Multiplying on the left by g and on the right by g −1 leads to H ⊂ gHg −1 .
Together with gHg −1 ⊂ H this establishes (iii), and clearly (iii) implies
(ii). Finally, multiplying (iii) on the right by g leads to (iv), and similarly
in the reverse direction.

2.5.4 Examples. (i) If G is abelian then all subgroups are normal. This
is clear from any of the criteria above.
2.5. QUOTIENT GROUPS 105

(ii) The subgroup H of Example 2.5.1 is normal in S3 . Using the third


criterion above, we have to check that gHg −1 = H for each of the six
elements of G = S3 , and this is easily done. Actually, with a little thought
we can considerably reduce the work. We clearly don’t have to bother
with the case g ∈ H, as the equation holds trivially then. Now ρHρ−1 =
{ρiρ−1 , ραρ−1 , ρβρ−1 } = {i, β, α} = H, using the table of 2.2.9. Since
σ = ρβ we obtain at once without further calculation that σHσ −1 =
ρβHβ −1 ρ−1 = ρHρ−1 = H, bearing in mind that β ∈ H. Similarly
τ Hτ −1 = H.
(iii) The subgroup K of Example 2.5.1 is not normal in S3 . For example
ρ ∈ K and yet αρα−1 = τ ∈ / K.

Actually, there is a slick reason why the subgroup H of Example (ii) is


normal, which avoids doing any calculation at all. First a definition:

2.5.5 Definition. The number of left cosets of H in G is the index of H


in G, written (G : H).

2.5.6 Proposition. If (G : H) = 2 then H C G.

Proof. There are only two distinct left cosets, namely H and one other,
say xH, which has, of course, to be the complementary set G − H. We
have to check that gH = Hg, for all g. This is trivial if g ∈ H, so assume
now that g ∈
/ H. Then gH = xH = G − H. Similarly Hg = G − H, and
we are done.

An important source of normal subgroups is given by the next result:

2.5.7 Proposition. Let f : G → H be a group homomorphism. Then


ker f C G.
106 CHAPTER 2. GROUPS

Proof. We know already that ker f is a subgroup, so we just have to


check normality. Let x ∈ ker f , so that f (x) = e0 and let g ∈ G. Then
f (gxg −1 ) = f (g)f (x)f (g)−1 = f (g)f (g)−1 = e0 , and hence gxg −1 ∈
ker f , as required.

We can now return to the theme which began this section and prove:

2.5.8 Theorem. Let G be a group and H a normal subgroup. Then:


(i) The prescription (xH)(yH) = xyH gives a well-defined operation on
the set G/H of all left cosets of H in G, and turns G/H into a group,
with identity element H (the quotient group of G by H).
(ii) The function ϕ : G → G/H , x 7→ xH is a surjective homomorphism,
with kernel H. It is called the canonical homomorphism.

Proof. (i) We must first check that if xH = x0 H and yH = y 0 H, then


xyH = x0 y 0 H, or in other words that (xy)−1 x0 y 0 = y −1 x−1 x0 y 0 belongs
to H. Now x−1 x0 belongs to H, and hence so does y −1 (x−1 x0 )y, as H
is normal. But we also have y −1 y 0 ∈ H, and so H contains the product
(y −1 x−1 x0 y)(y −1 y 0 ) = y −1 x−1 x0 y 0 , as required. Thus we have a well-
defined operation on G/H.
We now check that this is a group. Clearly H = eH serves as the iden-
tity. Next ((xH)(yH))(zH) = (xyH)(zH) = xyzH = (xH)(yzH) =
(xH)((yH)(zH)), proving associativity. Finally (xH)(x−1 H) = xx−1 H =
eH = H and likewise (x−1 H)(xH) = H. This shows that each element
xH has an inverse, namely (xH)−1 = x−1 H and we are done.
(ii) ϕ(xy) = xyH = (xH)(yH) = ϕ(x)ϕ(y), so that ϕ is a homomor-
phism. By definition every element of G/H is of the form xH, for some
2.5. QUOTIENT GROUPS 107

x, whence ϕ is surjective. Lastly ϕ(x) = H ⇔ xH = H ⇔ x ∈ H, and


so ker ϕ = H.

2.5.9 Example. Returning to the normal subgroup H = {i, α, β} of G =


S3 (Example 2.5.1), let us compute the group table of G/H = {H, ρH}.
We really only have to calculate that (ρH)(ρH) = ρ2 H = iH = H. Thus
the group table is:
H ρH
H H ρH
ρH ρH H
and it is clear that G/H is a cyclic group of order 2, isomorphic to the
group {1, −1} of Examples 2.2.2.

For a more substantial example, consider:

2.5.10 Example. Let Q = {1, −1, i, −i, j, −j, k, −k} be the quaternion
group of Section 1.3, Exercise 14. The full table of this is:

1 −1 i −i j −j k −k
1 1 −1 i −i j −j k −k
−1 −1 1 −i i −j j −k k
i i −i −1 1 k −k −j j
−i −i i 1 −1 −k k j −j
j j −j −k k −1 1 i −i
−j −j j k −k 1 −1 −i i
k k −k j −j −i i −1 1
−k −k k −j j i −i 1 −1
108 CHAPTER 2. GROUPS

One quickly checks that C = {1, −1} is a normal subgroup, and then
clearly Q/C = {C, iC, jC, kC}. We now compute the group structure
on this. For example (iC)(iC) = i2 C = (−1)C = C and (jC)(kC) =
jkC = iC, and I leave you to do the rest. The table for Q/C is:

C iC jC kC
C C iC jC kC
iC iC C kC jC
jC jC kC C iC
kC kC jC iC C

We can now see that this is isomorphic to the group G = {e, a, b, c} of



Section 1.3, Exercise 5, an explicit isomorphism f : Q/C → G being given
by C 7→ e, iC 7→ a, jC 7→ b, kC 7→ c.

2.5.11 Remarks. (i) One can view the quotient group construction G/H
as being in some sense a counterpoint to the study of the subgroup H.
In the latter, we narrow our attention down to what is going on inside H,
blinkering ourselves to the rest of the ambient group G. But when we
look at the quotient group, it’s as if we have “collapsed” H right out of
the picture, and are now concentrating on what residual structure is left.
Indeed, each coset has effectively been shrunk down to a single point.
(ii) As noted in Section 1.3, the group law in some abelian groups is written
additively, in which case we write the cosets as x + H. In that case we
will continue to use an additive notation in the quotient group and write
(x + H) + (y + H) = x + y + H.
(iii) When working in a quotient group G/H, it rapidly becomes tedious
2.5. QUOTIENT GROUPS 109

to keep writing the elements in coset notation, so we adopt the shorthand


symbol x̄ = xH (or x + H). Thus the group law may be written as
x̄ȳ = xy (or x̄ + ȳ = x + y). Alternative notations include [x], x̃, ẋ, . . .

The order of G/H is by definition equal to the index (G : H), defined


in 2.5.5. If G is finite we can go further:

2.5.12 Proposition. If G is a finite group and H a normal subgroup, then


|G/H| = |G|/|H|.

Proof. In the notation of the proof of Lagrange’s Theorem 2.3.19 we have


|G| = n, |H| = m and (G : H) = k. We saw there that n = km.

We have talked a lot about cyclic groups and their properties in Section
1.3, but how do we know that they actually exist? We can now fill in this
gap:

2.5.13 Example. Fix an integer n ≥ 0 and, for any integers x, y, define


x ≡ y (mod n) to mean that x − y is a multiple of n. We say that x is
congruent to y modulo n and it is easily checked (Section 1.2, Exercise
14) that this is an equivalence relation on Z. From the point of view of
cosets, note that x ≡ y (mod n) ⇔ x − y ∈ nZ ⇔ x + nZ = y + nZ.
Thus the equivalence classes are precisely the cosets of the subgroup nZ
in Z : [x] = x + nZ.
If n = 0 the relation of congruence just boils down to equality and there
are infinitely many classes, namely the [x] = {x}, for all x ∈ Z. Evidently

in this case there is an isomorphism Z → Z/0Z given by x 7→ [x].
Now let n ≥ 1. It follows from the Division Algorithm 2.3.22 that there
are precisely n cosets [0], [1], . . . , [n − 1]. Thus the quotient group Z/nZ
110 CHAPTER 2. GROUPS

has order n. This is called the group of integers mod n and will also be
denoted by Zn . It is evidently cyclic, generated by [1].

For convenience of notation we shall often drop the square brackets


and simply write the elements of Zn as 1, 2, . . . , n − 1. But one should
note carefully that Zn is not a subset of Z. As an example, the addition
table for Z5 is:
0 1 2 3 4
0 0 1 2 3 4
1 1 2 3 4 0
2 2 3 4 0 1
3 3 4 0 1 2
4 4 0 1 2 3
We can either say that 3 + 4 ≡ 2 (mod 5) or that 3 + 4 = 2 in Z5 .
We draw our conclusions together in the following:

2.5.14 Theorem. (i) Any two cyclic groups of the same order are iso-
morphic;
(ii) For each n ≥ 1 there is, up to isomorphism, precisely one cyclic group
of order n. A model for this is Zn ;
(iii) There is, up to isomorphism, precisely one cyclic group of infinite
order. A model for this is Z.

Proof. We have now seen in Example 2.5.13 that cyclic groups of all
possible orders exist. It remains to prove uniqueness. Let G be a cyclic
group, generated by an element x. Assume first that G is infinite. Clearly
the map f : Z → G , k 7→ xk is a homomorphism. Since G = hxi it
is surjective. Moreover, as x has infinite order xk = e ⇒ k = 0. Hence
2.5. QUOTIENT GROUPS 111

ker f = {0} and it follows from Lemma 2.4.14 that f is an isomorphism,


proving (iii). Assume instead that G has finite order n, and define a map
g : Zn → G by [k] 7→ xk . We must first check that this is well-defined,
since a given congruence class may be represented in many different ways.
So suppose that [k] = [l]. Then k ≡ l (mod n), and we may write k =
l + an for some integer a. Since xn = e it follows that xk = xl (xn )a = xl ,
and we have therefore shown that g is well-defined. It is now clearly a
surjective homomorphism. Lastly, suppose that g([k]) = xk = e. Then by
Proposition 2.3.12 k is a multiple of n, and hence [k] = [0], proving that
g is an isomorphism.

We can now see how to avoid doing all the tedious associativity checks
to verify that the group G = {e, a, b, c} of Examples 2.2.2 really is a group.
Indeed, we can see at once that it is the cyclic group of order four, and as
such exists.
The method whereby we showed above that Zn is isomorphic to G may
be greatly extended to prove the next Theorem, which is one of the central
planks of group theory. Recall that we already observed in Proposition
2.5.7 that the kernel of a homomorphism is a normal subgroup.

2.5.15 First Isomorphism Theorem. Let f : G → H be a homomor-



phism of groups. Then f gives rise to an isomorphism f¯ : G/ ker f →
f (G), defined by f¯(x̄) = f (x). We call f¯ the isomorphism induced by f .

Proof. First we check that f¯ is well-defined, so suppose that x̄ = ȳ. Thus


x = yk for some k ∈ K = ker f . Then f (x) = f (y)f (k) = f (y)e0 =
f (y), as required. Next, for any x, y ∈ G we have f¯(x̄ȳ) = f¯(xy) =
f (xy) = f (x)f (y) = f¯(x̄)f¯(ȳ) and f¯ is a homomorphism. Every element
112 CHAPTER 2. GROUPS

of the image f (G) is of the form f (x), so that f¯ is visibly surjective.


Finally, suppose that f¯(x̄) = e0 . Then f (x) = e0 and x ∈ K, whence
x̄ = ē. Lemma 2.4.14 now shows that f¯ is injective, and we are done.

Note that we have f = f¯ ◦ ϕ, where ϕ : G → G/H is the canonical


homomorphism. This Theorem is one of the most important tools in
the whole of group theory for obtaining isomorphisms and helping us to
understand the structure of groups. Its use enables us to dispense with
a lot of ad hoc arguments, and, for instance, streamlines some our our
earlier examples of quotients. The way we use it is encapsulated in the
following:

2.5.16 Strategy. Suppose we are trying to understand the structure of a


certain quotient group G/H and have a hunch that it might be isomorphic
to some other group K which we already understand well. We then try
to find some naturally defined surjective homomorphism f from the top
group G onto K. We keep our fingers crossed and hope that the kernel
turns out to be H. If so, then the Isomorphism Theorem straight away

gives us the desired isomorphism G/H → K.

We look at some examples, beginning with a couple of trivial ones:

2.5.17 Example. The identity automorphism i : G → G has kernel {e}



and leads to the isomorphism G/{e} → G. We are essentially doing no
“collapsing” at all.

2.5.18 Example. The trivial homomorphism G → G , x 7→ e has kernel



G and leads to the isomorphism G/G → {e}. This time we have collapsed
everything down to one point.
2.5. QUOTIENT GROUPS 113

2.5.19 Example. Let G = hxi be a cyclic group of order n. Clearly


f : Z → G , m 7→ xm is a homomorphism onto G, and Proposition 2.3.12

shows that its kernel is nZ. Thus f induces an isomorphism Z/nZ → G,
giving us a slicker proof of Theorem 2.5.14(ii).

2.5.20 Example. Consider the symmetric group G = {i, α, β, ρ, σ, τ }


of 2.2.9. If we imagine the numbers 1,2,3 evenly placed in clockwise
order round a circle, then clearly three of the permutations preserve the
cyclic ordering. They are i, α, β and we call these the even permutations.
The other three ρ, σ, τ reverse the cyclic order: we call these the odd
permutations. It is clear that the composition of two even or two odd
permutations produces an even permutation, but that the composition of
one of each type gives an odd permutation. In short:

◦ even odd
even even odd
odd odd even

Now let C = {1, −1} be the cyclic group of order two under multiplication,
and define a function f : S3 → C by sending the even permutations to 1,
and the odd ones to −1. This is seen at once to be a surjective homo-
morphism with kernel H = {i, α, β}, and thus it induces an isomorphism

S3 /H → C, as in Example 2.5.9.

We shall see in a later section how the idea of even and odd permuta-
tions extends to all symmetric groups Sn .

2.5.21 Example. In Section 1.4, Exercise 9 you were asked to show that
the set of all functions fa,b : R → R , x 7→ ax + b, with a, b ∈ R , a 6= 0
114 CHAPTER 2. GROUPS

forms a group under composition, a subgroup of the infinite symmetric


group SR . We also compute that fa,b ◦ fc,d = fac,ad+b and it follows at
once that the function ϕ : G → R∗ , fa,b 7→ a is a homomorphism, clearly
surjective. By definition, the kernel is the subgroup K = {f1,b : b ∈ R}
and the Isomorphism Theorem shows that we have an isomorphism from
G/K onto R∗ . Note that we do not first have to check that K is a
subgroup. That comes for free, since it is a kernel, together with the extra
fact that it is normal.

It is not immediately clear what the subgroups of a quotient G/H


look like. In fact they turn out to be in one-one correspondence with
those subgroups of G which contain H, as we shall now see. Suppose first
that K is a subgroup of G which contains H. Then H is still normal in
K and the quotient group K/H makes sense. The elements kH of this
are just particular elements of G/H and so K/H is a subset of G/H.
A moment’s reflection shows it to be a subgroup of G/H. It is just the
image ϕ(K) of K under the canonical homomorphism ϕ : G → G/H.
We now have:

2.5.22 Proposition. Let G be a group, H a normal subgroup and ϕ :


G → G/H the canonical homomorphism. Then there is a bijection:

{All subgroups of G containing H} −→ {All subgroups of G/H}

given by K 7→ ϕ(K) = K/H, and in the reverse direction by K 7→ ϕ−1 (K).

Proof. We have seen that this is a well-defined function. If K is a subgroup


of G/H, then by 2.4.13 ϕ−1 (K) is a subgroup of G, and it contains H since
2.5. QUOTIENT GROUPS 115

ϕ(H) = {ē}. Thus the reverse map is also well-defined and we just have
to check that the two maps are inverses of each other, or equivalently that
the composition each way round is the identity. This is left as an exercise
for the reader.
2.5.23 Example. Recall the group Zn = Z/nZ, where n ≥ 1. By Propo-
sition 2.3.24 the subgroups of Z are all of the form K = mZ, and one
checks at once that K contains nZ if and only if m|n (m divides n).
Thus the subgroups of Z/nZ are the various quotients mZ/nZ, where
m|n and we may as well take m ≥ 1. For instance Z/12Z has subgroups
Z/12Z, 2Z/12Z, 3Z/12Z 4Z/12Z, 6Z/12Z and 12Z/12Z = {0̄}.
If we reexamine the first two lines in the proof of Theorem 2.5.15,
we see that the argument there also establishes the following very useful
principle:
2.5.24 Proposition. If f : G → H is a homomorphism and K is a
normal subgroup of G contained in ker f , then f induces a homomorphism
f˜ : G/K → H, defined by f˜(ḡ) = f (g). We have f = f˜ ◦ ϕ, where
ϕ : G → G/K is the canonical homomorphism, and we say that f factors
through G/K.
Of course f˜ will not be an isomorphism unless K = ker f and f
happens to be surjective.
There are two further isomorphism theorems, which are actually just
particular cases of 2.5.15, but nevertheless useful to record. First a defi-
nition:
2.5.25 Definition. If X and Y are two subsets of a group (or monoid) G,
define their product to be the set XY = {xy : x ∈ X, y ∈ Y }.
116 CHAPTER 2. GROUPS

If X = {x} we shall simply write xY . Likewise in the case where Y


has just one element. Note that this accords with the notation which we
have already used for left and right cosets. Of course, if G is abelian and
the group law is written additively, then we must write X + Y and x + Y .
Now let H and K be subgroups of G. In general HK is not a subgroup
(exercise: find some examples). However:

2.5.26 Proposition. (i) If H and K are subgroups of G and one of them


is normal, then HK is a subgroup and HK = KH.
(ii) If both are normal, then so is HK.

Proof. (i) Suppose, for example, that H C G. Then Hk = kH for all


S S
k ∈ K, whence HK = k∈K Hk = k∈K kH = KH. The identity
e = ee ∈ HK. Now let g = hk and g 0 = h0 k 0 be elements of HK
(h, h0 ∈ H, k, k 0 ∈ K). Then gg 0 = (hkh0 k −1 )(kk 0 ). Since H is normal,
kh0 k −1 ∈ H and thus hkh0 k −1 ∈ H and kk 0 ∈ K, proving that gg 0 ∈ HK.
Furthermore g −1 = k −1 h−1 ∈ KH = HK. Thus HK is a subgroup.
(ii) If both are normal and g ∈ G, then gHK = HgK = HKg, and so
HK C G. We used here the obvious fact that the product of subsets is
an associative process.

We can now state:

2.5.27 Second Isomorphism Theorem. Let H and K be subgroups of


G, with K normal. Then H ∩ K C H and we have an isomorphism


H/H ∩ K → HK/K , given by h(H ∩ K) 7→ hK.
2.5. QUOTIENT GROUPS 117

(We could just write h̄ 7→ h̄, provided we remember that the two h̄’s mean
different things).

Proof. Consider the homomorphism f : H → HK/K , h 7→ h̄ given


by composing the inclusion map H → HK , h 7→ h with the canonical
homomorphism HK → HK/K. Take any element xK of HK/K and
write x = hk, with h ∈ H, k ∈ K. Then xK = hkK = hK = f (h),
showing that f is surjective. Furthermore, f (h) = hK = K ⇔ h ∈
K, and hence ker f = H ∩ K. Our result now follows from the First
Isomorphism Theorem.

As ever, if the group law is written additively, then then isomorphism


becomes H/H ∩ K ∼ = (H + K)/K.

2.5.28 Example. Take G = Z, H = mZ, K = nZ, where m, n ≥ 1.


Since Z is abelian, all subgroups are normal. We saw in Proposition 2.3.26
that H + K = dZ, with d = HCF (m, n), and it is easily shown (exercise)
that H ∩ K = eZ, where e = LCM (m, n) is the least common multiple
of m and n. So the Theorem above gives an isomorphism:

mZ/eZ ∼
= dZ/nZ .

For example 12Z/60Z ∼


= 3Z/15Z.

For our final isomorphism theorem, consider the situation K ⊂ H ⊂ G in


which both H and K are normal subgroups of G. Then all three quotients
G/H, G/K and H/K make sense. We now have:
118 CHAPTER 2. GROUPS

2.5.29 Third Isomorphism Theorem. Let H and K be normal subgroups


of G with K ⊂ H. Then H/K C G/K and we have an isomorphism


(G/K)/(H/K) → G/H given by ḡ(H/K) 7→ gH .

Proof. By Proposition 2.5.24 the canonical homomorphism G → G/H


factors through G/K to give a well-defined homomorphism ψ : G/K →
G/H , gK 7→ gH, and this is clearly surjective. Now gK ∈ ker ψ ⇔
gH = H ⇔ g ∈ H. So ker ψ consists of the various cosets gK with
g ∈ H, or in other words ker ψ = H/K. The Theorem follows from
2.5.15.

The intuitive idea behind this Theorem is that collapsing K down to


a point first and then further collapsing H, or more accurately H/K,
amounts to much the same as collapsing H down to a point all in one go.

Exercises:

1. Let G be a group and H a subgroup. Show that the subgroups of H are


precisely those subgroups of G which lie in H. (This is almost obvious, but
worth checking). If K is a subgroup of H, show that K C G ⇒ K C H.
Give an example to show that the converse is false.
! !
1 2 3 4 1 2 3 4
2. Consider the permutations ρ = and σ =
2 3 4 1 3 2 1 4
in S4 .
2.5. QUOTIENT GROUPS 119

(a) Show that G = {i, ρ, ρ2 , ρ3 , σ, σρ, σρ2 , σρ3 } is a subgroup of S4 .


(b) Find subgroups H and K such that K CH and H CG, and yet K 6 G.

3. (a) Prove that the centre C of a group G (Section 2.3, Ex.13)is a


normal subgroup.
(b) Part (a) shows that ϕ(C) = C for every inner automorphism ϕ of G
(2.4.24). Show that in fact this remains true for all automorphisms of G.

4. Fill in the details in the proof of Proposition 2.5.20.

5. Show that mZ ⊃ nZ ⇔ m|n.

6. Give an example of two subgroups H, K of a group G such that HK


is not a subgroup.

7. Show that the product operation 2.5.24 on subsets of a group is asso-


ciative, ie. that (XY )Z = X(Y Z).

8. For m, n ≥ 1, prove that mZ ∩ nZ = eZ, where e is the least common


multiple of m and n.

9. Prove that a quotient of a cyclic group is again cyclic.

10. Let f : G → G0 be an isomorphism and H a normal subgroup of


G. Show that f induces an isomorphism from G/H onto G0 /f (H).
120 CHAPTER 2. GROUPS

11. (i) Let m, n ≥ 1. Use the previous exercise to obtain an isomor-



phism Z/nZ → mZ/mnZ , x̄ 7→ mx.
(ii) Let d = HCF (m, n) and e = LCM (m, n). Use part (i), together
with 2.5.12 and 2.5.28, to show that de = mn.
(iii) Give a direct proof of this last fact without using group theory.

12. Show that the homomorphism f : R → C∗ , x 7→ eix induces


an isomorphism from the quotient R/Z onto the subgroup T = {z ∈ C :
|z| = 1} of C∗ . This last is called the circle group and is also denoted S 1 .

13. Consider the direct product G × H of two groups (Section 2.4, ex.7)
and let G0 , H 0 be normal subgroups of G, H respectively. Use Theorem
2.5.15 to show that G0 × H 0 C G × H and that (G × H)/(G0 × H 0 ) ∼ =
G/G0 × H/H 0 .

14. Let G be an abelian group of order pq, where p, q are distinct primes.
We aim to prove that G is cyclic.
(i) If all elements 6= e have order p, choose one such element x and derive
a contradiction by considering the quotient G/ < x > .
(ii) Deduce that either G is cyclic or that it contains an element a of order
q, and likewise an elements b of order p.
(iii) Show that in the latter case ab has order pq, and G is again cyclic.
2.6. MORE ON PERMUTATIONS 121

2.6 More on Permutations


The symmetric groups Sn form an important class of concrete examples
of groups and in some sense, according to Cayley’s Theorem 2.4.26, all
groups may be treated as groups of permutations. We now investigate
these groups more deeply, and we begin by developing a notation for per-
mutations which is a lot less cumbersome and easier to calculate with than
the two-line notation encountered so far. This depends on breaking up a
general permutation into parts of a very simple type:

2.6.1 Definition. Let a1 , . . . , ak be a sequence of distinct integers in


I = {1, 2, . . . , n}. Then the permutation σ ∈ Sn which sends ai to ai+1
(1 ≤ i ≤ k − 1), ak to a1 and fixes all other elements of I is called a
k-cycle. We denote it by σ = (a1 a2 . . . ak ).
! !
1 2 3 4 5 1 2 3 4 5
Thus in S5 : (241) = and (13254) = .
2 4 3 1 5 3 5 2 1 4

Note that in cycle notation we can start anywhere we like and proceed
round in cyclic order. Thus (13254) = (32541) = (25413) = (54132) =
(41325). We will adopt the convention that ak+1 = a1 . We can then say
that σ(ai ) = ai+1 for all i such that 1 ≤ i ≤ k. Of course, in general two
permutations do not commute with each other. But there is one situation
in which they do:

2.6.2 Proposition. Let σ = (a1 a2 . . . ak ) and τ = (b1 b2 . . . bl ) be two


cycles in Sn , and suppose that the sets A = {a1 , a2 , . . . , ak } and B =
{b1 , b2 , . . . , bl } are disjoint. Then στ = τ σ.
122 CHAPTER 2. GROUPS

Proof. We have to show that στ (m) = τ σ(m) for all m ∈ I. Now


στ (ai ) = σ(ai ) = ai+1 = τ (ai+1 ) = τ σ(ai ) and likewise στ (bj ) = τ σ(bj ).
If m ∈ / A ∪ B, then both σ and τ fix m, so once more στ (m) = m =
τ σ(m).

We refer to cycles as above as disjoint cycles. Consider now a gen-


eral permutation σ and define a relation ∼ on I by setting k ∼ l if
l = σ i (k) for some integer i. An immediate check shows that this is an
equivalence relation. The equivalence classes are also called the orbits.
(This is actually a special case of a much more general situation which
we shall investigate in Section 2.8). !For example, in S10 the orbits of
1 2 3 4 5 6 7 8 9 10
σ = are {1, 4, 7}, {2, 9}, {3, 6, 5, 10}
4 9 6 7 10 5 1 8 2 3
and {8} .
Now take any a ∈ I. The orbit Orb(a) of a consists of the integers
a, σ(a), σ 2 (a), . . . . We now have:

2.6.3 Lemma. The size of Orb(a) is the least positive integer k such that
σ k (a) = a. We have Orb(a) = {a, σ(a), σ 2 (a), . . . σ k−1 (a)}.

Proof. As I is finite, the sequence a, σ(a), σ 2 (a), . . . must contain repe-


titions, say σ i (a) = σ i+k (a) where k ≥ 1. Acting σ −i on each side gives
σ k (a) = a. If we now take k to be the least positive integer with this last
property, then a, σ(a), . . . , σ k−1 (a) are all distinct and we are done.

Note the similarity of this argument to that of Proposition 2.3.10.


Now let Orb(a1 ), . . . , Orb(ar ) be the distinct orbits corresponding to the
permutation σ under discussion, and let ki = |Orb(ai )|. Then Orb(ai ) =
{ai , σ(ai ), . . . , σ ki −1 (ai )}. Form the ki -cycles σi = (ai , σ(ai ), . . . , σ ki −1 (ai ))
2.6. MORE ON PERMUTATIONS 123

and consider the permutation τ = σ1 σ2 · · · σr . We claim that τ = σ, or


in other words that τ (m) = σ(m) for all m ∈ I. Suppose m ∈ Orb(ai ).
Then we can write m = σ j (ai ) for some j. The cycles σ1 , . . . , σr are
disjoint, and all of them leave m alone, apart from σi , which sends m to
σ j+1 (ai ). But this is also the effect of σ on m. We have thus proved most
of the following fundamental result:

2.6.4 Theorem. Every permutation σ ∈ Sn may be decomposed as a


product σ = σ1 σ2 · · · σr of disjoint cycles σi . The σi commute with each
other. Apart from rearranging the order of the factors, such a decompo-
sition is unique.

Proof. We are just left with proving uniqueness. So suppose σ = σ1 σ2 · · · σr =


τ1 τ2 · · · τs are two decompositions of σ into disjoint cycles. From the first
decomposition it is clear that there are r orbits I1 , . . . , Ir corresponding
to σ, where Ii is the set of entries in the cycle σi . Similar considerations
applied to the second decomposition show that in fact r = s and that,
after reordering the τ ’s, we make it that σi and τi involve precisely the
same set of entries Ii , for each i. It is then evident, furthermore, that
these entries must appear in the same cyclic order within σi and τi , so
that σi = τi .

The permutation σ ∈ S10 above is given by σ = (1 4 7)(2 9)(3 6 5 10)(8).


It is conventional to omit all 1-cycles (corresponding to points fixed by σ)
and we can arrange the cycles in any order. Thus σ = (2 9)(1 4 7)(3 6 5 10).
We can now write the elements of S3 in a much more succinct way than in
Example 2.2.9. They are: i, α = (1 2 3), β = (1 3 2), ρ = (1 2), σ = (1 3)
and τ = (2 3).
124 CHAPTER 2. GROUPS

Z It is important to realise that the gaps between adjacent cycles just rep-
resent composition of functions and that when we multiply two permuta-
tions we must always work from right to left. For example, if σ is as above
and τ = (1 3 8)(5 9 7)(6 4 10), then στ = (2 9)(1 4 7)(3 6 5 10)(1 3 8)(5 9 7)(6 4 10).
to work this out we see what is the effect on each of the integers 1, . . . , 10
in turn, working from right to left. The effect of the successive cycles on
1 is as follows: 1 7→ 1 7→ 1 7→ 3 7→ 6 7→ 6 7→ 6, so that 1 7→ 6. We now
see what happens to 6: 6 7→ 4 7→ 4 7→ 4 7→ 4 7→ 7 7→ 7. Continuing in
this way, we find that στ = (1 6 7 10 5 2 9)(3 8 4).

2.6.5 Proposition. (i) The order of a k-cycle equals the length k of the
cycle;
(ii) If σ = σ1 σ2 · · · σr is a permutation, expressed as a product of disjoint
cycles, then the order of σ is the least common multiple of the lengths of
the individual cycles σi .

Proof. (i) For a cycle σ = (a1 a2 . . . ak ) we have σ i (aj ) = aj+i , where we


have made the convention that ak+1 = a1 , ak+2 = a2 etc. Clearly σ k fixes
all the aj , whereas no lower power does, and of course it certainly fixes all
the other elements of I. Thus k is the lowest power such that σ k = i.
(ii) Let σj have length kj . Since the σj commute with each other, we
k
have σ m = σ1m σ2m · · · σrm for all m. If l = LCM (k1 , . . . , kr ), then σj j = i
and so σjl = i (1 ≤ j ≤ r). Hence σ l = i. This shows that the order of σ
is at most l. Now suppose that σ m = i. Then σ1m = σ2−m · · · σr−m . Write
σ1 = (a1 a2 . . . ak1 ). Since the aj lie outside the orbits of σ2 , . . . , σr the
right-hand side fixes each aj . Hence so does the left-hand side, which just
2.6. MORE ON PERMUTATIONS 125

means that σ1m = i. Thus m is a multiple of k1 . Similarly it is a multiple


of k2 , . . . , kr and so l|m.

Particularly important are the 2-cycles in Sn :

2.6.6 Definition. A 2-cycle σ = (i j) is called a transposition.

Thus (i j) just interchanges i and j and leaves all the other integers
alone. We shall now see that all permutations can be generated as products
of transpositions:

2.6.7 Theorem. Every element of Sn is a product of transpositions. For


a k-cycle we have the explicit formula

(a1 a2 . . . ak ) = (a1 ak )(a1 ak−1 ) · · · (a1 a3 )(a1 a2 ) .

Proof. The formula is easily checked, by working out the effect of the right-
hand side on the ai and on the other elements of I. By combining this
with Theorem 2.6.4 it follows that all permutations can be decomposed
into transpositions.

For example (4 6 7) = (4 7)(4 6) and (1 2 3 4 5) = (1 5)(1 4)(1 3)(1 2).


Unlike the disjoint cycle decomposition of Theorem 2.6.4, there is nothing
unique about the decomposition of a permutation into transpositions. In-
deed we can also write (4 6 7) = (4 7)(6 7)(4 7)(4 6)(6 7)(4 7). However,
whilst we may be able to write (4 6 7) as a product of two or six or eighty-
eight transpositions, we shall never succeed in expressing it as the product
of thirty-seven. This is the content of the next beautiful but by no means
obvious result:
126 CHAPTER 2. GROUPS

2.6.8 Theorem. The product of an even number of transpositions cannot


equal the product of an odd number of transpositions.

There are a number of different ways of proving this, some of which


are dealt with in the Exercises. The proof we shall present now depends on
the following idea. Introduce n formal variables x1 , . . . , xn and consider
Q
the expression ∆ = ∆(x1 , . . . , xn ) = i>j (xi − xj ). This means the
product of a whole lot of factors (xi − xj ), one for each pair of integers
(i, j) satisfying 1 ≤ j < i ≤ n. Those who know about such things will
recognize this as the Vandermonde determinant

1 x x2 . . . xn−1
1 1 1
n−1
1 x2 x22 . . . x2

∆(x1 , . . . , xn ) = . . . .. .. .
. .
. . . . . .
1 xn x2n . . . xn−1

n

For example ∆(x1 , x2 ) = x2 − x1 and ∆(x1 , x2 , x3 ) = (x2 − x1 )(x3 −


x1 )(x3 − x2 ). The expression ∆(x1 , . . . , xn ) is an example of a (homoge-
neous) polynomial in n variables. We shall be discussing polynomials in a
later Chapter. For the moment, just treat it as a formal expression.
For a given permutation σ ∈ Sn , let us now define

σ · ∆ = σ · ∆(x1 , . . . , xn ) = ∆(xσ(1) , . . . , xσ(n) ).

Thus σ · ∆ is a new polynomial obtained from ∆ by permuting round the


variables. For example, when n = 3 consider the elements α = (1 2 3) and
σ = (1 3). Then
2.6. MORE ON PERMUTATIONS 127

α · ∆ = ∆(xα(1) , xα(2) , xα(3) ) = (xα(2) − xα(1) )(xα(3) − xα(1) )(xα(3) − xα(2) )


= (x3 − x2 )(x1 − x2 )(x1 − x3 ) = ∆ , and
σ · ∆ = ∆(xσ(1) , xσ(2) , xσ(3) ) = (xσ(2) − xσ(1) )(xσ(3) − xσ(1) )(xσ(3) − xσ(2) )
= (x2 − x3 )(x1 − x3 )(x1 − x2 ) = −∆.

It is clear that in general σ will shuffle round the brackets in the expression
for ∆(x1 , . . . , xn ), possibly changing some of the signs, so that therefore
σ · ∆ = ±∆. This leads to the:

2.6.9 Definition. For each σ ∈ Sn the sign of σ is the integer ε(σ) = ±1


given by σ · ∆(x1 , . . . , xn ) = ε(σ)∆(x1 , . . . , xn ) . If ε(σ) = 1 we call σ
an even permutation, and if ε(σ) = −1 we call it an odd permutation.

Continuing with the calculations above, we easily check that for S3 we


have the following table:

2.6.10. The Symmetric Group S3 :

permutation sign even/odd


i 1 even
(1 2 3) 1 even
(1 3 2) 1 even
(1 2) -1 odd
(1 3) -1 odd
(2 3) -1 odd

The identity permutation is always even. Moreover:

2.6.11 Lemma. All transpositions are odd.


128 CHAPTER 2. GROUPS

Proof. Relabelling the variables, it is enough to consider σ = (12). Sup-


pose we apply σ to the expression ∆ = (x2 − x1 )(x3 − x1 )(x3 − x2 )(x4 −
x1 )(x4 − x2 )(x4 − x3 ) · · · (xn − x1 )(xn − x2 ) · · · (xn − xn−1 ). Then σ re-
verses the first bracket and the remaining brackets just get shuffled around,
without any more sign changes. Hence σ · ∆ = −∆ .

The fundamental fact about the sign ε(σ) is that it is multiplicative:

2.6.12 Theorem. If σ, τ ∈ Sn , then ε(στ ) = ε(σ)ε(τ ) . Thus we have a


group homomorphism ε : Sn → {±1} , σ 7→ ε(σ) .

Proof. This can seem a bit confusing. We have σ·(τ ·∆(x1 , . . . , xn )) = σ·


∆(xτ (1) , . . . , xτ (n) ) = ∆(xσ(τ (1)) , . . . , xσ(τ (n)) ) = ∆(x(στ )(1) , . . . , x(στ )(n) ) =
(στ ) · ∆(x1 , . . . , xn ) . The first equality in the previous line is valid be-
cause when we act σ on the expression, we have to replace each variable
xi by xσ(i) . So each xτ (j) gets replaced by xσ(τ (j)) . Our equation now gives
ε(σ)ε(τ )∆(x1 , . . . , xn ) = ε(στ )∆(x1 , . . . , xn ), from which it follows that
ε(σ)ε(τ ) = ε(στ ).

We can summarize this theorem by the following multiplication table:

even odd
even even odd
odd odd even
.
We have seen in Theorem 2.6.7 that every permutation can be ex-
pressed as a product of transpositions. In view of Lemma 2.6.11, this
gives us an effective way of finding the sign of a permutation:
2.6. MORE ON PERMUTATIONS 129

2.6.13 Proposition. The sign of a permutation σ ∈ Sn equals (−1)j ,


where σ is a product of j transpositions. Equivalently, the product of an
even (resp. odd) number of transpositions is even (resp. odd).
We can now at last supply the proof of Theorem 2.6.8 :

Proof. By the previous proposition, the product of an even number of


transpositions is even, and that of an odd number is odd. Since a per-
mutation cannot be both even and odd at the same time, the Theorem
follows.
2.6.14 Remark. ZIn view of Theorem 2.6.7 we have the slightly an-
noying fact that a cycle of even length is odd, and a cycle of odd length
is even.
As an example, the permutation σ = (2 9)(1 4 7)(3 6 5 10) has sign
(−1)1(−1)
= 1 and is thus even.

The kernel of the sign homomorphism ε is especially interesting:


2.6.15 Proposition. The set An of even permutations is a normal sub-
group of index two in Sn . It is called the alternating group.
Proof. By definition An = ker ε. Since transpositions have sign −1, we
have that ε is surjective. So the First Isomorphism Theorem 2.5.15 induces

an isomorphism Sn /An → {±1}, and we are done.
2.6.16 Examples. (i) A3 = {i, (1 2 3), (1 3 2)} ;
(ii) A4 = {i, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4), (2 4 3),
(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} .
130 CHAPTER 2. GROUPS

Recall that a subgroup H of G is normal if it is closed under conjuga-


tion. In other words, whenever h ∈ H then H also contains all conjugates
ghg −1 of h. If we want to discover normal subgroups of Sn , it is therefore
important to be able to recognize when two elements are conjugate. This
is actually very easy in Sn , and hinges on the following observation:

2.6.17 Proposition. If σ ∈ Sn and τ = (a1 a2 . . . ak ) is a k-cycle, then


στ σ −1 = (σ(a1 )σ(a2 ) . . . σ(ak )) . In other words, στ σ −1 is obtained from
τ by replacing each entry by its image under σ.

Proof. Write ρ = (σ(a1 )σ(a2 ) . . . σ(ak )). Let us calculate the effect of
each side on the elements m of I = {1, . . . , n}. If m = σ(ai ), for some
i, then στ σ −1 (m) = στ (ai ) = σ(ai+1 ) = ρ(m). On the other hand, if m
is not equal to any σ(ai ), then σ −1 (m) is not equal to any ai , and hence
τ leaves it alone. So στ σ −1 (m) = σσ −1 (m) = m. But also ρ(m) = m,
and we have shown that στ σ −1 = ρ.

As an example, if σ = (1 3)(2 5 9)(4 7 8 6) and τ = (1 5 2 7 6 8)), then


στ σ −1 = (σ(1) σ(5) σ(2) σ(7) σ(6) σ(8)) = (3 9 5 8 4 6). It is clearly a lot
quicker to work conjugates out this way, rather than directly multiplying
σ, τ and σ −1 . We can actually use this method to compute conjugates of
any permutation τ . For suppose τ = τ1 τ2 · · · τr is a product of several
cycles, then στ σ −1 = (στ1 σ −1 )(στ2 σ −1 ) · · · (στr σ −1 ), and we can work
out the separate parts as above . For instance, with σ as before, take
instead τ = (1 6)(2 4 9)(3 8 7 5). Then we have στ σ −1
= (σ(1) σ(6))(σ(2) σ(4) σ(9))(σ(3) σ(8) σ(7) σ(5)) = (3 4)(5 7 2)(1 6 8 9).
We have seen in Theorem 2.6.4 that any permutation σ may be broken
down in an essentially unique way into disjoint cycles. This leads to the
2.6. MORE ON PERMUTATIONS 131

idea of the cycle-type of σ:

2.6.18 Definition. If the disjoint cycle decompositions of σ and τ involve


the same number of 2-cycles, the same number of 3-cycles and so on, we
say that they have the same cycle-type.

It can be convenient to think of the cycle-type pictorially, for example:

(• •)(• •)(• •)(• • •)(• • •)(• • • • •)

From the remarks following Proposition 2.6.17 it is clear that conjugate


permutations have the same cycle-type. The remarkable thing is that the
converse is also true:

2.6.19 Theorem. Two elements σ, τ ∈ Sn are conjugate if and only if


they have the same cycle-structure.

Proof. Let σ = σ1 σ2 · · · σr and τ = τ1 τ2 · · · τr be the disjoint cycle decom-


positions. For completeness we will include, in both cases, any 1-cycles,
and place all the 1-cycles first, then the 2-cycles, and so on. Then, moving
from left to right, the numbers appearing within the various brackets of
σ form some arrangement of 1, . . . , n. Place the expression for τ directly
beneath that for σ. For each integer k in the top row denote the en-
try beneath it by ρ(k). This defines a permutation ρ ∈ Sn , and by very
construction we have ρσρ−1 = τ .

2.6.20 Example. In S12 , consider σ = (1 5)(2 11)(3 7 12)(4 10 9 8 6)


and τ = (7 9)(3 8)(1 11 10)(2 12 5 6 4) .
Then the permutation ρ constructed in the previous proof is here given by
132 CHAPTER 2. GROUPS

1 7→ 7, 5 7→ 9, 2 7→ 3 etc. Following this through gives:


ρ = (1 7 11 8 6 4 2 3)(5 9)(10 12). One may verify directly that indeed
ρσρ−1 = τ .

As our example shows, the Theorem does more than just tell us when
two permutations are conjugate. It actually provides a method for finding
an element ρ which conjugates the one into the other. By now, you may
well have noticed the following simple fact, the proof of which is left as
an exercise:

2.6.21 Proposition. The relation of conjugacy on a group G is an equiv-


alence relation. The equivalence classes are called the conjugacy classes.

Observe at once that since geg −1 = e for all g, the conjugacy class
of e is just {e}. Using Theorem 2.6.19 it is now easy to determine the
conjugacy classes in S3 and S4 . The results are as follows:

2.6.22 Example. Conjugacy Classes of S3 :

cycle-structure conjugacy class |class|


i 1
(• •) (1 2), (1 3), (2 3) 3
(• • •) (1 2 3), (1 3 2) 2
2.6. MORE ON PERMUTATIONS 133

2.6.23 Example. Conjugacy Classes of S4 :

cycle-structure conjugacy class |class|


i 1
(• •) (1 2), (1 3), (1 4), (2 3), (2 4), (3 4) 6
(1 2 3), (1 3 2), (1 2 4), (1 4 2),
(• • •) 8
(1 3 4), (1 4 3), (2 3 4), (2 4 3)
(1 2 3 4), (1 2 4 3), (1 3 2 4),
(• • • •) 6
(1 3 4 2), (1 4 2 3), (1 4 3 2)
(• •)(• •) (1 2)(3 4), (1 3)(2 4), (1 4)(2 3) 3

Note that if a subgroup of G is normal, it has to be a union of complete


conjugacy classes. It cannot contain only part of a conjugacy class. Of
course, that does not mean that any union of conjugacy classes is a normal
subgroup, even if we are careful to include the class {e}. For there is no
reason why such a union should even be a subgroup: we must certainly
check for this. As an example, let us use the information in the table
above to determine all the normal subgroups of S4 :

2.6.24. Normal Subgroups of S4 :


We have to take a union of certain of the conjugacy classes, making
sure to include {i}, in such a way that the union is actually a sub-
group. Of course there are always the two extremes {i} and S4 . We
can make life easy for ourselves by using Lagrange. This tells us the
the only possible orders for any other subgroups are 2,3,4,6,8 or 12.
Bearing in mind the sizes of the classes, and the fact that we have
to include i, we very quickly find that the only possibilities are H =
134 CHAPTER 2. GROUPS

{i} ∪ {(• •)(• •)} = {i, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} and K = {i} ∪
{(• • •)} ∪ {(• •)(• •)} = {i, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4),
(1 4 3), (2 3 4), (2 4 3), (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} .
Of course, we still have to check that these really are subgroups: all we
know so far is that if they are subgroups, then they will be normal as well.
But we have seen K before: it is the alternating group A4 of Examples
2.6.16. As for H, we check at once that each element is self-inverse,
and the the product of any two non-identity elements is the third. So H
really is a subgroup. It is often denoted V and called Klein’s Vierergruppe
(“fours group”).
In summary, S4 has precisely four normal subgroups: {i}, V, A4 and S4
.

2.6.25. Z Let H be a subgroup of a group G and let x, y ∈ H. Suppose


that x and y are conjugate elements of the group H. This means that
there is an element h ∈ H such that hxh−1 = y. Since h is, of course,
also in G, this very equation shows that x and y remain conjugate within
the group G. But the converse is false. Suppose to start with that x and
y are conjugate in the group G. Thus there is an element g ∈ G such
that gxg −1 = y. There is more elbow-room in G for such an element to
exist than in H, and it may very well not be possible to find some h in H
satisfying hxh−1 = y. In summary: x and y conjugate in G ; x and y
conjugate in H.

2.6.26 Example. Consider the subgroup A4 of S4 and the elements σ =


(1 2 3) and τ = (1 3 2) of A4 . These have the same cycle-structure, hence
are conjugate in S4 . Indeed, the element ρ = (2 3) does the job, namely
2.6. MORE ON PERMUTATIONS 135

τ = ρσρ−1 . But a direct check of all the possibilities shows that no


element ρ of A4 works, and so σ and τ are not conjugate in A4 .

To round off this section let us think some more about the ways in
which a permutation can be written as a product of disjoint cycles. Let
us begin with the example of σ = (1 2)(3 4 5) in S5 . Even if we insist on
keeping the 2-cycle before the 3-cycle, there are still 2×3 = 6 ways of writ-
ing σ, namely (1 2)(3 4 5), (1 2)(4 5 3), (1 2)(5 3 4), (2 1)(3 4 5), (2 1)(4 5 3)
and (2 1)(5 3 4) . This is because a k-cycle can be denoted in k different
ways, by cyclically permuting round the entries. In just the same way, every
permutation of the form (• •)(• • •) can be written in six ways. Suppose
we wish to determine the size of the conjugacy class (• •)(• • •). We can
fill in the five slots with the numbers 1, . . . , 5 in 5! = 120 ways. But in so
doing, we have actually counted each permutation six times over. So the
true size of this class is 120/6 = 20.
Consider now the more substantial example of the permutation τ =
(1)(2)(3 4)(5 6)(7 8)(9 10 11)(12 13 14) in S14 . This has the cycle-structure
(•)(•)(• •)(• •)(• •)(• • •)(• • •), where we have made sure to put in
the 1-cycles. Not only can each cycle in τ be cyclically reordered, but also
the 1-cycles can be placed in either of 2! = 2 orders, and likewise for the
3-cycles. And the 2-cycles may be rearranged in 3! = 6 ways. For example
we can also write τ = (2)(1)(5 6)(7 8)(3 4)(12 13 14)(9 10 11). All in all,
including 1-cycles, τ can be written in 2!×3!×2!×23 ×32 = 1728 different
ways. Now there are 14! ways to fill in the slots of (•)(•)(• •)(• •)(• •)(• •
•)(• • •) with the numbers 1, . . . , 14. So the size of this conjugacy class
is 14!/1728 = 50450400. This method clearly extends to prove the next
result. The reader may supply the formal details.
136 CHAPTER 2. GROUPS

2.6.27 Theorem. Let σ ∈ Sn involve ki i-cycles (1 ≤ i ≤ r) in its disjoint


n!
cycle form. Then the conjugacy class of σ has
k1 !k2 ! · · · kr !1k1 2k2 · · · rkr
elements.

Suppose σ, τ ∈ Sn have the same cycle structure, so that we there-


fore know that they are conjugate. Let us finally think a bit more about
how to actually find an element ρ ∈ Sn which conjugates the one into the
other. This is easy. To take an example, consider σ = (1 4)(2 3 5) and τ =
(2 3)(1 4 5) in S5 . For any ρ we know that ρσρ−1 = (ρ(1) ρ(4))(ρ(2) ρ(3) ρ(5)).
So this must match up with τ . But we have to remember that by moving
round the entries within each cycle τ can be written in six different ways
according to the scheme (• •)(• • •). Let us write the possibilities down
as a table:

σ  (1 4)(2 3 5)


 (2 3)(1 4 5)




 (2 3)(4 5 1)

 (2 3)(5 1 4)
τ


 (3 2)(1 4 5)




 (3 2)(4 5 1)

 (3 2)(5 1 4)

Sending each entry of σ to the one directly beneath it ! in the first ex-
1 4 2 3 5
pression for τ defines a permutation ρ = = (1 2)(3 4)
2 3 1 4 5
with the property that ρσρ−1 = τ . Using instead the other representa-
tions of τ leads to five more elements ρ with the desired property namely
ρ = (1 2 4 3 5), (1 2 5 4 3), (1 3 4 2), (1 3 5)(2 4) and (1 3)(2 5 4) . These six
2.6. MORE ON PERMUTATIONS 137

elements are all of the permutations ρ which conjugate σ into τ .

As a final example, we will compute the centralizer of an element of


Sn . First the definition, and a result which is left to you to prove:

2.6.28 Proposition. If G is a group and x ∈ G, then the set C(x) =


{g ∈ G : gxg −1 = x} = {g ∈ G : gx = xg} of all elements which
commute with x is a subgroup of G. It is called the centralizer of x .

2.6.29 Example. We compute the centralizer in S7 of the permutation


σ = (1 3)(2 7)(4 6 5), using the method above. Here we must be very
careful to remember that the two transpositions can be placed in either
order. The table is:
138 CHAPTER 2. GROUPS

σ  (1 3)(2 7)(4 6 5) ρ


 (1 3)(2 7)(4 6 5) i




 (1 3)(2 7)(6 5 4) (4 6 5)



 (1 3)(2 7)(5 4 6) (4 5 6)

(1 3)(7 2)(4 6 5) (2 7)





(1 3)(7 2)(6 5 4) (2 7)(4 6 5)





(1 3)(7 2)(5 4 6) (2 7)(4 6 5)





(3 1)(2 7)(4 6 5) (1 3)





(3 1)(2 7)(6 5 4) (1 3)(4 6 5)





(3 1)(2 7)(5 4 6) (1 3)(4 5 6)





(3 1)(7 2)(4 6 5) (1 3)(2 7)








 (3 1)(7 2)(6 5 4) (1 3)(2 7)(4 6 5)

 (3 1)(7 2)(5 4 6) (1 3)(2 7)(4 6 5)
σ


 (2 7)(1 3)(4 6 5) (1 2)(3 7)




 (2 7)(1 3)(6 5 4) (1 2)(3 7)(4 6 5)



 (2 7)(1 3)(5 4 6) (1 2)(3 7)(4 5 6)




 (2 7)(3 1)(4 6 5) (1 2 3 7)

(2 7)(3 1)(6 5 4) (1 2 3 7)(4 6 5)





(2 7)(3 1)(5 4 6) (1 2 3 7)(4 5 6)





(7 2)(1 3)(4 6 5) (1 7 3 2)





(7 2)(1 3)(6 5 4) (1 7 3 2)(4 6 5)





(7 2)(1 3)(5 4 6) (1 7 3 2)(4 5 6)





(7 2)(3 1)(4 6 5) (1 7)(2 3)








 (7 2)(3 1)(6 5 4) (1 7)(2 3)(4 6 5)

 (7 2)(3 1)(5 4 6) (1 7)(2 3)(4 5 6)
2.6. MORE ON PERMUTATIONS 139

The elements in the right-hand column comprise the subgroup C(σ)


of order 24 inside S7 .

Exercises:

1. Write the following permutations in disjoint cycle form:


! !
1 2 3 4 5 6 1 2 3 4 5 6 7 8
, .
3 6 5 2 1 4 6 5 7 1 2 8 3 4

2. Check that the relation ∼ defined after Proposition ?? is an equivalence


relation.

3. Prove Proposition 2.6.19 .

4. Show that a group is abelian if and only if all its conjugacy classes
are of size one.

5. Consider the set of permutations S = {i, (1 2 3), (1 3 2), (1 2), (1 3), (2 3)}
in S4 . Show that for distinct g, h ∈ S we cannot have g −1 h ∈ V , the Klein
group, and hence that the six elements of S represent different cosets of
V . Deduce that S4 /V ∼ = S3 .

6. Use Theorem 2.6.23 to check the sizes of the conjugacy classes in


Example 2.6.21.
140 CHAPTER 2. GROUPS

7. Prove Proposition 2.6.26.

8. Calculate the centralizers of


(a) σ = (1 2 3 4 5) in S5
(b) σ = (1 2 3)(4 5 6) in S6 .

9. Let σ ∈ Sn involve ki i-cycles (1 ≤ i ≤ r) in its disjoint cycle form. By


analyzing the method of Example 2.6.27, show that the centralizer C(σ)
has order k1 !k2 ! · · · kr !1k1 2k2 · · · rkr .

2.7 Examples: Symmetries and Matrices


By this point we have developed a fair amount of machinery and it is time
to illustrate this with some more examples. Apart from the groups R and
C of real and complex numbers, the cyclic groups, such as Z and Zn ,
and the permutation groups Sn , together with all the various subgroups,
quotients and products we can form from these, there are very many other
ways in which groups arise naturally in Mathematics. In this section we
will look at two such ways.

2.7.1 Symmetries
Consider the set R2 = R × R, which we may visualise as the ordinary
Euclidean plane with its system of x- and y-axes. If P = (x, y) and
P 0 = (x0 , y 0 ) are two points of R2 , we have the usual notion of the
2.7. EXAMPLES: SYMMETRIES AND MATRICES 141
p
distance in the plane d(P, P 0 ) = (x − x0 )2 + (y − y 0 )2 between them.
We will be interested in those functions from R2 to itself that preserve
distances in the following sense:

2.7.1 Definition. An isometry of the plane is a bijection f : R2 → R2


such that d(f (P ), f (P 0 )) = d(P, P 0 ) for all P, P 0 ∈ R2 .

It is shown in geometry courses that the isometries of the plane fall


into four types, and we shall accept this without proof here, so as not to
interrupt the flow. The types are:

(i) translations: we slide or translate each point of the plane the same
amount and in the same direction. Thus f (x, y) = (x + a, y + b), for some
fixed a, b ∈ R.
(ii) rotations: all points are rotated through a fixed angle about a given
point of the plane.
(iii) reflections: we fix a particular line in the plane and then send each
point to its reflection, or mirror-image, in this line.
(iv) glide-refections: these are by definition the compositions of a reflec-
tion with a translation.
Under composition, the isometries form a group:

2.7.2 Proposition. The set Isom(R2 ) of all isometries of the plane is a


subgroup of the symmetric group SR2 .

Proof. Clearly the identity i is an isometry. Now let f and g be two


isometries. Then d((g ◦ f )(P ), (g ◦ f )(P 0 )) = d(g(f (P )), g(f (P 0 ))) =
d(f (P ), f (P 0 ))
142 CHAPTER 2. GROUPS

= d(P, P 0 ), so that g ◦ f is an isometry. Finally we have d(P, P 0 ) =


d(f (f −1 (P )), f (f −1 (P 0 ))) = d(f −1 (P ), f −1 (P 0 )), proving that f −1 is also
an isometry.

Now for any subset or “figure” F in the plane, we may consider just
those isometries f of the plane which preserve F , in the sense that f (F ) =
F:

2.7.3 Definition. If F is a subset of R2 , then a symmetry of F is an


isometry f of the plane such that f (F ) = F .

Observe that any figure at all, no matter how “unsymmetrical”, has at


least one symmetry, namely the identity map i. You are left to prove the
next simple result:

2.7.4 Proposition. The set S(F ) of all symmetries of the plane is a


subgroup of Isom(R2 ).

If F = R2 or ∅, then being a symmetry clearly places no restriction on


an isometry, and we have S(R2 ) = S(∅) = Isom(R2 ). A more interesting
example is:

2.7.5 Example. The Disk Group: Let D = {(x, y) ∈ R2 : x2 + y 2 ≤ 1}


be the set of points of the plane within distance 1 of the origin. We call
this the unit disk. If we identify R2 with C, this amounts to D = {z ∈ C :
|z| ≤ 1}. It is intuitively clear, and can be proved formally, that the group
S(D) cannot contain any non-trivial translations or glide-reflections, and
that it consists precisely of all rotations about the origin, together with all
reflections in lines passing through the origin.
2.7. EXAMPLES: SYMMETRIES AND MATRICES 143

Let us introduce some notation for these symmetries. For any θ ∈ R let
rθ be the symmetry of anticlockwise rotation around the origin through
an angle of θ radians, and let s be reflection in the x-axis. We leave you
to check that the formulas for these symmetries are

rθ (x, y) = (x cos θ − y sin θ, x sin θ + y cos θ) and s(x, y) = (x, −y).

You are further left to check that the product srθ (in other words, the
composite s ◦ rθ ), which is given by the formula (srθ )(x, y) = (x cos θ −
y sin θ, −x sin θ − y cos θ) represents reflection in the line x sin(θ/2) +
y cos(θ/2)
= 0 obtained by rotating the x-axis clockwise about the origin through
an angle of θ/2. As θ varies we obtain in this way all reflections in lines
through the origin. Hence S(D) = {rθ : θ ∈ R} ∪ {srθ : θ ∈ R}. Note
that there are repetitions here: if θ ≡ ϕ (mod 2π), ie. θ − ϕ is a multiple
of 2π, then rθ = rϕ . So if we wanted we could restrict θ to lie in the range
0 ≤ θ < 2π.

The element s has order 2, and so s = s−1 . Moreover (rθ s)(x, y) =


(x cos θ + y sin θ, x sin θ − y cos θ) = (sr−θ (x, y)). Hence:

rθ s = sr−θ , or equivalently srθ s−1 = r−θ .

The group structure is determined by the following table, which shows how
144 CHAPTER 2. GROUPS

to multiply any two rotations or reflections:

rϕ srϕ
rθ rθ+ϕ srϕ−θ
srθ srθ+ϕ rϕ−θ

This is easily determined. For example rθ (srϕ ) = (rθ s)rϕ = (sr−θ rϕ ) =


srϕ−θ . Of course the identity is r0 = i.

Note that this symmetry group can be described in other ways: S(D) =
S(B) = S(T) = S({(0, 0)}), where B = {(x, y) ∈ R2 : x2 + y 2 < 1}
and T = {(x, y) ∈ R2 : x2 + y 2 = 1}.

2.7.6 Example. The Dihedral Group Dn : Consider a regular n-sided poly-


gon or n-gon Pn in the plane, where n ≥ 3. For definiteness let us take the
vertices to be equally spaced around the unit circle T with one of them at
the point (1, 0). In complex number notation this means that the vertices
are the points e2πki/n (0 ≤ k < n). The symmetry group S(Pn ) is written
Dn and is called a dihedral group. Once more it can only contain rotations
and reflections. It contains the refection s in the x-axis, and just those
rotations rθ for which θ is a multiple of 2π/n. If we write r = r2π/n , then
we have:
Dn = {i, r, r2 , . . . , rn−1 , s, sr, sr2 , . . . , srn−1 }

and Dn is a group of order 2n. We have srs−1 = r−1 and the set of
rotations H = {i, r, r2 , . . . , rn−1 } is a normal subgroup of index 2. The
remaining elements are reflections in the various axes of symmetry making
angles of kπ/n with the x-axis. Since sr 6= rs, the dihedral group Dn is
nonabelian.
2.7. EXAMPLES: SYMMETRIES AND MATRICES 145

Let us now label the vertices of Pn as v1 , . . . , vn in anticlockwise


order starting with v1 = (1, 0). Thus, in complex number notation,
vk = e2π(k−1)i/n (1 ≤ k ≤ n). Then V = {v1 , . . . , vn } ⊂ R2 and each
symmetry f ∈ Dn clearly restricts to give a permutation of the set V . If
we now simplify the notation and just label the vertices as 1, . . . , n, this
means that f gives rise to a permutation σf in the symmetric group Sn .
Moreover σf evidently determines f . Since the operation in both groups
is just composition of functions, we have therefore proved:

2.7.7 Proposition. The mapping f 7→ σf gives an injective homomor-


phism Dn ,→ Sn .

The hook on the arrow is a useful piece of notation, signifying that the
mapping is injective. To all intents and purposes we may identify Dn with
its image under this mapping, and regard Dn as a subgroup of Sn . Let us
see how this works out in a couple of concrete cases:

2.7.8 Example. The Triangle Group D3 : Following the procedure above


we find that
! ! ! !
1 2 3 1 2 3 1 2 3 1 2 3
i= , r= , r2 = , s= ,
1 2 3 2 3 1 3 1 2 1 3 2
! !
1 2 3 1 2 3
sr = and sr2 = .
3 2 1 2 1 3

Thus D3 = S3 and in terms of the notation used in Example 2.2.9 we


have r = α, r2 = β, s = τ, sr = σ and sr2 = ρ.
146 CHAPTER 2. GROUPS

2.7.9 Example. The Square (or Octic) Group D4 : We regard D4 as the


subgroup of S4 consisting of the permutations
! ! !
1 2 3 4 1 2 3 4 1 2 3 4
i= , r= , r2 = ,
1 2 3 4 2 3 4 1 3 4 1 2
! ! !
1 2 3 4 1 2 3 4 1 2 3 4
r3 = ,s= , sr = ,
4 1 2 3 1 4 3 2 4 3 2 1
! !
1 2 3 4 1 2 3 4
sr2 = and sr3 = .
3 2 1 4 2 1 4 3

2.7.2 Matrix Groups

Many of the most interesting groups to arise in Mathematics do so as


groups of matrices. We assume here that the reader is familiar with the
meaning and basic properties of matrices. Those who are not can refer
to the Appendix on matrices at the end of the book. We are here only
interested in square matrices. Recall that a real (n × n) square matrix
 
a11 · · · a1n
 . . 
A = (aij ) =  .. . . . .. 
an1 · · · ann

is an array of real numbers aij . The set of all such matrices is denoted
Mn (R). We can similarly talk about the sets Mn (C), Mn (Q), Mn (Z) of
complex, rational or integral matrices. Two (n×n) matrices A = (aij ) and
P
B = (bij ) have a product AB = C = (cij ), defined by cij = k aik bkj ,
and this defines an operation on Mn (R). There is an identity element
2.7. EXAMPLES: SYMMETRIES AND MATRICES 147

I = (δij ), where δij = 1 (i = j), 0 (i 6= j) (the Kronecker delta). It is


a basic fact that matrix multiplication is associative (see the Appendix),
and thus we have:

2.7.10 Proposition. The set Mn (R) is a monoid under matrix multipli-


cation. Likewise for Mn (C), Mn (Q) and Mn (Z).

If a matrix A ∈ Mn (R) has an inverse, we call it invertible or nonsin-


gular. We now have:

2.7.11 Proposition. The set of all invertible matrices in Mn (R) is a


group under matrix multiplication. It is denoted GLn (R) and called the
real general linear group. Likewise for the complex, rational and integral
general linear groups GLn (C), GLn (Q) and GLn (Z).

Proof. Write G = GLn (R). If A and B are invertible, then so is AB, and
(AB)−1 = B −1 A−1 . Thus matrix multiplication does indeed restrict to
give an operation on the subset G. We already know that this operation
is associative, and there is an identity I ∈ G. Finally, by very definition,
each element A ∈ G has an inverse A−1 ∈ G.

Associated to each matrix A = (aij ) there is an important number, its


determinant, written

a11 · · · a1n
.. . . . ..

det A = . .


an1 · · · ann

It is not our purpose now to set up and develop the theory of determinants,
interesting though it is. It will suffice for now to know that determinants
148 CHAPTER 2. GROUPS

exist and have certain nice properties. For completeness, we give the
definition here, and refer you to the Appendix for more information.

2.7.12 Definition. The determinant of an n × n matrix A = (aij ) is


defined by
X
det A = ε(σ)a1σ(1) a2σ(2) · · · anσ(n) .
σ∈Sn

In the (2 × 2) and (3 × 3) cases the formulas are:



a a
11 12
= a11 a22 − a12 a21


a21 a22
and

a11 a12 a13

a21 a22 a23 = a11 a22 a33 +a12 a23 a31 +a13 a21 a32 −a11 a23 a32 −a13 a22 a31 −a12 a21 a33


a31 a32 a32

For the moment let us accept the following

2.7.13 Facts. 1) det(AB) = (det A)(det B)


2) det I = 1
3) det(A−1 ) = (det A)−1
4) A matrix A in Mn (R), Mn (C) or Mn (Q) is invertible in that monoid
if and only if det A 6= 0. For Mn (Z) the condition is that det A = ±1.

These facts show that the determinant gives a group homomorphism


det : GLn (R) → R∗ , A 7→ det A. The kernel is an important subgroup:
2.7. EXAMPLES: SYMMETRIES AND MATRICES 149

2.7.14 Definition. The real special linear group is the group SLn (R) =
{A ∈ GLn (R) : det A = 1}. Likewise for the complex, rational and
integral special linear groups SLn (C), SLn (Q) and SLn (Z).

We have:

2.7.15 Proposition. For K = R, C and Q, the determinant homomor-



phism induces an isomorphism GLn (K)/SLn (K) → K ∗ . It likewise in-

duces an isomorphism GLn (Z)/SLn (Z) → {±1}.

Proof. In the first three cases the homomorphism det : GLn (K) → K ∗ is
a 0 ... 0
 
 0 1 . . . 0 
surjective. For, given any a ∈ K ∗ , the matrix A = 
 ... ... . . . ...  has

 
0 0 ... 1
determinant a. By definition, the kernel is SLn (K), and the result follows
from Theorem 2.5.15. The argument is similar in the last case.

Exercises:

1. Check carefully all the calculations in Example 2.7.6.

2. Show that every reflection t = srθ in the disk group S(D) has or-
der two, and that trφ t−1 = r−ϕ and t(srϕ )t−1 = sr2θ−ϕ .
150 CHAPTER 2. GROUPS

3. Show that the set of rotations H = {rθ : θ ∈ R} is a normal subgroup


of S(D) and that S(D)/H is cyclic of order two.

4. (i) Find the orders of all the elements of D4 ;


(ii) Find the centre of D4 ;
(iii) Show that D4 is not isomorphic to the quaternion group Q of Example
2.5.10.

5. Let G be a nonabelian group of order 2p , where p is a prime ≥ 3.


(i) Show that there exists an element a in G of order p.
(ii) Suppose xp = e. By considering the image x̄ in the quotient group
G/ < a > (see Proposition 2.5.6), show that x ∈< a >.
(iii) Deduce that x2 = e for all x ∈< / a >.
(iv) Choose any element b ∈< / a >. Show that G = {e, a, a2 , . . . , ap−1 ,
b, ba, ba2 , . . . , bap−1 }, where ap = b2 = e and bab = a−1 .
(v) Deduce that G is isomorphic to the dihedral group Dp .
This shows that, up to isomorphism, there is just one nonabelian group
of order 2p. By Exercise there is also precisely one abelian group of this
order, the cyclic group.
Bibliography

151

You might also like