0% found this document useful (0 votes)
6 views

Notes Functional Analysis (2)

The document is a comprehensive outline of a course on Real and Functional Analysis, authored by Marco Di Francesco from the University of L'Aquila. It covers various topics including function spaces, measure and integration, bounded linear operators, and Hilbert spaces, emphasizing the transition from finite-dimensional linear algebra to infinite-dimensional functional analysis. The introduction highlights the significance of functional analysis in solving problems involving functions defined on infinite sets, contrasting it with traditional linear algebra approaches.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Notes Functional Analysis (2)

The document is a comprehensive outline of a course on Real and Functional Analysis, authored by Marco Di Francesco from the University of L'Aquila. It covers various topics including function spaces, measure and integration, bounded linear operators, and Hilbert spaces, emphasizing the transition from finite-dimensional linear algebra to infinite-dimensional functional analysis. The introduction highlights the significance of functional analysis in solving problems involving functions defined on infinite sets, contrasting it with traditional linear algebra approaches.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 178

Real and Functional Analysis

Marco Di Francesco*

University of L’Aquila (Italy)

Contents

0 Why functional analysis? 3

I Function spaces, measure and integration 10


1 Metrics, norms, topologies 10
1.1 Metrics and norms . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Topological spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 The topology of metric spaces . . . . . . . . . . . . . . . . . . . . 21
1.6 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7 Finite-dimensional Banach spaces . . . . . . . . . . . . . . . . . 29
1.8 ` p spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2 Spaces of continuous functions 38


2.1 Convergence of function sequences . . . . . . . . . . . . . . . . . 38
2.2 Spaces of continuous functions . . . . . . . . . . . . . . . . . . . 40
2.3 Compact subsets of C (K ) . . . . . . . . . . . . . . . . . . . . . . 42
2.4 The contraction mapping theorem . . . . . . . . . . . . . . . . . 48
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Measure and integration. L p spaces 53


3.1 Integrals and measures . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 An overview of Lebesgue measure theory . . . . . . . . . . . . . 56
3.3 Lebesgue integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 L p spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
p
3.5 Convolution, regularisation and Lloc spaces. . . . . . . . . . . . 75
3.6 A criterion for strong compactness in L p . . . . . . . . . . . . . 81
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
* [email protected]

1
Contents

II Bounded linear operators and Hilbert spaces 87


4 Introduction to linear operators on Banach spaces 87
4.1 Bounded linear maps . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 The kernel and range of a linear map . . . . . . . . . . . . . . . 94
4.3 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5 An overview of fundamental principles of functional analysis . 103
4.6 Weak topologies and weak convergences . . . . . . . . . . . . . 108
4.7 Weak convergences in ` p and L p spaces . . . . . . . . . . . . . . 115
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5 Hilbert spaces 121


5.1 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.3 Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4 The dual of a Hilbert space . . . . . . . . . . . . . . . . . . . . . 134
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

6 Bounded operators on Hilbert spaces and spectral theory 139


6.1 The adjoint of an operator . . . . . . . . . . . . . . . . . . . . . . 139
6.2 Weak convergence in a Hilbert space . . . . . . . . . . . . . . . . 144
6.3 The spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.4 The spectral theorem for compact, self-adjoint operators . . . . 148
6.5 More on compact operators . . . . . . . . . . . . . . . . . . . . . 151
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

The Exam 158


References 178

2
Why functional analysis?1

Recall a typical problem from linear algebra.

0.1 problem. Let A be an n × n matrix with real entries, and let y ∈ Rn be a


given vector. Find all the vectors x ∈ Rn such that

Ax = y. (1)

The vector x has to be determined. Borrowing geometrical terminology we


call the vectors x, y points. The typical approach for this problem uses vector
spaces, or linear spaces (Rn ) and linear mappings, or linear operators (A), all of
which come from linear algebra.
Now note that a point x = ( x1 , . . . , xn ) ∈ Rn can be viewed as a function
x : {1, . . . , n} → R, i.e.

Rn = { x : {1, . . . , n} → R} .

Thus, points are actually functions defined on a finite set.

Differently from Linear Algebra, Functional analysis deals with prob-


lems (equations) where the sought-after object (unknown) is a function
defined on an infinite set.

An example of such a problem:

0.2 problem. Let α, y0 ∈ R and g : [0, +∞) → R a continuous function. Find


all the differentiable functions y : [0, +∞) → R such that
(
y0 (t) = αy(t) + g(t) for all t ≥ 0
y (0) = y0

This is a Cauchy problem for a linear differential equation. Integrating the


differential equation on the interval [0, t] for an arbitrary t > 0 we get

Zt Zt
y(t) − α y(s)ds = y0 + g(s)ds. (2)
0 0

The unknown is now a differentiable function y on the half line [0, +∞). Can we
view the set of differentiable functions on [0, +∞) as a linear space and regard
its elements y as points in the space? Can we consider the mapping

Zt
y = y(t) 7→ A[y] = A[y](t) , A[y](t) = y(t) − α y(s)ds
0

1 This chapter is just introductory, it is NOT included in the exam.

3
Contents

as a linear operator acting on the ‘point’ y of such a linear space? If so, then
the expression (2) could be written as

Zt
.
A[y](t) = h(t) , h ( t ) = y0 + g(s)ds,
0

and the latter expression is reminiscent of (1). The main different with (1) is
that here the unknown y is a function on an infinite set, namely the half-line
[0, +∞).

Functional analysis provides the best setting to adapt the notions of


linear space and linear operator to functions on infinite sets.

In the example of problem 0.2, the candidate linear space to work with is
the space of differentiable functions on the half line [0, +∞). This is actually
a subset of a wider set, the set of continuous functions on [0, +∞). Let us focus
on this set for a moment. For simplicity, we replace the half line [0, +∞) with
the closed interval [0, 1].

0.3 fact. The set of continuous functions on the closed interval [0, 1] is a linear
space. The concept of linear space should be well-known to the student. We may think
of a linear space as a set, the elements of which are called vectors. On such set
we are allowed to take sums between two (or more) vectors and to multiply
a vector by a real number. In both cases, the operations should produce an
element of the same set as an outcome. Two such operations are trivially defined
on this space (exercise!).

The concepts of linear dependence, basis of a linear space, span of a set of vectors,
dimension of a linear space are well-known from linear algebra.

0.4 fact. The linear space C ([0, 1]) defined above has dimension +∞, or more
precisely there exists no finite integer n such that the dimension of C ([0, 1])
equals n. Let us first recall from linear algebra what the dimension of a linear
space is: it is the largest possible integer number N of linearly independent
vectors, i.e. the largest possible number of vectors v1 , . . . , v N for which α1 v1 +
. . . + α N v N = 0 implies α1 = . . . = α N = 0, i.e. the largest possible number of
vectors in which none of them can be written as a linear combination of the
others. Now, how to prove that the dimension of C ([0, 1]) is infinite? The best
option is to assume that it is finite and get to a contradiction, which would
prove the assertion. So, let us assume that the dimension of our linear space
C ([0, 1]) is given by some (possibly very large) integer N ∈ N. Consider the
set of vectors
. . . .
v0 (t) = 1, v1 (t) = t, v2 ( t ) = t2 , . . . , v N (t) = t N .

4
We claim that the above N + 1 vectors are linearly independent. Assume

N
∑ αi ti = 0 (3)
i =0

for some α0 , α1 , . . . , α N ∈ R. Note that (3) must hold for all t ∈ [0, 1], so in
particular for t = 0. This implies α0 = 0 upon substituting t = 0 in (3). Now
divide (3) by t > 0, which gives

N
∑ α 1 t i −1 = 0
i =1

for all t > 0. Now, although we cannot substitute t = 0 above, we can send to
the limit t → 0. It is easy to see that we get α1 = 0. Iterating this procedure
we prove that α2 = . . . = α N = 0. Hence, the N + 1 monomials v0 , . . . , v N
are linearly independent. But that contradicts the fact that the dimension of
the space was N, because we have found N + 1 linearly independent vectors.
Now, here N was an arbitrary integer. This proves the assertion.

We just “touched the main point about functional analysis with bare hands”.
Linear spaces of functions (such as the space C ([0, 1]) in the above example)
do not fit the classical background of linear algebra that we learned in our
bachelor studies, in which linear spaces were always assumed to be finite dimen-
sional. This had a lot of implications, most importantly linear mappings could be
expressed via matrices with a finite number of entries.

Functional analysis is the extension of linear algebra to linear spaces


with infinite dimension. Such spaces are typically spaces of functions on
infinite sets, often functions on subsets of R or subsets of the Euclidean
space Rn .

Let us now introduce another important issue regarding functional analy-


sis, which is that of length measuring. It is well known that one can measure
lengths in a finite dimensional linear space. In the most intuitive case, namely
the Euclidean space Rn , the length of a vector x = ( x1 , . . . , xn ) is given by the
canonical formula
q
k x k = ( x1 )2 + . . . + ( x n )2 .

Other non-canonical ways could be

k x k1 = | x1 | + . . . | x n |,

or

k x k∞ = max | xi |.
i =1,...,n

5
Contents

A concept of length easily defines a concept of closeness between points: very


intuitively, two points are close if their distance is small, where the distance be-
tween two vectors is the length of their difference. A rigorous way to introduce
a notion of closeness is via the concept of limit, which is the founding concept
of mathematical analysis. This concept should be well known by the student.
In rough words, a sequence (i.e. an infinite set v1 , v2 , v3 , . . . , vn , . . . indexed by
integer numbers) of points in a linear space converges to v if the distance be-
tween vn and v tends to zero as n → +∞. Now, clearly two separate notions
of lengths may give rise to two separate notions of closeness, or convergence. How-
ever, at some point we will prove this very important fact: in finite dimensional
spaces, the concept of closeness is independent of the length we are using.

In finite dimensional spaces (more precisely, on finite dimensional


normed spaces, this concept will be introduced later on), different ways
of measuring lengths generate the same notion of closeness. If two
points are close to each other in a finite dimensional space, this does
not depend on the type of length we are using. This is, in general, not
true in infinite dimensional spaces. Therefore, functional analysis explores
many possible ways to measure distances between functions.

It is instructive to produce an example of two separate notions of closeness


already at this stage in infinite dimensions.

0.5 example. Consider the space of continuous functions on [0, 1] denoted by


C ([0, 1]). For each n ∈ N we define the function

f n (x) = xn , x ∈ [0, 1].

We can actually see f n as a sequence of functions, i.e. a family of functions


indexed by a positive integer n. We may ask ourselves whether or not this
sequence converge to a limit, i.e. to a function f ∈ C ([0, 1]). As explained
above, this notion has a lot to do with the concept of distance we are using.
The goal of this example is to show that f n has a limit with respect to some
distance, and has not with respect to some other one. For a given g ∈ C ([0, 1])
let us define
Z1
k f k∞ := max | f ( x )|, k f k1 = | f ( x )|dx.
x ∈[0,1]
0

Since x n → 0 for all x ∈ [0, 1), the most reasonable candidate limit for this
sequence (no matter what distance we are using) is

f ≡ 0.

Let us compute

k f n − 0k∞ = max | f n ( x )| = max | x n | = 1.


x ∈[0,1] x ∈[0,1]

6
Therefore, f n does not converge to zero in the ∞ distance. On the other hand,

Z1
1
k f n − 0k1 = x n dx = →0
n+1
0

as n → +∞.

The student may, at this stage, wonder why distance measuring between
functions is so important. An easy answer comes, for instance, from numerical
methods for differential equations, a matter of massive impact in the applica-
tions. While working on a numerical scheme for a differential equation, one
needs to know whether or not the method is a good approximation of the so-
lution to the equation. How do we measure the approximation? We need to
be able to establish whether or not the solution to our numerical scheme is
close to the actual solution to the equation. Therefore, first of all we should
define what we mean by ‘close’. We shall get back to this point later on in this
section.
For a while we may have thought that functional analysis is just linear
algebra in infinite dimensions. All of a sudden, we started dealing with se-
quences, limits, etc, things we saw for the first time in calculus and analysis.
Functional analysis is indeed a lot about analysis, not just linear algebra and
linear mappings. In fact,

Functional analysis is a subject in which analysis and linear algebra merge


together.

So far we provided a partial answer to the question ‘Why functional anal-


ysis?’. We tried somehow to justify functional analysis as a natural continua-
tion of a pedagogical path based on linear algebra and analysis. But is all that
of any use? Problem 0.2 is a good start, as it involves differential equations,
an undoubtedly useful tool in science and engineering. However, in many
practical situation we deal with optimisation problems, that is, given a variable
quantity, we want to compute its maximum or minimum value given a set of
constraint. Optimisation is probably the most important subject in industrial
applied mathematics. Optimisation needs a sound theoretical background, in
which we should be able to establish the existence of extremal points (maxima
and minima). Such a theoretical background uses concepts like continuity and
compactness.

0.6 fact. In finite dimensions, a continuous function f : K → R defined on


a bounded and closed set K ⊂ Rn has a maximum and a minimum. This
is a famous theorem in analysis due to Weierstrass. It is quite useful while
looking for the solution to an optimisation problem depending on finitely
many variable, since it guarantees under quite general assumptions that - no
matter how good we are in finding an explicit solution to the problem - there

7
Contents

actually is a solution. The key issue behind Weierstrass theorem is the fact
that a bounded and closed set in finite dimension is always compact, i.e. every
sequence in K has a convergent subsequence.

0.7 example. In classical mechanics, the trajectory of a material body subject


to conservative forces can be found by minimising the quantity

ZT h i
L= m| ẋ (t)|2 + U ( x (t)) dt,
0

in which the first term is the kinetic energy and second one the potential en-
ergy. The unknown of the problem is a curve [0, T ] 3 t 7→ x (t) ∈ R3 standing
for the optimal trajectory, i. e. the unknown is a vector in which each compo-
nent is a differentiable function. Hence, the minimiser of the quantity L above
lives in an infinite dimensional space. Weierstrass Theorem no longer holds in
infinite dimensions, and ensuring that a quantity of the form of L attains the
minimum on some curve t 7→ x (t) is far from being trivial. This is due to the
fact that in infinite dimension it is much more difficult to prove that a given set is
compact.

One of the most important concepts developed in a functional analysis


course is that of compactness.

As briefly mentioned above, mathematical modelling in applied sciences


often relies on numerical calculus, in which (for example) the solution to a
differential equation is approximated via some finite elements method.

0.8 example. A differential equation

ẏ(t) = f (y(t), t), t ∈ [0, T ],

can be approximated via finite difference methods e.g. by the set of equations

yi+1 = yi + (∆t) f (yi+1 , t), (4)

with i ∈ {0, . . . , n − 1} and (∆t)n = T. The unknown of the system of equa-


tions in (4) is the (n + 1)-dimensional vector (y0 , y1 , . . . , yn ), whereas the un-
known of the differential equation above is a function on [0, T ], which lives
in an infinite dimensional space. In order to make sure that the numerical
scheme (4) works, one has to prove that a suitable interpolation (piecewise
constant, or piecewise linear) of the solution (y0 , y1 , . . . , yn ) to (4) gets closer
and closer to the solution y(t) of the above differential equation as ∆t goes
to zero. Once again, it’s a matter of measuring a distance between functions.
What is the correct distance to use? How can one prove that our numerical
scheme converges? Functional analysis provides tools to answer to these ques-
tions.

8
The previous example provides another interpretation of the expression
infinite dimensional. The vector (y0 , y1 , . . . , yn ) in the example is n + 1 dimen-
sional. The dimension depends on the size of the time-step ∆t through the
formula (∆t)n = T. The smaller ∆t, the larger n, Such a variability of the
dimension is important in order to consider smaller and smaller time-steps
while approaching the solution to our numerical problem. A vector with finite
entries but with no constraint on the number of entries can be considered as well as
an infinite dimensional vector. More precisely, we can think of y as a vector with
infinitely many entries (y0 , y1 , . . . , y N −1 , y N , 0, 0, . . . , 0, . . .).

Why has functional analysis become so fashionable? Despite the above ex-
amples, it is often the case that functional analysis leads to a result in cases
in which other traditional techniques do as well. The typical example is op-
timisation, a very important field of applied mathematics, with many useful
applications in physics and engineering. A very abstract functional analyt-
ical framework may provide a non-constructive existence of a minimiser for
a given optimisation problem, but one may directly recover a set of Euler-
Lagrange equations as optimality conditions, thus having to solve just a set of
differential equations.
However, the strength and appeal of functional analysis is that it is a con-
venient way of examining the mathematical behavior of various structures.
More precisely, functional analysis clarifies, rigorises, and unifies the underly-
ing concepts.
It clarifies because - as already said - functional analysis is a generalisa-
tion and combination of linear algebra, analysis, and geometry (yes, there is
a bit of geometry too when you measure distances: orthogonal projections,
hyper-planes, etc.), expressed in a simple mathematical notation which allows
these three aspects of the problem to be easily seen. It rigorises, because it
has the back up of a vast mathematical machinery which subsumes many
of the classical results on differential equations, numerical methods, calculus
of variations, and applied mathematical techniques. It unifies, because often
the simple notation does away with many of the complicating details leaving
the essential standing out clearly, so that problems from many different fields
have the same functional analytical symbolism.

This course has several prerequisites. We try to list them here: set theory,
relations and functions, partially ordered sets, the set of real numbers and its
properties, supremum and infimum, topology of the real line, matrices, vec-
tors, linear spaces, linear independence, linear systems and their resolution,
Euclidean geometry, diagonalisation of matrices, eigenvectors, eigenvalues,
topology of Euclidean spaces, real functions of one and more variables, real
sequences, limits, derivatives, partial derivatives, real sequences and their lim-
its, lim sup and lim inf, infinite sums and their convergence, basics of ordinary
differential equations, Riemann’s integration theory.
The present lecture notes are adapted from many references which include
[3] as the main reference plus some hand-written notes by the lecturer.

9
1. Metrics, norms, topologies

Part I
Function spaces, measure and
integration
1 Metrics, norms, topologies

We are all familiar with the geometrical properties of ordinary, three dimen-
sional Euclidean spaces. A persistent theme in mathematics is the grouping
of various kinds of objects into abstract spaces. This grouping enables us to
extend our intuition of the relationship between points in Euclidean space to
the relationship between more general kinds of objects, leading to a clearer
and deeper understanding of those objects.
The simplest setting for the study of many problems in analysis is that
of a metric space. A metric space is a set of points with a suitable notion
of the distance between points. We can use the distance metric, or distance
function, to define the fundamental concepts of analysis, such as convergence,
continuity, and compactness.
In general, a metric space does not have any kind of algebraic structure de-
fined on it. In many applications, however, the metric space is a linear space,
with a metric derived from a norm that gives the ‘length’ of a vector. Such
spaces are called normed linear spaces. For example, the n-dimensional Eu-
clidean space is a normed linear space (after the choice of an arbitrary point
as the origin). A central topic of this course is to study infinite-dimensional
normed linear spaces, including function spaces in which a single point rep-
resents a function. As we will see, the geometrical intuition derived from
finite-dimensional Euclidean spaces remains essential, although completely
new features arise in the case of infinite-dimensional spaces.

1.1 Metrics and norms

Let X be an arbitrary nonempty set.

1.1 definition. A metric, or distance (or distance function), on X is a function

d: X×X →R

with the following properties

(a) d( x, y) ≥ 0 for all x, y ∈ X, and d( x, y) = 0 if and only if x = y;

(b) d( x, y) = d(y, x ) for all x, y ∈ X;

(c) d( x, y) ≤ d( x, z) + d(z, y), for all x, y, z ∈ X.

A metric space is a pair ( X, d) where X is a set and d is a metric on X. The


elements of X are called points.

10
1.1. Metrics and norms

When the metric d is understood from the context, we denote a metric


space simply by the set X. In words, the definition states that:

(a) distances are nonnegative, and the only point at zero distance from x is
x itself;

(b) the distance is a symmetric function;

(c) distances satisfy the triangle inequality.

For points in the Euclidean space, the triangle inequality states that the length
of one side of a triangle is less than the sum of the lengths of the other two
sides.

1.2 example. The set of real numbers R with the distance function d( x, y) =
| x − y| is a metric space. The set of complex numbers C with the distance
function d(z, w) = |z − w| is also a metric space.

1.3 example (Metric subspaces). Suppose ( X, d) is any metric space and Y


is a subset of X. We define the distance between points of Y by restricting
the metric d to Y.2 The resulting metric space (Y, d|Y ), or (Y, d) for short, is
called a metric subspace of ( X, d). For example, (R, | · |) is a metric subspace
of (C, | · |), and the space of rational numbers (Q, | · |) is a metric subspace of
(R, | · |).

1.4 example (Cartesian products). If X and Y are sets, then the Cartesian prod-
uct X × Y is the set of ordered pairs ( x, y) with x ∈ X and y ∈ Y. If d X and
dY are metrics on X and Y respectively, then we may define a metric d X ×Y on
X × Y by

d X ×Y (( x1 , y1 ), ( x2 , y2 )) = d X ( x1 , x2 ) + dY (y1 , y2 )

for all x1 , x2 ∈ X and y1 , y2 ∈ Y.

1.5 exercise. Let ( X, d) be a metric space. Prove that, for all x, y, z ∈ X, one
has

|d( x, y) − d( x, z)| ≤ d(y, z)

(Hint: use the triangle inequality).

As mentioned above, metric spaces do not need an underlying algebraic


structure to be well defined. By ‘algebraic structure’ we essentially mean a
structure of linear space. In the next definition, we shall refer to R and C
as ‘scalar fields’, i.e. sets equipped with a sum operation + and a product
operation · with certain elementary properties satisfied3 such as associativ-
ity, commutativity, existence of additive and multiplicative identity elements
(zero and one respectively), existence of additive inverses and multiplicative
2 The restriction of a function f : A → B to a subset C ⊂ A is the function f |C → B defined by
f C ( x ) = f ( x ) for all x ∈ C ⊂ A. In this case, the distance function is restricted to Y × Y ⊂ X × X.
3 https://en.wikipedia.org/wiki/Field_(mathematics)

11
1. Metrics, norms, topologies

inverses, and distributivity of multiplication over addition. These properties


are considered as elementary. When referring to R or C as a scalar field, we
shall often refer to their elements as scalar.

1.6 definition. A linear space (or vector space) X over the scalar field R (or C)
is a set, the elements of which are called vectors, on which the following two
operations are defined

• Sum between vectors: X × X 3 ( x, y) 7→ x + y ∈ X,4

• Scalar multiplication: R × X or (C × X ) 3 (λ, x ) 7→ λx ∈ X,

with the following properties:

1. For all x, y, z ∈ X,

• x+y = y+x
• x + (y + z) = ( x + y) + z,

2. there exists 0 ∈ X such that 0 + x = x for all x ∈ X,

3. for all x ∈ X there is a unique − x ∈ X such that x + (− x ) = 0,

4. for all x, y ∈ X and λ, µ ∈ R (or C),

• 1x = x
• (λ + µ) x = λx + µx,
• λ(µx ) = (λµ) x,
• λ( x + y) = λx + λy.

A norm on an linear space is a function that provides a “length” to a vector.

1.7 definition. A norm on a linear space X is a function k · k : X → R with


the following properties:

(a) k x k ≥ 0 for all x ∈ X (nonnegativity);

(b) kλx k = |λ|k x k for all x ∈ X and λ ∈ R (or C) (homogeneity);

(c) k x + yk ≤ k x k + kyk, for all x, y ∈ X (triangle inequality);

(d) k x k = 0 implies that x = 0 (strict positivity).

A normed linear space ( X, k · k) is a linear space X equipped with a norm k · k.

Depending on whether X in the above definition is a linear space on the


scalar field R or C, we shall call X a real normed linear space or complex
normed linear space respectively.

1.8 exercise. Some textbooks state the above property (d) as follows:
4 By abuse of notation, the same symbol + is used to denote both the sum in the scalar field

(sum of two real or complex numbers) and the sum in the linear space (sum of two vectors).

12
1.1. Metrics and norms

(d’) k x k = 0 if and only if x = 0.


Clearly, (d’) implies (d). In fact, (d) and (d’) are equivalent once the previous
properties (a)-(b)-(c) are assumed. Why?

In some sense, we can prove that a normed space can be made a metric
space in a standard way.

1.9 exercise (Normed spaces are metric spaces). Prove that a normed linear
space ( X, k · k) is a metric space with the metric

d( x, y) = k x − yk. (5)

The distance d in the Exercise 1.9 is called induced distance.


For future use, we recall the concept of convex subset. A subset C of a linear
space X is said to be convex if

tx + (1 − t)y ∈ C

for all x, y ∈ C and for all t ∈ [0, 1], meaning that the line segment joining two
vectors in C lies entirely in C.

1.10 exercise. Prove that the closed unit ball

{ x ∈ X : k x k ≤ 1}

in a linear normed space X is a convex set.

1.11 example. The set of real numbers R with the absolute value norm k x k =
| x | is a one-dimensional real normed linear space. More generally, Rn , where
n = 1, 2, 3, . . ., is an n-dimensional linear space. We define the Euclidean norm
of a point x = ( x1 , . . . , xn ) ∈ Rn by
q
kxk = x12 + x22 + . . . + xn2 ,

and call Rn equipped with the Euclidean norm n-dimensional Euclidean space.
As seen in the introductory Section, we can also define other norms on Rn .
For example, the 1-norm is given by

k x k1 = | x1 | + | x2 | + . . . + | x n |.

The maximum norm, or ∞-norm, is given by

k x k∞ = max{| x1 |, | x2 |, . . . , | xn |}.

The student is invited to prove that the three above (Euclidean norm, 1-norm,
and ∞-norm) are actual norms according to our Definition 1.7.

1.12 exercise. In the case n = 2, draw the sets


• { x ∈ R2 : k x k ≤ 1 } ,

13
1. Metrics, norms, topologies

• { x ∈ R2 : k x k 1 ≤ 1 } ,

• { x ∈ R2 : k x k ∞ ≤ 1 } ,

on the Cartesian coordinate system.

1.13 example. A linear subspace of a linear space, or simply a subspace when


it is clear we are talking about linear spaces, is a subset that is itself a linear
space. More precisely, A subset M of a linear space X is a subspace if and only
if λx + µy ∈ M for all λ, µ ∈ R (or C) and all x, y ∈ M. A subspace of a normed
linear space is a normed linear space with norm given by the restriction of the
norm on X to M. Note that every subset of a normed linear space X can be
therefore seen as a metric sub-space of X (in the induced distance), but not all
subsets of X are linear subspaces of X. For example, a bounded line segment
of the 2-dimensional Euclidean plane is a metric subspace of R2 , but not a
linear subspace.

We will see later on that all norms on a finite-dimensional linear space


lead to the same notion of convergence, so often it is not important which
norm we use. Different norms on an infinite-dimensional linear space, such
as a function space, may lead to completely different notions of convergence,
so the specification of a norm is crucial in this case. We will always regard a
normed linear space as a metric space with the metric defined in (5), unless
we explicitly state otherwise. Nevertheless, this equation is not the only way
to define a metric on a normed linear space, see the Exercises at the end of
this Section.
Let us get back to the question raised before Example 1.9. What is the rela-
tion between metric spaces and normed linear spaces? So far we have proven
that every normed linear space can be made a metric space via a standard
construction. In some sense, then, every normed space is also a metric space.
The opposite question raises naturally: suppose X is a linear space which is
also a metric space, with metric d; is the distance d induced by a norm? If so,
every linear space which is also a metric space would be a normed space. The
answer is negative, as shown in the next example.

1.14 example. On a real vector space X, consider the metric


(
0 if x = y
d( x, y) = .
1 if x 6= y

We claim that d cannot induce a norm. By contradiction, assume there exists


a norm k · k on X such that k x − yk = d( x, y). Now, let x ∈ X \ {0} (0 being
the identity element in X), let λ ∈ R \ {0}. Then, the homogeneity property
of norms implies 1 = d(λx, 0) = kλx k = |λ|k x k = |λ|d( x, 0) = |λ|, and this is
a contradiction if |λ| 6= 1.

14
1.2. Convergence

1.2 Convergence

The goal of this subsection is to introduce the concept of convergent sequence


in a metric space.
A sequence ( xn ) in a metric space ( X, d) is a map N 3 n 7→ xn ∈ X which
associates a point xn ∈ X with each natural number n ∈ N.

1.15 definition. A sequence ( xn ) in a metric space ( X, d) converges to x ∈ X


if for every e > 0 there is an N ∈ N such that d( xn , x ) < e for all n ≥ N.
The point x is called limit of the sequence. The sequence is Cauchy if for every
e > 0 there is an N ∈ N such that d( xm , xn ) < e for all m, n ≥ N.

We use the notations

lim xn = x , xn → x
n→+∞

to denote that xn converges to x.

1.16 remark. The limit of a convergent sequence in a metric space is unique.


To see this, assume xn → x and xn → y as n → +∞. Assume x 6= y. Then,
point (a) in the Definition 1.1 implies d( x, y) > 0. Let d( x, y) = δ > 0. Since
xn → x, there is an N ∈ N such that d( x, xn ) < δ/3 for all n ≥ N. Hence, due
to the triangle inequality, for n ≥ N one has

δ
δ = d( x, y) ≤ d( x, xn ) + d( xn , y) < + d ( x n , y ),
3

which implies d( xn , y) > 3 for all n ≥ N, and that contradicts xn → y.

In the introductory section we emphasized the fact that two separate con-
cepts of distance give rise to two separate notions of convergence. We touched
in particular the issue of convergence in a finite dimensional normed space,
and outlined that no matter what norm we use in a finite dimensional space,
this won’t affect the set of convergent sequences. In a generic metric space
(not necessarily normed) the fact that the notion of convergence depends quite
heavily on the distance is easily seen even in very simple examples.

1.17 example. Let X = R and consider the following two distances on X:

d1 ( x, y) = | x − y|
(
1 if x 6= y
d2 ( x, y) =
0 if x = y.

There are sequences which converge in the metric space ( X, d1 ) but not in
( X, d2 ), see also the Exercises. As example, take xn = 1/n. Clearly xn → 0
in d1 , but this is not true in the distance d2 . Indeed, let e = 1/2. To have
convergence we would need to find a N ∈ N such that d2 (1/n, 0) < 1/2 for
all n ≥ N. But the only possibility to have d2 (1/n, 0) < 1/2 is that 1/n = 0,
which never happens even if n is very large.

15
1. Metrics, norms, topologies

1.18 exercise. Prove that every convergent sequence in a metric space is a


Cauchy sequence.

The reverse property, namely that every Cauchy sequence converges, sin-
gles out a particularly useful class of metric spaces, called complete metric
spaces.

1.19 definition. A metric space ( X, d) is complete if every Cauchy sequence in


X converges to a limit in X. A subset Y of X is complete if the metric subspace
(Y, d|Y ) is complete. A normed linear space that is complete with respect to
the induced metric is called a Banach space.

Recall the set of rational numbers Q, which can be seen as a linear sub-
space of R, so it is by itself a linear space. Hence, by seeing R as a normed
linear space equipped with the usual absolute value norm, Q can be seen as a
normed linear space. As such, Q is not complete, since a sequence √ of rational
numbers which converges in R to an irrational number (such as 2 or π) is a
Cauchy sequence in Q, but does not have a limit in Q, see the Exercises at the
end of this Chapter for a specific example.

1.20 exercise. Prove that the normed linear space Rn is a Banach space when
equipped with the norms k · k, k · k1 , and k · k∞ considered in the Example
1.11.

Infinite series (or infinite sums) do not make sense in a general metric
space, because we cannot add points together in a general metric space. We
can, however, consider series in a normed linear space X. Just as for real or

complex numbers, if ( xn ) is a sequence in X, then the series ∑+n=1 xn converges
n
to s ∈ X if the sequence of partial sums (sn ), sn = ∑k=1 xk , converges to s.
The concepts of lim sup and lim inf of a sequence of real numbers are
required as prerequisites of this course. We invite the students to review them
in a proper basic real analysis textbook.
Having a concept of distance in our hands, we can introduce the concept
of bounded set even though a metric space is not necessarily equipped with
the structure of totally ordered set. Suppose that A is a nonempty subset of a
metric space ( X, d). We define the diameter of A as

diam A = sup{d( x, y) : x, y ∈ A}.

A subset A of X is bounded if diam A is finite. It follows that A is bounded if


and only if there is an M ∈ R and an x0 ∈ X such that d( x0 , x ) ≤ M for all
x ∈ A (see the Exercises at the end of this Chapter for the proof). The distance
d( x, A) of a point x from the set A is defined by

d( x, A) = inf{d( x, y) : y ∈ A}.

The statement d( x, A) = 0 does not imply necessarily that x ∈ A.


Given two metric spaces ( X, d X ) and (Y, dY ), we say that a function f :
X → Y is bounded if its range f ( X ) is bounded. For example, a real-valued

16
1.3. Continuity

function f : X → R is bounded if there is a finite number M such that | f ( x )| ≤


M for all x ∈ X. We say that f : X → R is bounded from above if there is an
M ∈ R such that f ( x ) ≤ M for all x ∈ X, and bounded from below if there is
an M ∈ R such that f ( x ) ≥ M for all x ∈ X.

1.3 Continuity
Everyone is familiar with the concept of continuous function f : R → R.
The definition of continuity for functions between metric spaces is an obvious
generalisation of that. Let ( X, d X ) and (Y, dY ) be two metric spaces.

1.21 definition. A function f : X → Y is continuous at x0 ∈ X if for every


e > 0 there is a δ > 0 such that d X ( x, x0 ) < δ implies dY ( f ( x ), f ( x0 )) < e. The
function f is continuous on X if it is continuous at every point x ∈ X.

If f is not continuous at x, then we say that f is discontinuous at x. There


are continuous functions on any metric space. For example, every constant
function is continuous.

1.22 example (Distance function). Let a ∈ X, and define f : X → R by


f ( x ) = d( x, a). Then f is continuous on X. Indeed, let x0 ∈ X and e > 0.
As a consequence of the triangle inequality (see the exercise 1.5), we have

| f ( x ) − f ( x0 )| = |d( x, a) − d( x0 , a)| ≤ d( x, x0 ).

Therefore, choosing δ = e gives | f ( x ) − f ( x0 )| < e provided d( x, x0 ) < δ.

We can also define continuity in terms of limits. If f : X → Y, we say that


f ( x ) → y0 as x → x0 , or

lim f ( x ) = y0 ,
x → x0

if for every e > 0 there is a δ > 0 such that 0 < d X ( x, x0 ) < δ implies that
dY ( f ( x ), y0 ) < e. Similarly to the concept of limit for real functions studied in
first year’s calculus, the above definition does not prescribe any requirement
on the value of f on the point x0 . In fact, such a concept can be extended to
a point x0 which is the limit of a sequence on which the function f is well
defined. More precisely, let f : D → Y, with D ⊂ X. Let x0 ∈ X such that x0 is
the limit of a sequence (yn ) ⊂ D. We say that f has limit y0 at the point x0 if
for every e > 0 there is a δ > 0 such that 0 < d X ( x, x0 ) < δ and x ∈ D implies
that dY ( f ( x ), y0 ) < e. A function f : X → Y is continuous at x0 ∈ X if

lim f ( x ) = f ( x0 ),
x → x0

meaning that the limit of f as x → x0 exists and is equal to f ( x0 ).


If f : X → Y and E is a subset of X, then we say that f is continuous on
E if it is continuous at every point x ∈ E. This property is, in general, not
equivalent to the continuity of the restriction f E of f on E, as shown in the
next example.

17
1. Metrics, norms, topologies

1.23 example. Let f : R → R defined by


(
1 if x ∈ Q,
f (x) =
0 if x 6∈ Q.

The function f is discontinuous at every point of R, but f |Q : Q → R is the


constant function f |Q ( x ) = 1, so f |Q is continuous on Q.

A subtle, but important, strengthening of continuity is uniform continuity.

1.24 definition. A function f : X → Y is uniformly continuous on X if for every


e > 0 there is a δ > 0 such that d X ( x, y) < δ implies dY ( f ( x ), f (y)) < e for all
x, y ∈ X.

The crucial difference between definition 1.21 and definition 1.24 is that
the value of δ does not depend on the point x ∈ X in the latter, so that
f (y) gets closer to f ( x ) at a uniform rate as y gets closed to x. For example,
r : (0, 1) → R defined by r ( x ) = 1/x is continuous on (0, 1) but not uniformly.
In the following, we will denote all metrics by d when it is clear from the
context which metric is meant.
There is a useful equivalent way to characterise continuous functions on
metric spaces in terms of sequences.

1.25 definition. A function f : X → Y is sequentially continuous at x ∈ X if


for every sequence ( xn )n in X that converges to x, the sequence ( f ( xn ))n in Y
converges to f ( x ) ∈ Y.

1.26 proposition. Let X, Y be metric spaces. A function f : X → Y is continuous


at x ∈ X if and only if it is sequentially continuous at x.

Proof. First, we show that if f is continuous, then it is sequentially continu-


ous. Suppose that f is continuous at x, and let xn → x. Let e > 0 be given.
By the continuity of f , we can choose δ > 0 such that d( xn , x ) < δ implies
d( f ( xn ), f ( x )) < e. By the convergence or ( xn )n , we can choose N so that
n ≥ N implies d( x, xn ) < δ. Therefore, n ≥ N implies d( f ( xn ), f ( x )) < e, and
this means that f ( xn ) → f ( xn ).
To prove the converse, we prove that if f is discontinuous, then it is not
sequentially continuous. If f is discontinuous at x, then there is an e > 0
such that for every n ∈ N there exists xn ∈ X with d( xn , x ) < 1/n and
d( f ( xn ), f ( x )) ≥ e. The sequence ( xn ) constructed converges to x but ( f ( xn ))
does not converge to f ( x ). Hence, f is not sequentially continuous.

Similarly to what we showed for convergence of sequences, the notion of


continuity for a function pretty much depends on the distance one is consid-
ering on the metric space.

1.27 example. Let X = R, and let d1 and d2 be as in the example 1.17. Con-
sider the function f : ( X, d1 ) → ( X, d1 ) given by f ( x ) = x. This function
is continuous, as we all know (a straight line on the real numbers equipped

18
1.4. Topological spaces

with the classical Euclidean distance). Let us now consider the same function
between ( X, d1 ) to ( X, d2 ). In order to have f continuous, every converging se-
quence in ( X, d1 ) should be mapped via f to converging sequence in ( X, d2 ).
But the only converging sequences with respect to the d2 distance are those
which are eventually constant. So, take the sequence xn = 1/n converging
to 0 in d1 . Its image is f ( xn ) = xn , which does not converge to 0 in d2 . So,
f : ( X, d1 ) → ( X, d2 ) is not continuous.

There are two kinds of ‘half-continuous’ real-valued functions, defined as


follows.

1.28 definition. A function f : X → R is upper semicontinuous on X if for all


x ∈ X and every sequence xn → x, we have

lim sup f ( xn ) ≤ f ( x ).
n→+∞

A function f : X → R is lower semicontinuous on X if for all x ∈ X and every


sequence xn → x, we have

lim inf f ( xn ) ≥ f ( x ).
n→+∞

1.29 example. Let X = R equipped with the usual distance d( x, y) = | x − y|.


Consider the function
(
0 if x < 0
f (x) =
1 if x ≥ 0.

Prove that f is upper semi-continuous at x = 0 but not lower semi-continuous


at x = 0.

1.30 exercise. Prove that a function f : X → R is continuous if and only if it


is upper and lower semicontinuous.

1.4 Topological spaces

The notion of topological space is defined by means of rather simple and


abstract axioms. It is very useful as an ‘umbrella’ concept which allows to use
the geometric language and the geometric way of thinking in a broad variety
of vastly different situations, which include metric spaces as a special case.

1.31 definition. A topological space is a pair ( X, τ ), where X is a set and a


τ ⊂ P ( X ) is a family of subsets of X called the topology of X, whose elements
are called open sets, such that

(i) ∅, X ∈ τ (the empty set and the whole set are open sets).

(ii) If {Oα }α∈ A ⊂ τ is an arbitrary family of open sets, then ∈ τ


S
α ∈ A Oα
(the union of an arbitrary family of open sets is open).

19
1. Metrics, norms, topologies

(iii) If {O j } N
j=1 ⊂ τ, then O1 ∩ . . . ∩ O N ∈ τ (the intersection of finite number
of open sets is open).

If x ∈ X, then an open set containing x is called an (open) neighborhood of x.


The complements of open sets are called closed sets.

We will often omit the topology τ, and refer to X as a topological space


assuming that the topology has been described.

1.32 example. Let X = R and let us define as open sets O ⊂ R all subsets of
R with the property that, for all x ∈ O, there exists ε > 0 such that the interval
( x − ε, x + ε) ⊂ O. Then, the above family of open sets defines a topology on
R (exercise!).

1.33 example. If in the set of real numbers R we declare open (besides the
empty set and R) all the half-lines { x ∈ R : x ≥ a}, a ∈ R, then we do
not obtain a topological space: the first and third axiom of topological spaces
hold, but the second one does not (e.g. for the collection of all half lines with
positive endpoints).

1.34 example. Let X be a set. The discrete topology on X is τ = P ( X ). Check


that τ is a topology. The indiscrete topology, or trivial topology on X is τ =
{∅, X }. Check that the trivial topology is also a topology.

1.35 exercise. Let {Cα }α∈ A be an arbitrary family of closed sets in a topolog-
T
ical space X. Prove that the intersection α∈ A Cα is still a closed set.

1.36 definition. The closure A of a set A ⊂ X is the smallest closed set con-
taining A, that is
\
A= {C : A ⊂ C , C closed set} .

A set A ⊂ X is called dense in X if A = X. A set A ⊂ X is called nowhere dense


if X \ A is dense. A point x ∈ X is called an accumulation point (or a limit point)
of a set A ⊂ X if every neighborhood of x contains infinitely many points of
A. A point x ∈ A is called an interior point of A if there exists a neighborhood
of x entirely contained in A. The set of all interior points of A is called the
interior of A, and is denoted by A◦ .

1.37 exercise. Prove that a set A is open if and only if A = A◦ . Prove that a
set A is closed if and only if A = A.

1.38 definition. A topological space X is said to be separable if there exists a


countable, dense subset S ⊂ X.

1.39 exercise. Show that R with the usual Euclidean topology is a separable
space. Show that R endowed with the discrete topology (every set is open) is
not separable.

1.40 definition. A point x ∈ X is called a boundary point of a set A ⊂ X if it

20
1.5. The topology of metric spaces

is neither an interior point of A nor it is an interior point of Ac = X \ A. The


set of boundary points of A is called the boundary of A and is denoted by ∂A.

1.41 exercise. For very set A ⊂ X, prove that A = A ∪ ∂A. Consequently,


prove that a set C ⊂ X is closed if and only if C contains its boundary.

We now introduce the concept of convergent sequence in a topological


space.

1.42 definition. A sequence { xn }n∈N ⊂ X is said to converge to x ∈ X if for


every open set O ⊂ X containing x there exists N ∈ N such that { xn }n≥ N ⊂ O.
Any such point x is said a limit for the sequence { xn }n .

1.43 exercise.

• Let X = R with the discrete topology (all sets are open). Prove that any
subset S ⊂ R has neither accumulation nor boundary points, prove that
the closure (as well as the interior) of every set S is the S itself, prove
that the sequence 1/n does not converge to 0.

• Let X = R with the trivial topology (only the empty set and R are
open). Prove that every sequence in R is convergent to any arbitrary
point x ∈ R.

The latter example above shows in particular that limits may be not unique
in a general topological space.
We now introduce the concept of continuity for a function between two
topological spaces.
The topological definition of continuity is simpler and more natural than
the e, δ definition for metric spaces.

1.44 definition. Let ( X, τ ), (Y, σ ) be topological spaces. A map f : X → Y


is said to be continuous if O ∈ σ implies f −1 (O) ∈ τ (pre-images of open sets
are open). f is an open map if O ∈ τ implies f (O) ∈ σ (images of open sets are
open). f is continuous at a point x ∈ X if for any neighborhood A of f ( x ) in
Y, the pre-image f −1 ( A) contains a neighborhood of x in X.

1.45 exercise. Prove that a function f is continuous if and only if it is contin-


uous at every point.

1.5 The topology of metric spaces

The concepts of convergence of a sequence in a metric space (introduced in


Definition 1.15) and of continuous function between two metric spaces (intro-
duced in Definition 1.21) can be formulated without the use of topologies and
open sets. On the other hand, once the open sets are known, these two con-
cepts are very naturally defined in a topological space, as seen in Definitions
1.42 and 1.44. Hence, in order to unify those concepts, we need to provide a
standard way to equip all metric spaces with a topology.

21
1. Metrics, norms, topologies

Let ( X, d) be a metric space. The open ball, Br ( a), with radius r > 0 and
center a ∈ X is the set
.
Br ( a) = { x ∈ X | d( x, a) < r }.

The closed ball Br ( a), is the set


.
Br ( a) = { x ∈ X | d( x, a) ≤ r }.

1.46 exercise. Let X = R and let d1 and d2 the two distances on R defined in
the example 1.17. Find B1/2 (0).

1.47 definition. A subset G of a metric space X is open if for every x ∈ G


there is an r > 0 such that Br ( x ) is contained in G.

1.48 exercise. Let X be a metric space. Prove that

• the empty set ∅ and the whole space X are open,

• a finite intersection of open sets is open,

• an arbitrary union of open sets is open.

We prove here the second statement. Let A1 , . . . , An be open sets. Let A =


A1 ∩ . . . ∩ An . In order to prove that A is open we must provide, for a given
x ∈ A, a positive e such that Be ( x ) ⊂ A. Now, x ∈ A means x ∈ Ai for all
i = 1, . . . , n, and since each of those sets is an open set, there is an ei > 0 such
that Bei ( x ) ⊂ Ai . Let

e = min {e1 , . . . , en } .

Clearly, Be ( x ) ⊂ Bei ( x ) ⊂ Ai for all i, and therefore Be ( x ) ⊂ A.

The example below clarifies that the above reasoning may fail in case we
are dealing with infinitely many open sets.

1.49 example. The interval (−1/n, 1) is open in R for every n ∈ N, but the
intersection
+
\ ∞
In = [0, 1)
n =1

is not open (please, spend some time in proving the above identity as an
exercise in case you are not convinced about it). Thus, an infinite intersection
of open sets need not be open.

As a consequence of the above exercise 1.48, the family of open sets in a


metric space X according to Definition 1.47 is a topology. With such an identi-
fication, all concepts we defined for topological spaces can be formulated for
metric spaces. More precisely, every metric space can be considered as a topological
space, the topology being the one defined in Definition 1.47. As a consequence, we

22
1.5. The topology of metric spaces

can define closed sets in a metric space as all sets of the form X \ G with G
open.
First of all, we need to check that the concepts of convergence and conti-
nuity we provided for metric and topological spaces independently coincide
on metric spaces.

1.50 exercise. Let xn be a sequence in a metric space and let x ∈ X. Prove that
xn converges to x in the sense of Definition 1.15 if and only if xn converges to
x in the sense of Definition 1.42, with X a topological spaces in the sense of
Definition 1.47.

1.51 exercise. Let X and Y be two topological spaces and let f : X → Y be a


function. Prove that f is continuous in the sense of definition 1.44 if and only
if it is continuous in the sense of Definition 1.21.

Closed sets in a metric spaces can be given an alternative, sequential char-


acterisation as sets that contain their limit points.

1.52 proposition. A subset F of a metric space is closed if and only if every conver-
gent sequence in X with elements in F converges to a limit in F. That is, if xn → x
and xn ∈ F for all n, then x ∈ F.

Proof. Assume first that F is closed, and let xn ∈ F with xn → x. Assume by


contradiction that x ∈ F c . Since F is closed, then F c is open. Hence, there is an
open ball Br ( x ) contained in F c . This implies that no elements of the sequence
xn are contained in Br ( x ), and this contradicts the fact that xn converges to x.
Indeed, for e = r there is no n ∈ N such that d( xn , x ) < e.
Assume now that every convergent sequence in X with elements in F con-
verges to a limit in F. We want to prove that F is closed. Assume by contra-
diction that F is not closed. This means that F c is not open. Therefore, there
exists x ∈ F c such that every open ball Be ( x ) intersects F. In particular, for all
n ∈ N there is xn ∈ F such that d( x, xn ) < 1/n. Clearly xn converges to x, but
x does not belong to F, which contradicts the starting assumption.

1.53 exercise. Prove that a subset of a complete metric space is complete if


and only if it is closed.

The closure of a set A can be also obtained by adding to A all limits of


convergent sequences of elements of A. That is,

A = { x ∈ X : there is a sequence ( an ) ⊂ A such that an → x }. (6)

It follows from (6) that A is a dense subset of the metric space X if and only
if for every x ∈ X there is a sequence ( an ) in A such that an → x. Thus, every
point in X can be approximated arbitrarily closely by points in the dense set
A. We will encounter many dense sets later on.
This property has repercussions also on the concept of separable topologi-
cal space that we introduced in Definition 1.38. A metric space is said to be
separable if it is separable as a topological space, with the usual topology

23
1. Metrics, norms, topologies

introduced at the beginning of this subsection.


For example, R with the usual standard metric is separable because Q is a
countable dense subset. To see this, it suffices to write an arbitrary real number
in decimal form and set zero on all the decimal digits from the (n + 1)-th
onward. This defines an approximating sequence in Q.

1.54 example. The metric considered on a given set is crucial to determine


whether or not a subset is dense. For example, consider R with the discrete
distance
(
1 if x 6= y
d( x, y) =
0 if x = y.

Now,

a sequence ( xn ) in R converges to x in the discrete distance


if and only if there exists N ∈ N such that xn = x for all n ≥ N. (7)

To see this, we apply the definition of convergence with e = 1/2, that is a


sequence xn converges to x if there is N ∈ N such that d( x, xn ) < 1/2 for
all n ≥ N. But the definition of discrete distance implies that d( x, xn ) can
only be less than 1/2 if it is zero, that is if xn = x. This proves (7). Now, it is
quite clear that Q cannot be dense in R with such a distance. Indeed, the only
subset of R which is dense in R equipped with the discrete distance is R itself.
If A ⊂ R with A 6= R, let x ∈ R \ A, assuming by contradiction that ( xn ) is a
sequence in A converging to x, the fact (7) implies xn = x for all n larger than
some N ∈ N, but this is impossible since x is not in A and xn is in A. As a
consequence, R is not separable when equipped with the discrete distance.

According to Definition 1.47, a set U in a metric space X is a neighborhood


of x if U contains a ball Br ( x ) centered at x for some r > 0. Definition 1.15
for the convergence of a sequence can therefore be rephrased in the following
way. A sequence ( xn ) converges to x if for every neighborhood U of x there is
an N ∈ N such that xn ∈ U for all n ≥ N.

1.6 Compactness

Compactness is one of the most important concepts in analysis. A simple and


useful way to define compact sets in a metric space is by means of sequences.
We first recall the concept of subsequence of a sequence in a metric space.

1.55 definition. Let { xn }n∈N be a sequence in the metric space ( X, d). A


subsequence { xnk }k∈N of { xn } is a sequence

N 3 k 7 → xnk

such that the map

N 3 k 7→ nk ∈ N

24
1.6. Compactness

is strictly increasing.

1.56 definition. A subset K of a metric space X is sequentially compact if every


sequence in K has a convergent subsequence whose limit belongs to K.

We can take K = X in this definition, so that a metric space X is sequen-


tially compact if every sequence in X has a convergent subsequence. A subset
K of ( X, d) is sequentially compact if and only if the metric subspace (K, dK )
is sequentially compact.

1.57 example. The space of real numbers R is not sequentially compact. For
example, the sequence ( xn ) with xn = n has no convergent subsequence be-
cause | xn − xm | ≥ 1 for all m 6= n. The closed, bounded interval [0, 1] is
sequentially compact, as we prove below. The half-open interval (0, 1] is not
a sequentially compact subset of R, because the sequence (1/n) converges
to 0, and therefore has not subsequence with limit in (0, 1]. The limit does,
however, belong to [0, 1].

The full importance of compact sets will become clear only in the setting of
infinite-dimensional normed spaces. It is nevertheless interesting to start with
the finite-dimensional case. Compact subsets of Rn have a simple, explicit
characterisation.

1.58 theorem (Heine-Borel). A subset of Rn is sequentially compact if and only if


it is closed and bounded.

The fact that closed, bounded sets of Rn are sequentially compact is a


consequence of the following, well-known theorem, called Bolzano-Weierstrass
theorem.

1.59 theorem (Bolzano-Weierstrass). Every bounded sequence in Rn has a con-


vergent subsequence.

Compactness may be rephrased in ways that do not involve sequences. In


fact, compactness is a topological property. We explain that in what follows.
Let A be a subset of a topological space X. We say that a collection { Gα :
α ∈ A} of subsets of X is a cover of A if its union contains A, meaning that
[
A⊂ Gα .
α∈A

We stress that the number of sets in the cover is not required to be countable.
Indeed, the set of indexes A may have an arbitrary cardinality. If every Gα in
the cover is open, then we say that { Gα } is an open cover of A.
Let e > 0. A subset { xα : α ∈ A} of X is called an e-net of the subset A
if the family of open balls { Be ( xα ) : α ∈ A} is an open cover of A. If the set
{ xα } is finite, then we say that { xα } is a finite e-net of A.

1.60 definition (Total boundedness). A subset of a metric space is totally


bounded if it has a finite e-net for every e > 0.

25
1. Metrics, norms, topologies

That is, a subset A of a metric space X is totally bounded if for every e > 0
there is a finite set of points { x1 , x2 , . . . , xn } in X such that A ⊂ in=1 Be ( xi ).
S

We say that a cover { Gα } of A has a finite subcover if there is a finite


subcollection of sets { Gα1 , . . . , Gαn } such that A ⊂ in=1 Gαi .
S

1.61 definition (Compactness). A subset K of a metric space X is compact if


every open cover of K has a finite subcover.

1.62 example. The space of real numbers R is not compact, since the open
cover {(n − 1, n + 1) : n ∈ Z} of R has no finite subcover. The half-open
interval (0, 1] is not compact, since the open cover {(1/2n, 2/n) : n ∈ N}
has no finite subcover. If this open cover is extended to an open cover of
[0, 1], then the extension must contain an open neighborhood of 0. This open
neighborhood, together with a finite number of sets from the previous cover
of (0, 1], is a finite subcover of [0, 1], which is not surprising, since [0, 1] is
indeed compact.

1.63 exercise. Prove that every totally bounded subset of a metric space is
bounded.

The next Theorem is of paramount importance in that it establishes three


equivalent formulations for compactness.

1.64 theorem. Let ( X, d) be a metric space, let K ⊂ X. Then, the following three
conditions are equivalent:

• K is compact

• K is sequentially compact

• K is totally bounded and complete

Proof. Omitted

1.65 lemma. Let ( X, d) be a metric space and let A ⊂ X. Prove that A is dense if
and only if for all x ∈ X and for all ε > 0 there exists a ∈ A such that d( x, a) < ε.

Proof. Assume A is dense, and let x ∈ X. Assume by contradiction that there


exists ε > 0 such that no a ∈ X with d( x, a) < ε belongs to A. As a conse-
quence Bε ( x ) ⊂ X \ A, which implies that no sequences in A can converge to
x (otherwise infinitely many elements of the sequence would be in Bε ( x ), a
contradiction). Assume now that for all x ∈ X and for all ε > 0 there exists
a ∈ A such that d( x, a) < ε. We aim at proving that A is dense, that is A = X,
which is equivalent to X ⊂ A. Let x ∈ X. Due to our hypothesis, we choose
ε = 1/n and get the existence of a point an ∈ A with d( x, an ) < 1/n. Hence,
an is a sequence in A that converges to x, which means that x ∈ A.

The above Lemma may be used to prove the following Lemma.

1.66 lemma. A sequentially compact metric space is separable.

26
1.6. Compactness

Proof. By theorem 1.64, there is a finite (1/n)-net An of a sequentially compact


S ∞
space K for every n ∈ N. Let A = + 5
n=1 An is countable Moreover, A is dense
in K by using the exercise 1.65.

1.67 lemma. Every compact subset K of a metric space X is closed and bounded.

Proof. Using Proposition 1.52, let xn ∈ K be a sequence in K with limit x ∈ X.


K is compact, and hence sequentially compact, due to Theorem 1.64, therefore
x ∈ K as it is the limit of all subsequences. Let us now assume that K is not
bounded. Let x1 ∈ K. For every n ∈ N there exists x2 ∈ K with d( x1 , x2 ) > 1
as K is not bounded. Then, for the same reason there exists x3 ∈ K such that
d( x1 , x3 ) > 2. Inductively, there exists a sequence xn ∈ K such that d( x1 , xn ) >
n. Hence, the family of open balls Bn ( xn ) clearly covers K but they have no
finite subcover, which contradicts compactness.

In the future, we will abbreviate ‘sequentially compact’ to ‘compact’ when


referring to metric spaces. The following terminology if often convenient.

1.68 definition. A subset A of a metric space X is precompact if its closure in


X is compact.

The term relatively compact is frequently used instead of ‘precompact’. This


definition means that A is precompact if every sequence in A has a convergent
subsequence. The limit of the subsequence can be any point in X, and is not
required to belong to A. Since compact sets are closed, a set is compact if and
only if it is closed and precompact. A subset of a complete metric space is
precompact if and only if it is totally bounded.

1.69 example. A subset of Rn is precompact if and only if it is bounded.

Continuous functions on compact sets have several nice properties. From


proposition 1.26, continuous functions preserve the convergence of sequences.
It follows immediately from definition 1.56 that continuous functions preserve
compactness.

1.70 theorem. Let f : K → Y be continuous on K, where K is a compact metric


space and Y is any metric space. Then f (K ) is compact in Y.

Since compact sets are bounded, continuous functions on a compact sets


are bounded. Moreover, continuous functions on compact sets are uniformly
continuous.

1.71 theorem. Let f : K → Y be a continuous function on a compact set K. Then f


is uniformly continuous.

Proof. Suppose that f is not uniformly continuous. Then there is an e > 0 such
that for all δ > 0 there are x, y ∈ K with d( x, y) < δ and d( f ( x ), f (y)) ≥ e.
Taking δ = 1/n for n ∈ N, we find that there are sequences ( xn ) and (yn ) in
5 Exercise: prove that the countable union of finite sets is countable.

27
1. Metrics, norms, topologies

K such that
1
d( xn , yn ) < , d( f ( xn ), f (yn )) ≥ e . (8)
n
Since K is compact there are convergent subsequences of ( xn ) and (yn ) which,
for simplicity, we again denote by ( xn ) and (yn ). From (8), the subsequences
converge to the same limit, but the sequences ( f ( xn )) and ( f (yn )) do not
converge to the same limit. This contradicts the continuity of f .

As highlighted in fact 0.6 and example 0.7, maximum and minimum prob-
lems are of central importance in applications. The mathematical formulation
of these problems is the maximisation or minimisation of a real-valued func-
tion f on a state space X. Each point of the state space, which is often a metric
space, represents a possible state of the system. The existence of a maximis-
ing, or minimising, point of f in X may not be at all clear; indeed, such a
point may not exist. The following theorem gives sufficient conditions for the
existence of maximising or minimising points - namely, that the function f is
continuous and the state space X is compact. Although these conditions are
fundamental, they are too strong to be useful in many applications. We will
return to these issues later on.

1.72 theorem. Let K be a compact metric space and f : K → R a continuous real-


valued function. Then, f is bounded on K and attains its maximum and minimum.
That is, there are points x, y ∈ X such that

f ( x ) = inf f (z) f (y) = sup f (z).


z∈K z∈K

Proof. From theorem 1.70, the image f (K ) is a compact subset of R, and there-
fore f is bounded by the Heine-Borel theorem 1.58. It is enough to prove that
f attains its minimum, because the application of the result to − f implies that
f attains its maximum. Since f is bounded, it is bounded from below, and the
infimum m of f on K is finite. By the definition of the infimum, for each n ∈ N
there is an xn ∈ K such that

1
m ≤ f ( xn ) < m + .
n
This inequality implies that

lim f ( xn ) = m. (9)
n→+∞

The sequence ( xn ) need not converge, but since K is compact the sequence
has a convergent subsequence, which we denote by ( xnk ). We denote the limit
of the subsequence by x. Then, since f is continuous, we have from (9) that

f ( x ) = lim f ( xnk ) = m.
k →+∞

Therefore, f attains its infimum m at x.

28
1.7. Finite-dimensional Banach spaces

The strategy of this proof is typical of many compactness arguments. We


construct a sequence of approximate solutions of our problem, in this case
a minimising sequence ( xn ) that satisfies (9). We use compactness to extract
a convergent subsequence, and show that the limit of the convergent subse-
quence is a solution of our problem, in this case a point where f attains its
infimum.

1.7 Finite-dimensional Banach spaces


In this subsection, we prove that in every finite-dimensional (real or complex)
normed linear space all norms are equivalent. This statement is not true for
infinite-dimensional linear spaces. As a result, topological considerations can
often be neglected when dealing with finite-dimensional spaces but are of
crucial importance on infinite-dimensional spaces.

1.73 definition. Let X be a linear space. Two norms k · k1 and k · k2 on X are


equivalent if there are constants c > 0 and C > 0 such that

c k x k1 ≤ k x k2 ≤ C k x k1 for all x ∈ X.

It is clear that if two norms are equivalent, then the two normed spaces
( X, k · k1 ) and ( X, k · k2 ) have the same topology, i. e. they have the same
convergent sequences (exercise).
Geometrically, two norms are equivalent if the unit ball of either one of the
norms is contained in a ball of finite radius of the other norm.

1.74 lemma. Let X be a finite-dimensional normed linear space with norm k · k, and
{e1 , . . . , en } any basis of X. There are constants m > 0 and M > 0 such that if
x = x1 e1 + x2 e2 + . . . xn en , then
n n
m ∑ | x i | ≤ k x k ≤ M ∑ | x i |. (10)
i =1 i =1
.
Proof. It suffices to prove the assertion for x ∈ X such that k x k1 = ∑in=1 | xi | =
1. Indeed, for a general x ∈ X, let x̃ = k xxk , we would have then m ≤ k x̃ k ≤ M,
1
i.e. mk x k1 ≤ k x k ≤ M k x k1 . Now, the cube

C = {( x1 , . . . , xn ) ∈ Rn : k x k1 = 1}

is a closed, bounded subset of Rn , and is therefore compact by the Heine-Borel


theorem. We define a function f : C → X by
n
f (( x1 , . . . , xn )) = ∑ x i ei .
i =1

For ( x1 , . . . , xn ), (y1 , . . . , yn ) ∈ Rn , we have


n n
k f (( x1 , . . . , xn )) − f ((y1 , . . . , yn ))k ≤ ∑ |xi − yi |kei k ≤ B ∑ |xi − yi |,
i =1 i =1

29
1. Metrics, norms, topologies

where B = maxi kei k, therefore f is continuous. Since k · k : X → R is contin-


uous, the map

Rn 3 ( x1 , . . . , xn ) 7→ k f (( x1 , . . . , xn ))k ∈ R

is continuous. Theorem 1.72 implies that k f k is bounded on C and attains its


infimum and supremum. Denoting the minimum by m ≥ 0 and the maximum
by M ≥ m, we obtain the assertion except that we still have to prove that
m > 0. Assume by contradiction that m = 0. This means that there exists
xmin ∈ C such that f ( xmin ) = 0. By definition of f this implies xmin = 0, a
contradiction because xmin 6∈ C.

1.75 theorem. Every finite-dimensional normed linear space is a Banach space.

Proof. Suppose ( xk ) is a Cauchy sequence in a finite-dimensional normed lin-


ear space X. Let {e1 , . . . , en } be a basis of X. We expand xk as
n
xk = ∑ xk,i ei ,
i =1

where xi,k ∈ R. For 1 ≤ i ≤ n, we consider the real sequence of i-th compo-



nents, ( xk,i )+
k =1 . Equation (10) implies that

1
| xk,i − xh,i | ≤ k x − x h k,
m k

so ( xk,i )+
k =1 is a Cauchy sequence in R. Since R is complete, there is a yi ∈ R
such that

lim xk,i = yi .
k →+∞

We define y ∈ X by
n
y= ∑ y i ei .
i =1

Then, from (10),


n
k xk − yk ≤ M ∑ | xk,i − yi |kei k,
i =1

and hence xk → y as k → +∞. Thus, every Cauchy sequence in X converges,


and X is complete.

As a consequence, we have the following corollary.

1.76 corollary. Every finite-dimensional linear subspace of a normed linear space


is closed.

Finally, we show that although there are many different norms on a finite-

30
1.8. ` p spaces

dimensional linear space, they all lead to the same topology and the same
notion of convergence.

1.77 theorem. Any two norms on a finite-dimensional space are equivalent.

Proof. Let k · k1 and k · k2 be two norms on a finite-dimensional space X. We


choose a basis {e1 , . . . , en } of X. then Lemma 1.74 implies that there are strictly
positive constants m1 , m2 , M1 , M2 such that if x = ∑in=1 xi ei then
n n
m1 ∑ |xi | ≤ k xk1 ≤ M1 ∑ |xi |,
i =1 i =1
n n
m2 ∑ |xi | ≤ k xk2 ≤ M2 ∑ |xi |.
i =1 i =1

Hence, we have
M1 M
k x k1 ≤ k x k2 ≤ 2 k x k1 .
m1 m1

1.8 ` p spaces
In this subsection we introduce the simplest example of infinite dimensional
normed spaces. Intuitively, what we will introduce here is an infinite dimen-
sional version of the linear space Rd , namely a set of vectors with infinitely
many components.

1.78 definition (` p spaces). Let p ∈ [1, +∞) be a real number. We say that a
sequence x = { xk }k∈N of real numbers is in ` p if

+∞
∑ |xk | p < +∞.
k =1

The space of real sequences with the above property is called ` p . For all x ∈ ` p ,
the quantity
#1/p
+∞
"
k x k` p := ∑ | xk | p
k =1

is called the ` p -norm of x. The space `∞ is the space of bounded sequences, i.


e.

sup | xk | < +∞.


k ∈N

For all x ∈ `∞ , the quantity

k x k`∞ := sup | xk |
k ∈N

31
1. Metrics, norms, topologies

is called the `∞ -norm of x.

For an exponent p ∈ [1, +∞] we define the conjugate exponent of p as the


exponent p0 ∈ [1, +∞] such that

1 1
+ 0 = 1,
p p

or equivalently
p
p0 = ,
p−1

with the convention that the conjugate of 1 is +∞ and vice versa.


The space ` p can be seen as a subset of the vector space of all sequences of
real numbers, with the obvious operations

• x = { xk }k ∈ ` p , y = {yk }k ∈ ` p , x + y = { xk + yk }k

• x = { xk }k ∈ ` p , λ ∈ R, λx = {λxk }k .

Proving that the sum between two vectors in well defined, as well as prov-
ing that k x k` p is an actual norm, is not immediate. The first two properties of
a norm (i. e. k x k` p = 0 implies x = 0, and kλx k` p = |λ|k x k` p ) are trivial. For
the third one, i. e. the triangular inequality, we have to struggle a bit more.
Once we have that, the sum between vectors will also be well defined, and we
shall have a nice family of normed spaces to work with.

1.79 exercise (Young’s inequality). Let p ∈ [1, +∞) and let p0 be its conjugate.
Let a, b ≥ 0 be two positive numbers. Then,
0
ap bp
ab ≤ + 0.
p p

Solution. If ab = 0 there is nothing to prove. Assume a > b > 0. Set A = a p


0
and B = b p . We need to prove that

0 A B
A1/p B1/p ≤ + 0.
p p

Multiplication by 1/B makes the above inequality equivalent to


   1/p
1 A 1 A
+ ≥
p B p0 B
A
where we have used p10 − 1 = − 1p . Now, set t = B ≥ 1. The above becomes
equivalent to proving that

1 1
t + 0 ≥ t1/p for all t ≥ 1.
p p

32
1.8. ` p spaces

1 −1
But the function φ(t) = 1p t + p10 − t1/p satisfies φ(1) = 0, φ0 (t) = 1p − 1p t p ,
which is ≥ 0 for t ≥ 1. Therefore, φ(t) ≥ 0 for all t ≥ 1, which proves the
assertion.

1.80 exercise (Discrete Hölder’s inequality). Let x = { xk }k ∈ ` p and y =


{yk }k ∈ ` p0 with p, p0 ∈ [1, +∞) conjugate numbers. Prove that

+∞
∑ |xk ||yk | ≤ k xk` p kyk` p0 .
k =1

Solution. Set
xk yk
Xk = , Yk = , for all k ≥ 1.
k x k` p k y k ` p0


We need to prove that ∑+
k =1 | Xk ||Yk | ≤ 1. From Young’s inequality, | Xk ||Yk | ≤
0
| Xk | p |Yk | p
p + p0 for all k ≥ 1, and taking the sum over k we get

+∞ +∞ +∞
1 1 0 1 1
∑ |Xk ||Yk | ≤ p ∑ |Xk | p + p0 ∑ |Yk | p =
p
+ 0 = 1.
p
k =1 k =1 k =1

We are now ready to prove the triangular inequality on ` p .

1.81 exercise (Discrete Minkowski’s inequality). Let x, y ∈ ` p ∈ [1, +∞].


Prove that x + y ∈ ` p and k x + yk` p ≤ k x k` p + kyk` p .

Solution. For p < +∞, compute

∑ |xk + yk | p = ∑ |xk + yk || xk + yk | p−1


k ≥1 k ≥1

≤ ∑ | xk || xk + yk | p−1 + ∑ |yk ||xk + yk | p−1 ,


k ≥1 k ≥1

where we have used the obvious inequality | xk + yk | ≤ | xk | + |yk |. Now, since


p
p0 = p−1 is conjugate of p, the above discrete Hölder’s inequality implies

!1/p ! p −1
p

∑ | xk + yk | p ≤ ∑ | xk | p ∑ | xk + yk | p
k ≥1 k ≥1 k ≥1
!1/p ! p −1
p
+ ∑ |yk | p
∑ | xk + yk | p
,
k ≥1 k ≥1

which yields
! 1 − p −1 !1/p !1/p
p

∑ | xk + yk | p ≤ ∑ | xk | p + ∑ |yk | p ,
k ≥1 k ≥1 k ≥1

33
1. Metrics, norms, topologies

which proves the assertion. The case p = +∞ is a trivial exercise.

1.82 exercise (Completeness of the ` p spaces). Let p ∈ [1, +∞]. Let xn =


{ xn,k }k be a Cauchy sequence in ` p . Then, xn → x as n → +∞ in k · k` p for
some x ∈ ` p .

Solution.
Let us first consider the case p = 1. Since

∑ |xn,k − xm,k | → 0 as n, m → +∞,


k ≥1

for all k ≥ 1 we have that { xn,k }n is a Cauchy sequence in R, and hence there
exists some xk ∈ R such that xn,k → xk as n → +∞. We need to prove that
x = { xk } ∈ `1 and that k xn − x k`1 → 0 as n → +∞. Let e > 0. The Cauchy
condition on the sequence xn reads
+∞
∑ |xn,k − xm,k | < e for m ≥ n ≥ Ne ,
k =1

for some Ne ∈ N. Now let K ∈ N, from above we have


K
∑ |xn,k − xm,k | < e for m ≥ n ≥ Ne ,
k =1

and since xn,k → xk as n → +∞ (and the above sum has finitely many terms),
we have
K
∑ |xn,k − xk | < e for n ≥ Ne .
k =1

Since Ne does not depend on K, we can take the supremum with respect to K
above and get

K
∑ | xn,k − xk | = sup ∑ |xn,k − xk | < e for n ≥ Ne ,
k ≥1 K ∈N k =1

which shows that k xn − x k`1 → 0 as n → +∞. By triangular inequality, we


then get

k x k`1 ≤ k x − x Ne k`1 + k x Ne k`1 ≤ e + k x Ne k`1 ,

and the last term above is finite.

Now, let p ∈ (1, +∞). Assume { xn }n ∈ ` p is a Cauchy sequence. Hence, as


above we can easily show that there exists a sequence { xk }k of real numbers
such that xn,k → xk as n → +∞ for all k ≥ 1. To prove that x = { xk }k ∈ ` p , for

34
1.9. Exercises

a given e let Ne , k e ∈ N such that, for all n, m ≥ Ne ,

∑ |xn,k − xm,k | p < e, for m ≥ n ≥ Ne . (11)


k ≥1

As before, for all K ∈ N we have


K
∑ |xn,k − xm,k | p < e for m ≥ n ≥ Ne ,
k =1

and the assertion k xn − x k` p → 0 follows similarly as in the case p = 1. The


triangular inequality then proves once again that x ∈ ` p .
Finally, let us consider the case p = +∞. Assume { xn }n ∈ `∞ is a Cauchy
sequence. For a given e > 0, there exists Ne ∈ N such that, for all n, m ≥ Ne
one has

sup | xn,k − xm,k | < e.


k ∈N

This implies that each sequence { xn,k }n (for all k ≥ 1) converges in n to some
xk ∈ R. Let x = { xk }k . We can set m > n and send m → +∞ above and get
| xn,k − xk | ≤ e for all n ≥ Ne . This holds for all k ≥ 1, hence k xn − x k`∞ ≤ e
for all n ≥ Ne . Moreover, k x k`∞ ≤ k x Ne − x k`∞ + k x Ne k`∞ , which proves that
x ∈ `∞

1.9 Exercises

1. Let ( X, k · k) be a linear normed space on R (or C), and let d be the


induced metric on X, i. e. d( x, y) = k x − yk. Prove that

• d is translation invariant, i.e. d( x + z, y + z) = d( x, y) for all x, y, z ∈


X,

• d is 1-homogeneous, i.e. d(λx, λy) = |λ|d( x, y) for all x, y ∈ X and


λ ∈ R (or C).

2. Let ( X, k · k) be a normed linear space. For x, y ∈ X define

. k x − yk
d( x, y) = .
1 + k x − yk

• Prove that ( X, d) is a metric space.

• Prove that, for all x, y, z ∈ X, one has the translation invariance


property

d( x + z, y + z) = d( x, y).

35
1. Metrics, norms, topologies

3. Prove that the map d : R2 × R2 → R defined by

d(( x1 , y1 ), ( x2 , y2 ))
( p
1 if ( x − x2 )2 + ( y1 − y2 )2 ≥ 1
= p p 1
( x1 − x2 )2 + ( y1 − y2 )2 if ( x1 − x2 )2 + ( y1 − y2 )2 < 1

is a metric on R2

4. Let the sequence of rational numbers ( xn ) be defined recursively via the


formula

xn2 + 2
x n +1 = , n = 1, 2, 3, . . . , x1 = 2.
2xn

(a) Prove that xn ≥ 2 for all n ≥ 1.
(b) Prove that xn+1 ≤ xn for all n ≥ 1, i.e. the sequence is increasing in
n (Hint: use (a)).
(c) Prove that ( xn ) is convergent (Hint: use some basic theory of se-
quences, monotone sequences... boundedness...).
(d) Prove that ( xn ) is a Cauchy sequence (Hint: estimate directly the
difference xn − xn+1 ).
(e) Prove that the limit of ( xn ) is irrational.
(f) Use the above to prove that not all Cauchy sequences of rational
numbers are convergent in Q.

5. Let X be a set, and let d be the distance


(
1 if x 6= y
d( x, y) =
0 if x = y.

Let xn be a convergent sequence in the metric space ( X, d). Prove that


there exists N ∈ N such that xn = xm for all n, m ≥ N, i.e. prove that the
sequence is eventually constant.

6. Let A be a subset of a metric space ( X, d). Prove that A is bounded (i.e.


diam A is finite) if and only if there exists x0 ∈ X and M ∈ R such that
d( x0 , x ) ≤ M for all x ∈ A.

7. Let s : R → R defined by s( x ) = x2 . Prove that s is continuous on R but


not uniformly. Prove that s|[ a,b] is uniformly continuous for al a, b ∈ R,
a < b.

8. Prove that an affine function f : Rn → Rm can be written as f ( x ) =


Ax + b, where A is a constant m × n matrix and b is a constant m-vector.

9. Prove that an affine function f : Rn → Rm is uniformly continuous.

36
1.9. Exercises

10. Suppose that ( X, d X ) and (Y, dY ) are metric spaces. Prove that the Carte-
sian product Z = X × Y is a metric space with the metric d defined
by

d ( z 1 , z 2 ) = d X ( x 1 , x 2 ) + dY ( y 1 , y 2 ) ,

where z1 = ( x1 , y1 ) and z2 = ( x2 , y2 ).

11. Let X be a normed linear space. A series ∑+ n=1 xn in X is absolutely con-
+∞
vergent if ∑n=1 k xn k converges to a finite value in R. Prove that X is a
Banach space if and only if every absolutely convergent series converges.

12. Let f : R → R, with R equipped with the usual Euclidean distance. Let
(
x if x ≤ 0
f (x) =
x + 1 if x > 0.

Prove that f is lower semi-continuous.

13. Let ( X, d X ), (Y, dY ), and ( Z, d Z ) be metric spaces and let f : X → Y


and g : Y → Z be continuous functions. Show that the composition
h = g ◦ f : X → Z defined by h( x ) = g( f ( x )) is also continuous.

14. Suppose that F and G are closed and open subsets of Rn , respectively,
such that F ⊂ G. Show that there is a continuous function f : Rn → R
such that

• 0≤ f ≤1
• f ( x ) = 1 for all x ∈ F,
• f ( x ) = 0 for all x ∈ G c .

Hint: consider the function


d( x, G c )
f (x) = .
d( x, G c ) + d( x, f )

This result is called (a special case of) Uhryson’s lemma.

15. Prove that a closed subset of a compact space is compact.

16. Let ( X, d) be a metric space and let Y ⊂ X. Prove that (Y, d) is complete
if and only if Y is a closed subset of X.

37
2. Spaces of continuous functions

2 Spaces of continuous functions


In section 1, we introduced the notion of normed linear space, with finite
dimensional Euclidean space Rn as the main example. In this section, we
study linear spaces of continuous functions on a compact set equipped with
the uniform norm. These function spaces are our first examples of infinite-
dimensional normed linear spaces, and we explore the concepts of conver-
gence, completeness, density, and compactness in this context. More practi-
cally, we learn for the first time how to compute distances between functions.
Functions will be treated as points in a linear space equipped with a norm.
We will focus in particular on the problem of compactness of sets of functions.
As an application, we prove an existence result for initial value problems for
ordinary differential equations.

2.1 Convergence of function sequences


Suppose that ( f n ) is a sequence of real-valued functions f n : X → R defined
on a metric space X. What should we mean by f n → f ? Two natural ways to
answer this question are the following:
• The functions f n are defined by their real values f n ( x ) ∈ R with x vary-
ing in X. So the sequence of functions converges if the values xn ( x )
converge. That is, we say f n → f if f n ( x ) → f ( x ) for all x ∈ X. This
definition reduces the convergence of a function sequence to the conver-
gence of real numbers, which which we are already familiar. This type
of convergence is called pointwise convergence.

• We define a suitable notion of the distance between functions, and say


that f n → f id the distance between f n and f tends to zero. In this
approach, we regard the functions as points in a metric space, and use
metric convergence.
Both of these ideas are useful. It turns out, however, that they are not
compatible. For most domains X - for example any uncountable domain -
pointwise convergence cannot be expressed as convergence with respect to
a metric. The next example shows that pointwise convergence is not a good
notion of convergence to use for continuous functions because it does not
preserve continuity.

2.1 example. We define f n : [0, 1] → R by

f n (x) = xn .

It is easily checked that the sequence ( f n ) converges pointwise to the function


f given by
(
0 if 0 ≤ x < 1
f (x) =
1 if x = 1.

The pointwise limit f is discontinuous at x = 1.

38
2.1. Convergence of function sequences

In view of these somewhat pathological features of pointwise convergence,


we consider metric convergence. As we will see, there are many different ways
to define a distance between functions, and different metrics or norms usually
lead to different types of convergence. A natural norm on spaces of continuous
functions the uniform or sup norm, which is defined by
.
k f k∞ = sup | f ( x )|. (12)
x∈X

The norm k · k∞ is finite if and only if f is bounded. The reason for the nota-
tion will become clear when we study L p spaces later on. Two functions are
close in the metric associated with the uniform norm if their pointwise values
are uniformly close. Metric convergence with respect to the uniform norm is
called uniform convergence.

2.2 definition. A sequence of bounded, real-valued functions ( f n ) on a met-


ric space X converges uniformly to a function f is

lim k f n − f k∞ = 0.
n→+∞

Uniform convergence implies pointwise convergence. This fact is an easy


exercise that we leave to the student. The sequence defined in example 2.1
shows that the opposite implication does not hold, since f n → f pointwise,
but k f n − f k∞ = 1 for every n, and hence k f n − f k∞ does not converge to zero.
Unlike pointwise convergence, uniform convergence preserves continuity.

2.3 theorem. Let ( f n ) be a sequence of continuous, real-valued functions on a metric


space ( X, d). If f n → f uniformly, then f is continuous.

Proof. In order to show that f is continuous at x ∈ X, we need to prove that


for every e > 0 there is a δ > 0 such that d( x, y) < δ implies | f ( x ) − f (y)| < e.
By the triangle inequality, we have

| f ( x ) − f (y)| ≤ | f ( x ) − f n ( x )| + | f n ( x ) − f n (y)| + | f n (y) − f n (y)|.

Since f n → f uniformly, there is an n ∈ N such that


e e
| f ( x ) − f n ( x )| < , | f (y) − f n (y)| < for all x, y ∈ X.
3 3
Since f n is continuous ar x, there is a δ > 0 such that d( x, y) < δ implies that
e
| f n ( x ) − f n (y)| < .
3
It follows that d( x, y) < δ implies | f ( x ) − f (y)| < e, so f is continuous at
x.

The ‘e/3-trick’ used in this proof has many other applications. The proof
fails if f n → f pointwise but not uniformly.
The notion of uniform convergence of a sequence of functions can be easily

39
2. Spaces of continuous functions

extended to series of functions. Given a sequence of functions ( f n ) on a metric


space X, consider the sequence of partial sums
n
Sn ( x ) = ∑ f n ( x ).
k =1

We say that the series ∑ f n converges pointwise at x ∈ X if the series of


real numbers ∑ f n ( x ) converges. We say that ∑ f n converges uniformly if the
sequence of functions (Sn ) converges uniformly.

2.2 Spaces of continuous functions

Let X be a metric space. We denote the set of continuous, real-valued func-


tions f : X → R by C ( X ). The set C ( X ) is a real linear space under the
pointwise addition and functions and the scalar multiplication of functions
by real numbers. That is, for f , g ∈ C ( X ) and λ ∈ R, we define

( f + g)( x ) = f ( x ) + g( x ), (λ f )( x ) = λ f ( x ).

From theorem 1.72, a continuous function f on a compact metric space K is


bounded, so the uniform norm k f k∞ is finite for f ∈ C (K ). It is straightfor-
ward to check that C (K ) equipped with the uniform norm is a normed linear
space. For example, the triangle inequality holds because

k f + gk∞ = sup | f ( x ) + g( x )| ≤ sup | f ( x )| + sup | g( x )| = k f k∞ + k gk∞ .


x ∈K x ∈K x ∈K

We will always use the uniform norm on C (K ), unless we state explicitly


otherwise. A basic property of C (K ) is that it is complete, and therefore a
Banach space.

2.4 theorem. Let K be a compact metric space. The space C (K ) is complete.

Proof. Let ( f n ) be a Cauchy sequence in C (K ) with respect to the uniform


norm. We have to show that ( f n ) converges uniformly. We do this in two
steps. First we construct a candidate function f for the limit if the sequence,
second we show that the sequence converges uniformly to f .
The fact that ( f n ) is Cauchy implies that the sequence of real numbers
( f n ( x )) is Cauchy in R for each x ∈ K. Indeed,

| f n ( x ) − f m ( x )| ≤ k f n − f m k∞ ,

and the latter term is less than e > 0 for n, m larger than some N = N (e).
Since R is complete, the sequence of pointwise values converges, and we can
define a function f : K → R by

f ( x ) = lim f n ( x ).
n→+∞

For the second step, we use the fact that ( f n ) is Cauchy in C (K ) to prove that

40
2.2. Spaces of continuous functions

it converges uniformly to f . Since f m ( x ) → f ( x ) as m → +∞, we have

k f n − f k∞ = sup | f n ( x ) − f ( x )| = sup lim | f n ( x ) − f m ( x )|


x ∈K x ∈K m→+∞
≤ lim inf sup | f n ( x ) − f m ( x )|. (13)
m→+∞ x ∈K

The last inequality in an elementary property of lim inf and sup (exercise).
The fact that ( f n ) is Cauchy in the uniform norm means that for all e > 0
there is an N such that

sup | f n ( x ) − f m ( x )| < e for all n, m ≥ N.


x ∈K

It follows from (13) that k f n − f k∞ ≤ e for all n ≥ N, which proves that


k f n − f k∞ → 0 as n → +∞. By theorem 2.3, the limit function f is continuous,
and therefore belongs to C (K ). Hence, C (K ) is complete.

2.5 example. Suppose K = { x1 , . . . , xn } is a finite space, with metric d defined


by d( xi , x j ) = 1 for i 6= j. A function f : K → R can be identified with a point
y = (y1 , . . . , yn ) ∈ Rn , where f ( x j ) = y j , and

k f k∞ = max |yi |.
1≤ i ≤ n

The space C (K ) is linearly isomorphic to the finite-dimensional space Rn with


the maximum norm, which we have already observed is a Banach space. If
K contains infinitely many points, for example if K = [0, 1], then C (K ) is an
infinite-dimensional Banach space.

The same proof applies to complex-valued functions f : K → C, and the


space of complex-valued continuous functions on a compact metric space is
also a Banach space with the uniform norm.
Equation (12) does not define a norm on C (K ) when X is not compact,
since continuous functions may be unbounded. The space Cb ( X ) of bounded
continuous functions on X is a Banach space with respect to the uniform
norm.

2.6 definition. The support, supp f , of a function f : X → R (or f : X → C)


on a metric space X is the closure of the set on which f is nonzero,
.
supp f = { x ∈ X : f ( x ) 6= 0}.

We say that f has compact support if supp is a compact subset of X, and de-
note the space of continuous functions on X with compact support by Cc ( X ).

The space Cc ( X ) is a linear subspace of Cb ( X ), but it need not be closed,


in which case it is not a Banach space. We denote the closure of Cc ( X ) in
Cb ( X ) by C0 ( X ). Since C0 ( X ) is a closed linear subspace of a Banach space, it
is also a Banach space. We have the following inclusions between these spaces

41
2. Spaces of continuous functions

of continuous functions:

C ( X ) ⊃ Cb ( X ) ⊃ C0 ( X ) ⊃ Cc ( X ).

If X is compact, then these spaces are equal.

2.7 example. A function f : Rd → R has compact support if there is an


R > 0 such that f ( x ) = 0 for all x with k x k > R. The space C0 (Rd ) consists
of continuous functions that vanish at infinity, meaning that for every e > 0
there is an R > 0 such that k x k > R implies that | f ( x )| < e. We write this
condition as limk xk→+∞ f ( x ) = 0.

2.8 example. Consider real functions f : R → R. Then f ( x ) = x2 is in C (R)


but not in Cb (R). The constant function f ( x ) = 1 is in Cb (R) but not C0 (R).
2
The function f ( x ) = e− x is in C0 (R) but not in Cc (R). The function
(
1 − x2 if | x | ≤ 1
f (x) =
0 if | x | > 1

is in Cc (R).

A polynomial p : [ a, b] → R on a closed, bounded interval [ a, b] is a function


of the form
n
p( x ) = ∑ ck x k ,
k =0

where the coefficients ck are real numbers. If cn 6= 0 and cm = 0 for all m > n,
then the integer n ≥ 0 is called the degree of p. Clearly, polynomials are a
special case of continuous functions. The Weierstrass Approximation Theorem
states that every continuous function f : [ a, b] → R can be approximated by a
polynomial with arbitrary accuracy in the uniform norm.

2.9 theorem (Weierstrass approximation). The set of polynomials on [ a, b] is


dense in C ([ a, b]).

The Weierstrass approximation theorem differs from Taylor’s theorem,


which states that a function with sufficiently many derivatives can be ap-
proximated locally by its Taylor polynomial. The Weierstrass approximation
theorem applied to a continuous function, which need not be differentiable,
and states that there is a global polynomial approximation of the function on
the whole interval [ a, b].

2.3 Compact subsets of C (K )


The proof of the Heine-Borel theorem, that a subset of Rn is compact if and
only if it is closed and bounded, uses the finite-dimensionality of Rn in an es-
sential way. Compact subsets of infinite-dimensional normed spaces are also
closed and bounded, but these properties are no longer sufficient for com-

42
2.3. Compact subsets of C (K )

pactness. In this subsection, we prove the Arzelà-Ascoli theorem, which char-


acterises the compact subsets of C (K ).

2.10 definition. Let F be a family of functions from a metric space ( X, d X ) to


a metric space (Y, dY ). The family F is equicontinuous if for every x ∈ X and
e > 0 there is a δ > 0 such that d X ( x, y) < δ implies dY ( f ( x ), f (y)) < e for all
f ∈ F.

The crucial point in this definition is that δ does not depend on f , al-
though it may depend on x. If δ can be chosen independent of x as well,
then the family is said to be uniformly equicontinuous. The following theorem
is a generalisation of theorem 1.71. The proof is left as an exercise (see the
Exercises).

2.11 theorem. An equicontinuous family of functions from a compact metric space


to a metric space is uniformly equicontinuous.

Next, we give necessary and sufficient conditions for compactness in C (K ).

2.12 theorem (Arzelá-Ascoli). Let K be a compact metric space. A subset of C (K )


is compact if and only if it is closed, bounded, and equicontinuous.

Proof. Recall that a set is precompact if its closure is compact, and that a set is
compact if and only if it is closed and precompact.
We only prove one side of the statement, namely that if F is bounded and
equicontinuous then F is pre-compact. Let { f n }n∈N ⊂ F be a sequence in
F . It suffices to prove that f n has a subsequence which is a Cauchy sequence.
This will then imply that the subsequence will converge to some f ∈ C ( X ),
as C ( X ) is a complete metric space. Let e > 0, and let δ > 0 given as by the
equicontinuity assumption. Since K is compact, we can cover K with finitely
many balls Bδ ( x1 ), . . . , Bδ ( x N ) with radius δ (K is totally bounded). Hence, for
a given x ∈ K, x ∈ Bδ ( x j ) for some j ∈ {1, . . . , N }. Hence, we can use the
equicontinuity and get

| f n ( x ) − f m ( x )| ≤ | f n ( x ) − f n ( x j )| + | f n ( x j ) − f m ( x j )| + | f m ( x j ) − f m ( x )|
≤ 2e + | f n ( x j ) − f m ( x j )|.

Now, the sequence vn := ( f n ( x1 ), . . . , f n ( x N )) ∈ R N is bounded in view of


F being bounded. Therefore, in view of Heine-Borel theorem there exists a
convergent subsequence vnk . This means that there exists M ∈ N such that
for all h, k ≥ M we have

| f nh ( x j ) − f nk ( x j )| < e, for all j ∈ {1, . . . , N }.

Therefore, | f nh ( x ) − f nk ( x )| < 3e for all x ∈ X. We have therefore shown that,


(1)
for a given e > 0 we can extract a subsequence { f n } such that

(1) (1)
k f n − f m k∞ ≤ 3e, for all n, m ≥ Me ,

43
2. Spaces of continuous functions

(1)
for a suitable Me ∈ N. Now, fix e = 1 and extract the subsequence { f n } as
(1) (2)
above. Then fix e = 1/2 and extract from { f n } another subsequence { f n },
(n)
and so on. The diagonal sequence { f n } will satisfy

(n) (m) 3
k f n − f m k∞ ≤
k
(n)
for n, m ≥ Mk for some suitable Mk depending on k. Hence, f n is a Cauchy
sequence, and the assertion follows.

2.13 example. For each n ∈ N, we define a function f n : [0, 1] → R by





 0 if 0 ≤ x ≤ 2−n
2n +1 ( x − 2− n )

if 2−n ≤ x ≤ 3 · 2−(n+1)
f n (x) =


 2n+1 (2−(n−1) − x ) if 3 · 2−(n+1) ≤ x ≤ 2−(n−1)
if 2−(n−1) ≤ x ≤ 1.

0

These functions consist of ‘tent’ functions of height one that move from right
to left across the interval [0, 1], becoming narrower and steeper as they do so.
Let F = { f n : n ∈ N}. Then k f n k∞ = 1 for all n ≥ 1, so F is bounded,
but k f m − f n k∞ = 1 for all m 6= n, so the sequence ( f n ) does not have any con-
vergent subsequence. Hence, the set F is a closed, bounded subset of C ([0, 1])
which is not compact. Note that F is not equicontinuous either, because the
graphs of f n become steeper as n gets larger.

2.14 definition. A function f : X → R on a metric space X is Lipschitz con-


tinuous on X if there is a constant C ≥ 0 such that

| f ( x ) − f (y)| ≤ Cd( x, y) for all x, y ∈ X. (14)

We will often abbreviate the term ‘Lipschitz continuous’ to ‘Lipschitz’.

2.15 exercise. Prove that every Lipschitz continuous function is uniformly


continuous.

2.16 example. The function f : [0, 1] → R defined by f ( x ) = x is uniformly
continuous, but it is not Lipschitz, because

| g( x ) − g(0)|
lim = +∞.
x →0+ | x − 0|

If f : X → R is a Lipschitz function, then we define the Lipschitz constant


Lip( f ) of f by

. | f ( x ) − f (y)|
Lip( f ) = sup .
x 6=y d( x, y)

Equivalently, Lip( f ) is the smallest constant C that works in the Lipschitz

44
2.3. Compact subsets of C (K )

condition (14), i.e.

Lip( f ) = inf{C : | f ( x ) − f (y)| ≤ Cd( x, y) for all x, y ∈ X }.

Suppose that K is a compact metric space and M > 0. We define a subset F M


of C (K ) by

F M = { f : f is Lipschitz on K and Lip( f ) ≤ M}.

The set F M is equicontinuous, since if e > 0 and δ = e/M, then

d( x, y) < δ implies | f ( x ) − f (y)| < e for all f ∈ F M .

The set F M is closed, since if ( f n ) is a sequence in F M that converges uni-


formly to f in C (K ), then

| f ( x ) − f (y)|
Lip( f ) = sup
x 6=y d( x, y)
| f n ( x ) − f n (y)|
 
= sup lim
x 6=y n→+∞ d( x, y)
" #
| f n ( x ) − f n (y)|
≤ lim inf sup
n→+∞ x 6=y d( x, y)
≤ M,

where we have used a simple property of lim inf and sup as in a previ-
ous proof. Thus, the limit f belongs to F M . The set F M is not bounded,
since the constant functions belong to F M and their sup-norms are arbitrarily
large. Consequently, although F M itself is not compact, the Arzelà-Ascoli the-
orem implies that every closed, bounded subset of F M is compact, and every
bounded subset of F M is precompact.

2.17 example. Suppose that x0 is a point in a compact metric space K. Let

B M = { f ∈ F M : f ( x0 ) = 0}.

Then, B M is bounded because for every f ∈ B M we have

k f k = sup | f ( x ) − f ( x0 )| ≤ M sup | x − x0 | ≤ Mdiam(K ),


x ∈K x ∈K

where diam(K ) is finite since K is compact, and hence bounded. The set B M
is closed, since if f n ( x0 ) = 0 and f n → f in C (K ), then

f ( x0 ) = lim f n ( x0 ) = 0.
n→+∞

Therefore, the set B M is a compact subset of C (K ).

A continuously differentiable function with bounded partial derivatives

45
2. Spaces of continuous functions

on a convex, open subset of Rn is Lipschitz, see the Exercises. A Lipschitz


continuous function may be non differentiable.

2.18 example. The absolute value function f ( x ) = | x | is Lipschitz continuous


with Lipschitz constant one, because

| f ( x ) − f (y)| = || x | − |y|| ≤ | x − y|.

However, f is not differentiable at x = 0.

2.19 example. Let C1 ([0, 1]) denote the space of all continuous functions f on
[0, 1] with continuous derivative f 0 . For constants M > 0 and N > 0, we define
the subset F of C ([0, 1]) by

F = { f ∈ C1 ([0, 1]) : k f k∞ ≤ M , k f 0 k∞ ≤ N }.

For all f ∈ F and x, y ∈ [0, 1] we have

Zy
| f ( x ) − f (y)| = f 0 (z)dz ≤ | x − y|k f 0 k∞ ≤ M | x − y|.
x

Since F is bounded in C ([0, 1]), Arzelà-Ascoli theorem implies that F is pre-


compact in C ([0, 1]). However, F is not closed, because the uniform limit of
continuously differentiable functions need not be differentiable. Thus, F is
not compact. Its closure in C ([0, 1]) is the compact set

F = { f ∈ C ([0, 1]) : k f k ≤ M , Lip( f ) ≤ N }.

A family of continuously differentiable functions with uniformly bounded


derivatives is equicontinuous, see also the Exercises. If the family is also
bounded, then it is precompact. The idea that a uniform bound on suitable
norms of the derivatives of a family of functions implies that the family is
precompact will reappear in the study of Sobolev spaces.

2.20 example. In many applications one may need to work with functions
that are continuously differentiable, i.e. functions f such that the first derivative
f 0 is continuous. Such a set is typically referred to as C1 ( A) where A is the
domain of the functions. Let us consider the case A = [ a, b]. A well known
calculus theorem states that C1 ([ a, b]) ⊂ C ([ a, b]), that is, every continuously
differentiable function is also continuous. Clearly, C1 ([ a, b]) is a linear space.
This is a simple exercise (every linear combination of continuously differen-
tiable functions is continuously differentiable). Hence, C1 ([ a, b]) may be seen
as a linear subspace of C ([ a, b]). Then, a natural question arises: is C1 ([ a, b])
closed in (C ([ a, b]), k · k∞ )?
The answer to said question is no, it is not. Indeed, it is not difficult to
construct examples of sequences of C1 functions converging uniformly to a

46
2.3. Compact subsets of C (K )

function that is not C1 . For instance, consider [ a, b] = [−1, 1] and

nx2
f n (x) = , x ∈ [−1, 1].
1 + n| x |

We claim that k f n − f k∞ → 0 as n → +∞ with f ( x ) = | x |. Indeed, computing

|nx2 − | x |(1 + n| x |)| |x|


| f n ( x ) − f ( x )| = = ,
1 + n| x | 1 + n| x |

and observing that | f n − f | is therefore an even function, we have


x
k f n − f k∞ = max .
x ∈[0,1] 1 + nx

By computing the derivative

d x 1 + nx − nx 1
= = ≥0
dx 1 + nx (1 + nx )2 (1 + nx )2

we decuce the maximum above is achieved at x = 1, therefore k f n − f k∞ =


1
1+n → 0, which proves the uniform convergence. On the other hand, it is well
known that the function f ( x ) = | x | is not differentiable at x = 0.
Since C1 ([ a, b]) is in general not closed in C ([ a, b]), we automatically de-
duce that (C1 ([ a, b]), k · k∞ ), is not a Banach space (it is a normed linear space
indeed, but it is not complete!). In order to make C1 ([ a, b]) a complete normed
linear space we need to equip it with the norm

k f kC1 = k f k∞ + k f 0 k∞ .

Such a norm incorporates the information on the first derivative as well.


Let us prove that (C1 ([ a, b]), k · kC1 ) is complete. Assume f n ∈ C1 ([ a, b]) is
a Cauchy sequence. Since

k f n − f m k∞ ≤ k f n − f m k∞ + k f n0 − f m0 k∞ = k f n − f m kC1

we deduce that f n is Cauchy in the space (C ([ a, b]), k · k∞ ), which is a Banach


space. Therefore, there exists f ∈ C ([ a, b]) such that f n → f in the k · k∞ norm.
Similarly, we deduce that f n0 is Cauchy in (C ([ a, b]), k · k∞ ) and for the same
reason there exists g ∈ C ([ a, b]) such that f n0 → g in k · k∞ . Now, if we prove
that g = f 0 then we would get that

k f n − f k∞ + k f n0 − f 0 k∞ → 0

which would prove the assertion. On the other hand, since f n is continuously
differentiable, from the fundamental theorem of integral calculus we get

Zx
f n (x) = f n0 (y)dy.
a

47
2. Spaces of continuous functions

By letting n → +∞ on both sides of the above identity, we get

Zx
f (x) = g(y)dy,
a

because on the right hand side f n ( x ) converges pointwise to f ( x ) (uniform


implies pointwise) and on the left hand side we know that we can interchange
limit and integral if the convergence is uniform ( f n0 converges uniformly to f ).
Hence, the fundamental theorem of integral calculus implies once again that
f 0 = g.

2.21 example. The above example can be extended as follows to functions


with k-derivatives. The space C k ([ a, b]) of k-times continuously differentiable
functions on [ a, b] is not a Banach space with respect to the sup-norm k · k∞
for k ≥ 1, since the uniform limit of continuously differentiable functions need
not be differentiable. We define the C k -norm by

k f kCk = k f k∞ + k f 0 k∞ + k f 00 k∞ + . . . + k f (k) k∞ .

Then C k ([ a, b]) is a Banach space with respect to the C k -norm. Convergence


with respect to the C k -norm is uniform convergence of functions and their
first k derivatives. We omit the details.

2.4 The contraction mapping theorem


In this subsection we state and prove that contraction mapping theorem, which
is one of the simplest and most useful methods for the construction of linear
and nonlinear equations.

2.22 definition. Let ( X, d) be a metric space. A mapping T : X → X is a


contraction mapping, or contraction, if there exists a constant c, with 0 ≤ c < 1,
such that

d( T ( x ), T (y)) ≤ cd( x, y) (15)

for all x, y ∈ X.

Thus, a contraction maps points closer together. In particular, for every


x ∈ X, and any r > 0, all points y in the ball Br ( x ) are mapped into a ball
Bs ( T ( x )), with s < r. Clearly, all contractions are Lipschitz continuous, and
hence uniformly continuous.
If T : X → X, then a point x ∈ X such that

T (x) = x (16)

is called a fixed point of T. The contraction mapping theorem states that a


contraction on a complete metric space has a unique fixed point. The contrac-
tion mapping theorem is only one example of what are more generally called
fixed-point theorems. For example, the Schauder fixed point theorem states that

48
2.4. The contraction mapping theorem

a continuous mapping on a convex, compact subset of a Banach space has a


fixed point. We will not discuss the proof in this course.
In general, the condition that c is strictly less than one is needed for the
uniqueness and existence of the fixed point. For example, if X = {0, 1} is the
discrete metrics space with metric determined by d(0, 1) = 1, then the map T
defined by T (0) = 1, T (1) = 0 satisfies (15) with c = 1, but T does not have
any fixed points. On the other hand, the identity map on any metric space
satisfies (15) with c = 1, and every point is a fixed point. It is worth noting
that (16), and hence its solutions, do not depend on the metric d. Thus, if we
can find any metric on X such that X is complete and T is a contraction on X,
then we obtain the existence and uniqueness of a fixed point.

2.23 theorem (Contraction mapping). If T : X → X is a contraction mapping on


a complete metric space ( X, d), then T has exactly one fixed point.

Proof. The proof is constructive, meaning that we will explicitly construct a


sequence converging to the fixed point. Let x0 ∈ X be any point in X. We
define a sequence ( xn ) in X by

x n +1 = T ( x n ), for n ≥ 0.

To simplify the notation, we often omit the parentheses around the argument
of a map. We denote the n-th iterate of T by T n , so that xn = T n x0 .
First, we show that ( xn ) is a Cauchy sequence. If n ≥ m ≥ 1, then from
(15) and the triangle inequality. we have

d ( x n , x m ) = d ( T n x0 , T m x0 )
≤ c m d ( T n − m x0 , x0 )
h i
≤ cm d( T n−m x0 , T n−m−1 x0 ) + . . . + d( Tx0 , x0 )
" #
n − m −1
≤ cm ∑ c k d ( x1 , x0 )
k =0
+∞
" #
≤ cm ∑ ck d ( x1 , x0 )
k =0

c m 
= d ( x1 , x0 ),
1−c

which implies that ( xn ) is a Cauchy sequence since 0 ≤ c < 1. Since X is


complete, ( xn ) converges to a limit x ∈ X. By continuity of T, we get
 
Tx = T lim xn = lim Txn = lim xn+1 = x,
n→+∞ n→+∞ n→+∞

which shows that x is a fixed point. Finally, let x, y ∈ X be two fixed points,
then

0 ≤ d( x, y) = d( Tx, Ty) ≤ cd( x, y).

49
2. Spaces of continuous functions

Since c < 1, we have d( x, y) = 0, so x = y and the fixed point is unique.

In the Exercises we consider a simple application of the contraction map-


ping theorem in a finite dimensional case. Applications to spaces of infi-
nite dimensions are more challenging. A classical one is the famous Cauchy-
Lipschitz existence and uniqueness theorem for systems of ODEs.

2.5 Exercises
1. For the following sequences of functions on a specific domain K ⊂ R,
answer the following questions: 1) find (if it exists) a function f on the
same domain K such that f n converges pointwise to f ; 2) say whether or
not f n converges fo f uniformly; motivate all the answers with suitable
mathematical reasoning:

• f n ( x ) = x n on K = [0, 1],
• f n ( x ) = nxe−nx on K = [0, 1] and on K = R,
• f n ( x ) = (1 − x ) x n on K = [0, 1],
nx
• f n (x) = 1+ n2 x 2
on K = [−1, 1],
n2 x 2
• f n (x) = 1+ n2 x 2
on K = [−1, 1],
• f n (x) = x
1+ n2 x 2
on K = R,
2x
• f n ( x ) = nxe−n on K = [0, 1],
• f n ( x ) = sin( x/n).

2. Let ( A, k · k A ) and ( B, k · k B ) be two normed spaces. A linear isometry


between A and B is a linear map T : A → B such that k Tx k B = k x k A .
Prove that every surjective, linear isometry is also invertible. Prove that
both T and T −1 are continuous.

3. Consider the Banach space X = C ([−1, 1]) equipped with the usual
infinity norm k · k. Say which of the following subsets of X are closed
and which ones are dense in X. For those sets that are not closed, find
the closure of the set.

(a) H = { f ∈ X : f (0) = 0}
(b) H = { f ∈ X : f ( x ) = 0 for all x ∈ [−1, 0]}
(c) H = { f ∈ X : f is a polynomial of degree ≤ 1}
(d) H = { f ∈ X : f is a polynomial}
(e) H = { f ∈ X : f is a Lipschitz function}
(f) H = { f ∈ X : f is a Lipschitz function with Lipschitz constant ≤ 1}
(g) H = { f ∈ X : f is even}
(h) H = { f ∈ X : f is odd}
(i) H = { f ∈ X : f is strictly increasing}
(j) H = { f ∈ X : f has a local minimum at x = 0}

50
2.5. Exercises

4. Consider the Banach space X = C ([−1, 1]) equipped with the usual
infinity norm k · k. Consider the set

H = { f ∈ X : f is strictly decreasing} .

Find H. Justify your answer.

5. Prove that an equicontinuous family of functions from a compact metric


space to a metric space is uniformly equicontinuous.

6. For each n ∈ N, consider f n ( x ) = sin(nπx ). Is the family of functions


{ f n : n ∈ N} compact in C ([0, 1]) equipped with the uniform norm?
Motivate your answer.

7. Let F be the subset of C ([0, 1]) that consists of functions f of the form
+∞ +∞
f (x) = ∑ an sin(nπx) with ∑ n|an | ≤ 1.
n =1 n =1

(a) Prove that actually f is an element of C ([0, 1])


(b) Prove that F is bounded in C ([0, 1])
(c) Prove that F is precompact in C ([0, 1])

8. Let { f n }n∈N ⊂ C ([0, 1]) such that supx∈[0,1] | f n ( x )| ≤ 1 for any n ∈ N.


Define Fn : [0, 1] → R by

Zx
Fn ( x ) = f n (t) dt.
0

Show that the sequence { Fn }n∈N has a subsequence that converges uni-
formly on [0, 1].

9. Let G the following subset of C ([0, π ])


 
 Zπ 
G = g ∈ C ([0, π ]; g( x ) = sin( xy) f (y) dy, f ∈ C ([0, π ]), k f k∞ ≤ 1 .
 
0

Prove that G is relatively compact in in C ([0, π ]).

10. Suppose that f : C → R is a continuously differentiable function on an


open, convex subset C of Rn , and that the partial derivatives of f are
bounded on C. Prove that f is Lipschitz.

11. Prove that a family of continuously differentiable functions on an open,


convex subset C of Rn with uniformly bounded partial derivatives is
equicontinuous.

12. Give a counterexample to show that f n → f in C ([0, 1]) and f n continu-


ously differentiable does not imply that f is continuously differentiable.

51
2. Spaces of continuous functions

13. Prove that the set of Lipschitz continuous functions on [0, 1] with Lips-
chitz constant less than or equal to one and zero integral is compact in
C ([0, 1]).

14. Prove that C ([ a, b]) is separable.

15. Consider the discrete dynamical system

xn+1 = Txn , x0 ∈ [0, 1],

defined on [0, 1] with

Tx = 4µx (1 − x ).

Such a system describes the logistic growth of a population. Prove that if


0 ≤ µ < 1/4 there is a unique initial datum x0 such that xn = x0 for all
n ∈ N. What happens if µ ≥ 1/4?

16. An infinite sum of functions ∑+ n=1 f n ( x ) converges totally if the series
∞ +∞
∑+
n=1 k f n k∞ is convergent. Prove that if ∑n=1 f n ( x ) converges totally then
+∞
∑n=1 f n ( x ) converges uniformly.

52
3 Measure and integration. L p spaces

Using the sup norm is by far the simplest way to measure the distance be-
tween two functions. On the other hand, this distance has the downside of
being sometime ‘too strong’. In many applications, convergence with respect
to weaker distance my be extremely useful, for examples based on integral
quantities. For instance, given two functions f and g on the interval [0, 1], we
may ask ourselves how much they differ in an integral sense by evaluating the
quantity

Z1
| f ( x ) − g( x )|2 dx.
0

The above quantity (its square root, more precisely) has very likely all the
properties of a distance. Defining said quantity relies on the concept of inte-
gral of a function of real variables. We all are familiar with Riemann’s integral,
which therefore seems to be the natural candidate concept to use in this con-
text. However, we will see very soon that there is a major theoretical issue
with classical Riemann integration theory when we try to adapt it to the goals
of functional analysis. Hence, in this section we will introduce the Lebesgue
theory of integration, which will allow is to defined the so-called L p spaces,
also called Lebesgue spaces.
We all know that there is a close interplay between integrals of functions
f : Rd → R and the way we measure sets in Rd . For example, if we define
R function f : A → R as f ( x ) ≡ 1, with a given A ⊂ R , we expect that
the 3

A f ( x ) dx returns the 3-dimensional volume of the set A. Now, Riemann’s the-


ory of integrals relies on the so-called Peano-Jordan theory of measure, which
we will briefly recall here. Then, we shall introduce Lebesgue measure theory,
which is sort of a counterpart (on the measure side) of Lebesgue integral the-
ory.

3.1 Integrals and measures

We all are familiar with Riemann’s integration theory.


Given a bounded function f : [ a, b] → R, Riemann’s idea to compute the
integral relies essentially on approximating the area of the subgraph of f (with
the convention that regions in which f is negative give a contribution with
negative sign) with piecewise constant functions, i.e. with functions which are
Rb
constant on intervals. The integral a h( x )dx of a piecewise constant function
h is trivially computed as the sum of areas of rectangles. We set
 
Z b Zb
.

f ( x )dx = sup h( x )dx : h ≤ f and h is piecewise constant ,
a 
a

53
3. Measure and integration. L p spaces

and
 
Z b Zb
.

f ( x )dx = inf h( x )dx : h ≥ f and h is piecewise constant .
a  
a

Rb Rb
If f ( x )dx = a f ( x ) dx we say that f is Riemann integrable, and define
a

Zb Z b Z b
f ( x )dx = f ( x )dx = f ( x )dx.
a a
a

The class of Riemann integrable functions contains e.g. continuous functions


and piecewise continuous functions.
Riemann’s integral can be extended also to functions defined on subsets of
Rd . We refer to [1].
Strictly related with integration is the way we measure sets. In one space
dimension, the measure of a set is (roughly speaking) the length of the set,
which reduces e.g. to b − a in the case of an interval [ a, b]. In two dimensions
the measure of A ⊂ R2 is the area of A, in three dimensions the measure of
A ⊂ R3 is the volume of A. The goal of measure theory is to define the mea-
sure of elementary sets (intervals in one dimension, rectangles in two dimen-
sions, etc) and use them to define the measure of more and more complicated
sets, e.g. open sets, closed sets, etc.

3.1 fact. Let R ⊂ Rn be a rectangle, i.e. R is the Cartesian product of n intervals


of the form [ a, b), ( a, b], ( a, b), [ a, b]. The measure m( R) of a rectangle can be
easily computed as the produce of the sizes of said intervals. An elementary set
I ⊂ Rn is the finite union of rectangles. Every elementary set can be written as
the union of pairwise disjoint rectangles (easy exercise), i.e. I = R1 ∪ . . . ∪ Rm
with Ri ∩ R j = ∅ if i 6= j. Hence, the measure of the elementary set I above
can be computed as
m
m( I ) = ∑ m ( R i ).
i =1

A classical way to extend the notion of measure in Rn to more complicated


sets leads to Peano-Jordan’s theory.

3.2 fact. In Peano-Jordan’s theory, a set is measurable if it can be well ap-


proximated by elementary sets from outside and from inside. More precisely,
let A ⊂ Rn be a bounded set. We set

m∗,PJ ( A) = sup{m( I ) : I is an elementary set and I ⊂ A} (inner measure)


m∗,PJ ( A) = inf{m( I ) : I is an elementary set and I ⊃ A} (outer measure)

A set A ⊂ Rn is Peano-Jordan measurable if m∗,PJ ( A) = m∗,PJ ( A). In this case


we denote the Peano-Jordan measure of A as m PJ ( A) = m∗,PJ ( A) = m∗,PJ ( A).

54
3.1. Integrals and measures

Peano-Jordan’s theory works well with sets with thin boundaries, namely
sets A with a boundary ∂A with m∗,PJ (∂A) = 0. Moreover, it fits well with
Riemann’s integration theory, in a sense which is better explained as follows.

3.3 example. Let f : [ a, b] → [0, +∞) be Riemann integrable. Then, the sub-
graph of f

A = {( x, y) ∈ R2 : a ≤ x ≤ b , 0 ≤ y ≤ f ( x )}

is Peano-Jordan measurable and

Zb
m PJ ( A) = f ( x )dx.
a

Peano-Jordan’s and Riemann’s theories cover a fairly large class of sets


and functions respectively. However, these theories lack in being well suited
with respect to σ-additivity properties, i.e. they do not work well with count-
able unions of measurable sets. More precisely, consider a sequence of Peano-

Jordan measurable sets { Ek }+k =1 , Ek ⊂ R . In general, we are not guaranteed
n
S+∞
that the union k=1 is Peano-Jordan measurable. This gap in the theory is only
seemingly harmless. Its consequences on more advanced mathematical theo-
ries involving integration are indeed very serious. Moreover, Peano-Jordan
theory cannot cover unbounded sets.

3.4 example. The set A = [0, 1] ∩ Q is countable. Write it as a sequences


A = { xk }k without repetitions. Clearly, each point has measure zero, therefore
m({ xk }) = 0. We would naturally expect that

∞ +∞
!
+
∑ mPJ ({xk }),
[
m PJ ( A) = m PJ { xk } =
k =1 k =1

but this is false, since A is not even measurable according to Peano-Jordan


theory. We leave the proof of this claim as an exercise.

3.5 exercise. Let f n : [ a, b] → R be a sequence of Riemann integrable func-


tions. Assume that f n → f uniformly. Prove that

Zb Zb
lim f n ( x )dx = f ( x )dx.
n→+∞
a a

In 1902, measure theory was greatly improved by Henri Lebesgue, who


formulated an extended version of Peano-Jordan’s theory which fixed the
above mentioned bugs.

55
3. Measure and integration. L p spaces

3.2 An overview of Lebesgue measure theory

3.6 definition. Let A ⊂ Rn be a bounded open set. We set

m( A) = sup {m( I ) : I ⊂ A and I is an elementary set} .

Let K ⊂ Rn be a compact set. We set

m(K ) = inf {m( I ) : I ⊃ K and I is an elementary set} .

For an arbitrary bounded subset E ⊂ Rn we define the Lebesgue outer measure

m∗ ( E) = inf{m( A) : A ⊃ E and A is an open set},

and the Lebesgue inner measure

m∗ ( E) = sup{m(K ) : K ⊂ E and K is a compact set}.

A bounded subset E ⊂ Rn is said to be Lebesgue measurable if m∗ ( E) =


m∗ ( E). If E ⊂ Rn is unbounded, we say that E is Lebesgue measurable if
E ∩ BR (0) is Lebesgue measurable for all R ≥ 0, and set

m( E) = lim m( E ∩ BR (0))
R→+∞

(m( E) may be infinite!).

It can be easily proven that this concept of measurability satisfies the so


called Boolean closure, i. e. if E, F ⊂ Rn are measurable then so are E ∪ F,
E ∩ F, E \ F. Moreover, the empty set is measurable with m(∅) = 0, and Rn is
measurable with m(Rn ) = +∞.

3.7 exercise. Prove that a bounded set E ⊂ Rn is Lebesgue measurable if and


only if for any e > 0 there exist Ke ⊂ E ⊂ Ae , with Ke compact and Ae open,
such that m( Ae \ Ke ) < e.

As we expect, Lebesgue measure extends Peano-Jordan measures.

3.8 exercise. Let E ⊂ Rn be Peano-Jordan measurable. Prove that E is Lebesgue


measurable and that the two measures coincide. Hint: prove that

m∗,PJ ( E) ≥ m∗ ( E) , m∗,PJ ( E) ≤ m∗ ( E).

In measure theory, sets with zero measure are particularly important. We


say that a property holds almost everywhere in Rn if it holds outside a set with
zero Lebesgue measure.

3.9 exercise. Prove that if E ⊂ Rn satisfies m∗ ( E) = 0 then E is Lebesgue


measurable and m( E) = 0.

3.10 proposition. If E, F ⊂ Rn are measurable, then

56
3.2. An overview of Lebesgue measure theory

(a) E ⊂ F implies m( E) ≤ m( F ),

(b) m( E ∪ F ) ≤ m( E) + m( F ),

(c) m( E ∪ F ) = m( E) + m( F ) if E and F are disjoint,

(d) m( E \ F ) = m( E) − m( F ) if E ⊃ F and m( F ) < +∞.

Proof. Omitted.

As a consequence of (a) above, all bounded measurable sets have finite


Lebesgue measure.
We mentioned above about gaps with σ-additivity properties in Peano-
Jordan’s theory. The next theorem collects fundamental properties of Lebesgue
measure which marks a basic improvement in the theory.

3.11 theorem. Let { Ek }k be a sequence of measurable sets in Rn . Then, +
T
S+∞ k =1 Ek
and k=1 Ek are Lebesgue measurable. Moreover,
S ∞  +∞
(a) m + k =1 Ek ≤ ∑k =1 m ( Ek ) (countable subadditivity),
S ∞  +∞
(b) m + k =1 Ek = ∑k =1 m ( Ek ) if the Ek ’s are pairwise disjoint (countable addi-
tivity),
S ∞ 
(c) m + k =1 Ek = limk →+∞ m ( Ek ) if Ek ⊂ Ek +1 for all k ∈ N (continuity of
Lebesgue measure w.r.t. increasing families of sets)
T ∞ 
(d) m + k =1 Ek = limk →+∞ m ( Ek ) if Ek ⊃ Ek +1 for all k ∈ N and m ( E1 ) <
+∞ (continuity of Lebesgue measure w.r.t. decreasing families of sets).

Proof. Omitted.

3.12 example. Let Ek = [k, +∞) ⊂ R. Clearly, m([k, +∞)) = +∞ for all k. We
have

!
+
= m(∅) = 0 6= +∞ = lim m( Ek ),
\
m Ek
k →+∞
k =1

so the assumption m( E1 ) < +∞ is needed in point (d) of the previous Theo-


rem.

3.13 fact. Lebesgue measure is translation invariant. More precisely, given E ⊂


Rn a measurable set and given h ∈ Rn , the set Eh = {y = x + h : x ∈ E} is
measurable and one has m( Eh ) = m( E).

3.14 example. In the example 3.4 we showed that the set E = Q ∩ [0, 1] is not
Peano-Jordan measurable. Due to Theorem 3.11 (b), E is actually Lebesgue
measurable, since
+
[ ∞
E= { xk }
k =1

57
3. Measure and integration. L p spaces


where { xk }+k =1 is an enumeration (without repetitions) of the rational numbers
in [0, 1]. Therefore,
+∞
m( E) = ∑ m({ xk }) = 0,
k =1

since all singletons { xk } have zero measure. Now, we left (b) Theorem 3.11
without a proof, but due to the importance of this example, let us provide an
alternative proof that E is Lebesgue measurable and has zero measure. Let us

fix e > 0. Let { xk }+
k =1 the enumeration of E introduced above. We have

+ ∞
[ e e
E⊂ xk − , x k + .
k =1
2k 2k

Now, we take for granted that


(a) Lebesgue outer measure is monotone, i.e. E ⊂ F implies m∗ ( E) ≤ m∗ ( F )
(easy exercise).
S ∞ 
(b) Lebesgue outer measure is countably sub-additive, i.e. m∗ + k =1 Ek ≤
+∞ ∗
∑ k =1 m ( Ek ) (exercise).
Hence, we obtain
+∞ +∞
 e e 
m∗ ( E) ≤ ∑m xk −
2k
, x k +
2k
= 2e ∑ 2−k = 2e,
k =1 k =1

and since e > 0 is arbitrary we obtain m∗ ( E) = 0, which proves the assertion.

Lebesgue’s theory has the advantage of providing a class of measurable


sets much larger than the one provided by Peano-Jordan’s theory. A famous
example due to Vitali shows, however, that there exists at least one subset of
R which is not Lebesgue measurable.
In the remaining sections we shall use the expression almost everywhere
for a property that holds everywhere but on a set of zero Lebesgue measure.
We observe also that Lebesgue measure depends on the dimension d of the
Euclidean space. Indeed, a segment with positive length has positive Lebesgue
measure in dimension one and zero Lebesgue measure in dimension two.
When necessary, we use the notation md ( A) for the measure of a set A to
emphasize the dimension of the measure.

3.3 Lebesgue integral


Starting from the class of Lebesgue measurable sets, we introduce a class of
functions on Rd which turns out to be the most suitable setting to define a
concept of integral. In what follows we adopt the notation R to denote the
extended real line [−∞, +∞], given by R ∪ {−∞} ∪ {+∞}.
Let Ω ⊂ Rn , and let f : Ω → R. We say that f is measurable in Ω if

for all α ∈ R, the set { x ∈ Ω : f ( x ) > α} is Lebesgue measurable.

58
3.3. Lebesgue integral

Such a definition is equivalent to requiring, for all α ∈ R, one of the following:


• for all α ∈ R, the set { x ∈ Ω : f ( x ) ≥ α} is Lebesgue measurable,
• for all α ∈ R, the set { x ∈ Ω : f ( x ) ≤ α} is Lebesgue measurable,
• for all α ∈ R, the set { x ∈ Ω : f ( x ) < α} is Lebesgue measurable,
• for all U ⊂ R open, the set { x ∈ Ω : f ( x ) ∈ U } is Lebesgue measurable.
Clearly, continuous functions are measurable. Given f , g measurable and
c ∈ R, one has that the functions f + g, c f , f g, f /g (with g 6= 0), max{ f , g},
min{ f , g}, f + , f − , | f | are measurable. Moreover, given a sequence { f k }k of
measurable functions, supk≥1 f k , infk≥1 f k , lim infk→+∞ f k , lim supk→+∞ f k , and
limk→+∞ f k are measurable functions.

3.15 definition. A simple function φ : Rn → R is a measurable function on


Rn which attains only a finite number of values. Let α1 , . . . , αk ∈ R be those
values, and let Ej = { x ∈ Rn : φ( x ) = α j } for j = 1, . . . , k. Then, φ can be
represented as

k
φ( x ) = ∑ α j 1 E j ( x ), (17)
j =1

where
(
1 if x ∈ A
1 A (x) = for a measurable set A.
0 if x 6∈ A,

We observe that the above representation (17) is, in general, not unique if
we do not impose that each Ej is the pre-image of α j , as we might re-define
the sets Ej and the constants α j and obtain the same function φ. However,
such representation is unique if we assume Ei ∩ Ej = ∅ for i 6= j and the
constants αi without repetitions. 1 A above is called indicator function. The class
of simple functions is closed under trivial operations such as sum, difference,
multiplication by a real number.
We now define the concept of Lebesgue integral for simple functions. Let
φ : Rn → R be a simple function which is bounded and zero outside a com-
pact set of Rd . Assume

k
φ( x ) = ∑ α j 1Ek (x).
j =1

Then, we set
Z k
φ( x )dx = ∑ α j m ( E j ).
j =1

The above definition is well posed, in the sense that it is independent of the
choice of the representation of φ as finite combination of indicator functions.

59
3. Measure and integration. L p spaces

The integral of simple functions satisfies


Z Z Z
(φ + λψ)dx = φdx + λ ψdx,

for all φ, ψ simple and fort all λ ∈ R.

3.16 exercise. LetRφ, ψ beR simple functions. Assume φ ≤ ψ almost everywhere


on Rd . Prove that φ ≤ ψ.

In Riemann’s theory, the class of integrable functions is determined by the


property of being well approximated by piecewise constant from above and
from below. This requirement in general is quite selective, as it takes some
nontrivial functions (such as the Dirichlet function) out of the set of integrable
functions. In Lebesgue’s theory, the minimal requirement of the function being
measurable is essentially enough in order to compute the integral.

3.17 exercise. Let f : Rd → [0, +∞) be measurable, bounded, and zero out-
side the set B M (0). Then, prove that there exist two sequences ψk , φk of simple
functions such that ψk ( x ) = φk ( x ) = 0 for x 6∈ B M (0), ψk ≤ f ≤ φk for all
k, and such that φk − ψk → 0 uniformly in Rd . Hint: let M > 0 such that
supx∈Rd f ( x ) ≤ M < +∞ and f ( x ) = 0 for all | x | ≥ M. For all n ∈ N and
k = 0, . . . , 2n consider the sets

k−1
 
d k
Ek = x ∈ R : M n ≤ f ( x ) < M n ∩ B M (0),
2 2
n n
and set ψn ( x ) = ∑2k=1 M k2−n1 1Ek and φn ( x ) = ∑2k=1 M 2kn 1Ek . Prove that φn and
ψn satisfy the assertion.

As a consequence of the above exercise, given f : Rd → [0, +∞) be measur-


able, bounded, and zero outside the set B M (0), as a consequence of exercise
3.16, one has
Z Z Z Z
φk dx − ψk dx = (φk − ψk )dx ≤ kφk − ψk k∞ dx ≤ m( BM (0))kφk − ψk k∞ ,

and the right hand side goes to zero as k → +∞.


The above argument shows that, at least for measurable, nonnegative,
bounded functions which are zero outside a compact set, approximation of
the integral via simple functions works well both from outside and from be-
low. Thus, we can define a notion of integral in this class in either directions.
We choose to approximate from below.

3.18 definition (Lebesgue integral of a measurable function). Let f : Rd →


[0, +∞] be measurable. We set
Z Z 
f ( x )dx = sup φ( x )dx : φ is simple and φ ≤ f .

Let f : Rd → [−∞, +∞] be measurable and assume that at least one between

60
3.3. Lebesgue integral

f + = max{ f , 0} and f − = max{− f , 0} have finite integral. Then we set


Z Z Z
f ( x )dx = f + ( x )dx − f − ( x )dx.

A measurable function is called summable (or L1 ) if | f ( x )|dx < +∞. For a


R

given measurable set E ⊂ Rd , we set


Z Z
f ( x )dx = f ( x )1E ( x )dx.
E

An elementary property which can be proven easily is monotonicity


R of
R the
Lebesgue integral, that is, if f ≤ g almost everywhere and f dx and gdx
make sense, we have
Z Z
f dx ≤ gdx .

The proof easily follows from the definition of Lebesgue integral and is left as
an exercise.
Before proving more elementary properties of the Lebesgue integral, we
need to prove a first result solving the limit-integral interchange property.

3.19 theorem (Beppo-Levi or monotone convergence). Let f k : Rd → [0, +∞]


be a sequence of measurable functions. Assume that

f n ( x ) ≤ f n +1 ( x ) almost everywhere.

Then, there exists f : Rd → [0, +∞] measurable such that f n → f almost every-
where, and
Z Z
lim f n ( x )dx = f ( x )dx.
n→+∞

Proof. Let A ⊂ Rd be such that f n ( x ) ≤ f n+1 ( x ) for all x ∈ Rd \ A and


m( A) = 0. We set fen ( x ) to be equal to f n ( x ) on Rd \ A and 0 on A, so that fen
is monotone everywhere. We set

f ( x ) = sup fen ( x ) .
n ∈N

Since fen is measurable (exercise), then we get that f is also measurable. We ob-
R R
serve that due to the monotonicity we have that limn→+∞ f n dx = limn→+∞ fen dx
exists for sure. Moreover, since fen ≤ f for all n, we immediately get
Z Z Z
lim f n dx = sup f n dx ≤ f dx .
n→+∞ n ∈N

To prove the opposite inequality, let ϕ be a simple function which is zero


outside a ball with ϕ ≤ f , without restriction ϕ ≥ 0. For a t ∈ (0, 1) we define

61
3. Measure and integration. L p spaces

the (Lebesgue measurable) set


n o
En = x ∈ Rd : tϕ( x ) < fen ( x ) .

We represent the simple function ϕ as

N
ϕ( x ) = ∑ αi 1Bi ,
i =1

for pairwise disjoint Bi ’s and distinct αi ’s. We claim that iN=1 En = Rd . To see
S

this, let x ∈ Rd . Since ϕ( x ) ≤ f ( x ), we have tϕ( x ) < f ( x ). Since fen ( x ) → f ( x ),


for some n ∈ N we have tϕ( x ) ≤ fen ( x ), that is x ∈ En . Now, for a fixed n,
Z Z Z
fen ( x )dx ≥ fen ( x )dx ≥ t ϕ( x )dx
Rd En En
Z N N
=t ∑ αi 1Bi = t ∑ αi m( Bi ∩ En ) .
En i =1 i =1

Hence,
Z N
sup fen ( x )dx ≥ t ∑ αi m( Bi ∩ En ) .
n i =1
Rd

Since the family Bj ∩ En is increasing with respect to n, by continuity of the


Lebesgue measure we can sent n → +∞ on the right hand side and get
Z N Z
sup fen ( x )dx ≥ t ∑ αi m( Bi ) = t ϕdx .
n i =1
Rd Rd

Due to the arbitrariness of t, we get


Z Z
lim f n ( x )dx ≥ ϕdx
n→+∞
Rd

for all simple functions ϕ ≤ f . By taking the sup with respect to ϕ we obtain
the desired inequality.

We now prove more elementaty properties.

3.20 proposition. Let f , g be measurable functions for which Lebesgue integral


makes sense.6 Let λ, µ ∈ R. Then,
R R R
• (λ f ( x ) + µg( x ))dx = λ f ( x )dx + µ g( x )dx.
R R
• f ( x )dx ≤ | f ( x )|dx.
6 According to the above definition, this means either f is nonnegative or one between f + =

max{ f , 0} and f − = max{− f , 0} have finite integral.

62
3.3. Lebesgue integral

R R
• If f ≥ 0, E ⊂ F and E, F are measurable, then E f ( x )dx ≤ F f ( x )dx.

• If f ≥ 0 almost everywhere then

1
Z
d
m({ x ∈ R : f ( x ) ≥ λ}) ≤ f ( x )dx.
λ

The proof is omitted and left as an exercise. Notice that Beppo-Levi is


needed in the linearity property.
As expected, Lebesgue integral extends Riemann integral, i.e. if f is Rie-
mann integrable then it is Lebesgue measurable and the two integrals coin-
cide. This can be seen as a simple exercise by noticing for example that simple
functions contain piecewise constant functions as a subset. On the other hand,
there are functions which are Lebesgue integrable but not Riemann integrable,
for example

f ( x ) = 1Q∩[0,1] ,

the details are left as an exercise.


We now state some fundamental theorems regarding the Lebesgue inte-
gral.

3.21 theorem (Fubini). Let f ( x, y) be a summable function on Rn × Rm . Then,

(i) The function Rn 3 x 7→ f ( x, y) is summable on Rn for almost all y ∈ Rm .

(ii) The function Rm 3 y 7→ f ( x, y)dx is summable on Rm and we have


R

Z Z Z
f ( x, y)dxdy = dy f ( x, y)dx. (18)
Rn + m

If f is nonnegative and not necessarily summable, the same conclusion of (18)


holds if one of the three integrals
Z Z Z Z Z
f ( x, y)dxdy, dy f ( x, y)dx, dx f ( x, y)dy
Rn + m

is finite.

The proof is omitted.

3.22 exercise. Let f : Rd → [−∞, +∞] be a measurable function. Assume


Z
f ( x )dx = 0
E

for all measurable sets E ⊂ Rd . Prove that f = 0 almost everywhere.

The property proven for the Riemann integral in exercise 3.5 is a very im-
portant one. It is called limit-integral exchange property. A downside of Riemann

63
3. Measure and integration. L p spaces

integration is that such a property only holds in general under the quite strict
assumption that the sequence f n converges uniformly. The most natural class
in which we would like to investigate such a property for Lebesgue integra-
tion is the class of function sequences f n : Rd → R which are measurable and
converge almost everywhere to some f : Rd → R, i.e. such that f n ( x ) → f ( x ) as
n → +∞ for all x ∈ Rd \ A with m( A) = 0.
A first case in which the property is valid is when the sequence is mono-
tone increasing almost everywhere, as proven in Beppo-Levi’s theorem. In
general, assuming f n → f almost everywhere does not necessarily imply that
the above limit exchange property is true. Under the assumption that the se-
quence is nonnegative, the following property can be proven.

3.23 theorem (Fatou’s lemma). Let f k : Rd → [0, +∞] be a sequence of measur-


able functions. Then
Z   Z
lim inf f n ( x ) dx ≤ lim inf f n ( x )dx.
n→+∞ n→+∞

Proof. We set gn ( x ) = infk≥n f k ( x ) for all n ∈ N. Clearly, we have

g n ( x ) ≤ g n +1 ( x ) for all n ∈ N and for all x ∈ Rd .

Therefore, we can apply Beppo-Levi’s theorem and get


Z Z  
lim gn ( x )dx = lim gn ( x ) dx.
n→+∞ n→+∞

Since gn ( x ) ≤ f n ( x ) for all n ∈ N, we get


Z   Z
lim gn ( x ) dx ≤ lim inf f n ( x )dx.
n→+∞ n→+∞

The integrand in the left hand side above is lim infn→+∞ f n ( x ).

The following examples show that the strict inequality in Fatou’s lemma
occurs very often.

3.24 example (Concentration). Let f n : R → R defined by

f n ( x ) = n1[0,1/n] ( x ).

The sequence f n converges almost everywhere to f = 0. Moreover, it is easily


seen that
Z
f n (x) = 1 for all n ∈ N.
R

Hence
Z Z
0= 0dx < lim inf f n ( x )dx = 1.
n→+∞
R

64
3.3. Lebesgue integral

3.25 example (Travelling wave). Let f n : R → R defined by

f n ( x ) = 1[n,n+1] ( x ).

The sequence f n converges almost everywhere to f = 0. Moreover, it is easily


seen that
Z
f n (x) = 1 for all n ∈ N.
R

Hence
Z Z
0= 0dx < lim inf f n ( x )dx = 1.
n→+∞
R

The following theorem provides a quite general sufficient condition which


removes the possibility that mass can be concentrated to one point or escape
at infinity as it does in the previous examples.

3.26 theorem (Lebesgue’s dominated convergence). Let f k : Rd → [−∞, +∞]


be a sequence of measurable functions. Suppose that
(i) There exists a measurable function f : Rd → [−∞, +∞] such that f n → f
almost everywhere.
(ii) There exists a summable function g : Rd → [−∞, +∞] such that | f n ( x )| ≤
g( x ) for all n ∈ N and for almost every x ∈ Rd .
Then,
Z Z
f ( x )dx = lim f n ( x )dx.
→+∞

Proof. For all n ∈ N, let

h n ( x ) = g ( x ) − f n ( x ).

Since hn ≥ 0 almost everywhere, we can apply Fatou’s lemma and get


Z   Z  
lim inf g( x ) − f n ( x ) dx = lim inf hn ( x ) dx
n→+∞ n→+∞
Z Z
≤ lim inf hn ( x )dx = lim inf ( g( x ) − f n ( x ))dx,
n→+∞ n→+∞

g( x )dx < +∞ we can use trivial properties of lim inf and lim sup
R
and since
and get
Z Z
f ( x )dx ≥ lim sup f n ( x )dx. (19)
n→+∞

We now set

Hn ( x ) = g( x ) + f n ( x ),

65
3. Measure and integration. L p spaces

and since Hn ≥ 0 almost everywhere we get by Fatou’s lemma


Z   Z  
lim inf g( x ) + f n ( x ) dx = lim inf Hn ( x ) dx
n→+∞ n→+∞
Z Z
≤ lim inf Hn ( x )dx = lim inf ( g( x ) + f n ( x ))dx,
n→+∞ n→+∞

which implies
Z Z
f ( x )dx ≤ lim inf f n ( x )dx. (20)
n→+∞

(19) and (20) imply


Z Z Z
lim sup f n ( x )dx ≤ f ( x )dx ≤ lim inf f n ( x )dx,
n→+∞ n→+∞

and the assertion is proven since


Z Z
lim inf f n ( x )dx ≤ lim sup f n ( x )dx.
n→+∞ n→+∞

3.4 L p spaces

In this subsection we introduce one of the main classes of Banach spaces used
in functional analysis, i. e. the L p spaces. They are constructed as function
spaces on Rd , and their theory makes use of the Lebesgue measure-integration
theory developed above.
The theory we develop in this chapter will be defined for functions on
Rd with values on R, but everything can be easily generalised to the case of
functions with values on C.

3.27 definition. Let p ∈ [1, +∞), and let Ω ⊂ Rn be a Lebesgue measurable


set. For a measurable function f : Ω → R we define the L p norm of f on Ω as
the (finite or infinite) number
 1/p
.
Z
k f k L p (Ω) =  | f ( x )| p dx  .

Moreover, we set
.
C = {α ∈ R : | f ( x )| ≤ α almost everywhere on Ω}.

The L∞ norm of f (also called the essential supremum of f ) is defined as

k f k L∞ (Ω) = inf C.

The essential supremum of | f | is the minimum essential upper bound for | f |,

66
3.4. L p spaces

namely the minimum α such that | f | ≤ α almost everywhere. It is easily seen


that, in general

k f k L∞ ≤ sup | f ( x )|,
x∈E

and very simple examples can be constructed in which the strict inequality
holds above (basically we get a strict inequality anytime the f achieve its
supremum f on a set of measure zero, and it is bounded above by a value
strictly less than f elsewhere).

3.28 exercise. Prove that if f : Rd → R is continuous then k f k L∞ (Rd ) =


supx∈Rd | f ( x )|.

3.29 remark. A simple consequence of the above definition is that

| f ( x )| ≤ k f k L∞ (E) almost everywhere on E.

To see this, for all k ∈ N let

1
Ck = { x ∈ E : | f ( x )| ≤ k f k L∞ + }.
k
Clearly, m( E \ Ck ) = 0 for all k ∈ N, because, for all k ∈ N, the value k f k L∞ +
1
k is an upper bound for | f | almost everywhere. Hence, m ( k ∈N ( E \ Ck )) = 0,
S

and
[
( E \ Ck ) ⊃ E \ { x ∈ E : | f ( x )| ≤ k f k L∞ },
k ∈N

which implies

m ( E \ { x ∈ E : | f ( x )| ≤ k f k L∞ }) = 0,

and therefore the assertion is proven.

Clearly, we would like to define a norm by means of the L p and L∞ norms.


The problem is that, in general, k f k L p = 0 does not imply f ≡ 0, which is
one of the axioms of a norm. Indeed, for all p ≥ 1 the statement k f k L p (E) = 0
only implies that | f ( x )| 6= 0 almost everywhere on E, but f could still be
nonzero on a set of null measure. We will see in a few pages how to bypass
this problem.
Let p ∈ [1, +∞]. We define the conjugate of p is the number p0 defined by

1 1
+ 0 = 1,
p p

with the convention that 1/ + ∞ = 0. In particular, 1 is the conjugate of +∞


and vice versa.

3.30 theorem (Hölder inequality). Let f , g : E → R be measurable functions, and

67
3. Measure and integration. L p spaces

let p, q ∈ [1, +∞] be conjugate. Then,

k f g k L1 ( E ) ≤ k f k L p ( E ) k g k L q ( E ) .

Proof. If p = 1 and q = +∞, we have


Z Z
k f g k L1 ( E ) = | f ( x ) g( x )|dx ≤ k gk L∞ | f ( x )|dx = k gk L∞ k f k L1 ,
E

where the first inequality is justified by the fact that g can be redefined on a
set of measure zero in a way that g ≤ k gk L∞ everywhere, and this does not
affect the integral.
In the general case p > 1, the statement is trivial if either f or g are zero
almost everywhere. Otherwise, we clearly have k f k L p > 0 and k gk Lq > 0. For
a fixed α > 0 we have
p
f (x) 1 f (x) 1
| f ( x ) g( x )| = |αg( x )| ≤ + |αg( x )|q ,
α p α q

where we have used Young’s inequality (Exercise 1.79). By integrating the


above inequality on E we get

1 1 p 1 q
k f g k L1 ( E ) ≤ p k f k L p ( E) + αq k g k Lq ( E) .
pα q

We now choose α such that the two terms on the above right hand side are
equal, namely
1
k f k Lq p
α := 1
,
p
k g k Lq

which yields

1 k g k Lq p 1 k f kLp q
k f g k L1 ( E ) ≤ k f kLp + k g k Lq ,
p k f k p/q q q/p
k g k Lq
Lp

and the definition of p and q implies the last term above equals k f k L p k gk Lq .

3.31 remark. Hölder inequality can be also rephrased as follows:


 α  β
Z Z Z
| f ( x )|α | g( x )| β dx ≤  | f ( x )|dx   | g( x )|dx  ,
E E E

provided α + β = 1.

3.32 theorem (Minkowski’s inequality). Let f , g : E → R be measurable func-

68
3.4. L p spaces

tions. Let p ∈ [1, +∞]. Then,

k f + g k L p ( E) ≤ k f k L p ( E) + k g k L p ( E) .

Proof. The case p = +∞ is trivial, since for all x ∈ E one has

| f ( x ) + g( x )| ≤ | f ( x )| + | g( x )|,

and the right hand side is controlled by k f k L∞ + k gk L∞ almost everywhere on


E. This implies the assertion.
The case p = 1 is straightforward.
Let p ∈ (1, +∞). The statement is trivially satisfied if either k f k L p or k gk L p
equals +∞. Therefore, assume the are both finite. We observe that

| f ( x ) + g( x )| p ≤ | f ( x ) + g( x )| p−1 (| f ( x )| + | g( x )|) (21)


p −1 p −1
≤ | f ( x ) + g( x )| | f ( x )| + | f ( x ) + g( x )| | g( x )|. (22)

Integrating on E yields, via the Hölder inequality,


Z
| f ( x ) + g( x )| p−1 | f ( x )|dx
E
  p −1  1
Z p Z p
p p p −1
≤ | f ( x ) + g( x )| dx   | f ( x )| dx  = k f + gk L p k f k L p .
E E

By performing the same manipulation on the last term in (21), we get


Z
p p −1
| f ( x ) + g( x )| p dx = k f + gk L p ≤ k f + gk L p (k f k L p + k gk L p ) ,
E

which proves the assertion.

Now, clearly the L p norm verifies


• k f k L p ≥ 0,
• kλ f k L p = |λ|k f k L p ,
• k f + gk L p ≤ k f k L p + k gk L p ,
in the class of measurable functions on which the above quantities are finite.
But this is not enough to make k · k L p a norm on such space, because in general
k f k L p = 0 does not imply f ≡ 0. Please notice that the underlying linear space
on which we are defining (or trying to define) a norm here is the space of
measurable functions on Rd such that k f k L p < +∞ (verify as an exercise that
such a set is a linear space).

3.33 fact. Let us recall a simple fact from set theory. Given a set X, an equiva-
lence on X is a subset E of X × X such that
(i) ( x, x ) ∈ E for all x ∈ E,

69
3. Measure and integration. L p spaces

(ii) ( x, y) ∈ E if and only if (y, x ) ∈ E,

(iii) If ( x, y) and (y, z) are in E, then ( x, z) ∈ E.


We use the notation x ∼ y to denote ( x, y) ∈ E. Given an equivalence on X,
and given x ∈ X, we set

[ x ] = { y ∈ X : x ∼ y },

called the equivalence class of x. We call X/ ∼ the set of all equivalence classes
for the relation E. Such set is called the quotient set.

3.34 exercise. Let X be a real (or complex) linear space. Let ∼ be an equiva-
lence on X. Given [ x ], [y] ∈ X/ ∼ and λ, µ ∈ R, set

λ[ x ] + µ[y] = [λx + µy] ∈ X/ ∼ .

Prove that such a definition is well posed (i.e. the class [ x ] + [y] does not
depend on the choice of the vectors x, y. Prove that X/ ∼ is a real (or complex)
linear space with the above defined operation.

3.35 definition. Let E ⊂ Rd be a measurable set and p ∈ [1, +∞]. We call


L p ( E) the set of measurable functions f : E → R such that k f k L p < +∞.
Now, we set the following equivalence on L p ( E). Let f , g ∈ L p ( E). We say
that f ∼ g if f ( x ) = g( x ) for almost every x ∈ E.7 We set

L p ( E ) = L p ( E ) / ∼,

i. e. L p ( E) is the quotient (vector) space of L p ( E) through the relation ∼8 . For


a given equivalence class [ f ] ∈ L p ( E), we define the norm of [ f ] as k f k L p (E)
for an arbitrary representant f ∈ [ f ]. From now on, by abuse of notation, we
shall confuse [ f ] and its representant f . The space L p ( E) is the L p space on E.

3.36 exercise. Prove that the norm of f ∈ L p ( E) is well defined.

3.37 remark. Clearly, the L p norm on L p ( E) is now a norm, since all the
‘good’ properties proven above are easily inherited by the norm on the quo-
tient space, and furthermore one has k f k L p = 0 implies f = 0 almost every-
where, hence [ f ] = 0. Therefore, L p ( E) is a normed space.

We shall say that a sequence f n ∈ L p ( E) converges in L p to f ∈ L p if


k f n − f k L p → 0 as n → +∞.

3.38 theorem (Riesz-Fisher). Let E ⊂ Rd be a measurable set and p ∈ [1, +∞].


Then the space L p ( E) is Banach space. Moreover, if p ∈ [1, +∞) and { f n }n∈N is
a Cauchy sequence in L p ( E), then there exist two functions f , h ∈ L p ( E) and a
subsequence f nk of f n such that
7 Prove that ∼ is actually an equivalence!
the vector space operations on L p ( E) are inherited by the quotient space by considering
8 All

operations between representant. For instance, given [ f ], [ g] ∈ L p , [ f ] + [ g] is the equivalence class


of [ f + g]. Prove that such operation is well defined. Define similarly λ[ f ] for some λ ∈ R.

70
3.4. L p spaces

(a) | f nk ( x )| ≤ h( x ) almost everywhere on E,

(b) f nk → f almost everywhere on E.

Proof. Omitted
The theorem above is quite important. Apart from stating that L p spaces
are complete, it investigates the interplay between L p convergence and al-
most everywhere convergence. More precisely, it says that if a sequence f n
converges in L p to some f , then f n has a subsequence that converges almost
everywhere. The next example shows that, in general, convergence in L p does
not imply convergence almost everywhere of the whole sequence.

3.39 example. Let f n : [0, 1] → R be defined by f n ( x ) = 0, f 1 ( x ) = 1[0,1/2) ( x ),


f 2 ( x ) = 1[1/2,1] ( x ), and for general k ≥ 1, k ∈ N, and for all n = 2k−1 +
1, . . . , 2k ,

f n ( x ) = 1[n2−k −1,n2−k ) ( x ).

One can easily see that the L1 ([0, 1]) norm of f n tends to zero as n → +∞,
so that f n → 0 in L1 ([0, 1]). However, for every x ∈ [0, 1], the set of integers
n such that f n ( x ) = 1 is infinite, and therefore f ( x ) cannot converge to zero.
This is true for every x ∈ [0, 1]. Therefore, f n converges to zero on the empty
set, so it does not converge to zero on a set of measure 1. Hence, it is not true
that f n converges to zero almost everywhere.

On the other hand, does almost everywhere convergence imply L p conver-


gence? This is also not true in general, as one can deduce from the example
f n : R → R,

f n ( x ) = 1[n,n+1) ( x ),

in which f n converges to zero almost everywhere but not in L1 (Exercise).

3.40 definition (Support of a continuous function). Let Ω ⊂ Rd be an open


set. Let f ∈ C (Ω)9 . The support of f in Ω is the set

spt( f ) = { x ∈ Ω : f ( x ) 6= 0}.

If spt( f ) is compact, we say that f is compactly supported. The space of com-


pactly supported functions on Ω is denoted by Cc (Ω). We notice that Cc (Ω) ⊂
Cb (Ω) ⊂ C (Ω).

We now recall the notion of distance between sets.

3.41 definition. Let x ∈ Rd , A, B ⊂ Rd . We set

d( x, A) = inf{k x − yk : y ∈ A},
9 We recall that C (Ω) is the space of continuous functions from Ω to R

71
3. Measure and integration. L p spaces

and

d( A, B) = inf{k x − yk : x ∈ A, y ∈ B}.

Here k · k is the Euclidean norm on Rd .

3.42 exercise. The distance function defined above has the following proper-
ties.

• d( A, B) = infx∈ A d( x, B) = infy∈ B d( x, A) (exercise).

• The map Rd 3 x 7→ d( x, A) is continuous, indeed, the map is Lipschitz


continuous with Lipschitz constant 1, i. e.

|d( x, A) − d(y, A)| ≤ k x − yk.

Left as an exercise. Hint: take an arbitrary point z ∈ A and use the


triangular inequality.

• If K, C ⊂ Rd with K compact, C closed, and K ∩ C = ∅, then d(K, C ) > 0


(exercise).

• Let A ⊂ Rd and δ > 0. Set

Aδ = { x ∈ Rd : d( x, A) ≤ δ}.

Prove that A ⊂ Aδ and Aδ is closed. Moreover, if K ⊂ Rd is compact


then Kδ is compact. This is an easy exercise.

• As a consequence of the above exercises, let Ω ⊂ Rd be open, and let


K ⊂ Ω be compact. Let δ0 = d(K, Rd \ Ω). Clearly, δ0 > 0. Hence, for all
δ < δ0 one has

K ⊂ Kδ ⊂ Kδ ⊂ Ω.

The next proposition is a special case of a more general result in topology


known as Urysohn’s lemma10 .

3.43 proposition. Let K ⊂ Ω ⊂ Rd , with K compact and Ω open. Then, there


exists ϕ ∈ Cc (Ω) such that ϕ( x ) = 1 for all x ∈ K and 0 ≤ ϕ( x ) ≤ 1 for all x ∈ Ω.

Proof. Let δ such that 0 < δ < d(K, Rd \ Ω). Set

d( x, Rd \ Kδ )
ϕ( x ) = .
d( x, Rd \ Kδ ) + d( x, K )

Clearly, d( x, Rd \ Kδ ) + d( x, K ) 6= 0 for all x ∈ Ω. Indeed, if d( x, K ) = 0 then


x ∈ K, and therefore d( x, y) ≥ δ for all y ∈ Rd \ Kδ . Moreover, ϕ( x ) ∈ [0, 1] for
all x ∈ Ω, and x ∈ K implies d( x, K ) = 1 and ϕ( x ) = 1. Finally, ϕ( x ) 6= 0 only
10 http://en.wikipedia.org/wiki/Urysohn’s_lemma

72
3.4. L p spaces

if d( x, Rd \ Kδ ) 6= 0, which is equivalent to x ∈ Kδ . Hence, the support of ϕ is


compact in Ω.

In what follows we shall denote with S(Ω) the space of simple functions
on Ω which are zero outside a bounded set.

3.44 theorem (Density of Cc in L p ). Let Ω ⊂ Rd be an open set.

(i) The space S(Ω) is dense in L p if p ∈ [1, +∞).

(ii) Cc (Ω) is dense in L p (Ω) if p ∈ [1, +∞).

(iii) Cc (Ω) is not dense in L∞ (Ω). S(Ω) is dense in L∞ (Ω) if Ω is bounded.

Proof. Proof of (i). Let f ∈ L p (Ω). We have to construct a sequence of simple


functions φj ∈ S(Ω) with kφj − f k L p → 0 as j → +∞. Assume first that f ≥ 0.
We know from Exercise 3.17 that there exists a sequence of nonnegative simple
functions φj ∈ S(Ω) with φj % f almost everywhere in case f is zero outside
a bounded set. In the general case of f ≥ 0, for a given n ∈ N there exists
a sequence φn,j % f 1Bn (0) . Consider the diagonal sequence φj,j ∈ S(Ω). Let
e > 0. For almost every x ∈ Ω one has x ∈ Bj (0), and hence | f ( x ) − φj,j | ≤ e
for j large enough. Hence, the claim is true also for a general f ≥ 0. Now, this
implies

0 ≤ | f − φj | p ≤ | f | p ,

almost everywhere on Ω. Therefore, we can apply Lebesgue dominated con-


vergence theorem 3.26 to get
Z
| f ( x ) − φj ( x )| p dx → 0 as j → +∞.

The general case f sign changing can be solved by splitting f = f + − f − , con-


structing sequences of simple functions as above for f + and f − , and applying
the previous step.
Proof of (ii). Let f ∈ L p (Ω) and let e > 0. Due to (i) there exists a simple
function φ on Ω such that k f − φk L p ≤ 2e . The proof will be completed once
we find a continuous function g on Ω such that k g − φk L p ≤ 2e . Assume first
that φ = α1F for some measurable bounded set F ⊂ Ω and some α ∈ R. Fix
σ > 0. Let K ⊂ F ⊂ A, K compact and A open, such that m( A) − m(K ) < σ.
From Proposition 3.43, we know that there exists a function g̃ ∈ Cc ( A) such
that 0 ≤ g̃ ≤ 1 and g̃ = 1 on K. Let g = α g̃. We have
Z Z
p
k g − φk L p = | g( x ) − φ( x )| p dx = |α g̃ − α1F | p dx ≤ α p m( A \ K ) ≤ α p σ,
Ω Ω

and choosing σ = (e/2α) p one has k g − φk L p ≤ 2e . Assume now that

N
φ= ∑ α j 1Fj ,
j =1

73
3. Measure and integration. L p spaces

with Fj measurable and bounded sets. From the previous case we can find
continuous functions g j such that

e
k g j − 1Fj k L p ≤ .
2 ∑N
j =1 | α j |

Set g = ∑ N
j=1 α j g j , we have

N N
e
k g − φk L p = k ∑ α j ( g j − 1Fj )k L p ≤ ∑ kα j | gj − 1Fj |k L p ≤ 2 ,
j =1 j =1

and the assertion (ii) is proven.


Proof of (iii). Cc (Ω) cannot be dense in L∞ (Ω). Indeed, take f ∈ L∞ (Ω)
discontinuous at one point. The density property would imply that there ex-
ists a sequence of continuous functions f j on Ω that converge in L∞ to f .
But for continuous functions the convergence in L∞ is equivalent to the uni-
form convergence, and this is in contradiction with a well known convergence
property of sequences of functions.
If Ω is bounded, then the statement that simple functions are dense in L∞
is an immediate consequence of exercise 3.17.

3.45 theorem (Separability of L p ). L p (Ω) is separable if p ∈ [1, +∞). L∞ (Ω) is


not separable.

Proof. We omit the proof of the case p < +∞, which is obtained from the
previous theorem using the density of simple functions and via approximation
by rectangles with rational edges.
Case p = +∞. If we prove that there exists a family of open balls in L∞ (Ω)
which are pairwise disjoint and with an uncountable cardinality, the proof
will be completed. Indeed, if such property is satisfied, any dense subset S
in L∞ should have at least one element in each of the above open balls, and
this makes it impossible for S to be countable. Now, given two open balls
B, B0 ⊂ Ω, assuming that B 6= B0 , one has

k1B − 1B0 k L∞ = 1,

and the proof is an easy exercise. Now, for a given ball B ⊂ Ω, set

1
UB = { g ∈ L ∞ : k g − 1 B k L ∞ < }.
2
Clearly, the family

U = {UB : B is an open ball in Ω}

is more than countable, and every two distinct elements in U are disjoint. This
proves the assertion.

3.46 remark. Let Ω ⊂ Rd be open, and let p, q ∈ [1, +∞] with p ≤ q. Is there

74
p
3.5. Convolution, regularisation and Lloc spaces.

any relation between L p and Lq ? More presicely, is one of the two spaces a
subset of the other one? In general the answer is negative. As an example,
let p = 1 and q = 2, and let Ω = (0, +∞) ⊂ R. Take f ( x ) = 1+1 x . Clearly,
f 6∈ L1 (Ω), where as f ∈ L2 (Ω). Now, let g( x ) = √1x 1(0,1) . Clearly, g ∈ L1 (Ω)
but g 6∈ L2 (Ω). Hence, L1 (Ω) is not a subset of L2 (Ω) and L2 (Ω) is not a
subset of L1 (Ω).
On the other hand, if m(Ω) < +∞, the L p spaces are ordered. Indeed, let
p ≤ q: then Lq (Ω) ⊆ L p (Ω). To see this, assume first q < +∞. We compute
Z Z Z Z Z
| f | p dx = | f | p dx + | f | p dx ≤ | f |q dx + m(Ω) ≤ | f |q dx + m(Ω),
Ω | f |≥1 | f |<1 | f |≥1

and hence k f k Lq < +∞ implies k f k Lq < +∞. Now, let us consider the case
q = +∞. We have
Z
p
| f | p ≤ k f k L ∞ m ( Ω ),

and this proves the assertion.

Having defined the new family of functional spaces L p (Ω) for p ∈ [1, +∞]
allows to consider a new notion of convergence for sequences of functions.
Given Ω ⊂ Rd a measurable set, we have that L p (Ω) is a complete normed
space, i.e. a Banach space. As such, it encompasses a notion of convergence.
A sequence f n ∈ L p (Ω) converges to f in L p if k f n − f k L p → 0 as n → +∞.
The above remark shows that there is no relationship between convergence in
L p (Ω) and convergence in Lq (Ω) for p 6= q unless Ω has finite measure. In this
case, the convergence in L1 is weaker than any other L p convergence, whereas
the L∞ one is the strongest. The uniform convergence is stronger than the L∞
convergence on an arbitrary measurable set Ω (even if Ω is unbounded).

3.47 definition. Let Ω ⊂ Rd be open. Let p ∈ [1, +∞]. The vector space
p
Lloc (Ω) is the set of all measurable functions f : Ω → R such that, for every
compact subset K ⊂ Ω, one has f 1K ∈ L p (Ω), or equivalently f ∈ L p (K ).
p
3.48 exercise. Let p ∈ [1, +∞]. Prove that if f ∈ L p than f ∈ Lloc . Show that
q p
the converse is not true in general. Prove that Lloc ⊂ Lloc if p ≤ q.
p
Note in particular that Lloc (Ω) ⊂ L1loc (Ω) for all p ≥ 1.

p
3.5 Convolution, regularisation and Lloc spaces.

We first define the convolution product of a function f ∈ L1 (Rd ) with a func-


tion g ∈ L p (Rd ).

3.49 theorem (Young’s inequality for convolutions). Let f ∈ L1 (Rd ) and g ∈


L p (Rd ) with p ∈ [1, +∞]. Then, for almost every x ∈ Rd the function y 7→ f ( x −

75
3. Measure and integration. L p spaces

y) g(y) is summable on Rd and we define


Z
( f ∗ g)( x ) = f ( x − y) g(y)dy .
Rd

In addition f ∗ g ∈ L p (Rd ) and we have

k f ∗ g k L p (Rd ) ≤ k f k L1 (Rd ) k g k L p (Rd ) .

Proof. The conclusion is trivial if p = +∞. We now consider the case p = 1.


Set F ( x, y) = f ( x − y) g(y). For a.e. y ∈ Rd
Z Z
| F ( x, y)|dx = | g(y)| | f ( x − y)|dy = | g(y)|k f k L1 (Rd ) < +∞ .
Rd Rd

Moreover,
Z Z
dy | F ( x, y)|dx = k gk L1 (Rd ) k f k L1 (Rd ) < +∞ .
Rd Rd

From Fubini’s theorem we get that F ∈ L1 (Rd × Rd ) and that we can exchange
the order of integration to obtain the desired assertion.
We omit the proof of the case p ∈ (1, +∞).

In the sequel we shall denote

fˇ( x ) = f (− x ).
0
3.50 remark. Let f ∈ L1 (Rd ), g ∈ L p (Rd ), and h ∈ L p (Rd ). Then we have
Z Z
( f ∗ g)hdx = g( fˇ ∗ h)dx .
Rd Rd

To prove this, let

F ( x, y) = f ( x − y) g(y)h( x )

which belongs to L1 (Rd × Rd ) because


Z Z
|h( x )| | f ( x − y)|| g(y)|dy < +∞
Rd Rd

in view of Hoelder’s inequality and the previous Theorem. Moreover,


Z Z Z Z Z Z
( f ∗ g)( x )h( x )dx = dx F ( x, y)dy = dy F ( x, y)dx = g(y)( fˇ ∗ h)(y)dy .
Rd Rd Rd Rd Rd Rd

We now want to refine our concept of support for L p functions. The one

76
p
3.5. Convolution, regularisation and Lloc spaces.

we have defined so far only applies to continuous functions. The problem


with L p space is that this is a set of equivalence classes, therefore the usual
definition does not apply. As an example, consider the indicator function of
the set D = [0, 1] ∩ Q. The support of this function with the usual definition
would be [0, 1], because the latter is the closure of D. However, this function
is almost everywhere equal to zero, and the support of zero is the empty set.
Hence, this definition is not well posed. To bypass this problem, we proceed
as follows.
Let f : Rd → R be any function. Consider the family {ωi }i∈ I of all open
S
sets on which f = 0 almost everywhere. Set ω = i∈ I ωi . We claim that
f = 0 almost everywhere on ω. To see this, consider a countable family of
open sets On in Rd such that every open set can be written as union of some
On . This is doable for instance by considering open balls with center having
S
rational components and rational radius. Write ωi = n∈ Ai On , which implies
ω = n∈ B On where B = i∈ I Ai . For all n ∈ B we have n ∈ Ai for some i ∈ I.
S S

Since f is zero almost everywhere on ωi , we have f = 0 almost everywhere


on On . Then, f is zero almost everywhere on every On included in ω, which
implies that f = 0 almost everywhere on ω.
We then set, by definition, supp( f ) as Rd \ ω. We immediately see that if
f 1 = f 2 almost everywhere then supp( f 1 ) = supp( f 2 ). This is due to the fact
that if f 1 and f 2 coincide almost everywhere they cannot differ on a nontrivial
open set.

3.51 exercise. Check that the above definition coincides with the usual one in
case f is continuous.

We recall that, given two subsets A, B ⊂ Rn , the set A + B is defined as

A + B = { x ∈ Rn : x = y + z for some y ∈ A and z ∈ B} .

3.52 proposition. Let f ∈ L1 (Rd ) and g ∈ L p (Rd ) with p ∈ [1, +∞]. Then,

supp( f ∗ g) ⊂ supp( f ) + supp( g).

Proof. Omitted

We remark that if both f and g have compact support then so does f ∗ g


(Exercise!).
We now start investigating on how convolutions inherit the regularity of
just one of the two factors.

3.53 proposition. Let f ∈ Cc (Rd ) and g ∈ L1loc (Rd ). Then f ∗ g is continuous on


Rd .

Proof. Let xn → x. We first notice that

y 7→ f ( x − y) g(y)

has a well defined Lebesgue integral, because y ranges in the compact set

77
3. Measure and integration. L p spaces

x − supp( f ) and g is summable on that set. Now, by possibly fattening the


support of f , we can find a compact set K containing the set xn − supp( f ) for
large enough n. Therefore, if y 6∈ K, then xn − y does not belong to the support
of f and therefore f ( xn − y) = 0. Since f continuous, then f is uniformly con-
tinuous on its support. Hence, using that every uniformly continuous function
f has a modulus of continuity

ω ( f )(δ) = sup{| f ( x ) − f (y)| : | x − y| ≤ δ}

tending to zero as δ & 0, we get

| f ( xn − y) − f ( x − y)| ≤ ω (δn )1K (y) ,

as | xn − x | < δn & 0. By integrating with respect to y we get


Z
|( f ∗ g)( xn ) − ( f ∗ g)( x )| ≤ | g(y)|| f ( xn − y) − f ( x − y)|dy
Rd
Z
≤ ω (δn ) | g(y)|dy
K

which proves the assertion since the last integral above is finite.

3.54 proposition. Let f ∈ Cck Rd ) and g ∈ L1loc (Rd ). Then f ∗ g ∈ C k (Rd ) and

Dα ( f ∗ g) = ( Dα f ) ∗ g

for any multi-index α with length less than k.

Proof. By induction we only need to prove the case k = 1. Let x ∈ Rd . We


claim that f ∗ g is differentiable at x and that ∇( f ∗ g)( x ) = ((∇ f ) ∗ g)( x ). Let
us fix h ∈ Rd with |h| < 1. For all y ∈ Rd we have

| f ( x + h − y) − f ( x − y) − h · ∇ f ( x − y)|
Z1
= (h · ∇ f ( x − y + sh) − h · ∇ f ( x − y))ds .
0

Now, due to the uniform continuity of f and its first derivative on supp( f ), the
aboev integral can be controlled by |h|ω (h) for some modulus of continuity
ω (h) & 0 as |h| & 0. Let K be a compact set such that x + B1 (0) − supp( f ) ⊂
K. If y 6∈ K then x − y + h 6∈ supp( f ) for all h with |h| < 1. Therefore, for
y 6∈ K and |h| < 1,

f ( x + h − y) − f ( x − y) − h · ∇ f ( x − y) = 0 .

Therefore, similarly to the previous proposition

| f ( x + h − y) − f ( x − y) − h · ∇ f ( x − y)| ≤ |h|ω (|h|)1K (y) .

78
p
3.5. Convolution, regularisation and Lloc spaces.

Hence,

|( f ∗ g)( x + h) − ( f ∗ g)( x ) − h · ((∇ f ) ∗ g)( x )|


Z
≤ | g(y)|| f ( x + h − y) − f ( x − y) − h · ∇ f ( x − y)|dy
Rd
Z
≤ |h|ω (|h|) | g(y)|dy ,
K

which implies the assertion by letting |h| & 0.

3.55 definition (Mollifiers). A sequence of mollifiers ρn is a sequence of func-


tions on Rd such that
Z
ρn ∈ Cc∞ (Rd ) , supp(ρn ) ⊂ B1/n (0) , ρn ( x )dx = 1 , ρn ≥ 0 .
Rd

It is very easy to generate a family of mollifiers as follows. Take


2
(
e1/(| x| −1) if | x | < 0
ρ( x ) =
0 if | x | ≥ 1 .

Then set ρn ( x ) = Cnd ρ(nx ) with


  −1
Z
C= ρ( x )dx  .
Rd

3.56 proposition. Assume f ∈ C (Rd ). Then ρn ∗ f → f uniformly on compact


sets.

Proof. Fix a compact set K in Rd . Given ε > 0 there exists a δ > 0 such that

| f ( x − y) − f ( x )| < ε

provided |y| < δ and for all x ∈ K. Clearly, the δ depends on ε and on K. Now,
Z
(ρn ∗ f )( x ) − f ( x ) = ρn (y)( f ( x − y) − f ( x ))dy
Z
= ρn (y)( f ( x − y) − f ( x ))dy .
B1/n (0)

For n > 1/δ and x ∈ K we get


Z
|(ρn ∗ f )( x ) − f ( x )| ≤ ε ρn (y)dy = ε .

79
3. Measure and integration. L p spaces

3.57 theorem. Assume f ∈ L p (Rd ) with p ∈ [1, +∞). Then (ρn ∗ f ) → f in L p .

Proof. Given ε > 0, we know there is a function f 1 ∈ Cc (Rd ) with k f − f 1 k L p <


ε. We know that ρn ∗ f 1 converges to f 1 uniformly on compact sets. On the
other hand

supp(ρn ∗ f 1 ) ⊂ B1/n (0) + supp( f 1 ) ⊂ B1 (0) + supp( f 1 ) ,

which is a fixed compact set. Hence, it easily follows that

kρn ∗ f 1 − f 1 k L p → 0 .

Now,

ρn ∗ f − f = ρn ∗ ( f − f 1 ) + (ρn ∗ f 1 − f 1 ) + ( f 1 − f ) ,

which gives

k ρ n ∗ f − f k L p ≤ 2k f − f 1 k L p + k ρ n ∗ f 1 − f 1 k L p ,

as a consequence of Young’s inequality for convolutions. Therefore,

lim sup kρn ∗ f − f k L p ≤ 2ε


n→+∞

and the assertion follows from the arbitrariness of ε > 0.

3.58 corollary. Let Ω ⊂ Rd be open. Then Cc∞ (Ω) is dense in L p (Ω).

Proof. The proof is rather technical, but it is essentially based on the above
results. We omit it.

3.59 proposition. Let u ∈ L1loc (Ω), Ω ⊂ Rd open. If


R
Ω uφdx = 0 for all φ ∈
Cc∞ (Ω), then u = 0 almost everywhere on Ω.

Proof. Let g ∈ L∞ (Rd ) be a function with compact support contained in Ω. Set


gn = ρn ∗ g. Hence, for large enough n, gn ∈ Cc∞ (Ω). Hence, by assumption
we have for all n large enough
Z
ugn dx = 0 .

Since gn → g in L1 (Rd ), there exists a subsequence of gn (still denoted by gn


for simplicity) converging to g almost everywhere on Rd . Moreover, Young’s
inequality for convolutions implies k gn k L∞ ≤ k gk L∞ . Hence, by dominated
convergence we get
Z
ugdx = 0 .

80
3.6. A criterion for strong compactness in L p

Let K be a compact subset of Ω and set


(
sign(u( x )) if x ∈ K
g( x ) =
0 otherwise .

We deduce
Z Z Z
0= ugdx = usign(u)dx = |u|dx .
Rd K K

Hence, |u| = 0 on K. Since K is arbitrary, |u| = 0 on Ω.

3.6 A criterion for strong compactness in L p

In this subsection we shall use the shift function

(τh f )( x ) = f ( x + h) .

3.60 theorem (Kolmogorov-Riesz-Frechet). Let F be a bounded set in L p (Rd )


with 1 ≤ p < +∞. Assume further that

lim k(τh f ) − f k L p (Rd ) = 0 uniformly on f ∈ F , (23)


|h|&0

that is, for ε > 0 there exists δ > 0 such that k(τh f ) − f k L p (Rd ) < ε for all |h| < δ
and for all f ∈ F . Then, F |Ω is relatively compact in L p (Ω) for any measurable
Ω ⊂ Rd having finite measure.

Proof. The proof is performed in four steps.


Step 1. Under the assumptions above, we claim that

k ρ n ∗ f − f k L p (Rd ) ≤ ε

for all f ∈ F and for all n > 1/δ. Indeed, Hoelder’s inequality implies
Z
|(ρn ∗ f )( x ) − f ( x )| ≤ ρn (y)| f ( x − y) − f ( x )|dy
Z
0
= ρn (y)1/p ρn (y)1/p | f ( x − y) − f ( x )|dy
Z 1/p Z 1/p0
≤ ρn (y)| f ( x − y) − f ( x )| p dy ρn (y)dy
Z 1/p
= ρn (y)| f ( x − y) − f ( x )| p dy .

Hence,
Z ZZ
|(ρn ∗ f )( x ) − f ( x )| p dx ≤ ρn (y)| f ( x − y) − f ( x )| p dydx ,

and by assumption the above is controlled, for n > 1/δ, by ε p .

81
3. Measure and integration. L p spaces

Step 2. We claim that there exists a constant Cn depending only on n such


that

kρn ∗ f k L∞ ≤ Cn k f k L p for all f ∈ F

and

|(ρn ∗ f )( x1 ) − (ρn ∗ f )( x2 )| ≤ Cn k f k L p | x1 − x2 | for all f ∈ F and for all x1 , x2 ∈ Rd .

Indeed, we have
Z Z 1/p0
0
|(ρn ∗ f )( x )| ≤ ρn (y)| f ( x − y)|dy ≤ ρn (y) p dy k f kLp

and, since ∇(ρn ∗ f ) = (∇ρn ) ∗ f , similarly we get

k∇(ρn ∗ f )k L∞ ≤ k∇ρn k L p0 k f k L p ,

which implies the assertion since the L∞ norm of ∇(ρn ∗ f ) controls the dif-
ference quotients of ρn ∗ f .
Step 3. Given ε > 0 and Ω with finite measure, we can find ω ⊂ Ω
bounded and measurable such that

k f k L p (Ω\ω ) < ε for all f ∈ F .

Indeed, we write

k f k L p (Ω\ω ) ≤ k f − (ρn ∗ f )k L p (Rd ) + kρn ∗ f k L p (Ω\ω ) ,

and the last term above is controlled by

m(Ω \ ω )1/p Cn k f k L p (Rd )

which can be made small by choosing m(Ω \ ω ) small, which is always possi-
ble since the measure of Ω is finite.
Step 4. Since L p (Ω) is complete, to conclude we need to show that F |Ω
is totally bounded. Let ε > 0. Let us fix ω ⊂ Ω as above, and let us fix
n > 1/δ. The family H := (ρn ∗ F )|ω satisfies all assumptions of Arzelà-
Ascoli Theorem. Therefore, H is relatively compact in C (ω ). Since ω has finite
measure, it is easily checked that H has in fact compact closure in L p (ω ).
Hence, by total boundedness we can cover H by a finite number of balls in
L p (ω ) with radius ε. Therefore, there exists finitely many gi ∈ L p (ω ), i =
1, . . . , k, such that
k
[
H⊂ Bε ( gi ) .
i =1

Now, for all i ∈ 1, . . . , k we set gei : Ω → R to be equal to g on ω and zero


elsewhere. We claim that F |Ω can be covered by the balls of centers gei and

82
3.7. Exercises

radius 3ε. Let f ∈ F . There is some i such that

k ρ n ∗ f − gi k L p ( ω ) < ε .

Since
Z Z
p
k f − gei k L p (Ω) = | f | p dx + | f − gi | p dx ,
Ω\ω ω

we have

k f − gei k L p (Ω) ≤ ε + k f − gi k L p (ω )
≤ ε + k f − ρn ∗ f k L p (Rd ) + kρn ∗ f − gi k L p (ω ) ≤ 3ε .

Hence, we conclude that F |Ω can be covered by finitely many balls with


radius 3ε, which implies total boundedness and the thesis.

3.61 remark. As a consequence of the previous theorem, let F be a bounded


subset of L p (Rd ) with p ∈ [1, +∞) such that (23) holds and such that for every
ε > 0 there exists a bounded set Ω ⊂ Rd such that
Z
| f | p dx < ε for all f ∈ F .
Rd \ Ω

Then, F has compact closure in L p (Rd ).

In general, if we want to achieve strong compactness in L p on the whole space


Rd , we need an additional assumption as in the previous remark, otherwise
there may be situations in which just (23) alone (and the boundedness) is not
sufficient for compactness, see the Exercises.

3.7 Exercises

1. Let A, B ⊂ Rd be two Lebesgue measurable sets. Assume that A = B \ C


with C a measurable set with m(C ) = 0. Then prove that m( A) = m( B).

2. Find an example of a sequence of measurable functions f n on R which


do not satisfy the assumptions of Fatou’s lemma and for which
Z   Z
lim inf f n ( x ) dx > lim inf f n ( x )dx.
n→+∞ n→+∞

3. Show that the indicator function 1 A of a set A ⊂ Rd is measurable if and


only if the set A is Lebesgue measurable.

4. Suppose { f n }n is a sequence of measurable, nonnegative functions. As-


sume f n → f almost everywhere on Rd . Prove that f is almost every-
where nonnegative.

83
3. Measure and integration. L p spaces

5. Let f : Rd → [0, +∞] be a summable function. Show that for every e > 0
there exists a measurable set E ⊂ Rd such that
Z
| f ( x )|dx < e
Rd \ E

(Hint: use the dominated convergence theorem).

6. For each of the following sequences of functions (restricted to the do-


main I),

• determine whether or not they converge almost everywhere on I,


and in the affirmative case find the almost everywhere limit f ,
• say whether or not I f n dx → I f dx as n → +∞,
R R

• say whether or not the sequence converges uniformly to f on I:


nx
(a) f n ( x ) = 1+ n2 x 2
, I = [0, 1].

(b) f n ( x ) = nxe −nx2 , I = (0, 1).


n2 x 2
(c) f n ( x ) = n4 + x 2
, I = (1, +∞).
(d) f n ( x ) = nx
1+ n2 x 2
, I = (0, +∞).
2 x2
(e) f n ( x ) = nxe−n , I = [0, 1].
2 2
(f) f n ( x ) = nxe−n x , I = [1, +∞).
(g) f n ( x ) = nxe − n2 x 2 , I = [0, +∞).
(h) f n ( x ) = 1
1+ x n , I = [0, +∞).

7. Let E ⊂ Rd be a measurable set. Let f ∈ L p ( E) and g ∈ Lq ( E) for some


p, q ∈ [1, +∞]. Let r ∈ [1, +∞] be such that

1 1 1
= + .
r p q

Prove that f · g ∈ Lr ( E), and

k f g k Lr ≤ k f k L p k g k L q .

8. On Rd , let
(
| x |−α if | x | < 1
f0 (x) =
0 if | x | ≥ 1,

and
(
| x |−α if | x | ≥ 1
f ∞ (x) =
0 if | x | < 1.

Show that

84
3.7. Exercises

• f 0 ∈ L p if and only if pα < d.


• f ∞ ∈ L p if and only if pα > d.

9. For each of the following functions defined on the set I ⊂ R, say for
which p ∈ [1, +∞] the function f belongs to L p ( I ):
sin | x |
(a) f ( x ) = | x |2
, I = [−1, 1].
(b) f ( x ) = x2 /3 log x, I = (0, 1).
q
3
(c) f ( x ) = 1+ xx2 + x4 , I = [0, +∞).
(d) f ( x ) = arctan x
x , I = (0, +∞).
1
(e) f ( x ) = x log x , I = (0, 1).

(f) f ( x ) = 1
x log x , I = (1, +∞).

10. For each of the following sequences of functions defined on the set I ⊂
R,

• determine whether or not they converge almost everywhere on I,


and in the affirmative case find the almost everywhere limit f ,
• say whether or not f n → f in L p ( I ) for the index p indicated:

n2 x 2
(a) f n ( x ) = 1+ n3 x 3
, I = [0, +∞), p = 2,
(b) f n ( x ) = 1
1+nx1/3
, I = [0, +∞), p = 2,
(c) f n ( x ) = n sin( x/n)e−2x , I = [0, +∞), p = 1,
1√
(d) f n ( x ) = 1+ n x
, I = [0, 1], p = 1.
(e) f n ( x ) = ne−nx , I = (0, 1), p ∈ [1, +∞).
(f) f n ( x ) = n1/3 e−nx , I = (0, +∞), p ∈ [1, +∞]

11. Let f n : R → R be defined via



−1 if x < −1/n

f n (x) n if −1/n ≤ x ≤ 1/n

1 if x > 1/n.

• Find the almost everywhere limit of f n as n → +∞.


• Prove that f n does not converge uniformly to f .
• Prove that f n converges to f in L p (R) if and only if p ∈ [1, +∞).

12. Let B be a bounded subset of L p (Rd ) with p finite and let G ∈ L1 (Rd ).
Consider the set

F = { G ∗ f : f ∈ B} .

Prove that F |Ω has compact closure in L p (Ω) for any measurable set Ω
with finite measure.

85
3. Measure and integration. L p spaces

13. Let ϕ ∈ Cc∞ (Rd ) and let

F = {ψn ( x ) = ϕ( x + n) : n ∈ N} .

Prove that F satisfies (23) but it doesn’t have compact closure in L p (Rd )
for p finite.

86
Part II
Bounded linear operators and
Hilbert spaces
4 Introduction to linear operators on Banach spaces
Many linear equations can be formulated in terms of a suitable linear operator
acting on a Banach space (see Problem 0.2). In this section we study linear
operators acting on Banach spaces in greater detail.

4.1 Bounded linear maps


We recall the concept of linear operator, which should be well known from
basic linear algebra. A linear map or linear operator T between real (or complex)
linear spaces X, Y is a function T : X → Y such that

T (λx + µy) = λTx + µTy, for all λ, µ ∈ R (or C) and x, y ∈ X.

A linear map T : X → X is called a linear transformation of X, or a linear


operator on X. If a linear map T : X → Y is one-to-one and onto, then we say
that T is invertible, and define the inverse map T −1 : Y → X by T −1 y = x if
and only if Tx = y, so that TT −1 = IY , T −1 T = IX . The linearity of T implies
the linearity of T −1 (exercise!). An useful exercise in this contexts is to prove
that for a linear operator T, the image of the zero vector in X, T (0), is always
equal to the zero vector in Y.
A natural question arises: are linear operators continuous? In the finite di-
mensional case we expect this is the case. Sadly, in infinite dimension this is
not always the case.

4.1 proposition. Let ( X, k · k X ) and (Y, k · kY ) be two normed spaces. Let T : X →


Y be a linear map. Then, the following are equivalent:

(i) T is continuous at x = 0 ∈ X.

(ii) T is continuous on all points x ∈ X.

(iii) T maps bounded sets of X into bounded sets of Y.

(iv) There exists M > 0 such that

k T ( x )kY ≤ Mk x k X . (24)

Proof. (i) implies (ii): Assume T is continuous at 0 ∈ X. Let x ∈ X. Let { xn }n


be a sequence in X which converges to x as n → +∞. Consider

k T ( xn ) − T ( x )kY = k T ( xn − x )kY → 0,

because xn − x converges to zero and T is continuous at zero.

87
4. Introduction to linear operators on Banach spaces

(ii) implies (iii): Let A ⊂ X be a bounded set. This means A ⊆ BR (0)


for some R ≥ 0. We claim there exists S ≥ 0 such that T ( A) ⊆ BS (0) ⊂ Y.
Assume by contradiction that for every n ∈ N there exists xn ∈ A such that
k T ( xn )kY ≥ n. Set vn := xnn . Since k xn k X ≤ R, then vn → 0 as n → +∞. On
the other hand,

T ( xn ) 1
k T (vn )kY = = k T ( xn )kY ≥ 1.
n Y n

This is a contradiction with the fact that T is continuous at zero.


(iii) implies (iv): By contradiction, assume that for all n ∈ N there exists
xn ∈ X with k T ( xn )kY ≥ nk xn k X . Set yn = k xxnk . Clearly kyn k X = 1. More-
n X
over,

1
k T (yn )kY = k T ( xn )kY ≥ n,
k xn k X

which implies the set T ({yn }n ) is unbounded, whereas {yn }n is bounded, a


contradiction.
(iv) implies (i): Let { xn }n converge to zero as n → +∞. Then, the condition
k T ( x )kY ≤ Mk x k X easily implies k T ( xn )kY → 0, which gives the continuity
of T at zero.

4.2 definition. Let X and Y be normed spaces. A linear operator T : X → Y


is called a bounded operator if it is continuous. The (operator) norm of T is the
number
k T ( x )kY
k T k = sup ,
x 6 =0 kxkX

i. e. k T k is the infimum of all M such that condition (49) is satisfied. The space
of all bounded linear operators from X to Y is denoted by L( X, Y ).

A linear operator that is not bounded is called an unbounded operator.

4.3 exercise. Let T : X → Y be linear and bounded and let k T k be its norm.
Then,

k T k = inf{ M ≥ 0 : k Tx k ≤ Mk x k , for all x ∈ X } .

Moreover,

kTk = sup k Tx k = sup k Tx k .


x 6=0 k x k≤1 k x k=1

To prove the latter, we observe first of all that

k Tx k x
= T ,
kxk kxk

which shows that k T k is the supremum of k Tzk on the set kzk = 1. Then, for

88
4.1. Bounded linear maps

x
all x ∈ X with k x k ≤ 1 we set xe = kxk
and observe

k Tx k
k T xek = ≥ k Tx k
kxk

which implies the supremum of k Tx k on k x k ≤ 1 is bounded from above by


the supremum of k Tx k on k x k = 1. The opposite inequality is trivial.

4.4 exercise. With the notation in the previous definition, prove that k T ( x )kY ≤
k T kk x k X .

4.5 exercise. Prove that the operator norm k · k is a norm on the linear space
L( X, Y ).

4.6 proposition. Let X be a normed space and Y be a Banach space. Then the space
L( X, Y ) is a Banach space

Proof. We only need to prove that L( X, Y ) is complete. Let Tn be a Cauchy


sequence on L( X, Y ) with respect to the operator norm k · k. This means that,
for all e > 0, there exists an N (e) such that k Tn − Tm k ≤ e for all n, m ≥ N (e).
Hence, for all x ∈ X,

k Tn ( x ) − Tm ( x )kY = k( Tn − Tm )( x )kY ≤ k Tn − Tm kk x k X ≤ ek x k X .

The above shows that the sequence Tn ( x ) is a Cauchy sequence in Y, which


is complete, i. e. there exists an element y ∈ Y such that Tn ( x ) → y. Since y
depends on x via T, we name y = T ( x ). It is easily shown that T is a linear
operator. Indeed, for x1 , x2 ∈ X, we have

T ( x1 + x2 ) = lim Tn ( x1 + x2 ) = lim ( Tn ( x1 ) + Tn ( x2 )) = T ( x1 ) + T ( x2 ).
n→+∞ n→+∞

Moreover, T is a bounded operator. To see this, let e > 0. Then, one easily
sees that there exists N ∈ N such that k T − TN k ≤ e (as a consequence of the
Cauchy condition on Tn ). Therefore,

k T ( x )kY = k( T − TN )( x )kY + k TN ( x )kY


≤ k T − TN kk x k X + k TN kk x k X ≤ ek x k X + M N k x k X ,

where M N = k TN k. This implies k T ( x )kY ≤ (e + M N )k x k X , i. e. T is a


bounded operator.

4.7 definition. We say that a sequence Tn ∈ L( X, Y ) is norm-convergent to


T ∈ L( X, Y ) if k Tn − T k → 0 as n → +∞. We say that Tn converges to T
pointwise if k Tn ( x ) − T ( x )kY → 0 as n → +∞ for all x ∈ X.

4.8 exercise. Prove that T : X → Y is bounded if and only if T maps the unit
ball {k x k ≤ 1} into a bounded set.

4.9 example. The linear map A : R → R defined by Ax = ax, where a ∈ R, is

89
4. Introduction to linear operators on Banach spaces

bounded, and has norm k Ak = | a|.

4.10 example. The identity map I : X → X is bounded on any normed space


X, and has norm one. If a map has norm zero, then it is the zero map 0x = 0.

4.11 theorem. Every linear operator on a finite-dimensional linear space is bounded.

Proof. Suppose that A : X → X is a linear map and X is finite dimensional.


Let {e1 , . . . , en } be a basis of X. If x = ∑in=1 xi ei ∈ X, then (10) implies that
n n
1
k Ax k ≤ ∑ |xi |k Aei k ≤ 1max
≤i ≤ n
{k Aei k} ∑ | xi | ≤ max {k Aei k} k x k,
m 1≤ i ≤ n
i =1 i =1

so A is bounded.

Linear maps on infinite-dimensional normed spaces may be unbounded.

4.12 example. Let X = C ∞ ([0, 1]) consist of the smooth functions on [0, 1] that
have continuous derivatives of all orders. equipped with the maximum norm
k · k∞ . The space X is a normed space, but is not a Banach space, since it is
incomplete. The differentiation operator Du = u0 is an unbounded linear map
D : X → X. For example, the function u( x ) = eλx satisfies Du = λu. Thus,
k Duk/kuk = |λ| may be arbitrarily large. The unboundedness of differential
operators is a fundamental difficulty in their study.

4.13 exercise (Operator norms of finite dimensional matrices). We now show


how, in the finite dimensional case, the norm of an operator cam be computed
depending on norm on the space, in terms of the associated matrix. Suppose
that A : X → Y is a linear map between finite-dimensional real linear spaces
X, Y with dimX = n, dimY = m. We choose bases {e1 , e2 , . . . , en } of X and
{ f 1 , f 2 , . . . , f m } of Y. Then
m
A(e j ) = ∑ aij fi ,
i =1

for a suitable m × n matric ( aij ) with real entries. We expand x ∈ X as

n
x= ∑ x i ei ,
i =1

where xi ∈ R is the i-th component of x. It follows from the linearity of A that


!
n n m n
A( x ) = A ∑ xj ej = ∑ x j A ( e j ) = ∑ yi f i , yi = ∑ x j aij .
j =1 j =1 i =1 j =1

Thus, given a choice of bases for X, Y, we may represent A as a linear map

90
4.1. Bounded linear maps

A : Rn → Rm , where
    
y1 a11 a12 ... a1n x1
 y2   a21 a22 ... a2n   x2 
 ..  =  .. ..   ..  . (25)
    
.. ..
 .   . . . .  . 
ym am1 am2 ... amn xn

We will often use the same notation A to denote a linear map on a finite-
dimensional space and its associated matrix, but is important not to confuse
the geometrical notion of a linear map with the matrix of numbers that repre-
sents it.
Each pair of norms on Rn and Rm induces a corresponding operator norm
(or matrix norm) on A. We first consider the Euclidean norm, or 2-norm, k Ak2
of A. The Euclidean norm of a vector x is given by k x k2 = ( x, x ), where
( x, y) = x T y. We may compute the Euclidean norm of A by maximizing k Ax k2
on the unit sphere k x k2 = 1. The maximizer x is a critical point of the function

f ( x, λ) = ( Ax, Ax ) − λ{( x, x ) − 1},

where λ ∈ R is a Lagrange multiplier. Computing ∇ f and setting it equal to


zero, we find that x satisfies

A T Ax = λx. (26)

Hence, x is an eigenvector of the matrix A T A and λ is an eigenvalue. The


matrix A T A is an n × n symmetric matrix, with real, nonnegative eigenval-
ues (this easily follows after multiplying (26) by x via scalar product). At an
eigenvector x of A T A that satisfies (26), normalized so that k x k2 = 1, we have
( Ax, Ax ) = λ. Thus, the maximum value of k Ax k2 on the unit sphere is the
maximum eigenvalue of A T A.
We define the spectral radius r ( B) of a squared matrix B to be the maximum
absolute value of its eigenvalues. It follows that the Euclidean norm of A is
given by
q
k A k2 = r ( A T A ). (27)

In the case of linear maps A : Cn → Cm on finite dimensional complex linear


spaces, equation (27) holds with A T replaced by A∗ , where A∗ is the Hermi-
tian conjugate of A.
To compute the maximum norm of A, we observe from (25) that

|yi | ≤ | ai1 || x1 | + | ai2 || x2 | + . . . + | ain || xn | ≤ (| ai1 | + . . . + | ain |)k x k∞ .

Taking the maximum of this equation with respect to i and comparing the
result with the definition of operator norm, we conclude that

k Ak∞ ≤ max (| ai1 | + . . . + | ain |).


1≤ i ≤ m

91
4. Introduction to linear operators on Banach spaces

Conversely, suppose that the maximum on the right-hand side of this equation
is attained at i = i0 . Let x be the vector with components x j = sign ai0 j , where
sign is the sign function

1
 if x > 0
sign x = 0 if x = 0

−1 if x < 0.

Then, if A is nonzero, we have k x k∞ = 1, and

k Ax k∞ = | ai0 1 | + . . . + | ai0 n |,

which shows that


!
n
k Ak∞ = max
1≤ i ≤ m
∑ |aij | .
j =1

A similar argument (exercise) shows that the sum norm of A is given by the
maximum column sum
m
k Ak1 = max
1≤ j ≤ n
∑ |aij |.
i =1

For 1 < p < +∞, one can show (we omit the proof) that
1/p 1−1/p
k A k p ≤ k A k1 k A k ∞ .

There are norms on the space L(Rn , Rm ) = Rn×n of m × n matrices that


are not associated with any vector norms on Rn and Rm . An example is the
Hilbert-Schmidt norm
!1/2
m n
k Ak = ∑ ∑ |aij |2 .
i =1 j =1

Next, we give some examples of linear operators on infinite-dimensional


spaces.

4.14 example. Let X = `∞ (N) be the space of bounded sequences x =


{( x1 , x2 , . . .)} with the norm

k x k∞ = sup | xi |.
i ∈N

+∞
A linear map A : X → X is represented by an infinite matrix ( aij )i,j =1 , where

+∞
( Ax )i = ∑ aij x j .
j =1

92
4.1. Bounded linear maps

In order for this sum to converge for any x ∈ `∞ (N), we require that
+∞
∑ |aij | ≤ +∞
j =1

for each i ∈ N, and in order for Ax to belong to `∞ (N), we require that

+∞
!
sup ∑ |aij | < +∞.
i ∈N j =1

Then A is a bounded linear operator on `∞ (N), and its norm is the maximum
row sum
+∞
!
k Ak∞ = sup ∑ |aij | .
i ∈N j =1

The details are omitted.

4.15 example. Let X = ` p (N). Consider the operator T : X → X defined by

( Tx )n = αn xn ,

for all x ∈ ` p , with α = (αn )n a given sequence of real numbers. Let us figure
out what condition we should impose on αn in order to have T bounded from
X into itself, and let us compute the operator norm. A simple estimate gives
p p
k Tx k` p ≤ sup |αn |k x k` p .
n

The above estimate is sharp: let αnk be a subsequence converging to kαk`∞ as


k → +∞. Let x k be the sequence in ` p defined by

( x k )i = δi,nk , i = 1, 2, 3, . . . .
p
We immediately see that k Tx k k` p tends to kαk`∞ ¡as k → +∞ and k x k k` p = 1.
Therefore,

k T k = k α k `∞ .

4.16 example. Let X = C ([0, 1]) with the maximum norm, and

k : [0, 1] × [0, 1] → R

be a continuous function. We define the linear Fredholm integral operator


K : X → X by

Z1
K f (x) = k( x, y) f (y)dy.
0

93
4. Introduction to linear operators on Banach spaces

Then K is bounded and


 
Z1
kK k ≤ max  |k( x, y)|dy .
0≤ x ≤1
0

The details are left as an exercise.

4.2 The kernel and range of a linear map

The kernel and range are two important linear subspaces associated with a
linear map.

4.17 definition. Let T : X → Y be a linear map between linear spaces X, Y.


The null space or kernel of T, denoted by ker T, is the subset of X defined by

ker T = { x ∈ X : Tx = 0}.

The range of T, denoted by RanT, is the subset of Y defined by

RanT = {y ∈ Y : there exists x ∈ X such that Tx = y}.

The word ‘kernel’ is also used in a completely different sense to refer to


the kernel of an integral operator. A linear map T : X → Y is one-to-one if
and only if ker T = {0}, and is onto if and only if RanT = Y.

4.18 exercise. Let T : X → Y be a linear map between linear spaces X, Y.


Prove that ker T is a linear subspace of X and RanT is a linear subspace of Y.
If X and Y are normed linear spaces and T is bounded, prove that the kernel
of T is a closed linear subspace of X.

The nullity of T is the dimension of the kernel of T, and the rank of T is the
dimension of the range of T. We now consider some examples.

4.19 example. The right shift operator on `∞ (N) is defined by

S( x1 , x2 , x3 , . . .) = (0, x1 , x2 , . . .),

and the left shift operator T by

T ( x1 , x2 , x3 , . . . ) = ( x2 , x3 , x4 , . . . ).

These maps have norm one (exercise!). Their matrices are the infinite-dimensional
Jordan blocks
   
0 0 0 ... 0 1 0 ...
1 0 0 . . . 0 0 1 . . .
[ S ] = 0 , [ T ] = 0 .
   
 1 0 . . .  0 0 . . .
.. .. .. .. .. .. .. ..
. . . . . . . .

94
4.2. The kernel and range of a linear map

The kernel of S is {0} and the range of S is the subspace

RanS = {(0, x2 , x3 , . . .) ∈ `∞ (N)}.

The range of T is the whole space `∞ (N), and the kernel of T is the one-
dimensional subspace

ker T = {( x1 , 0, 0, . . .) : x1 ∈ R}.

The operator S is one-to-one but not onto, and T is onto but not one-to-one.
This cannot happen for linear maps T : X → X on a finite-dimensional space
X, such as X = Rn . In that case, ker T = {0} if and only if RanT = X.

4.20 example. Let X = C ([0, 1]) with the sup norm. We define the integral
operator K : X → X by

Zx
K f (x) = f (y)dy. (28)
0

An integral operator like this one, with a variable range of integration, is called
a Volterra integral operator. Then, K is bounded, with kK k ≤ 1, since

Zx Z1
kK f k∞ ≤ sup | f (y)|dy ≤ | f (y)|dy ≤ k f k∞ .
0≤ x ≤1
0 0

In fact, kK k = 1, since K (1) = x, and k x k∞ = 1. The range of K is the set of


continuously differentiable functions on [0, 1] that vanish at x = 0. This is a
linear subspace of C ([0, 1]) but it is not closed. The lack of closure of the range
of K is due to the smoothing effect of K, which maps continuous functions to
differentiable functions.

4.21 theorem. Let X, Y, Z be normed linear spaces. If T ∈ L( X, Y ) and S ∈


L(Y, Z ), then ST ∈ L( X, Z ), and

kST k ≤ kSkk T k.

Proof. Exercise.

4.22 example. Consider the linear maps A, B on R2 with matrices


   
λ 0 0 0
A= , B= .
0 0 0 µ

These matrices have the Euclidean (or sum, or maximum) norms k Ak = λ


and k Bk = µ, but k ABk = 0.

95
4. Introduction to linear operators on Banach spaces

4.3 Compact operators


A particularly important class of bounded operators is the class of compact
operators.

4.23 definition. A linear operator T : X → Y is compact if T ( B) is a precom-


pact subset of Y for every bounded subset B of X.

An equivalent formulation is that T is compact if and only if every bounded


sequence ( xn ) in X has a subsequence ( xnk )k such that ( Txnk )k converges in
Y. We do not require the range of T to be closed, so T ( B) need not be compact
even if B is a closed bounded set. Another equivalent formulation is that T
is compact if and only if T maps the closed unit ball {k x k ≤ 1} of X into a
precompact subset of Y.

4.24 example. We propose immediately a classical example of a compact lin-


ear operator on an infinite dimensional Banach space. Let X = C ([0, 1]) and
consider the operator T ∈ L( X ) defined by

Zx
( T f )( x ) = f (y)dy,
0

called Volterra operator. It is an easy exercise to verify that T is linear and


bounded. Now, let f ∈ X be in a bounded set B ⊂ X. Since B is bounded,
there is a constant M such that k f k∞ ≤ M for all f ∈ B. Now, for all f ∈ B,
we have that T f is differentiable. Moreover,

Zx
0
k( T f ) k∞ ≤ sup | f (y)|dy ≤ k f k∞ ≤ M.
x ∈[0,1]
0

Since ( T f )(0) = 0 for all f ∈ B, we can use example 2.17 (consequence of


Arzelá-Ascoli) to show that T ( B) is precompact. Hence, T is a compact oper-
ator.

We leave the proof of the following properties of compact operators as an


exercise.

4.25 proposition. Let X, Y, Z be Banach spaces.


(a) If S, T ∈ L( X, Y ) are compact, then any linear combination of S and T is
compact.
(b) Let S ∈ L( X, Y ) and T ∈ L(Y, Z ). If S is bounded and T is compact, or if S is
compact and T is bounded, then TS ∈ L( X, Z ) is compact.
(c) If T is bounded and RanT has finite dimension, then T is compact. In this case
we say that T is a finite-rank operator.

4.26 example. If ( Tn ) is a sequence of compact operators in L( X, Y ) con-


verging uniformly to T, then T is compact. To see this, let e > 0 and let

96
4.3. Compact operators

Ne ∈ N such that k T − TNe k < e/2. Let B be the closed unit ball of X. Since
TNe ( B) is precompact, then TNe ( B) is totally bounded. Hence, there exists
y1 , . . . , y Me ∈ TNe ( B) such that TNe ( B) ⊂ iM
S e
=1 Be/2 ( yi ). Therefore, for a given
x ∈ B there exists i ∈ {1, . . . , Me } such that k TNe x − yi k < e/2. Hence,

k Tx − yi k ≤ k Tx − TNe x k + k TNe x − yi k < e.

This proves that T ( B) is totally bounded, i. e. T ( B) is precompact.

As a consequence of the previous example, if X ad Y are Banach spaces


the space K( X, Y ) of compact linear operators from X into Y is a closed linear
subspace of L( X, Y ). Moreover, if ( Tn ) is a sequence of finite-rank operators
converging uniformly to T, then T is a compact operator. The converse is
also true for compact operators on many Banach spaces, including Hilbert
spaces, although there exists separable Banach spaces on which some compact
operators cannot be approximated with finite-rank operators.
We conclude this section with some important considerations on the com-
pactness of the unit ball on infinite dimensional spaces. The compact sets of
the Euclidean space Rd are characterised as those sets which are closed and
bounded, according to Heine-Borel’s theorem. This property is valid for all
finite dimensional normed spaces. Indeed, this property characterises finite di-
mensional normed spaces, i. e. if the dimension of the space is infinite such property
is no longer true. This is seen more precisely in the next theorem. First we prove
the following Lemma.

4.27 lemma (Riesz’s lemma). Let ( E, k · k) be a normed space. Let M ⊂ E be a


proper closed linear subspace of E. Let α ∈ (0, 1). Then, there exists a vector xα ∈ E
such that k xα k = 1 and inf{k x − xα k , x ∈ M} > α.

Proof. For a given y ∈ E \ M, let d = infx∈ M k x − yk. Since M is closed, d > 0.


Indeed, if d = 0 there would be no open balls centered on x entirely contained
in E \ M, which would contradict E \ M being an open set. Hence, since α ∈
(0, 1), there exists a vector x0 ∈ M with d ≤ ky − x0 k ≤ αd . Set xα = kyy− x0
− x0 k
.
Clearly k xα k = 1. Moreover, for all x ∈ M we have

y − x0 k y − x0 k x + x0 − y α
k x − xα k = x − = ≥ kky − x0 k x + x0 − yk.
k y − x0 k k y − x0 k d

Since ky − x0 k x + x0 ∈ M, the quantity kky − x0 k x + x0 − yk is bigger than d,


and therefore k x − xα k ≥ α.

We are now ready to prove the next important theorem.

4.28 theorem. Let ( E, k · k) be a normed space, and let B1 (0) be the closed unit ball
of E. Then, B1 (0) is compact if and only if the dimension of E is finite.

Proof. Assume first that E is finite dimensional, let {e1 , . . . , ed } be a basis for
E. Let us set for x = ∑id=1 xi ei , T ( x ) = ( x1 , . . . , xd ) ∈ Rd . The linear map
T : E → Rd is a homeomorphism (that is, a linear, continuous, invertible,

97
4. Introduction to linear operators on Banach spaces

with continuous inverse map). Then supx∈ B1 (0) k x k2 < +∞, where k x k2 =
 1/2
∑id=1 | xi |2 . The norm k · k2 is the Euclidean norm on Rd . Hence, we can
apply Heine-Borel theorem on Rd : since the set { T ( x )} x∈ B1 (0) is closed and
bounded in the Euclidean norm, Heine-Borel theorem gives that the set is
compact. Since T −1 is continuous, B1 (0) is also compact.
Now, assume that B1 (0) is compact. Assume by contradiction that E is not
finite dimensional. Pick an element x1 ∈ E with k x1 k = 1, and denote by S1 the
linear subspace generated by x1 . According to Riesz’s lemma 4.27, there exists
an element x2 ∈ E with k x2 k = 1 and infx∈S1 k x − x2 k > 12 . Now let S2 be
the linear subspace generated by x1 and x2 . Since E is not finite dimensional,
S2 is a proper subspace, and hence there exists x3 ∈ E with k x3 k = 1 and
infx∈S2 k x − x3 k > 21 . If we proceed inductively, we can construct a sequence
xn with k xn k = 1 which satisfies k xn − xm k > 12 for all n 6= m. Therefore,
no convergent subsequence can be extracted from { xn }n , i. e. B1 (0) is not
compact, a contradiction.

4.4 Dual spaces


The dual space of a linear space consists of the scalar-valued linear maps on
the space. Duality methods play a crucial role in many parts of analysis. In this
subsection we consider real linear spaces for definiteness, but all the results
hold for complex linear spaces too.

4.29 definition. A scalar-valued linear map from a linear space X to R is


called a linear functional, or linear form on X. The space of linear functionals
on X is called the algebraic dual space of X, and the space of continuous linear
functionals on X is called the topological dual space of X.

In terms of notation, we denote by X ∗ the algebraic dual and by X 0 the


topological dual. From now on, the topological dual will be called simply the
dual space. In fact, X 0 = L( X, R). A linear functional ϕ ∈ X ∗ belongs to X 0 if
there is a constant M such that

| ϕ( x )| ≤ Mk x k for all x ∈ X,

and we define the dual norm of ϕ as the operator norm of ϕ, that is

| ϕ( x )|
k ϕk = sup = sup | ϕ( x )|.
x 6 =0 kxk k x k=1

Clearly, X ∗ is a linear space with the obvious structure, whereas X 0 is a Ba-


nach space because R is complete. If X is finite dimensional, then X 0 = X ∗ .
Moreover, in this case X ∗ is linearly isomorphic to X. To see this, pick a basis
{e1 , . . . , en } of X. The map ωi : X → R defined by
!
n
ωi ∑ xj ej = xi
j =1

98
4.4. Dual spaces

is an element of the algebraic dual space X 0 . The linearity of ωi is obvious.


The action of a general element ϕ of the dual space, ϕ : X → R, on a vector
x ∈ X is given by a linear combination of the components of x, since
!
n n
ϕ ∑ x i ei = ∑ ϕi xi ,
i =1 i =1

where ϕi = ϕ(ei ) ∈ R. It follows that, as a map,


n
ϕ= ∑ ϕ i ωi .
i =1

Thus, {ω1 , . . . , ωn } is a basis for X 0 , called the dual basis of {e1 , . . . , en }, and
both X 0 and X ∗ are linearly isomorphic to Rn . The dual basis has the property
that

ωi (e j ) = δij ,

where δij is the Kronecker delta function, defined by


(
1 if i = j
δij =
0 if i 6= j.

Although a finite-dimensional space is linearly isomorphic with its dual space,


there is no canonical way to identify the space with its dual; there are many
isomorphisms, depending on the arbitrary choice of a basis. In the following
chapters, we will study Hilbert spaces, and show that the topological dual
space of a Hilbert space can be identified with the original space in a natural
way through the inner produce. The dual of an infinite-dimensional Banach
space is, in general, different from the original space.

4.30 example. Let p ∈ [1, +∞). We want to prove that the dual of ` p (N) is
(essentially) `q (N) where 1/p + 1/q = 1. We do it in many steps.

Step 1. For all x = ( xn ) ∈ ` p (N), we have x = ∑i+=∞1 xi ei , where ei denotes


the usual unit vector with 0’s everywhere except at the i-th component equal
to 1. To prove that, we observe that ∑in=1 xi ei ∈ ` p for all n ∈ N, so

n p +∞
x − ∑ x i ei = ∑ | xi | p ,
i =1 `p i = n +1

and the last term converges to zero as n → +∞ since the series ∑i+=∞1 | xi | p is
convergent, as a consequence of x ∈ ` p (N). Note that this property is false for
p = +∞ (for example, use x being a constant sequence).

Step 2. Let f ∈ (` p (N))0 . Then there exists a y ∈ `q (N), y = (αi ) such that
f ( x ) = ∑i+=∞1 xi αi . To see this, let αi = f (ei ). For all x ∈ ` p (N), by continuity

99
4. Introduction to linear operators on Banach spaces

and linearity of f , we have


+∞ +∞
f (x) = ∑ x i f ( ei ) = ∑ xi αi .
i =1 i =1

Consider now the case p ∈ (1, +∞). For n ∈ N, define xn = ∑in=1 |αi |q/p sign(αi )ei .
Note that sign(α)α = |α|. We have
!1/p
n
k xn k` p = ∑ | αi | q
,
i =1

and
n n n
f ( xn ) = ∑ |αi |q/p sign(αi ) f (ei ) = ∑ |αi |q/p+1 = ∑ |αi |q .
i =1 i =1 i =1

But since f is bounded, | f ( xn )| ≤ k f k(` p )0 k xn k` p , so


!1/p
n n
∑ | αi | q
≤ k f k(` p )0 ∑ | αi | q
.
i =1 i =1

Rearranging, we get
!1/q
n
∑ | αi | q
≤ k f k(` p )0 ,
i =1

i.e. the sequence y = (αi ) belongs to `q and satisfies kyk`q ≤ k f k(` p )0 . In the
case p = 1, define xn = sign(αn )en . So k xn k`1 ≤ 1 and f ( xn ) = αn sign(αn ) =
|αn |, which shows

|αn | = | f ( xn )| ≤ k f k(`1 )0 k xn k`1 ≤ k f k(`1 )0 .

Hence, y = (αi ) ∈ `∞ (N) and kyk`∞ ≤ k f k(`1 )0 .


Step 3. Let y = (αi ) ∈ `q (N). We prove that y defines a functional f y ∈
(` p (N))0 as follows
+∞
f y (x) = ∑ xi αi , for all x = ( xi ) ∈ ` p (N),
i =1

which has the property k f y k(` p )0 = kyk`q . The linearity of f y is trivial and left
as an exercise. To prove that f y is bounded, let us observe by Hölder inequality
(discrete version),
!1/p !1/q
+∞ +∞ +∞
| f y ( x )| ≤ ∑ |xi ||αi | ≤ ∑ |xi | p
∑ | αi | q
= k x k ` p k y k `q .
i =1 i =1 i =1

The above computation also proves that k f k(` p )0 ≤ kyk`q . Now, clearly f y (ei ) =

100
4.4. Dual spaces

αi , so the same analysis in Step 1 applies and in particular kyk`q ≤ k f k(` p )0 ,


which shows that kyk`q = k f k(` p )0

The example 4.30 shows that, given p ∈ [1, +∞) there is a map F : `q (N) →
(` p (N))0 ,
with 1/q + 1/p = 1, defined as follows: for all y ∈ `q (N) set F (y) ∈
(` p (N))0 defined by
+∞
F (y)( x ) = ∑ xi yi ,
i =1

with the following properties:


• F is one-to-one and onto (proven in example 4.30)
• F is linear (easy exercise)
• F is an isometry, i.e. kyk = k F (y)k (proven in example 4.30).
So, essentially, the two normed linear spaces `q (N) and (` p (N))0 are identi-
fied.
The above identification is false is p = +∞. It can be proven that the dual of
the space c0 of convergent sequences with limit equal to zero with the k · k`∞
norm is a Banach space. Moreover, the dual space of c0 is `1 . The identification
(in the above sense) of the dual of `∞ goes beyond the scopes of this course.
We continue this subsection with the goal of performing a similar identifi-
cation for the dual of the L p spaces defined in subsection 3.4. Let Ω ⊂ Rd be
a measurable set. Suppose 1 ≤ p ≤ +∞ and g ∈ Lq (Ω) with 1/p + 1/q = 1.
We define ϕ g : L p (Ω) → R by
Z
ϕg ( f ) = f ( x ) g( x )dx for every f ∈ L p (Ω).

Hölder’s inequality implies that ϕ g is a bounded linear functional on L p , with

k ϕ g k( L p )0 = sup | ϕ g ( f )| ≤ sup k gk Lq k f k L p ≤ k gk Lq .
k f k L p ≤1 k f k L p ≤1

Now, set f 0 = | g|q−2 g. We have


Z
q
ϕg ( f0 ) = | g( x )|q dx = k gk Lq ,

and
 1/p  1/p
Z Z
q/p
k f0 kL p =  | g( x )| p(q−1) dx  = | g( x )|q dx  = k g k Lq ,
Ω Ω

which implies that f 0 ∈ L p (Ω) and


q/p+1 q
k f 0 k L p k g k Lq = k g k Lq = k gk Lq = | ϕ g ( f 0 )|,

101
4. Introduction to linear operators on Banach spaces

hence k ϕ g k( L p )0 = k gk Lq .

4.31 theorem (Riesz representation Theorem for L p spaces). Let Ω ⊂ Rd be a


measurable set. Let 1 ≤ p < +∞. Then, for every ϕ ∈ ( L p (Ω))0 there is a g ∈ Lq (Ω)
with 1/p + 1/q = 1 such that
Z
ϕ( f ) = f ( x ) g( x )dx,

for all f ∈ L p (Ω). Moreover, k ϕk( L p )0 = k gk Lq .

We will not give the proof of the above theorem. The identification per-
formed before the
R statement of the theorem only shows that the multiplica-
tion functional f gdx with g ∈ Lq is a bounded (linear) functional on L p , and
that the map associating g ∈ Lq to ϕ ∈ ( L p )0 is an isomorphism. It remains to
show that such a map is onto, which goes beyond our scopes.
According to Theorem 4.31, we may identify ( L p )0 with Lq with 1/p +
1/q = 1. When p = q = 2, we recover the result of the Riesz representation
theorem on a Hilbert space, which we will prove later on. The dual of L1 is
L∞ , but the dual of L∞ is strictly larger than L1 .

4.32 example. Consider X = C ([ a, b]). For any g ∈ L1 ([ a, b]), the formula

Zb
ϕ( f ) = f ( x ) g( x )dx
a

defines a continuous linear functional ϕ on X. However, not all continuous


functional are of the above form. For example, if x0 ∈ [ a, b], then the evaluation
of f at x0 is a continuous linear functional. That is, if we define δx0 : C ([ a, b]) →
R by

δx0 ( f ) = f ( x0 ),

then δx0 is a continuous linear functional on C ([ a, b]) (easy exercise).

Since X 0 is a Banach space, we can form its dual space X 00 , called the bidual
of X. There is not natural way to identify an element of X with an element of
the dual X 0 , but we can naturally identify an element of X with an element of
the bidual X 00 . If x ∈ X, then we define Fx ∈ X 00 by evaluation at x:

Fx ( ϕ) = ϕ( x ) for every ϕ ∈ X 0 .

We leave as an exercise to prove that Fx ∈ X 00 . In this way, we may regard X as


a subspace of X 00 . Indeed, one can prove that the identification X 3 x 7→ Fx ∈
X 00 is isomorphic, i.e. k Fx k X 00 = k x k X (we leave the details as an exercise).
This holds for arbitrary normed spaces X, and in general such identification
is not onto. Now, if all continuous linear functionals F on X 0 are of the form
F ( ϕ) = ϕ( x ) for some x ∈ X, then X and X 00 essentially coincide under the

102
4.5. An overview of fundamental principles of functional analysis

identification x 7→ Fx , and we say that X is reflexive.


If 1 < p < +∞, then ( L p )00 = L p and L p is reflexive, but L1 and L∞ are not
reflexive. Similarly to ` p spaces, the situation of L∞ is special. The dual of L∞
is a space of measures, we omit the details and refer to [2].

4.5 An overview of fundamental principles of functional analysis

We provide, in this section, a brief overview of some results that are of great
relevance in functional analysis, although we shall only prove some of them.
We start with a famous Theorem by Hahn and Banach, which basically
says that in any linear space we can always extend a linear functional defined
on a linear subspace in a suitable way.
A remark on the notation. In functional analysis the “action” of a func-
tional f ∈ X 0 on an element x ∈ X is often denoted by

h f , xiX0 ×X = h f , xi .

4.33 theorem (Hahn-Banach analytic form). Let E be a real linear space. Let
p : E → R be a function satisfying

(i) p(λx ) = λp( x ) for all x ∈ E and all λ > 0,

(ii) p( x + y) ≤ p( x ) + p(y) for all x, y ∈ E.

Let G ⊂ E be a linear subspace of E and let g : G → R be a linear functional defined


on G such that

g( x ) ≤ p( x ) for all x ∈ G.

Then, there exists a linear functional f defined on all of E that extends g, i. e. g( x ) =


f ( x ) for all x ∈ G, and such that f ( x ) ≤ p( x ) for all x ∈ E.

Here are some important consequences of the above theorem.

4.34 corollary. Let E be a normed space. Let G ⊂ E be a linear subspace. If


g : G → R is a continuous linear functional, then there exists f ∈ E0 that extends g
and such that

k f k E∗ = sup | g( x )| = k gkG0 .
x ∈ G, k x k≤1

Proof. Use Theorem 4.33 with p( x ) = k gkG0 k x k.

4.35 corollary. Let E be a normed space. For every x0 ∈ E there exists f 0 ∈ E0


such that

k f 0 k = k x0 k, and h f 0 , x0 i = k x0 k2 .

Proof. Use Corollary 4.34 with G = {tx0 : t ∈ R} and g(tx0 ) = tk x0 k2 , so


that k gkG0 = k x0 k.

103
4. Introduction to linear operators on Banach spaces

In general, the element f 0 ∈ E∗ in Corollary 4.35 is not unique for a given


x0 ∈ E. It is unique in special cases, e. g. E a Hilbert space or an L p space
p ∈ (1, +∞), as we shall see later on. In general, for an element x0 ∈ E we set

F ( x0 ) = { f 0 ∈ E0 : k f 0 k = k x0 k and h f 0 , x0 i = k x0 k2 }.

The multi-valued map E 3 x0 7→ F ( x0 ) ∈ P ( E0 ) is called the duality map from


E into E0 .

4.36 corollary. Let x0 ∈ E. If f ( x0 ) = 0 for all f ∈ E0 , then x0 = 0.

Proof. Assume by contradiction that x0 6= 0. Then, by Corollary 4.35 there


exists f 0 ∈ E0 with f 0 6= 0 such that k f 0 k = k x0 k and h f 0 , x0 i = k x0 k2 6= 0,
and this is a contradiction.

4.37 corollary. Let E be a normed space. For every x ∈ E we have

kxk = sup |h f , x i| = max |h f , x i|.


f ∈ E0 , k f k≤1 f ∈ E0 , k f k≤1

Proof. The assertion is trivial if x = 0. If x 6= 0, it is clear that

sup |h f , x i| ≤ k x k.
f ∈ E0 , k f k≤1

On the other hand, we know from Corollary 4.35 that there is some f 0 ∈ E0
f (x)
such that k f 0 k = k x k and h f 0 , x i = k x k2 . Set f 1 ( x ) := k0 xk . We have k f 1 k = 1
and h f 1 , x i = k x k.

The above corollary is quite important, because is allows to state the fol-
lowing “duality principle”. Given a Banach space X and its dual X 0 , by defi-
nition of dual norm k f k for some f ∈ X 0 we have

k f k = sup |h f , x i| .
k x k=1

The above corollary somehow implies the opposite, namely that for all x ∈ X
we have

k x k = sup |h f , x i| .
k f k=1

We now state the so-called Baire’s cathegory theorem, a very important


but, at the same time, very abstract result, which has substantial consequences
on the theory of linear operators.
Recall that a subset B of a metric space X is called nowhere dense if its
closure has empty interior.

4.38 definition. A metric space X is called a Baire first category space if X can
be written as the union of a countable family of nowhere dense closed sets.
The space X is called a Baire second category space if it is not a first category

104
4.5. An overview of fundamental principles of functional analysis

S+∞
space, i. e. if given a sequence closed subsets { Fn }n∈N in X, if X = n=1 Fn
this implies that at least one of the Fn has a nonempty interior.

4.39 theorem (Baire). Let X be a nonempty complete metric space. Then X is a


Baire second category space.

A first important consequence is the following Theorem.

4.40 theorem (Banach-Steinhaus, uniform boundedness principle). Let E and


F be two Banach spaces and let { Ti }i∈ I be a family (not necessarily countable) of
continuous linear operators from E into F. Assume that

sup k Ti ( x )k < +∞ for all x ∈ E. (29)


i∈ I

Then,

sup k Ti kL(E,F) < +∞. (30)


i∈ I

In other words, there exists a constant c such that

k Ti ( x )k ≤ ck x k for all x ∈ E and for all i ∈ I.

Proof. For every n ≥ 1 let

Xn = { x ∈ E : k Ti x k ≤ n for all i ∈ I }.

Clearly, Xn is closed, and from the assumption (29) we have


[
Xn = E.
n ≥1

From the Baire category theorem 4.39, the interior of Xn0 is non empty for at
least one n0 ∈ N. Therefore there exists a ball Br ( x0 ) ⊂ Xn0 for some x0 ∈ Xn0
and some r > 0. This implies

k Ti ( x0 + rz)k ≤ n0 for all i ∈ I and for all z ∈ B1 (0).

Therefore, for all x ∈ E with x 6= 0 we have


   
r rx rx
k T ( x )k = Ti ≤ Ti x0 + + k Ti ( x0 )k ≤ n0 + k Ti ( x0 )k,
kxk i kxk kxk

and this proves the assertion.

4.41 corollary. Let E and F be two Banach spaces. Let { Tn }n be a sequence of


continuous linear operators from E into F such that for every x ∈ E, Tn ( x ) converges
to a limit denoted by T ( x ) as n → +∞. Then we have
(a) supn k Tn kL(E,F) < +∞,

(b) T ∈ L( E, F ),

105
4. Introduction to linear operators on Banach spaces

(c) k T kL(E,F) ≤ lim infn→+∞ k Tn kL(E,F) .

Proof. (a) follows directly from Theorem 4.40, and thus there exists a constant
c such that

k Tn ( x )k ≤ ck x k for all n ∈ N and for all n ∈ E.

Hence, for all n ∈ N,

k T ( x )k ≤ k( T − Tn )( x )k + k Tn ( x )k ≤ ck x k + k( T − Tn )( x )k,

and taking the limit as n → +∞ we get

k T ( x )k ≤ ck x k for all x ∈ E.

Since T is clearly linear, we obtain (b).


Finally, we have

k Tn ( x )k ≤ k Tn kL(E,F) k x k.

By taking the lim inf on both sides, recalling that Tn ( x ) converges to T ( x ), we


get

k T ( x )k ≤ lim inf k Tn kL(E,F) k x k,


n→+∞

which proves (c).

Another important consequence of Baire’s cathegory Theorem is the fol-


lowing result, the proof of which is non trivial and we will omit it.

4.42 theorem (Open mapping theorem). Let E an F be two Banach spaces and
let T be a continuous linear operator from E into F that is bijective (i. e. one-to-one
and surjective). Then T −1 is also continuous from F into E.

The following result is an application of the open mapping theorem. It


provides a useful way to show that an operator T has closed range, a property
that is sometimes useful in the applications. The theorem states that T has
closed range if one can estimate the norm of the solution x of the equation
Tx = y in terms of the norm of the right-hand side y.

4.43 proposition. Let T : X → Y be a bounded linear map between Banach spaces


X, Y. The following statements are equivalent:
(a) There is a constant c > 0 such that

ck x k ≤ k Tx k for all x ∈ X,

(b) T has closed range, and the only solution of the equation Tx = 0 is x = 0.

Proof. First, suppose that T satisfies (a). Then Tx = 0 implies that k x k = 0,


so x = 0. To show that RanT is closed, suppose that (yn ) is a convergent

106
4.5. An overview of fundamental principles of functional analysis

sequence in RanT, with yn → y ∈ Y. Then there is a sequence ( xn ) in X such


that Txn = yn . The sequence ( xn ) is Cauchy, since (yn ) is Cauchy and

1 1
k xn − xm k ≤ k T ( xn − xm )k = kyn − ym k.
c c
Hence, since X is complete, we have xn → x for some x ∈ X. Since T is
bounded, we have

Tx = lim Txn = lim yn = y,


n→+∞ n→+∞

so y ∈ RanT and RanT is closed.

Conversely, suppose that T satisfies (b) Since RanT is closed, it is a Banach


space. Since T : X → Y is one-to-one, the operator T : X → RanT os a one-to-
one, onto map between Banach spaces. The open mapping theorem implies
that T −1 : RanT → X is bounded, and hence there is a constant C > 0 such
that

k T −1 y k ≤ C k y k for all y ∈ RanT.

Setting y = Tx, we see that ck x k ≤ k Tx k for all x ∈ X, where c = 1/C.

4.44 example. Consider the operator T = I + K on C ([0, 1]), where K is de-


fined in (28). The range of K is the whole space C ([0, 1]) and is therefore
closed. To prove this statement, we observe that g = T f if and only if

Zx
f (x) + f (y)dy = g( x ).
0
Rx
Writing F ( x ) = 0 f (y)dy, we have F 0 = f and

F 0 + F = g, F (0) = 0.

The solution of this initial value problem is

Zx
F(x) = ey− x g(y)dy.
0

Differentiating this expression with respect to x, we find that f is given by

Zx
f ( x ) = g( x ) − ey− x g(y)dy.
0

Thus, the operator T = I + K is invertible on C ([0, 1]) and

(I + K )−1 = I − L,

107
4. Introduction to linear operators on Banach spaces

where L is the Volterra integral operator

Zx
Lg( x ) = ey− x g(y)dy.
0

4.45 example. Consider the Volterra integral operator K : C ([0, 1]) → C ([0, 1])
defined in (28). Then

Zx
sin(nπx )
K [cos(nπx )]( x ) = cos(nπy)dy = .

0

We have k cos(nπx )k∞ = 1 for every n ∈ N, but kK [cos(nπx )]k∞ → 0 as


n → +∞. Thus, it is not possible to estimate k f k in terms of kK f k, consistent
with the fact that the range of K is not closed.

4.6 Weak topologies and weak convergences


4.46 exercise. On a given set X one may consider two distinct topologies τ
and σ. We say that τ is weaker than σ (or equivalently that σ is stronger than τ)
if τ ⊂ σ. Prove that if τ is weaker than σ then every sequence xn converging
to x in σ converges to x in τ too.

Let ( E, k · k) be a normed space. Let us, for the moment, ignore the usual
topology on E induced by the norm k · k.
For a given family F of maps f : E → R, one can consider the coarsest
topology τ that makes all maps f ∈ F continuous. Here R is considered as
topological space with the usual Euclidean topology.
The topology τ is constructed as the topology generated by the inverse
images of all open sets in R via all maps f ∈ F . More in detail, one considers
the family C = { f −1 (O) : O ⊂ R open, f ∈ F } and take first the family I of
the intersection of finitely many elements in C .
Finally, one takes τ as the union of all sets in I . Such topology is denoted
by τ = τ (C), and is called the inverse limit topology of F . (Exercise: show τ (C)
is a topology).
The set C is called a sub-basis for τ. More precisely, we say that a set C
is a sub-basis for a given topology τ if τ is the coarsest (weakest) topology
containing C . Moreover, the family I of finite intersections in of sets in C is a
basis for the topology τ. This means that every open set in τ (C) can be written
as union of sets in I (exercise!).
Finally, for a given point x ∈ E, the family

U x = { f 1−1 (O1 ) ∩ . . . ∩ f k−1 (Ok ) : O1 , . . . , Ok


open in R, f 1 , . . . , f k ∈ F , with f j ( x ) ∈ O j for all j = 1, . . . , k}

is a basis of neighborhoods for x in the topology τ, which means that every open
neighborhood of x in τ contains an open set of the family U x .

108
4.6. Weak topologies and weak convergences

4.47 exercise. With the notation above, prove that a sequence { xn }n ⊂ E


converges to x ∈ E in τ (C) if and only if f ( xn ) → f ( x ) for all f ∈ F .
Solution: If xn → x in τ, then f ( xn ) → f ( x ) for all f ∈ F because all
f ∈ F are continuous in the topology τ (the image of an open set O via f is
open in τ (C)!). Vice versa, assume f ( xn ) → f ( x ) for all f ∈ F . Let Ux ∈ U x
be a basic neighborhood of x, in particular, Ux = f 1−1 (O1 ) ∩ . . . ∩ f k−1 (Ok )
for some f 1 , . . . , f k ∈ F and some open sets O1 , . . . , Ok ⊂ R with f i ( x ) ∈ Oi
for all i = 1, . . . , k. By assumption, f i ( xn ) → f i ( x ), hence there exist integers
Ni ∈ N such that f i ( xn ) ∈ Oi for all n ≥ Ni , for all i = 1, . . . , k. Set N :=
max{ N1 , . . . , Nk }. Then, for all n ≥ N one has f i ( xn ) ∈ Oi for all i = 1, . . . , k,
i. e. xn ∈ f 1−1 (O1 ) ∩ . . . ∩ f k−1 (Ok ), i. e. xn ∈ Ux . This implies that xn → x in τ.

4.48 exercise. With the notation above, let Z be a topological space and let
ψ : Z → E. Then ψ is continuous if and only if f ◦ ψ is continuous from Z into
R for every f ∈ F .
Solution: If ψ is continuous then f ◦ ψ is also continuous for all f ∈ F (all
the f ∈ F are continuous in τ, and the composition of continuous functions
is continuous). Vice versa, assume f ◦ ψ is continuous for every f ∈ F . We
have to prove that ψ−1 (U ) is open for all sub-basic open sets U ⊂ E in the
τ topology. But we know that sub basic open sets in τ are all of the form
U = f −1 (O) for some f ∈ F and some O ∈ R open. Hence, ψ−1 (U ) =
ψ−1 ( f −1 (O)) = ( f ◦ ψ)−1 (O), and the latter is an open set by the continuity
of f ◦ ψ.

Let E be a normed space and let f ∈ E0 . We denote by ϕ f : E → R the


linear functional ϕ f ( x ) = h f , x i. As f runs through E∗ we obtain a collection
{φ f } f ∈E0 of maps from E into R. We now ignore the usual topology induced
by k · k on E and define a new topology on the set E as follows.

4.49 definition. The weak topology σ ( E, E0 ) on E is the inverse limit topology


of the family of maps {φ f } f ∈E0 .

4.50 remark. Note that all f ∈ E∗ are continuous functionals on ( E, k · k). Since
σ ( E, E0 ) is the coarsest topology that makes all f ∈ E0 continuous, we deduce
that the weak topology σ ( E, E0 ) on E is weaker than the usual topology induced on
E by the norm k · k, which we will from now on refer to as the strong topology
on E.

4.51 proposition. Let x0 ∈ E. Given e > 0 and a finite set { f 1 , f 2 , . . . , f k } ⊂ E0 ,


consider

V = V ( f 1 , . . . , f k ; e) = { x ∈ E : |h f i , x − x0 i| < e, for all i = 1, . . . , k}.

Then V is a neighborhood of x0 for the topology σ( E, E0 ). Moreover, we obtain a basis


of neighborhoods of x0 by varying e, k, and the f i ’s in E0 .

Proof. From the discussion above we already know that the sets V in the state-
ment are a basis of neighborhoods for x0 if the statement |h f i , x − x0 i| < e is
replaced by f i ∈ Oi for some Oi open neighborhood of f i ( x0 ). The statement

109
4. Introduction to linear operators on Banach spaces

follows by recalling that the open sets of R are characterised as the sets O
such that for all y ∈ O there exists an open interval (y − e, y + e) ⊂ O for
some e > 0. And this proves the assertion.

Let { xn }n be a sequence on E. If xn converges to x ∈ E in the σ ( E, E0 )


topology we shall use the notation

xn * x.

We shall sometimes say xn * x weakly in σ ( E, E0 ). The convergence of xn


to x in the usual topology will be sometimes emphasised by saying xn → x
strongly, meaning k xn − x k → 0.

4.52 proposition. Let { xn }n ⊂ E be a sequence in E. Then

(i) xn * x weakly in σ ( E, E0 ) is and only if h f , xn i → h f , x i for all f ∈ E0 .

(ii) If xn → x strongly, then xn * x weakly in σ ( E, E0 ).

(iii) If xn * x weakly in σ ( E, E0 ), then {k xn k}n is bounded and

k x k ≤ lim inf k xn k.
n→+∞

(iv) If xn * x weakly in σ ( E, E0 ) and if f n → f strongly in E0 (i. e. k f n − f k E0 →


0), then h f n , xn i → h f , x i.

Proof. (i) is a consequence of the Exercise 4.47 and the definition of weak
topology σ ( E, E0 ) on E.
(ii) is a consequence of (i), since

|h f , xn i − h f , x i| ≤ k f k E0 k xn − x k.

Alternatively, it is a consequence of the fact that the weak topology is weaker


than the norm topology.
(iii) follows from Theorem 4.40. Indeed, for every n ∈ N define the map
Tn : E0 → R as Tn ( f ) = h f , xn i. Since xn converges weakly to x, for every
f ∈ E0 the real sequence h f , xn i is convergent, and hence bounded. There-
fore, for all f ∈ E0 the family Tn ( f ) is uniformly bounded in E0 with re-
spect to n ∈ N. Hence, from the Banach-Steinhaus theorem 4.40 we have
supn∈N k Tn kL(E0 ,R) < +∞ and there exists a constant c ∈ R such that

| Tn ( f )| = |h f , xn i| ≤ ck f k E0 , for all n ∈ N.

Hence, Corollary 4.37 implies

k xn k = sup |h f , xn i| ≤ c,
f ∈ E 0 , k f k E 0 ≤1

which proves that k xn k is a bounded sequence. Now, taking the lim infn→+∞

110
4.6. Weak topologies and weak convergences

in the inequality |h f , xn i| ≤ k f k E0 k xn k we obtain

|h f , x i| ≤ k f k E0 lim inf k xn k,
n→+∞

which implies, once again by Corollary 4.37,

k x k = sup |h f , x i ≤ lim inf k xn k.


n→+∞
k f k≤1

(iv) follows from the inequality

|h f n , xn i − h f , x i| ≤ |h f n − f , xn i| + |h f , xn − x i| ≤ k f n − f kk xn k + |h f , xn − x i|.

Now, due to (iii) k xn k is uniformly bounded and the first term in the r.h.s.
above goes to zero. The second term goes to zero because of (i).

4.53 proposition. When E is finite-dimensional, the weak topology σ ( E, E0 ) and


the usual topology are the same. In particular, a sequence { xn }n converges weakly if
and only if it converges strongly.

Proof. Since the weak topology is always weaker than the strong topology,
it suffices to check that every strongly open set is weakly open. Let x0 ∈ E
and let U be a neighborhood of x0 in the strong topology. We have to find a
neighborhood V of x0 in the weak topology σ ( E, E0 ) such that V ⊂ U. In other
words, we have to find f 1 , . . . , f k ∈ E0 and e > 0 such that

V = { x ∈ E : |h f i , x − x0 i| < e , for all i = 1, . . . , k} ⊂ U.

Fix r > 0 such that Br ( x0 ) ⊂ U. Pick a basis e1 , e2 , . . . , ek ∈ E such that kei k = 1


for all i. Hence, every x ∈ E can be written as x = ∑ik=1 xi ei , and the maps
x 7→ xi are continuous linear functionals on E denoted by f i . Choosing those
functionals for the neighborhood V we have, for all x ∈ V,

k
k x − x0 k ≤ ∑ |h fi , x − x0 i| < ke.
i =1

Choosing e = r/k, we obtain V ⊂ Br ( x0 ) ⊂ U as desired.

Open (resp. closed) sets in the weak topology σ( E, E0 ) are always open
(resp. closed) in the strong topology. In any infinite dimensional space the
weak topology is strictly coarser than the strong topology: i. e., there exist
open (resp. closed) sets in the strong topology that are not open (resp. closed)
in the weak topology. Here are two examples.

4.54 example. The unit sphere S = { x ∈ E : k x k = 1}, with E infinite


dimensional, is never closed in the weak topology σ ( E, E0 ). More precisely, we
have
σ ( E,E0 )
S = B1 (0) = { x ∈ E : k x k ≤ 1},

111
4. Introduction to linear operators on Banach spaces

σ ( E,E0 )
where S denotes the closure of S in the topology of σ( E, E0 ). First, let us
σ ( E,E0 )
check that every x0 ∈ E with k x0 k < 1 belongs to S . Indeed, let V be a
neighborhood of x0 in σ ( E, E0 ). We have to prove that V ∩ S 6= ∅. In view of
Proposition 4.51, we may always assume that V has the form

V = { x ∈ E : |h f i , x − x0 i| < e for all i = 1, . . . , k},

with e > 0 and f 1 , . . . , f k ∈ E0 .


Now, we claim that there exists y0 ∈ E, y0 6= 0, such that h f i , y0 i = 0 for
all i = 1, . . . , k. Assume by contradiction that the claim is false. We define the
following linear map ϕ : E → Rk ,

ϕ( x ) = (h f 1 , x i, . . . , h f k , x i) ∈ Rk .

If the claim is false, then for all x ∈ E there exists i ∈ {1, . . . , k } such that
f i ( x ) 6= 0, which means that for all x ∈ E the vector ϕ( x ) ∈ Rk is always
non zero. Hence, ϕ is a linear isomorphism from E onto ϕ( E), and hence
dim( E) = dim( ϕ( E)) ≤ k, which contradicts the assumption that E has infi-
nite dimension. This proves the claim.
Now, the function [0, +∞) 3 t 7→ g(t) := k x0 + ty0 k is continuous on
[0, +∞). Moreover, g(0) < 1, and limt→+∞ g(t) = +∞. Hence, there exists
t0 > 0 such that k x0 + ty0 k = 1, i. e. x0 + t0 y0 ∈ S. Now, for all i ∈ 1, . . . , k
we have h f i , ( x0 + t0 y0 ) − x0 i = t0 h f i , y0 i = 0 < e, clearly x0 + t0 y0 ∈ V.
Therefore, x0 + t0 y0 ∈ V ∩ S.
σ ( E,E0 )
The above argument proves that S ⊂ B1 (0) ⊂ S . To prove B1 (0) ⊃
σ ( E,E∗ )
S it suffices to prove that B1 (0) is closed in the weak topology. We omit
the proof.

4.55 example. The open unit ball

U = { x ∈ E : k x k < 1},

with E infinite dimensional, is never open in the weak topology σ( E, E0 ). Sup-


pose by contradiction that U is weakly open. Then its complement U c = { x ∈
E : k x k ≥ 1} is weakly closed. It follows that B1 (0) ∩ U c = S is weakly closed,
which contradicts the example 4.54.

One can prove that the weak topology is never metrizable in infinite di-
mensions, i. e. there is no metric on E that induces the weak topology σ ( E, E0 ).
The proof is postponed.
We now turn our attention to the dual space E0 . As we know, E0 is a normed
space with the operator norm

k f k E0 = sup | f ( x )|.
k x k≤1

Hence, one could consider the dual space of E0 , i. e. the space of all continuous
linear functionals defined on E0 , given by the bidual E00 . As dual of the space

112
4.6. Weak topologies and weak convergences

E∗ , the space E00 is naturally equipped with the norm

kξ k E00 = sup |hξ, f i|, ξ ∈ E00 .


f ∈ E0 , k f k E 0 ≤1

Recall the canonical injection J : E → E00 defined as follows. Given x ∈ E, the


map E0 3 f 7→ h f , x i is a continuous linear functional on E0 . This is due to the
inequality

|h f , x i| ≤ k x k E k f k E0 .

Thus, such map is an element of E00 . We denote such element as J ( x ). By


definition, we have

h J ( x ), f i E00 ,E0 = h f , x i E0 ,E , for all f ∈ E0 .

It is clear that J is linear and that J is an isometry, that is, k J ( x )k E00 = k x k E .


Indeed, we have

k J ( x )k E00 = sup |h J ( x ), f i| = sup |h f , x i| = k x k E ,


k f k E 0 ≤1 k f k E 0 ≤1

and the last step is due to Corollary 4.37.


So far, we have two topologies on E0 :

(a) the usual strong topology associated to k · k0 ,

(b) the weak topology σ ( E0 , E00 ).

We are now going to define a third topology on E0 , called weak∗ topology


and denoted by σ ( E0 , E). For every x ∈ E consider the linear functional ϕ x :
E0 → R defined by E0 3 f 7→ ϕ x ( f ) = h f , x i. As x runs through E we obtain a
collection { ϕ x } x∈E of maps from E0 into R.

4.56 definition. The weak∗ topology σ( E0 , E) on E0 is the inverse limit topology


of the family of maps { ϕ x } x∈E .

Notice that ϕ x is just another notation for J ( x ) above. Hence, for all x ∈ E,
the linear functional ϕ x : E0 → R is continuous (as a function from the normed
space ( E∗ , k · k E0 ) to R), and therefore ϕ x ∈ E00 . This fact implies that the weak∗
topology σ ( E0 , E) on E0 is coarser than the weak topology σ ( E0 , E00 ) on E0 , i. e.
σ ( E0 , E) has fewer open sets (resp. closed sets) than σ( E0 , E00 ), which in turn
has fewer open sets than the strong topology induced on E0 by the operator
norm k · k E0 .
One may probably wonder why there is such an interest into defining
weaker and weaker topologies. The reason is the following: a coarser topology
has more compact sets, since there are less open covers to test the compactness
condition.
We now state some general properties of the weak* topology without
proofs.

113
4. Introduction to linear operators on Banach spaces

4.57 proposition. Let f 0 ∈ E0 . Given a finite set { x1 , . . . , xk } ⊂ E and e > 0,


consider

V = V ( x1 , . . . , xk ; e) = { f ∈ E0 : |h f − f 0 , xi i| < e, for all i = 1, . . . , k}.

Then V is a neighborhood of f 0 for the topology σ ( E0 , E). Moreover, we obtain a basis


of neighborhoods of f 0 for σ ( E0 , E) by varying e, k, and the xi ’s in E.

Let { f n }n be a sequence on E0 . If f n converges to f ∈ E0 in the σ ( E0 , E)


topology we shall use the notation

fn −
* f.

We shall sometimes say f n − * f weakly∗ in σ( E0 , E). The convergence of f n
to f in the weak topology σ ( E0 , E00 ) will be denoted by f n * f in σ ( E0 , E00 ).
The convergence of f n to f in the strong topology of E0 will be sometimes
emphasised by saying f n → f strongly, meaning k f n − f k E0 → 0.
By mimicking the proof of Proposition 4.52, one can prove (we won’t) the
following statements for a general sequence f n ∈ E0 :


* f in σ( E0 , E) if and only if h f n , x i → h f , x i for all x ∈ E.
(i) f n −

(ii) If f n → f strongly, then f n * f in σ( E0 , E00 ). If f n * f in σ ( E0 , E00 ), then



fn −* f in σ( E0 , E).


* f in σ( E0 , E) then {k f n k}n is bounded and k f k ≤ lim infn→+∞ k f n k.
(iii) If f n −


* f in σ( E0 , E) and if xn → x strongly in E, then h f n , xn i → h f , x i.
(iv) If f n −

When dim( E) < +∞ the three topologies (strong, weak, weak∗ ) on E0


coincide. Indeed, the canonical injection J : E → E00 is in this case surjective
(since dim( E) = dim( E00 ) and therefore σ ( E0 , E) = σ ( E0 , E00 ).
We are now ready to state one of the main results of this part. As we
observed above, weakening a topology implies having more compact sets.
As seen in Theorem 4.28, one of the main points with the strong topology
on a normed space of infinite dimension is that the closed unit ball is not
a compact set. Such a situation changes drastically when passing from the
strong topology to weak topologies. The next result is a first big step toward
this direction.

4.58 theorem (Banach - Alaoglu - Bourbaki). The closed unit ball

BE 0 = { f ∈ E 0 : k f k E 0 ≤ 1 }

is compact in the weak∗ topology σ( E0 , E).

The next result describes a basic property of reflexive spaces.

114
4.7. Weak convergences in ` p and L p spaces

4.59 theorem (Kakutani). Let E be a Banach space. Then E is reflexive if and only
if

BE = { x ∈ E : k x k ≤ 1 }

is compact in the weak topology σ ( E, E0 ).

We conclude with a result which is probably the most important one of


this section.

4.60 theorem (Weak sequential compactness on reflexive spaces). Assume that


E is a reflexive Banach space and let { xn }n be a bounded sequence in E. Then there
exists a subsequence xnk that converges in the weak topology σ ( E, E0 ).

4.7 Weak convergences in ` p and L p spaces


4.61 example. Let p ∈ (1, +∞) and X = ` p (N). By definition, a sequence
( xn ) ∈ X converges weakly to x if ϕ( xn ) → ϕ( x ) as n → +∞ for all ϕ ∈
(` p (N))0 . Example 4.30 shows that this is equivalent to requiring
+∞ +∞
∑ xn,k yk → ∑ xk yk as n → +∞
k =1 k =1

for all y = (yk ) ∈ `q (N) with 1/p + 1/q = 1. We show with the following
example that weak convergence in general does not imply strong convergence.
Consider the sequence ( xn ) in ` p (N) defined by xn = ( xn,k )k with xn,k = δn,k ,
δn,k being the usual Kronecker delta. We show that xn converges weakly to
zero in ` p (N). To see that, let y = (yk ) be an element of `q (N) with 1/q +
1/p = 1. Compute
+∞ +∞
∑ xn,k yk = ∑ δn,k yk = yn .
k =1 k =1


Since y ∈ `q (N), the series ∑+ q
k =1 | yn | converges, which implies that yn → 0 as
n → +∞, and the assertion is proven. However, xn does not converge strongly
to zero. Indeed,
+∞
∑ |xn,k | p = 1 9 0.
p
k x n − 0k ` p =
k =1

Such an argument clearly does not work if p = 1. In this case, the dual
space is identified with `∞ (N), and therefore it is no longer true that yn → 0
as n → +∞. In fact one can prove that `1 (N) has the so-called Schur property,
namely that every weakly convergent sequence is also strongly convergent,
we omit the details.
Let us now consider the case p = +∞. In this case, we cannot easily iden-
tify the dual space of X = `∞ (N), hence the weak convergence is difficult to
state. But since X is the dual of `1 (N), we can easily define weak-∗ conver-
gence as follows: a sequence ( xn ) ∈ `∞ (N) converges weakly-∗ to x if and

115
4. Introduction to linear operators on Banach spaces

only if
+∞ +∞
∑ xn,k yk → ∑ xk yk as n → +∞
k =1 k =1

for all y = (yk ) ∈ `1 (N). The above example xn,k = δn,k works also in this case
to show that xn converges to zero in the weak-∗ sense but not strongly.

Let us now turn to weak convergence in L p spaces. According to Theo-


rem 4.31, weak convergence can be characterized in terms of convergence of
multiplications under integrals. This permits us to reformulate the notion of
weak convergence in L p spaces as follows. Here we shall always think of L p
as L p (Ω) for some measurable set Ω ⊂ Rd .

4.62 definition. Suppose that 1 ≤ p < +∞. A sequence ( f n ) converges weakly


to f in L p , written f n * f , if
Z Z
lim f n gdx = f gdx for every g ∈ Lq ,
n→+∞

where q is the Hölder conjugate of p, 1/p + 1/q = 1. When p = +∞ and


q = 1, the condition above corresponds to weak-∗ convergence of f n to f in
L∞ .

As in the case of ` p spaces, weak L p convergence does not imply strong


L p -convergence, i.e. convergence in the L p norm. The following example illus-
trates three typical ways in which a weakly convergent sequence of functions
can fail to be strongly convergent.

4.63 example. Let g ∈ L p (R) be a fixed nonzero function, where 1 < p < +∞.
For each of the following three sequences, we have f n * 0 weakly as n → +∞
but not f n → 0 strongly, in L p (R).

(a) f n ( x ) = g( x ) sin nx (oscillation). Consider the case g = 1[0,π ] . For every


polynomial p we have
 
Zπ Zπ
1
Z
f n pdx = sin nxp( x )dx = p(0) − p(π ) cos nπ + p0 ( x ) cos nxdx  ,
n
R 0 0

hence R f n pdx → 0 as n → +∞. We know that polynomial are dense


R

in C ([0, π ]). Therefore, for a general f ∈ Lq (R) with 1/q + 1/p = 1,


and an arbitrary e > 0, let p be a polynomial on [0, π ] such that k f −

116
4.7. Weak convergences in ` p and L p spaces

pk L∞ ([0,π ]) < e. Consider

Z Z Zπ
f n f dx ≤ f n pdx + f n ( f − p)dx
R R 0
Z
≤ f n pdx + k f n k L1 ([0,π ]) k f − pk L∞ ([0,π ]) .
R

Now, the last term above can be controlled by


k f n k L1 ([0,π ]) k f − pk L∞ ([0,π ]) ≤ e | sin nx |dx ≤ e,
0

and then we can send n → +∞ and get

Z
lim f n f dx ≤ e,
n→+∞
R

in view of the previous case of polynomials. Since e is arbitrary, we


have proven that f n * 0 weakly in L p . ROn the other hand, f n does
not converge strongly to zero in L p , since 0 | sin(nx )| p dx can be easily
π

proven not to converge to zero as n → +∞ (exercise).

• f n ( x ) = n1/p g(nx ) (concentration). Once again, for simplicity let us con-


sider the case g = 1[0,π ] . For all f ∈ Lq (R) with q conjugate exponent of
p, Hölder’s inequality implies

Z 1/n
Z
1/p
f n ( x ) f ( x )dx = n | f ( x )|dx
R 0
 1/p  1/q
1/n
Z 1/n
Z
≤ n1/p  dx   | f ( x )|q dx 
0 0
 1/q  1/q
1/n 1/n
1
Z Z
= n1/p  | f ( x )|q dx  = | f ( x )|q dx  .
n1/p
0 0

Now, the sequence of functions hn = | f |q 1[0,1/n] satisfy 0 ≤ hn ≤ | f |q ,


and the latter is a summable function. Therefore, Lebesgue’s dominated
R 1/q
1/n
convergence theorem easily implies 0 | f ( x )|q dx → 0 as n →
+∞. This shows that f n * 0 weakly in L p (R). On the other hand,

1/n
Z
p
k f n k L p (R) = |n1/p | p dx = 1 for all n → +∞,
0

117
4. Introduction to linear operators on Banach spaces

which implies f n does not converge to zero in L p .


(c) f n ( x ) = g( x − n) (escape to infinity). Using the example g( x ) = 1[0,1] ,
it is immediately seen that f n does not converge to zero in L p . On the
other hand, for all f ∈ Lq (R),
 1/q
Z nZ+1 nZ+1
f n f dx = f dx ≤  | f ( x )|q dx  ,
n n

and the last term converges to zero similarly to case (b) above.

4.8 Exercises
1. Let f n : R → R defined by

1

x −
 2n x ≥ 1/n
f n (x) = nx2
2 −1/n ≤ x ≤ 1/n

1
−x − x ≤ −1/n.

2n

Prove that f n is continuously differentiable for all n ∈ N. Find (if it


exists) f the uniform limit as n → +∞ of f n on R. Is f differentiable?
2. Is the space C ([0, 1]) complete with respect to the L p norm for p ∈
[1, +∞)? Justify your answer.
3. Prove that the set

{ f ∈ C ([0, 1]) : f (0) = 0}

is a closed linear subspace of C ([0, 1]).


4. Consider the operator K : C ([0, 1]) → C ([0, 1]) defined by

Z1
K f (x) = sin(π ( x − y)) f (y)dy.
0

(a) Prove that K is a bounded linear operator.


(b) Find the range of K.
5. Let f ( x ) = | x | be defined on x ∈ [−1, 1].
• Prove that the function f is an element of the Sobolev space W 1,p
for all p ∈ [1, +∞] (Hint: show that the W 1,p norms of f are finite
for all p).
• For n ∈ N consider the sequence

1

− x − 2n
 if −1 ≤ x < −1/n
2
f n ( x ) = n x2 if −1/n ≤ x ≤ 1/n

1
x − 2n if 1/n < x ≤ 1.

118
4.8. Exercises

Show that f n → f as n → +∞ in W 1,p for all p ∈ [1, +∞) but not


for p = +∞.

6. Consider the operator T : L1 ([0, 1]) → C ([0, 1])

Zx
( T f )( x ) = t f (t)dt.
0

• Prove that T is a linear operator.


• Prove that T is a bounded operator.

7. Suppose that k : [0, 1] × [0, 1] → R is a continuous function. Prove that


the integral operator K : C ([0, 1]) → C ([0, 1]) defined by

Z1
K f (x) = k( x, y) f (y)dy
0

is compact.

8. Let T : `2 (N) → `2 (N) defined by

( Tx )n = arctan(n) xn n ∈ N.

Show that T is a bounded linear operator and compute the operator


norm k T k.

9. Let T : `∞ (N) → `∞ (N) defined by

n2
( Tx )n = ( x n + x n +1 ) n ∈ N.
1 + n2
Show that T is a bounded linear operator and compute the operator
norm k T k.

10. Let g : Rd → [0, +∞) be a measurable nonnegative function, and let


p ∈ (1, +∞). Consider the operator

f 7 → T f ( x ) = g ( x ) f ( x ).

Find a condition on g such that the above operator T is a linear and


bounded operator from L p (Rd ) to L p (Rd ).

11. Consider the operator A : C1 ((0, 1)) → C ((0, 1)) defined by

d
( A f )( x ) = f ( x ).
dx
Show that A is linear but not bounded.

12. Let `c (N) be the space of all real-valued sequences of the form x =
( x1 , x2 , . . . , xn , 0, 0, . . .), whose terms vanish from some point onwards.

119
4. Introduction to linear operators on Banach spaces

(a) Prove that `c (N) is an infinite dimensional linear subspace of ` p (N)


for all p ∈ [1, +∞].
(b) Prove that `c (N) is not closed in ` p (N) for all p ∈ [1, +∞].
(c) Prove that `c (N) is dense in ` p (N) for all p ∈ [1, +∞).
(d) Prove that the closure of `c (N) in `∞ (N) is the space of all se-
quences that converge to zero.

13. Let T : L2 ([0, π ]) → L2 ([0, π ]) be defined as


( T f )( x ) = cos( x + 2y) f (y)dy.
0

Find the kernel and the range of T.

14. Let T : L1 (R) → R be defined as


Z
T( f ) = sin x f ( x )dx.
R

Show that T is a linear and continuous functional on L1 and compute its


norm.

15. Let x0 ∈ [0, 1]. Let T : C ([0, 1]) → R be defined as

T ( f ) = f ( x0 ).

Prove that T is a linear and bounded functional and compute its norm.

16. Let ( xn ) be the sequence in `2 defined by


(
1 if n = k
xn,k =
0 otherwise.

Prove that xn converges to zero weakly in `2 . Is ( xn ) converging to zero


strongly?

17. f n : R → R be defined as

f n ( x ) = n1[0,1/n] .

Prove that f n converges weakly to zero in L2 (R).

18. Let f n : R → R be defined by f n ( x ) = n1[0,1/n] ( x ).

(a) Prove that f n is uniformly bounded in L1 (R).


(b) Is it possible to extract a subsequence of f n which converges weakly
in L1 ?

120
5 Hilbert spaces
Hilbert spaces are Banach spaces with a norm that is derived from an inner
product, so they have an extra feature in comparison with arbitrary Banach
spaces, which makes them still more special. We can use the inner product
to introduce the notion of orthogonality in a Hilbert space, and the geometry
of Hilbert spaces is in almost complete agreement with our intuition of linear
spaces with an arbitrary (finite or infinite) number of orthogonal coordinate
axes. By contrast, the geometry of infinite-dimensional Banach spaces can be
surprisingly complicated and quite different from what naive extrapolations
of the finite-dimensional situation would suggest.

5.1 Inner products


To be specific, we consider complex linear spaces throughout this section.
We use a bar to denote the complex conjugate of a complex number. The
corresponding results for real linear spaces are obtained by replacing C by R
and omitting the complex conjugates.

5.1 definition. An inner product on a complex linear space X is a map

(·, ·) : X × X → C

such that, for all x, y, z ∈ X and λ, µ ∈ C:


(a) ( x, λy + µz) = λ( x, y) + µ( x, z) (linear in the second argument);

(b) (y, x ) = ( x, y) (Hermitian symmetric);

(c) ( x, x ) ≥ 0 (nonnegative);

(d) ( x, x ) = 0 if and only if x = 0 (positive definite).


We call a linear space with an inner product an inner product space or a pre-
Hilbert space.

From (a) and (b) it follows that (·, ·) is antilinear, or conjugate linear, in the
first argument, meaning that

(λx + µy, z) = λ( x, z) + µ(y, z).

If X is real, then (·, ·) is bilinear, meaning that it is a linear function of each


argument. If X is complex, then (·, ·) is said to be sesquilinear.
There are two conventions for the linearity of the inner product. In most of
the mathematically oriented literature (·, ·) is linear in the first component. We
adopt the convention that the inner product is linear in the second component,
which is more common in applied mathematics and physics.
If X is a linear space with an inner product (·, ·), then we can define a
norm on X by
q
kxk = ( x, x ).

121
5. Hilbert spaces

To see that the above k · k is actually a norm, due to the properties (a)-(d)
above we only need to prove the triangle inequality. Such a property follows
form the following one.

5.2 theorem (Cauchy-Schwarz inequality). Let X be an inner product space, and


let x, y ∈ X. Then

|( x, y)| ≤ k x kkyk. (31)

Proof. By the nonnegativity of the inner product we have

0 ≤ (λx − µy, λx − µy)

for all x, y ∈ X and λ, µ ∈ C. Expansion of the inner product implies

λµ( x, y) + λµ(y, x ) ≤ |λ|2 k x k2 + |µ|2 kyk2 .

If ( x, y) = reiϕ , where r = |( x, y)| and ϕ = Arg( x, y), then we choose

λ = kykeiϕ , µ = k x k.

It follows that

2k x kkyk|( x, y)| ≤ 2k x k2 kyk2 ,

which proves the result.

As a consequence of (31), given x, y ∈ X we have

k x + yk2 = ( x + y, x + y) = k x k2 + kyk2 + ( x, y) + (y, x )


≤ k x k2 + kyk2 + 2k x kkyk = (k x k + kyk)2 ,

and this proves the triangle inequality.

5.3 definition. A Hilbert space is a complete inner product space.

5.4 example. The standard inner product on Cn is given by


n
( x, y) = ∑ xj yj,
j =1

where x = ( x1 , . . . , xn ) and y = (y1 , . . . , yn ), with x j , y j ∈ C. This space is


complete, and therefore it is a finite-dimensional Hilbert space.

5.5 example. Let C ([ a, b]) denote the space of all-complex-valued continuous


functions defined on the interval [ a, b]. We define an inner product on C ([ a, b])

122
5.1. Inner products

by

Zb
( f , g) = f ( x ) g( x )dx,
a

where f , g : [ a, b] → C are continuous functions. This space is not complete,


so it is not a Hilbert space.

5.6 example. Let Ω ⊂ Rd be an open set. Given f , g ∈ L2 (Ω) with complex


values, i.e. f , g : Ω → C, we define as in the previous example
Z
( f , g) = f ( x ) g( x )dx. (32)

Then, it is easily seen that

( f , f ) = k f k2L2 (Ω) ,

which proves that the L2 norm is induced by the inner product (32)11 . Since
L2 is a complete space, then we have just proven that L2 is a Hilbert space. L2
is the only L p space to be a Hilbert space.

5.7 example. We define the Hilbert space `2 (Z) of bi-infinite complex se-
quences by

+∞
( )

`2 (Z) = (zn )+
n=−∞ : ∑ | z n |2 < + ∞ .
n=−∞

The space `2 (Z) is a complex linear space, with the obvious operations of
addition and multiplication by a scalar. An inner product on it is given by
+∞
( x, y) = ∑ xn yn .
n=−∞


The space `2 (N) of squared-summable sequences (zn )+ n=1 is defined in an
analogous way. The fact the these spaces are complete follows by the com-
pleteness of the ` p (N) spaces proven earlier in this course.

5.8 theorem (Parallelogram law). On an inner product space X we have

k x + y k2 + k x − y k2 = 2k x k2 + 2k y k2 ,

for all x, y ∈ X.

11 The L p spaces defined in Section 3.4 consist of functions with real values, but the whole L p

theory can be extended for complex valued functions

123
5. Hilbert spaces

Proof. We compute

k x + y k2 + k x − y k2
= 2k x k2 + 2kyk2 + ( x, y) + (y, x ) − ( x, y) − (y, x )
= 2k x k2 + 2k y k2 .

5.9 exercise. Use Cauchy-Schwarz inequality to prove that the inner prod-
uct is a continuous function on an inner product space with respect to both
components.

5.2 Orthogonality

Let H be a Hilbert space. We denote its inner product by h·, ·i, which is another
common notation for inner products that is often reserved for Hilbert spaces.
The inner product structure of a Hilbert space allows us to introduce the
concept of orthogonality, which makes it possible to visualize vectors and
linear subspaces of a Hilbert space in a geometric way.

5.10 definition. If x, y are vectors in a Hilbert space H, then we say that x


and y are orthogonal, written x ⊥ y, if h x, yi = 0. We say that subsets A and
B are orthogonal, written A ⊥ B, if x ⊥ y for every x ∈ A and y ∈ B. The
orthogonal complement A⊥ of a subset A is the set of vectors orthogonal to A,

A⊥ = { x ∈ H : x ⊥ y for all y ∈ A}.

5.11 theorem. The orthogonal complement of a subset of a Hilbert space is a closed


linear subspace.

Proof. Let H be a Hilbert space and A a subset of H. If x, y ∈ A⊥ and λ, µ ∈ C,


then the linearity of the inner product implies that

h x, λy + µzi = λh x, yi + µh x, zi = 0

for all x ∈ A. Therefore, λy + µz ∈ A⊥ , so A⊥ is a linear subspace.


To show that A⊥ is closed, we show that if (yn ) is a convergent sequence
in A⊥ , then the limit y also belongs to A⊥ . Let x ∈ A. From the continuity of
the inner product we have

h x, yi = h x, lim yn i = lim h x, yn i = 0,
n→+∞ n→+∞

since h x, yn i = 0 for every x ∈ A and yn ∈ A⊥ . Hence, y ∈ A⊥ .

The following theorem expresses one of the fundamental geometrical prop-


erties of Hilbert spaces. While the result may appear obvious, the proof is not
trivial.

124
5.2. Orthogonality

5.12 theorem (Orthogonal Projection). Let M be a closed linear subspace of a


Hilbert space H.

(a) For each x ∈ H there is a unique closest point y ∈ M such that

k x − yk = min k x − zk.
z∈ M

(b) The point y ∈ M closest to x ∈ H is the unique element of M with the property
that ( x − y) ⊥ M.

Proof. Let d be the distance of x from M,

d = inf{k x − yk : y ∈ M }.

First, we prove that there is a closest point y ∈ M at which this infimum


is attained, meaning that k x − yk = d. From the definition of d, there is a
sequence of elements yn ∈ M such that

lim k x − yn k = d.
n→+∞

Thus, for all e > 0, there is an N such that

k x − yn k ≤ d + e when n ≥ N.

We show that the sequence (yn ) is Cauchy. From the parallelogram law, we
have

k y m − y n k2 + k2 − y m − y n k2 = 2k x − y m k2 + 2k x − y n k2 .

Since (yn + ym )/2 ∈ M, the definition of d implies

k x − (ym + yn )/2k ≥ d.

Hence, for all m, n ≥ N, we get

k y m − y n k2 = 2k x − y m k2 + 2k x − y n k2 − k2 − y m − y n k2
≤ 4(d + e)2 − 4d2 = 4e(2d + e).

Therefore, (yn ) is Cauchy. Since a Hilbert space is complete, there is a y such


that yn → y, and since M is closed, we have y ∈ M. The norm is continuous,
so k x − yk = limn→+∞ k x − yn k = d.
Second, we prove the uniqueness of a vector y ∈ M that minimizes k x − yk.
Suppose y and y0 both minimize the distance to x, meaning that

k x − yk = k x − y0 k = d.

Then the parallelogram law implies that

2k x − yk2 + 2k x − y0 k2 = k2x − y − y0 k2 + ky − y0 k2 .

125
5. Hilbert spaces

Hence, since (y + y0 )/2 ∈ M,

ky − y0 k2 = 4d2 − 4k x − (y + y0 )/2k2 ≤ 0,

therefore ky − y0 k = 0 so that y = y0 .
Third, we show that the unique y ∈ M found above satisfies the condition
that the ‘error’ vector x − y is orthogonal to M. Since y minimizes the distance
to x, we have for every λ ∈ C and z ∈ M that

k x − yk2 ≤ k x − y − λzk2 .

Expanding the right-hand side of this equation, we obtain that

2Reλh x − y, zi ≤ |λ|2 kzk2 .

Suppose that h x − y, zi = |h x − y, zi|eiϕ . Choosing λ = ee−iϕ , where e > 0,


and dividing by e, we get

2|h x − y, zi| ≤ ekzk2 .

Taking the limit as e → 0+ , we find that h x − y, zi = 0, so ( x − y) ⊥ M.


Finally, we show that y is the only element in M such that x − y ⊥ M.
Suppose that y0 is another such element in M. Then y − y0 ∈ M, and, for any
z ∈ M we have

hz, y − y0 i = hz, x − y0 i − hz, x − yi = 0.

In particular, we may take z = y − y0 , and therefore we must have y = y0 .

The point y ∈ M above is called the Orthogonal Projection of x onto M.


The proof of part (a) applies if M is any closed convex subset of H (exer-
cise). Theorem 5.12 can also be stated in terms of decomposition of H into an
orthogonal direct sum of closed subspaces.

5.13 definition. If M and N are orthogonal closed linear subspaces of a


Hilbert space H, then we define the orthogonal direct sum, or simply direct sum,
M ⊕ N of M and N by

M ⊕ N = {y + z : y ∈ M and z ∈ N }.

Theorem 5.12 states that if M is a closed subspace, then any x ∈ H may be


uniquely represented as x = y + z, where y ∈ M is the best approximation to
x and z ⊥ M. We therefore have the following corollary.

5.14 corollary. If M is a closed linear subspace of a Hilbert space H, then H =


M ⊕ M⊥ .

Thus, every closed linear subspace M of a Hilbert space has a closed com-
plementary subspace M⊥ . If M is not closed, then we may still decompose H
as H = M ⊕ M⊥ .

126
5.3. Orthonormal bases

5.3 Orthonormal bases


A subset U of nonzero vectors in a Hilbert space H is orthogonal if any two
distinct elements in U are orthogonal. A set of vectors U is orthonormal if it is
orthogonal and kuk = 1 for all u ∈ U, in which case the vectors u are said
to be normalized. An orthonormal basis of a Hilbert space is an orthonormal set
such that every vector in the space can be expanded in terms of the basis, in a
way that we make precise below.
In this section we show that every Hilbert space has an orthonormal ba-
sis, which may be finite, countably finite, of uncountable. Two Hilbert spaces
whose orthonormal bases have the same cardinality are isomorphic, but many
different concrete realizations of a given abstract Hilbert space arise in appli-
cations. The most important case in practice is that of a separable Hilbert space,
which has a finite of countably infinite orthonormal basis. As shown below,
this condition is equivalent to the separability of the Hilbert space as a metric
space, meaning that it contains a countable dense subset.

5.15 example. A set of vectors {e1 , . . . , en } is an orthonormal basis of the


finite-dimensional Hilbert space Cn if:
(a) he j , ek i = δjk for 1 ≤ j, k ≤ n;
(b) for all x ∈ Cn there are unique coordinates xk ∈ C such that
n
x= ∑ x k ek ,
k =1

where δjk is the Kronecker delta symbol


(
1 if j = k
δjk =
0 if j 6= k.

The orthonormality of the basis implies that xk = hek , x i. For example, the
standard orthonormal basis of Cn consists of the vectors

e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), ..., en = (0, 0, . . . , 1).

5.16 example. Consider the Hilbert space `2 (Z) defined in example 5.7. An
orthonormal basis of `2 (Z) is the set of coordinate basis vectors {en : n ∈ Z}
given by

en = (δkn )+
k =−∞ .

5.17 example. The set of functions {en : n ∈ Z}, given by

1
en ( x ) = √ einx ,

is an orthonormal basis of the space L2 (T) of 2π-periodic functions, called


the Fourier basis. We will study it in detail below. As we will see, the map

127
5. Hilbert spaces

F −1 : `2 (Z) → L2 (T) defined by


+∞
1
F −1 ((ck )k ) = √

∑ ck eikx
k=−∞

is a Hilbert space isomorphism between `2 (Z) and L2 (T). Such a map is called
inverse Fourier transform. Both Hilbert spaces are separable with a countably
infinite basis.

5.18 theorem (Bessel’s inequality). Let U = {un : n ∈ N} be an orthonormal


sequence in a Hilbert space H and x ∈ H. Then,

(a) ∑n∈N |hun , x i|2 ≤ k x k2 ;

. ∞
(b) xU = ∑+
n=1 h un , x i un is a convergent sum;

(c) x − xU ∈ U ⊥ .

Proof. We begin by computing k x − ∑nN=1 hun , x iun k for any N ∈ N:

N 2 *
N
!
N
!+
x− ∑ hun , x iun = x− ∑ hun , x iun , x− ∑ hum , x ium
n =1 n =1 m =1
N N N
= h x, x i − ∑ hum , xihx, um i − ∑ hun , xihun , xi + ∑ hun , x ihum , x ihun , um i
m =1 n =1 n,m=1
N
= k x k2 − ∑ |hun , xi|2 .
n =1

Hence,

N N 2

∑ |hun , xi| 2 2
= kxk − x − ∑ hun , x iun ≤ k x k2 .
n =1 n =1

Since ∑nN=1 |hun , x i|2 is a sum of nonnegative numbers that is bounded from
above by k x k2 , it is the partial sum of a convergent series. Therefore the sum
converges and satisfies (a). The convergence claimed in (b) follows by the fact
that, for given N, M ∈ N, one has

N 2 N N
∑ un = ∑ hun , um i = ∑ k u n k2 .
n = M +1 n,m= M+1 n = M +1

Now, since the right hand side goes to zero as N, M → +∞ because of (a),
then the left hand side is infinitesimal for large N, M. In order to prove (c),
we consider any uk0 ∈ U. Using the orthonormality of U and the continuity

128
5.3. Orthonormal bases

of the inner product, we find that

+∞
* +
x− ∑ hun , x iun , uk0
n =1
+∞
= h x, uk0 i − ∑ hun , xihun , uk0 i = hx, uk0 i − hx, uk0 i = 0.
n =1


Hence, x − ∑+ ⊥
n =1 h u n , x i u n ∈ U .

Given a subset U ⊂ H, we define the closed linear span [U ] of U by


( )
[U ] = ∑ cu u : cu ∈ C and ∑u∈U cu u converges .
u ∈U

Equivalently, [U ] is the smallest closed linear subspace that contains U. We


leave the proof of the following lemma to the student.

5.19 lemma. If U = {un : n ∈ N} is an orthonormal set in a Hilbert space H,


then
+∞
( )

[U ] = ∑ cn un : cn ∈ C and ∑+
n =1 | c n | < + ∞
2 .
n =1

By combining Theorem 5.12 and Theorem 5.18 we see that xU , defined in


part (b) of Theorem 5.18, is the unique element of [U ] satisfying

k x − xU k = min k x − uk.
u∈[U ]

In particular, if [U ] = H, then xU = x, and every x ∈ H may be expanded


in terms of elements of U. The following theorem gives equivalent conditions
for this property of U, called completeness.

5.20 theorem. If U = {un : n ∈ N} is an orthonormal sequence of a Hilbert


space H, then the following conditions are equivalent:

(a) hun , x i = 0 for all n ∈ N implies x = 0;

(b) x = ∑n∈N hun , x iun for all x ∈ H;

(c) k x k2 = ∑n∈N |hun , x i|2 for all x ∈ H;

(d) [U ] = H.

Proof. Omitted.

In view of this theorem, we may now introduce the following definition.

5.21 definition. An orthonormal set U = { xn : n ∈ N} of a Hilbert space


H is complete if it satisfies any of the equivalent conditions (a)-(d) in Theorem

129
5. Hilbert spaces

5.20. A complete orthonormal subset of H is also called an orthonormal basis of


H.

Condition (a) is often the easiest to verify. Condition (b) is the property
that is used most often. Condition (c) is a special case of the so-called Parseval’s
identity (see below). Condition (d) simply expresses completeness of the basis.
The following identity shows that a Hilbert space H with orthonormal
basis {un : n ∈ N} is isomorphic to the sequence space `2 (Z). The proof is
omitted.

5.22 theorem (Parseval’s identity). Suppose that U = {un : n ∈ N} is an


orthonormal basis of a Hilbert space H. If x = ∑n∈N an un and y = ∑n∈N bn un ,
where an = hun , x i and b = hun , yi, then

h x, yi = ∑ a n bn .
n ∈N

Orthonormal basis play an essential role in Hilbert spaces. It can be proven


(but we shall omit the proof) that an arbitrary Hilbert space can be equipped
with an orthonormal basis. In general, this basis may be countable or more
than countable. One can prove that a separable Hilbert space always has a
countable orthonormal basis. The procedure used to prove that is the so-called
Gram-Schmidt orthonormalization procedure, i.e. an algorithm for the construc-
tion of an orthonormal basis from any countable linearly independent set
whose linear span is dense in H. We omit the details and state this impor-
tant assertion as a Theorem. In fact, the existence of an orthonormal basis is
equivalent to the separability of the Hilbert space.

5.23 theorem. A Hilbert space is separable if and only if it has a countable orthonor-
mal basis.

What makes Hilbert spaces so powerful in many applications is the possi-


bility of expressing a problem in terms of a suitable orthonormal basis. Here
we consider Fourier series, which corresponds to the expansion of periodic
functions with respect to an orthonormal basis of trigonometric functions.
A function f : R → C is 2π-periodic if

f ( x + 2π ) = f ( x ) for all x ∈ R.

The choice of the 2π for the period is simply for convenience; different periods
may be reduced to this case by rescaling the independent variable. A 2π-
periodic function on R may be identified with a function on the circle, or one-
dimensional torus T = R/(2πZ), which we identify by identifying points in
R that differ by 2πn for some n ∈ Z. We could instead represent a 2π-periodic
function f : R → C by a function on a closed interval f : [ a, a + 2π ] → C such
that f ( a) = f ( a + 2π ), but the choice of a here is arbitrary, and it is clearer to
think of the function as defined on the circle, rather than an interval.
The space C (T) is the space of continuous functions from T to C, and

130
5.3. Orthonormal bases

L2 (T) is the closure of C (T) with respect to the L2 norm


 1/2
Z
k f k L2 =  | f ( x )|2 dx  .
T

Here, the integral over T is an integral with respect to x taken over any interval
of length 2π. We recall that L2 (T) is a Hilbert space with the inner product
Z
h f , gi = f ( x ) g( x )dx.
T

The Fourier basis elements are the functions


1
en ( x ) = √ einx .

We recall by Euler’s formula that

einx = cos(nx ) + i sin(nx ).

Our first objective is to prove that {en : n ∈ Z} is an orthonormal basis of


L2 (T). The orthonormality of the functions en is a simple computation:

1 1
Z
h em , en i = √ eimx √ einx dx
2π 2π
T
Z2π
1
= ei(m−n) x dx

0
(
1 if m = n,
=
0 if m 6= n.

Thus, the main result that we have to prove is the completeness of {en :
n ∈ N}. We denote the set of all finite linear combination of the en by P .
Functions in P are called trigonometric polynomials. We will prove that any
continuous function on T can be approximated uniformly by trigonometric
polynomials. Since uniform convergence on T implies L2 -convergence, and
continuous functions are dense in L2 (T), it follows that the trigonometric
polynomials are dense in L2 (T), so {en } is a basis.

5.24 theorem. The trigonometric polynomials are dense in C (T) with respect to the
uniform norm.

Proof. Omitted

The result in Theorem 5.24 and the density of continuous functions in L2


imply that trigonometric polynomials are dense in L2 . Hence, the set U =
{en : n ∈ Z} spans the whole Hilbert space L2 (T). Due to Theorem 5.20,

131
5. Hilbert spaces

this implies that U is an orthonormal basis for L2 (T). This means that any
function f ∈ L2 (T) can be expanded in a Fourier series as
+∞
1
f (x) = √

∑ fˆn einx , (33)
n=−∞

where
Z2π
1
fˆn = hen , f i = √ f ( x )e−inx dx.

0

The identity (33) means convergence of the partial sums of f in the L2 -norm,
i.e.

N 2
1
Z
lim
N →+∞


∑ fˆn einx − f ( x ) dx = 0.
T n=− N

Moreover, Parseval’s identity implies that


Z +∞
f ( x ) g( x )dx = ∑ fˆn ĝn .
n=−∞
T

In particular, the L2 -norm of an L2 (T)-function can be computed either in


terms of the function or its Fourier coefficients, since
Z +∞
| f ( x )|2 dx = ∑ | fˆn |2 .
n=−∞
T

Thus, the periodic Fourier transform F (T) → `2 (Z) that maps a function to
its sequence of Fourier coefficients, by
 +∞
Ff = fˆn ,
n=−∞

is a Hilbert space isomorphism between L2 (T) and `2 (Z). The projection the-
orem, Theorem 5.12, implies that the partial sum

N
1
FN ( x ) = √

∑ fˆn einx
n=− N

is the best approximation of f by a trigonometric polynomial of degree N in


the sense of the L2 -norm.
We conclude this section with some examples of orthogonal projections on
closed subspaces M of a Hilbert space, that is maps P : H → M provided by
the projection theorem.

5.25 example. The space L2 (R) is the orthogonal direct sum of the space M of

132
5.3. Orthonormal bases

even functions and the space N of odd functions. The orthogonal projections
P and Q of H onto M and N, respectively, are given by

f ( x ) − f (− x ) f ( x ) − f (− x )
P f (x) = , Q( x ) = .
2 2
Note that I − P = Q.

5.26 example. Suppose that A is a measurable subset of R - for example, an


interval - with characteristic function
(
1 if x ∈ A
1 A (x) =
0 if x 6∈ A.

Then

PA f ( x ) = 1 A ( x ) f ( x )

is an orthogonal projection of L2 (R) onto the subspace of functions with sup-


port contained in A.

A frequently encountered case is that of projections onto a one-dimensional


subspace of a Hilbert space H. For any vector u ∈ H with kuk = 1, the map
Pu defined by

Pu x = hu, x iu

projects a vector orthogonally onto its component in the direction u.

5.27 example. If H = Rn , the orthogonal projection Pu in the direction of a


unit vector u has the rank one matrix uu T . The component of a vector x in the
direction u is Pu x = (u T x )u.


5.28 example. If H = `2 (Z), and u = en , where en = (δk,n )+
k =−∞ , and x = ( xk ),
then Pen x = xn en .

√ example. If H = L (T) is the space of 2π-periodic functions and u =


5.29 2

1/ 2π is the constant function with norm one, then the orthogonal projection
Pu maps a function to its mean: Pu f = h f i, where

Z2π
1
hfi = f ( x )dx.

0

The corresponding orthogonal decomposition,

f ( x ) = h f i + f˜( x ),

decomposes a function into a constant mean part h f i and a fluctuating part f˜


with zero mean.

133
5. Hilbert spaces

5.4 The dual of a Hilbert space

A linear functional on a complex Hilbert space H is a linear map from H to C.


A linear functional ϕ is bounded, or continuous, if there exists a constant M
such that

| ϕ( x )| ≤ Mk x k for all x ∈ H. (34)

The norm of a bounded linear functional ϕ is

k ϕk = sup | ϕ( x )|. (35)


k x k≤1

If y ∈ H, then

ϕy ( x ) = hy, x i (36)

is a bounded linear functional on H, with k ϕy k = kyk.

5.30 example. Suppose that H = L2 (T). Then, for each n ∈ Z, the functional
ϕn : L2 (T) → C,

1
Z
ϕn ( f ) = √ f ( x )e−inx dx,

T

that maps a function to its nth Fourier coefficient is a bounded linear func-
tional. We have k ϕn k = 1 for every n ∈ Z.

One of the fundamental facts about Hilbert spaces is that all bounded
linear functionals are of the form (36).

5.31 theorem (Riesz’ representation theorem). Let H be a Hilbert space, and let
f ∈ H 0 a linear and continuous functional on H. Then, there exists a unique z ∈ H
such that

h f , x i = (z, x ), for all x ∈ H. (37)

The map σ : H 0 3 f 7→ z ∈ H is a bijection of H 0 onto H, it is an isometry, i.


e. kσ ( f )k H = k f k H 0 , and it is anti-linear, i. e. σ ( f + λg) = σ ( f ) + λσ( g) for all
f , g ∈ H 0 and all λ ∈ C.

Proof. Let N = Ker( f ). If N = H, then f ≡ 0, and we set z = 0. If N 6= H


then there exists z0 ∈ N ⊥ with z0 6= 0: indeed, for a given x0 ∈ H \ N,
take z0 = Q( x0 ) with Q the orthogonal projection onto N ⊥ . We observe that
h f , z0 i 6= 0 as z0 6∈ Ker( f ). Moreover, for all x ∈ H one has
z0
x− h f , x i ∈ N.
h f , z0 i

134
5.5. Exercises

Indeed,

z0 h f , z0 i
hf, x − h f , x ii = h f , x i − h f , x i = 0.
h f , z0 i h f , z0 i

Therefore,

z0 h f , xi h f , xi
0 = ( z0 , x − h f , x i) = (z0 , x ) − ( z , z ) = ( z0 , x ) − k z0 k2 ,
h f , z0 i h f , z0 i 0 0 h f , z0 i

which implies

h f , z0 i h f , z0 i
h f , xi = 2
( z0 , x ) = ( z0 , x ).
k z0 k k z0 k2
h f ,z0 i
Hence, k z0 k2 0
z = σ( f ) is the desired element in H for which (37) holds.
Now, we claim that there is just one vector z ∈ H with the property (37).
Assume h f , x i = (z0 , x ) for all x ∈ H. Then, (z − z0 , x ) = 0 for all x ∈ H, which
implies z = z0 . Moreover, σ is a bijection, because if σ ( f 1 ) = σ ( f 2 ) then

h f 1 , x i = ( σ ( f 1 ), x ) = ( σ ( f 2 ), x ) = h f 2 , x i,

for all x ∈ H, i. e. f 1 and f 2 coincide. The anti-linearity is an easy exercise. Let


us prove that σ is an isometry. It suffices to prove that kσ( f )k H = k f k H ∗ . Let
x ∈ H, we have (with z = σ ( f ))

|h f , x i| = |(z, x )| ≤ k x kkzk,

which implies k f k ≤ kσ ( f )k. Choosing x = z we have |h f , zi| = kzk2 , which


proves the assertion.

5.32 corollary. Every Hilbert space is reflexive.

5.5 Exercises

1. Let A be a subset of a Hilbert space H. Prove that A⊥ = A .

2. Suppose that H1 and H2 are Hilbert spaces. Define

H1 ⊕ H2 = {( x1 , x2 ) : x1 ∈ H1 , x2 ∈ H2 }

with the inner product

h( x1 , x2 ), (y1 , y2 )i = h x1 , y1 i H1 + h x2 , y2 i H2 .

Prove that H1 ⊕ H2 is a Hilbert space. Find the orthogonal complement


of the subspace {( x1 , 0) : x1 ∈ H1 }.

3. Let f , g ∈ H, H a Hilbert space. Assume that equality holds in Cauchy-

135
5. Hilbert spaces

Schwartz inequality for f and g, i. e

( f , g) = k f kk gk.

Prove that f = cg for some scalar c ∈ C. Hint: assume first that k f k =


k gk = 1 and use Pythagoras’ theorem.
4. Let η : [ a, b] → R be a continuous function such that η (t) > 0 for all
t ∈ [ a, b]. For two given functions f , g : [ a, b] → C define the product

Zb
( f , g)η := f ( x ) g( x )η ( x )dx.
a

Prove that (·, ·)η is a scalar product, and prove that the resulting normed
space is a Hilbert space.
5. Prove the the vectors in an orthogonal set are linearly independent.
6. Let H = L2 (R), and set

M = { f ∈ H : f (t) = f (−t) almost everywhere in R}.

• Show that M is a closed subspace of H.


• Find an explicit expression for the orthogonal complement M⊥ .
• Find an explicit expression for the orthogonal projection of H onto
M.
7. Consider the Hilbert space H = L2 (Rd ; Rd ) of all vector fields v : Rd →
Rd equipped with the scalar product
Z
(u, v) = u( x ) · v( x )dx.
Rd

Consider u( x ) = ∇V ( x ) ∈ H for some V ∈ C1 (Rd ) and v ∈ H such that


divv = 0. Prove that u and v are orthogonal.
8. Let { xn : n ∈ N} be a countable orthonormal set in a Hilbert space.
∞ xn
Show that the sum ∑+n=1 n converges unconditionally but not abso-
lutely.
9. Prove the following lemma: If U = {uα : α ∈ I } is an orthonormal set
in a Hilbert space H, then
( )
[U ] = ∑ cα uα : cα ∈ C and ∑α∈ I |cα |2 < +∞ .
α∈ I

10. Prove that the sets {en : n ≥ 1} defined by


r
2
en ( x ) = sin(nx ),
π

136
5.5. Exercises

and { f n : n ≥ 0} defined by
r r
1 2
f0 (x) = , f n (x) = cos(nx ), for n ≥ 1
π π

are both orthonormal bases of L2 ([0, π ]).

11. For each of the following functions f ∈ L2 ([0, π ]), find the Fourier coeffi-
cients of f with respect to both the bases {en : n ≥ 1} and { f n : n ≥ 0}
of the previous exercise:

(a) f ( x ) = x2 ,
(b) f ( x ) = | x |,
(
1 x ∈ [0, π/2]
(c) f ( x ) = ,
2 x ∈ (π/2, π ]
(d) f ( x ) = 3 sin(4x ) − 7 cos(2x ),

12. * Define the Legendre polynomials Pn by

1 dn 2
Pn ( x ) = ( x − 1) n .
2n n! dx n
• Compute the first few Legendre polynomials.
• Show that the Legendre polynomials are orthogonal in L2 ([−1, 1]),
and that they are obtained by Gram-Schmidt orthogonalisation of
the monomials.
• Show that
Z1
2
Pn ( x )2 dx = .
2n + 1
−1

• Prove that the set


r
2n + 1
{ Pn , n ∈ N}
2

is an orthonormal basis for L2 ([−1, 1])

13. * Let H = L2 ([0, 1]). We say that f ∈ H has a weak derivative in L2 if there
exists a function g ∈ H such that

Z1 Z1
g( x )φ( x )dx = − f ( x )φ0 ( x )dx, for all φ ∈ Cc1 ([0, 1]).
0 0

The function g is called the weak derivative of f , and is denoted by D f .


We call H01 ⊂ H the set of all f ∈ L2 with a weak derivative in L2 .

137
5. Hilbert spaces

• Prove that, if f ∈ H is a continuously differentiable function, then


the weak derivative D f coincides almost everywhere with the clas-
sical derivative f 0 .
• Prove that H01 is a dense linear subspace of L2 . Hint: use the fact
that C1 functions are dense in H.
• Equip H01 with the product

Z1 Z1
( f , g) H1 := f ( x ) g( x ) f x + D f ( x ) Dg( x )dx.
0
0 0

Prove that H01 is a Hilbert space with the above product.

14. Suppose ( Pn ) is a sequence of orthogonal projections on a Hilbert space


H such that
+
[ ∞
RanPn+1 ⊃ RanPn , RanPn = H.
n =1

Prove that ( Pn ) converges strongly to the identity operator as n → +∞.


Show that ( Pn ) does not converge to the identity operator with respect
to the operator norm unless Pn = I for all n sufficiently large.

15. Let H = L2 (T3 ; R3 ) be the Hilbert space of 2π periodic, square-integrable,


vector-valued functions u : T3 → R3 , with the inner product
Z
hu, vi = u( x ) · v( x )dx.
T3

We define the subspaces V and W of H by

V = {v ∈ C ∞ (T3 ; R3 ) : divv = 0},


W = {w ∈ C ∞ (T3 ; R3 ) : w = ∇ ϕ, for some ϕ : T3 :→ R}.

Show that H is the orthogonal direct sum of V and W.

138
6 Bounded operators on Hilbert spaces and introduction
to spectral theory

In this chapter we describe some important classes of bounded linear oper-


ators on Hilbert spaces, including self-adjoint operators. We also introduce
the Fredholm alternative principle and some properties of weak convergence
on Hilbert spaces. Then, we introduce spectral theory for bounded linear op-
erators on Hilbert spaces and derive some basic result such as the spectral
decomposition of compact self-adjoint operators.

6.1 The adjoint of an operator

An important consequence of the Riesz representation theorem is the existence


of the adjoint of a bounded operator on a Hilbert space. The defining property
of the adjoint A∗ ∈ B( H ) of an operator A ∈ M( H ) is that

h x, Ayi = h A∗ x, yi for all x, y ∈ H.

To prove that A∗ exists and is uniquely defined, we have to show that for
every x ∈ H, there is a unique vector z ∈ H, depending linearly on x, such
that

hz, yi = h x, Ayi for all x, y ∈ H. (38)

For fixed x, the map ϕ x defined by

ϕ x (y) = h x, Ayi

is a bounded linear functional on H, with k ϕ x k ≤ k Akk x k. By the Riesz


representation theorem, there is a unique z ∈ H such that ϕ x (y) = hz, yi. This
z satisfies (38), so we set A∗ x = z. The linearity of A∗ is left as an exercise.

6.1 example. The matrix of the adjoint of a linear map on Rn with matrix A
is A T , since

x · ( Ay) = ( A T x ) · y.

In component notation, we have


! !
n n n n
∑ ∑ aij y j = ∑ ∑ aij xi yj.
i =1 j =1 j =1 i =1

The matrix of the adjoint of a linear map on Cn with complex matrix A is the
Hermitian conjugate matrix,

A∗ = A T .

6.2 example. Suppose that S and T are the right and left shift operators on

139
6. Bounded operators on Hilbert spaces and spectral theory

the sequence space `2 (N), defined by

S( x1 , x2 , x3 , . . .) = (0, x1 , x2 , . . .), T ( x1 , x2 , x3 , . . . ) = ( x2 , x3 , . . . ).

Then T = S∗ , since

h x, Syi = x2 y1 + x2 y2 + x4 y3 + . . . = h Tx, yi

6.3 exercise. Let K : L2 ([0, 1]) → L2 ([0, 1]) be an integral operator of the form

Z1
K f (x) = k( x, y) f (y)dy,
0

where k : [0, 1] × [0, 1] → C. Then prove that the adjoint operator

Z1
K∗ f (x) = k(y, x ) f (y)dy
0

is the integral operator with the complex conjugate, transpose kernel.

6.4 exercise. Prove that A∗∗ = A for all A ∈ B( H ).

The adjoint plays a crucial role in studying the solvability of a linear equa-
tion

Ax = y (39)

where A : H → H is a bounded linear operator. Let z ∈ H be any solution of


the homogeneous adjoint equation,

A∗ z = 0.

We take the inner product of (39) with z. The inner product on the left-hand
side vanishes because

h Ax, zi = h x, A∗ zi = 0.

Hence, a necessary condition for a solution x of (39) to exist is that hy, zi = 0


for all z ∈ kerA∗ , meaning that y ∈ (kerA∗ )⊥ . This condition on y is not
always sufficient to guarantee the solvability of (39); the most we can say for
general bounded operators is the following result. First, a simple exercise.

6.5 exercise. Let A ⊂ H. Then A⊥⊥ = ( A⊥ )⊥ = [ A], where [ A] is the closed


linear span of A, i.e. the closure of the linear space generated by all finite
linear combinations of vectors of A. To see this, we first observe that A⊥⊥ is a
closed subspace. Moreover, A ⊂ A⊥⊥ . Indeed, let x ∈ A, and let y ∈ A⊥ . Then
h x, yi = 0, which means x ∈ A⊥⊥ . So A⊥⊥ is a closed linear subspace that
contains A, and since [ A] is the smallest linear subspace containing A we have

140
6.1. The adjoint of an operator

A⊥⊥ ⊃ [ A]. On the other hand, assuming that [ A] 6= A⊥⊥ , let x ∈ A⊥⊥ \ [ A].
⊥ ⊥
We can always find x ∈ [ A] . Since [ A] ⊃ A, then [ A] ⊂ A⊥ , and therefore
x ∈ A⊥ . But then x ∈ A⊥ ∩ A⊥⊥ , which is only possible if x = 0. This proves
the assertion.

6.6 theorem. If A : H → H is a bounded linear operator, then

ranA = (kerA∗ )⊥ , kerA = (ranA∗ )⊥ . (40)

Proof. If x ∈ ranA, there is a y ∈ H such that x = Ay. For any z ∈ kerA∗ , we


then have

h x, zi = h Ay, zi = hy, A∗ zi = 0.

This proves that ranA ⊂ (kerA∗ )⊥ . Since (kerA∗ )⊥ is closed, it follows that
ranA ⊂ (kerA∗ )⊥ . On the other hand, if x ∈ (ranA)⊥ , then for all y ∈ H we
have

0 = h Ay, x i = hy, A∗ x i.

Since y ∈ H is arbitrary, this implies that A∗ x = 0, i.e. x ∈ kerA∗ . Hence,


(ranA)⊥ ⊂ kerA∗ . By taking the orthogonal complement of this relation, we
get

(kerA∗ )⊥ ⊂ (ranA)⊥⊥ = ranA,

which proves the first part of (40). To prove the second part, we apply the first
part to A∗ instead of A, use that the kernel of A is a closed linear subspace,
use A = A∗∗ , and take orthogonal complements. The details are left as an
exercise.

An equivalent formulation of this theorem is that if A is a bounded linear


operator on H, then H is the orthogonal direct sum

H = ranA ⊕ kerA∗ .

If A has closed range, then we obtain the following necessary and sufficient
condition for the solvability of (39)

6.7 theorem. Suppose that A : H → H is a bounded linear operator on a Hilbert


space H with closed range. Then the equation Ax = y has a solution x if and only if
y is orthogonal to kerA∗ .

This theorem provides a useful general method of proving existence from


uniqueness: if A has closed range, and the solution of the adjoint problem
A∗ x = y is unique, then kerA∗ = {0}, so every y is orthogonal to kerA∗ .
Hence, a solution of Ax = y exists for every y ∈ H. The condition that A has
closed range is implied by an estimate of the form ck x k ≤ k Ax k, as shown
in Proposition 4.43. A commonly occurring dichotomy for the solvability of a
linear equation is summarized in the following Fredholm alternative principle.

141
6. Bounded operators on Hilbert spaces and spectral theory

6.8 definition. A bounded linear operator A : H → H on a Hilbert space H


satisfies the Fredholm alternative if one of the following two alternatives holds:

(a) Either Ax = 0, A∗ x = 0 have only the zero solution, and the equations
Ax = y, A∗ x = y have a unique solution x ∈ H for every y ∈ H,

(b) Or Ax = 0, A∗ x = 0 have nontrivial, finite-dimensional solutions spaces


of the same dimension, Ax = y has a (nonunique) solution if and only
if y ⊥ z for every solution z of A∗ z = 0, and A∗ x = y has a (nonunique)
solution if and only if y ⊥ z for every solution z of Az = 0.

Any linear operator A : Cn → Cn on a finite-dimensional space, associated


with an n × n system of linear equations Ax = y, satisfies the Fredholm alter-
native. The ranges of A and A∗ are closed because they are finite-dimensional.
From linear algebra, the rank of A∗ is equal to the rank of A, and therefore
the nullity of A is equal to the nullity of A∗ . The Fredholm alternative then
follows from Theorem 6.7. Two things can go wrong with the Fredholm alter-
native in Definition 6.8 for bounded operators A on an infinite-dimensional
space. First, ranA might be not closed; and second, even if ranA is closed, it
is not true, in general, that kerA and kerA∗ have the same dimension. As a
result, the equation Ax = y may be solvable for all y ∈ H even though A is
not one-to-one, or Ax = y may not be solvable for all y ∈ H even though A is
one-to-one. We illustrate these possibilities with some examples.

6.9 example. Consider the multiplication operator M : L2 ([0, 1]) → L2 ([0, 1])
defined by

M f ( x ) = x f ( x ).

Then M∗ = M, and M is one-to-one, so every g ∈ L2 ([0, 1]) is orthogonal to


kerM∗ ; but the range of M is a proper dense subspace of L2 ([0, 1]), so M f = g
is not solvable for every g ∈ L2 ([0, 1]). We will get back to this example below.

6.10 example. The range of the right shift operator S : `2 (N) → `2 (N), de-
fined in Example 6.2, is closed since it consists of y = (y1 , y2 , y3 , . . .) ∈ `2 (N)
such that y1 = 0. The left shift operator T = S∗ is singular since its kernel is
the one-dimensional space with basis {(1, 0, 0, . . .)}. The equation Sx = y, or
(0, x1 , x2 , . . .) = (y1 , y2 , y3 , . . .), is solvable if and only if y1 = 0, or y ⊥ kerT,
which verifies Theorem 6.7 in this case. If a solution exists, then it is unique.
On the other hand, the equation Tx = y is solvable for every y ∈ `2 (N), even
though T is not one-to-one, and the solution is not unique.

6.11 definition. A bounded linear operator A : H → H on a Hilbert space is


self-adjoint if A∗ = A.

Equivalently, a bounded linear operator A on H is self-adjoint if

h x, Ayi = h Ax, yi for all x, y ∈ H.

6.12 example. From Example 6.1, a linear map on Rn with matrix A is self-

142
6.1. The adjoint of an operator

adjoint if and only if A is symmetric, meaning that A = A T , where A T is the


transpose of A. A linear map on Cn with matrix A is self-adjoint if and only
if A is Hermitian, meaning that A = A∗ .

6.13 example. From example 6.3, an integral operator K : L2 ([0, 1]) → L2 ([0, 1])

Z1
K f (x) = k( x, y) f (y)dy,
0

is self-adjoint if and only if k ( x, y) = k(y, x ).

Given a linear operator A : H → H, we may define a sesquilinear form

a: H×H →C

by a( x, y) = h x, Ayi. If A is self-adjoint, then this form is Hermitian symmetric,


or symmetric, meaning that

a( x, y) = a(y, x ).

It follows that the associated quadratic form q( x ) = a( x, x ), or

q( x ) = h x, Ax i,

is real-valued. We say that A is nonnegative if it is self-adjoint and h x, Ax i ≥ 0


for all x ∈ H. We say that A is positive, or positive definite, if it is self-adjoint
and h x, Ax i > 0 for every nonzero x ∈ H. If A is a positive, bounded operator,
then

( x, y) = h x, Ayi

defines an inner product on H. If, in addition, there is a constant c > 0 such


that

h x, Ax i ≥ ck x k2 for all x ∈ H,

then we say that A is bounded from below, and the norm associated with (·, ·) is
equivalent to the norm associated with h·, ·i.
The quadratic form associated with a self-adjoint operator determines the
norm of the operator.

6.14 lemma (Norm of an adjoint operator via quadratic form). If A is a bounded


self-adjoint operator on a Hilbert space H, then

k Ak = sup |h x, Ax i|.
k x k=1

Proof. Omitted.
As a corollary, we have the following result.

143
6. Bounded operators on Hilbert spaces and spectral theory

6.15 corollary. If A is a bounded operator on a Hilbert space then k AA∗ k =


k Ak2 . If A is self adjoint, then k A2 k = k Ak2 .

Proof. The definition of k Ak and the application of Lemma 6.14 to the self-
adjoint operator A∗ A imply that

k Ak2 = sup |h Ax, Ax i| = sup |h x, A∗ Ax i| = k A∗ Ak.


k x k=1 k x k=1

Hence, if A is self-adjoint, then k Ak2 = k A2 k.

6.2 Weak convergence in a Hilbert space


A sequence in a Hilbert space H converges weakly to x ∈ H if

lim h xn , yi = h x, yi for all y ∈ H.


n→+∞

Weak convergence is usually written as

xn * x as n → +∞,

to distinguish it from strong, or norm, convergence.

6.16 example. Suppose that H = `2 (N). Let

en = (0, 0, . . . , 0, 1, 0, . . .)

be the standard basis vector whose n-th term is 1 and whose other terms are
0. If (y1 , y2 , y3 , . . .) ∈ `2 (N), then

h en , y i = yn → 0 as n → +∞,

since ∑√|yn |2 converges. Hence, en * 0 as n → +∞. On the other hand, ken −


em k = 2 for all n 6= m, so the sequence (en ) cannot converge strongly.

Clearly, all the result we have proven on the weak topologies of Banach
spaces apply to Hilbert spaces.

6.17 example. In example 6.16, we saw that the bounded sequence (en ) of
standard basis elements in `2 (N) converges weakly to zero. The unbounded
sequence (nen ), where

nen = (0, 0, . . . , 0, n, 0, . . .),

does not converge weakly, however, even though the coordinate sequence with
respect to the basis (en ) converges to zero. For example,
 +∞
x = n−3/4
n =1

belongs to `2 (N), and hnen , x i = n1/4 does not converge as n → +∞.

144
6.3. The spectrum

The examples we saw in subsection 3.4 on the weak convergence in L p


spaces in the case p = 2 are a special case of weak convergence in a Hilbert
space. In particular, the phenomena of oscillation, concentration, and escape
to infinity can occur.
As we know, the norm of the limit of a weakly convergent sequence may
be strictly less than the norms of the terms in the sequence, corresponding to
a loss of ‘energy’ in oscillations, at a singularity, or by escape to infinity in the
weak limit. In each case, the expansion of f n in any orthonormal basis contains
coefficients that wander off to infinity. If the norms of a weakly convergent
sequence converge to the norm of the weak limit, then the sequence converges
strongly.

6.18 proposition. If ( xn ) converges weakly to x and

lim k xn k = k x k,
n→+∞

then ( xn ) converges strongly to x.

Proof. Expansion of the inner product gives

k xn − x k2 = k xn k2 − h xn , x i − h x, xn i + k x k2 .

If xn converges weakly to x, then h xn , x i →< h x, x i = k x k2 . Hence, if we also


have k xn k → k x k, then k xn − x k2 → 0, meaning that xn → x strongly.

6.3 The spectrum

Spectral theory provides a powerful way to understand linear operators by


decomposing the space on which they act into invariant subspaces on which
their action is simple. In the finite-dimensional case, the spectrum of a linear
operator consists of its eigenvalues. The action of the operator on the subspace
of eigenvectors with a given eigenvalue is just multiplication by the eigen-
value. As we will see, the spectral theory of bounded linear operators on infi-
nite dimensional spaces is more involved. For example, an operator may have
a continuous spectrum in addition to, or instead of, a point spectrum of eigenval-
ues. A particularly simple and important case is that of compact, self-adjoint
operators. Compact operators may be approximated by finite-dimensional op-
erators, and their spectral theory is close to that of finite-dimensional opera-
tors.
The student at this level should be familiar with the diagonalization of
squared matrices, with concepts such as eigenvalue, eigenvector, eigenspace,
algebraic multiplicity and geometric multiplicity of an eigenvalue, character-
istic polynomial of a matrix. The student should also be familiar with the
following important result in linear algebra: every self-adjoint squared (finite
dimensional) matrix has an orthonormal basis of eigenvectors (which means
it is diagonalisable).
A bounded linear operator on an infinite-dimensional Hilbert space need
not have any eigenvalues at all, even it if is self-adjoint. Thus, we cannot

145
6. Bounded operators on Hilbert spaces and spectral theory

hope to find, in general, an orthonormal basis of the space consisting entirely


of eigenvectors. It is therefore necessary to define the spectrum of a linear
operator on an infinite-dimensional space in a more general way than as the
set of eigenvalues. We denote the space of bounded linear operators on a
Hilbert space H by L( H ).

6.19 definition. The resolvent set of an operator A ∈ L( H ), denoted by ρ( A),


is the set of complex numbers λ such that ( A − λI) : H → H is one-to-one and
onto. The spectrum of A, denoted by σ ( A), is the complement of the resolvent
set in C, meaning that σ ( A) = C \ ρ( A).

If A − λI is one-to-one and onto, then the open mapping theorem implies


that ( A − λI)−1 is bounded. Hence, when λ ∈ ρ( A), both A − λI and ( A −
λI)−1 are one-to-one, onto, bounded linear operators.
As in the finite-dimensional case, a complex number λ is called an eigen-
values of A if there is a nonzero vector u ∈ H such that Au = λu. In that case,
ker( A − λI) 6= {0}, so A − λI is not one-to-one, and λ ∈ σ ( A). This is not the
only way, however, that a complex number can belong to the spectrum. We
subdivide the spectrum of a bounded linear operator as follows.

6.20 definition. Suppose that A is a bounded linear operator on a Hilbert


space H.
(a) The point spectrum of A consists of all λ ∈ σ ( A) such that A − λI is not
one-to-one. In this case λ is called an eigenvalues of A.
(b) The continuous spectrum of A consists of all λ ∈ σ ( A) such that A − λI
is one-to-one but not onto, and ran( A − λI) is dense in H.
(c) The residual spectrum of A consists of all λ ∈ σ ( A) such that A − λI is
one-to-one but not onto, and ran( A − λI) is not dense in H.

6.21 example. Let H = L2 ([0, 1]), and define the multiplication operator M :
H → H by

M f ( x ) = x f ( x ).

Then M is bounded with k Mk = 1 (exercise!). If M f = λ f , then it is easily


seen that f ( x ) = 0 for almost every x ∈ [0, 1], so f = 0 in L2 ([0, 1]). Thus,
f has no eigenvalues. If λ 6∈ [0, 1], then ( x − λ)−1 f ( x ) ∈ L2 ([0, 1]) for any
f ∈ L2 ([0, 1]) because ( x − λ) is bounded away from zero on [0, 1]. Thus,
C \ [0, 1] is in the resolvent set of M. If λ ∈ [0, 1], then M − λI is not onto,
because c( x − λ)−1 6∈ L2 ([0, 1]) for c 6= 0, so the nonzero constant function
c does not belong to the range of M − λI. The range of M − λI, however, is
dense. To see this, let f ∈ L2 ([0, 1]), let
(
f ( x ) if | x − λ| ≥ 1/n
f n (x) =
0 if | x − λ| < 1/n.

Then f n converges to f in L2 ([0, 1]), and f n ∈ Ran( M − λI), since ( x −

146
6.3. The spectrum

λ)−1 f n ( x ) ∈ L2 ([0, 1]). It follows that σ ( M ) = [0, 1], and that every λ ∈ σ ( M )
belongs to the continuous spectrum of M.

If λ belongs to the resolvent set ρ( A) of a linear operator A, then A − λI


has an everywhere defined, bounded inverse. The operator

Rλ = (λI − A)−1

is called the resolvent operator of A at λ. The resolvent operator of A is an


operator-valued function defined on the subset ρ(C).
An operator-valued function F : Ω → L( H ), defined on an open subset Ω
of the complex plane C is said to be analytic at z0 ∈ Ω if there are operators
Fn ∈ L( H ) and a δ > 0 such that
+∞
F(x) = ∑ (z − z0 )n Fn ,
n =0

where the power series on the right-hand side converges with respect to the
operator norm on L( H ) in a disc |z − z0 | < δ. We say that F is analytic in Ω if
it is analytic at any point of Ω.

6.22 exercise (Neumann Series). Suppose that K : X → X is a bounded linear


operator on a Banach space X with kK k < 1. Prove that I − K is invertible and

(I − K ) −1 = I + K + K 2 + K 3 + . . . ,

where the series on the right hand side converges uniformly in L( X ).

6.23 proposition. If A is a bounded linear operator on a Hilbert space H, then the


resolvent set ρ( A) is an open subset of C that contains the exterior disc {λ ∈ C :
|λ| > k Ak}. The resolvent Rλ is an operator valued analytic function of λ on ρ( A).

Proof. Suppose that λ0 ∈ ρ( A). Then we may write

λI − A = (λ0 I − A)[I − (λ0 − λ)(λ0 I − A)−1 ].

If |λ0 − λ| < k(λ0 I − A)−1 k−1 , then we can invert the operator on the right-
hand-side by the Neumann series (see exercise 6.22). Hence, there is an open
disk in the complex plane with center λ0 that is contained in ρ( A). Moreover,
the resolvent Rλ is given by an operator-norm convergent Taylor series in the
disc, so it is analytic in ρ( A). To see this, we compute
+∞
R λ = [ I − ( λ 0 − λ ) R λ0 ] −1 R λ0 = ∑ (λ0 − λ)k Rkλ+0 1 .
k =1

If |λ| > k Ak, then the Neumann series also shows that λ(I − Aλ−1 ) is invert-
ible, so λ ∈ ρ( A).

Since the spectrum σ ( A) of A is the complement of the resolvent set, it

147
6. Bounded operators on Hilbert spaces and spectral theory

follows that the spectrum is a closed subset of C, and

σ ( A) ⊂ {z ∈ C : |z| ≤ k Ak}.

The spectral radius of A, denoted by r ( A), is the radius of the smallest disk
centered at zero that contains σ ( A),

r ( A) = sup{|λ| : λ ∈ σ ( A)}.

We can refine the above proposition 6.23 as follows.

6.24 proposition. If A is a bounded linear operator, then

r ( A) = lim k An k1/n . (41)


n→+∞

If A is self-adjoint, then r ( A) = k Ak.

Proof. Omitted.
Although the spectral radius of a self-adjoint operator is equal to its norm,
the spectral radius does not provide a norm on the space of all bounded
operators. In particular, r ( A) = 0 does not imply that A = 0. If r ( A) = 0,
then we say that A is a nilpotent operator (as an example consider a nontrivial
Jordan block).

6.25 proposition. The spectrum of a bounded linear operator on a Hilbert space is


nonempty.

We omit the proof.

6.4 The spectral theorem for compact, self-adjoint operators


In this section, we analyze the spectrum of a compact, self-adjoint operator.
The spectrum consists entirely of eigenvalues, with the possible exception of
zero, which may belong to the continuous spectrum. We begin by proving
some basic properties of the spectrum of a bounded, self-adjoint operator.

6.26 lemma. The eigenvalues of a bounded, self-adjoint operator are real, and eigen-
vectors associated with different eigenvalues are orthogonal.

Proof. If A : H → H is self-adjoint, and Ax = λx with x 6= 0, then

λh x, x i = h x, Ax i = h Ax, x i = λh x, x i,

and λ = λ, i.e. λ ∈ R. If Ax = λx and Ay = µy, where λ and µ are real, then

λh x, yi = h Ax, yi = h x, Ayi = µh x, yi.

It follows that if λ 6= µ, then h x, yi = 0 and x ⊥ y.

A linear subspace M of H is called an invariant subspace of a linear operator


A on H if Ax ∈ M for all x ∈ M. In that case, the restriction A| M of A to M is a

148
6.4. The spectral theorem for compact, self-adjoint operators

linear operator on M. Suppose that H = M ⊕ N is the direct sum of invariant


subspaces M and N of A. Then every x ∈ H may be written as x = y + z with
y ∈ M and z ∈ N, and

Ax = A| M y + A| N z.

Thus, the action of A on H is determined by the actions on the invariant


subspaces.

6.27 example. Consider matrices acting on Cd = Cm ⊕ Cn , where d = m + n.


A d × d matrix A leaves Cm invariant if it has the block form
 
B D
A= ,
0 C

where B is an m × m matrix, D is m × n, and C is n × n. The matrix A leaves


both Cm and the complementary subspace Cn invariant if D = 0.

An invariant subspace of a non-diagonalizable operator may have no com-


plementary invariant subspace. However, the orthogonal complement of an
invariant subspace of a self-adjoint operator is also invariant, as we prove in
the following lemma. Thus, we can decompose the action of a self-adjoint op-
erator on a linear space into action on smaller orthogonal invariant subspaces.

6.28 lemma. If A is a bounded, self-adjoint operator on a Hilbert space H and M is


an invariant subspace of A, then M⊥ is an invariant subspace of A.

Proof. If x ∈ M⊥ and y ∈ M, then

hy, Ax i = h Ay, x i = 0

because A = A∗ and Ay ∈ M. Therefore, Ax ∈ M⊥ .

Next we show that the whole spectrum - not just the point spectrum - of a
bounded, self-adjoint operator is real, and that the residual spectrum is empty.
We begin with a preliminary proposition.

6.29 proposition. If λ belongs to the residual spectrum of a bounded operator A on


a Hilbert space, then λ is an eigenvalues of A∗ .

Proof. If λ belongs to the residual spectrum of a bounded operator A ∈ L( H ),


then Ran( A − λI) is not dense in H. Hence, there is a nonzero vector x ∈ H
such that x ⊥ Ran( A − λI). Theorem 6.6 then implies that x ∈ Ker( A∗ − λI).
Hence, λ is an eigenvalue of A∗ .

6.30 lemma. If A is a bounded, self-adjoint operator on a Hilbert space, then the


spectrum of A is real and is contained in the interval [−k Ak, k Ak].

Proof. We have shown that r ( A) ≤ k Ak, so we only have to prove that the
spectrum is real. Suppose that λ = a + ib ∈ C, where a, b ∈ R and b 6= 0. For

149
6. Bounded operators on Hilbert spaces and spectral theory

any x ∈ H, we have

k( A − λI) x k2 = h( A − λI) x, ( A − λI) x i


= h( A − aI) x, ( A − aI) x i + h(−ib) x, , (−ib) x i
+ h( A − aI) x, (−ib) x i + h(−ib) x, ( A − aI) x i
= k( A − aI) x k2 + b2 k x k2 ≥ b2 k x k2 .

It follows from this estimate and Proposition 4.43 that A − λI is one-to-one


and has closed range. If Ran( A − λI) 6= H, then λ belongs to the residual spec-
trum of A, and, by proposition 6.29, λ = a − ib is an eigenvalue of A. Thus, A
has an eigenvalue that does not belong to R, which contradicts Lemma 6.26.
It follows that λ ∈ ρ( A) if λ is not real.

6.31 corollary. The residual spectrum of a bounded, self-adjoint operator is empty.

Proof. From Lemma 6.30, the point spectrum and the residual spectrum are
disjoint subsets of R. So if λ ∈ R belongs to the residual spectrum, Proposition
6.29 implies that λ is also an eigenvalue, which is a contradiction.

Bounded linear operators on an infinite dimensional Hilbert space do not


always behave like operators on a finite dimensional space. We have seen in
Example 6.21 that a bounded, self-adjoint operator may have no eigenvalues,
while the identity operator on an infinite dimensional Hilbert space has a
nonzero eigenvalues of infinite multiplicity. The properties of compact oper-
ators are much closer to those of operators on finite dimensional spaces, and
we will study their spectral theory next.

6.32 proposition. A nonzero eigenvalue of a compact, self-adjoint operator A has


finite multiplicity. A countably infinite set of nonzero eigenvalues has zero as accu-
mulation point, and no other accumulation points.

Proof. Suppose by contradiction that λ is a nonzero eigenvalue with infinite


multiplicity. Then, there is a sequence (en ) of orthonormal eigenvectors. This
sequence is bounded, but ( Aen ) does not have a convergent subsequence be-
cause Aen = λen . Indeed, ken − em k2 = ken k2 + kem k2 = 1. This contradicts
the compactness of A.
If A has a countably infinite set {λn } of nonzero eigenvalues, with en
as corresponding orthonormal eigenvectors, then, since the eigenvalues are
bounded by k Ak, there is a convergent subsequence (λnk ). If λnk → λ and
λ 6= 0, then the orthogonal sequence of eigenvectors ( f nk ), where f nk = λ− 1
nk enk
and kenk k = 1, would be bounded. But ( A f nk ) has no convergent subsequence
since A f nk = enk .

To motivate the statement of the spectral theorem for compact, self-adjoint


operators, suppose that x ∈ H is given by

x= ∑ ck ek + z, (42)
k

150
6.5. More on compact operators

where {ek } is an orthonormal set of eigenvectors of A with corresponding


nonzero eigenvalues λk , z ∈ ker( A), and ck ∈ C. Then

Ax = ∑ λck ek .
k

Let Pk denote the one-dimensional orthogonal projection onto the subspace


spanned by ek ,

Pk x = hek , x iek .

From Lemma 6.26, we have z ⊥ ek , so ck = hek , x i and

Ax = ∑ λk Pk x. (43)
k

If λk has finite multiplicity mk > 1, meaning that the dimension of the as-
sociated eigenspace Ek ⊂ H is greater than one, then we may combine the
one-dimensional projections associated with the same eigenvalues. In doing
so, we may represent A by a sum of the same form as (43), in which λk are
distinct, and the Pk are orthogonal projections onto the eigenspaces Ek .
The spectral theorem for compact, self-adjoint operators states that any
x ∈ H can be expanded in the form (42), and that A can be expressed as a
sum of orthogonal projections as in (43).

6.33 theorem (Spectral theorem for compact, self-adjoint operators). Let A :


H → H be a compact, self-adjoint operator on a Hilbert space H. There is an or-
thonormal basis of H consisting of eigenvectors of A. The nonzero eigenvalues of A
form a finite or countably infinite set {λk } of real numbers, and

A= ∑ λk Pk , (44)
k

where Pk is the orthogonal projection onto the finite-dimensional eigenspace of eigen-


vectors with eigenvalues λk . If the number of nonzero eigenvalues is countably infi-
nite, then the series in (44) converges to A in the operator norm.

We omit the proof.


The above theorem is only one simple version of spectral theorem for com-
pact operators. It can be generalized and extended to many other cases which
will not be considered here. We emphasize in particular that the property

σ ( T ) \ {0} = σp ( T ) \ {0}

holds for all compact operators on a Hilbert space, not only for the self-adjoint
ones. We omit the proof.

6.5 More on compact operators


Before we can apply the spectral theorem for compact, self-adjoint operators,
we have to check that an operator is compact. In this subsection we discuss

151
6. Bounded operators on Hilbert spaces and spectral theory

some ways to do this, and also give some examples of compact operators.
The most direct way to prove that an operator is compact is to verify
the definition by showing that if E is a bounded subset of H, then the set
A( E) = { Ax : x ∈ E} is precompact, i.e. with compact closure. In many
examples, this can be done by using an appropriate condition for compact-
ness, such as the Arzelá-Ascoli theorem or Kolmogorov-Riesz-Frechet Theo-
rem. The following theorem characterizes precompact sets in a general, sepa-
rable Hilbert space. We omit the proof.

6.34 theorem. Let E be a subset of an infinite-dimensional, separable Hilbert space


H.

(a) If E is precompact, then for every orthonormal set {en : n ∈ N} and every
e > 0, there is an N such that
+∞
∑ |hen , x i|2 < e for all x ∈ E. (45)
n = N +1

(b) If E is bounded and there is an orthonormal basis {en } of H with the property
that for every e > 0 there is an N such that (45) holds, then E is precompact.

6.35 example. Let H = `2 (N). The Hilbert cube

C = {( x1 , x2 , x3 , . . . , xn , . . .) : | xn | ≤ 1/n}

is closed and precompact. Hence C is a compact subset of H.

6.36 exercise. Consider the diagonal operator A : `2 (N) → `2 (N) defined by

A ( x1 , x2 , x3 , . . . , x n , . . . ) = ( λ1 x1 , λ2 x2 , λ2 x3 , . . . , λ n x n , . . . ), (46)

where λn ∈ R and λn is decreasing with λn → 0 as n → +∞. Prove that A is


compact.

Proposition 4.25 implies that the uniform limit of compact operators is


compact. An operator with finite rank is compact. Therefore, another way to
prove that A is compact is to show that A is the limit of a uniformly convergent
sequence of finite-rank operators. One such class of compact operators is the
class of Hilbert-Schmidt operators.

6.37 definition. A bounded linear operator A on a separable Hilbert space


H is Hilbert-Schmidt if there is an orthonormal basis {en : n ∈ N} such that
+∞
∑ k Aen k2 < +∞ (47)
n =1

152
6.5. More on compact operators

If A is Hilbert-Schmidt, then
v
u +∞
k Ak HS = t ∑ k Aen k2
u
(48)
n =1

is called the Hilbert-Schmidt norm of A.

One can show that the sum in (47) is finite in every orthonormal basis if it
is finite in one orthonormal basis, and the norm (48) does not depend on the
choice of the basis.

6.38 theorem. A Hilbert-Schmidt operator is compact.

Proof. (Sketch) Suppose that A is a Hilbert-Schmidt operator and {en : n ∈


N} is an orthonormal basis. If PN is the orthogonal projection onto the finite-
dimensional space spanned by {e1 , . . . , en }, then Pn A is a finite rank operator,
and one can check that PN A → A uniformly as N → +∞.

6.39 example. The diagonal operator A : `2 (N) → `2 (N) defined in (46) is


Hilbert-Schmidt if and only if
+∞
∑ |λn |2 < +∞.
n =1

6.40 example. Let Ω ⊂ Rn . One can show that an integral operator K on


L2 ( Ω ),
Z
K f (x) = k( x, y) f (y)dy,

is Hilbert-Schmidt if and only if k ∈ L2 (Ω × Ω), meaning that


Z
|k( x, y)|2 dxdy < +∞.
Ω×Ω

The Hilbert-Schmidt norm of K is


 1/2
Z
kK k HS =  |k( x, y)|2 dxdy .
Ω×Ω

If K is a self-adjoint, Hilbert-Schmidt operator then there is an orthonormal


basis { ϕn : n ∈ N} of L2 (Ω) consisting of eigenvectors of K, such that
Z
k ( x, y) ϕn (y)dy = λn ϕn (y).

153
6. Bounded operators on Hilbert spaces and spectral theory

Then, one can prove that


+∞
k( x, y) = ∑ λ n ϕ n ( x ) ϕ n ( y ).
n =1

We omit the proof.

6.41 exercise. Prove that the Volterra operator

Zx
( T f )( x ) = f (t)dt
0

is compact in H = L2 ([0, 1]).


Solution. Let B1 (0) be the closed unit ball of H. For f ∈ B1 (0) we consider
2
Z1 Z1 xZ+h
2
|( T f )( x + h) − ( T f )( x )| dx = f (t)dt dx
0 0 x
Z1 xZ+h xZ+h Z1 Z1
≤ dt f (t)2 dt dx ≤ |h| f (t)2 dtdx ≤ |h| .
0 x x 0 0

Hence,

k( T f )(· + h) − ( T f )k L2 ≤ |h|1/2 ,

and the assertion follows form Kolmogorov-Riesz-Frechet Theorem.

6.42 exercise. Prove that the Volterra operator in the exercise above has no
eigenvalues.
Solution. Assume λ 6= 0 is an eigenvalue. Then, for f 6= 0 in L2 we have

Zx
λ f (x) = f (t)dt .
0

For x, y ∈ [0, 1],

Zx
1
f ( x ) − f (y) = f (t)dt
λ
y

and therefore

| x − y|1/2
| f ( x ) − f (y)| ≤ k f k L2
λ
which shows that f is continuous. Hence, the eigenvalue condition implies f

154
6.6. Exercises

is C1 . We can therefore write

λ f 0 (x) = f (x)

and since f (0) = 0 we get, by uniqueness of solutions to the Cauchy problem,


that f = 0, a contradiction.

6.6 Exercises
1. Let H be a Hilbert space. Prove that for all A, B ∈ L( H ) and λ ∈ C, one
has
(a) A∗∗ = A
(b) ( AB)∗ = B∗ A∗
(c) (λA)∗ = λA∗
(d) ( A + B)∗ = A∗ + B∗
(e) k A∗ k = k Ak.
2. Let (un ) be a sequence of orthonormal vectors in a Hilbert space. Prove
that un converges to zero weakly.

3. Let A ∈ L( H ) with H a Hilbert space. Prove that ρ( A∗ ) = ρ( A).


4. Let k : [0, 1] × [0, 1] → R given by

N
k( x, y) = ∑ α i ( x ) β i ( y ),
i =1

for some continuous functions α1 , . . . , α N , β 1 , . . . , β N on [0, 1]. Prove that


the operator K ∈ L( L2 ([0, 1])) defined by

Z1
(K f )( x ) = k( x, y) f (y)dy
0

is compact.
5. Let λ be an eigenvalue of A ∈ L( H ) with H a Hilbert space. Is λ in the
spectrum of A∗ ? What can we say about the type of spectrum λ belongs
to?
6. Suppose A ∈ L( H ) with H a Hilbert space and λ, µ ∈ ρ( A). Prove that
the resolvent Rλ of A satisfies the resolvent equation

Rλ − Rµ = (λ − µ) Rλ Rµ .

7. Let H be a Hilbert space and M ⊂ H be a closed subspace. Let P : H →


H be the orthogonal projection of H onto M. Find σ( M).
8. Let A be a bounded, self-adjoint, nonnegative operator on a complex
Hilbert space. Prove that σ ( A) ⊂ [0, +∞).

155
6. Bounded operators on Hilbert spaces and spectral theory

9. Let G be a multiplication operator on L2 ([0, 1]) defined by

G f ( x ) = x 2 f ( x ).

Prove that G is a bounded linear operator on L2 ([0, 1]) and that its spec-
trum is given by [0, 1]. Does G have eigenvalues? Motivate your answer.
10. Let K : L2 ([0, 1]) → L2 ([0, 1]) be the integral operator defined by

Zx
K f (x) = f (y)dy.
0

(a) Find the adjoint operator K ∗


(b) Find the operator norm kK k (Hint: use kK ∗ K k = kK k2 ).
(c) Show that the spectral radius of K is equal to zero.
(d) Show that 0 belongs to the continuum spectrum of K.
11. Let `2 (N) be the real Hilbert space of squared-summable sequences.
Define the right-shift operator S on `2 (N) by
(
xk−1 if k ≥ 2
S( x )k =
0 if k = 1,


where x = ( xk )+
k =1 is in ` (N). Prove the following facts:
2

(a) kSk = 1.
(b) The point spectrum of S is empty.
(c) σ (S) = [−1, 1].
12. Define the left-shift operator T on `2 (N) by

T ( x ) k = x k +1 for all k ≥ 1,

where x = ( xk )+
k =1 is in ` (N). Prove the following facts:
2

(a) k T k = 1.
(b) The point spectrum of T is given by (−1, 1).
(c) σ( T ) = [−1, 1].
13. Let H = L2 ([−π, π ]) and let K : H → H be defined by


( T f )( x ) = | x − y| f (y)dy.
−π

Using the Fourier’s series expansion

a0 + ∞
f (x) = + ∑ ( an cos(nx ) + bn sin(nx )),
2 n =1

156
6.6. Exercises

find the spectrum of T.

14. Solve the following Fredholm integral equation for u( x ) in L2 ([0, 2π ]):


u( x ) = cos x + λ sin( x − y)u(y).
0

15. Solve the following Fredholm integral equation for u( x ) in L2 ([0, 1]):

Z1
x
u( x ) = e + λ (5x2 − 3)y2 u(y).
0

16. Discuss the existence of solutions to the following integral equation

Z2π
u( x ) = f ( x ) + λ sin( x + y)u(y)dy
0

for the two cases



(a) λ = 1/ π, f ( x ) = x2 ,
(b) λ = 1/π, f ( x ) = sin(3x )

157
6. Bounded operators on Hilbert spaces and spectral theory

The exam
The RaFA (Real and Functional Analysis) exam is a written exam only. It is
split into two parts, in agreement with the two parts in the lecture notes. For
each part, maximum 18 points are awarded, for a total maximum of 36 points.
In each of the two parts, the 18 points are distributed as follows:

• 5 points on as many true or false questions (TFQ).

• 7 points on one or two book work questions (BWQ).

• 6 points on two problems.

For each of the two parts, the student has maximum 90 minutes to submit
the exam script.
For each of the two parts, given m ∈ [0, 18] the total score, the final mark F ∈
[0, 18] is computed via a renormalization formula subject to two constrants:
• F = m if 0 ≤ m ≤ 9,

• F ≤ m if 9 ≤ m ≤ 18.

The formula may vary according to the general performance of the students
in the same exam attempt. Given the two final (renormalized) marks F1 and
F2 , the final mark T is computed by T = F1 + F2 .
Students who score T ≥ 30 get 30 as a final mark. The cum laude is awarded
at the lecturer’s discretion. If T ≥ 18 then the exam is passed. If not, the
student may resit for a single part or for both. The student may resit (for one
part or both) as well in order to improve the final mark (even if the exam has
been already passed).

The true or false questions


In each of the five TFQs of each part, students are given a mathematical state-
ment. They are asked to say whether the statement is true or false and to
provide a brief motivation/explanation behind their answer (no more than 5
lines).
On each TFQ, 0 points are awarded if the answer (true/false) is incorrect,
0.5 points are awarded if the answer (true/false) is correct but no suitable
motivation/explanation is provided (or an unsuitable one is provided), 1 point
if the answer is correct and the motivation/explanation is suitable.
How to provide a suitable motivation?
In case the statement is a definition, then the students are asked to say if the
definition is correct. If so, the suitable motivation is just by definition. If not,
they should provide the correct definition. For example, assume the statement
is

The Lebesgue outer measure of a set E ⊂ Rd is the infimum of the measures of all
open sets O such that O ⊃ E.

158
6.6. Exercises

In this case, the student should simply answer true, by definition. Assume
the statement is

The k · k∞ -norm of a continuous function f : [0, 1] → R is defined as

k f k∞ = sup f ( x ) .
x ∈[0,1]

In this case, the student should answer false; the correct definition is

k f k∞ = sup | f ( x )| .
x ∈[0,1]

In most of the cases, however, the TFQ is not a definition. We will consider
here some examples.
A typical situation arises when the statement looks like

Under conditions A, B, and C, the property P is true.

This typically refers to a general property, something which makes sense in all
metric spaces, or in all L p spaces, or in all Hilbert spaces, etc.. Let’s begin with
the easier case, namely when the statement is false. In this case, all the student
has to do is to provide a counterexample, that is an instance in which conditions
A, B, and C are satisfied and in which the property P is not satisfied. No proof
needs to be provided. It’s enough to mention the counterexample. For example,
if the statement is

Assume f n → f converges weakly in L2 (R).


Then f n → f converges strongly in L2 (R).

The statement is false. All student has to do in order to motivate the answer is
to provide a counterexample. So, it’s ok to just say: consider f n ( x ) = 1[n,n+1] ( x ),
which converges to zero weakly in L2 (R) but not strongly. Alternatively, in some
cases it is enough to provide an additional condition under which the statement
is clearly false. For example, if the statement is

All bounded subsets of a normed linear space are relatively compact,

then the correct answer is false; for example, the statement is incorrect if the space
is infinite dimensional.
Let us now consider the case in which the statement is true. In this case, a
very brief sketch of the proof should be provided. For example, consider the
statement

All Peano-Jordan measurable sets are Lebesgue measurable.

The answer is true. A suitable motivation could be the following: If a set E is


PJ measurable this implies m∗PJ ( E) = m∗,PJ ( E). Now, since m∗L ( E) ≤ m∗PJ ( E) and
m∗,PJ ( E) ≤ m∗,L ( E), then we get

m∗L ( E) ≤ m∗PJ ( E) = m∗,PJ ( E) ≤ m∗,L ( E) ,

159
6. Bounded operators on Hilbert spaces and spectral theory

and since the opposite inequality m∗,L ( E) ≤ m∗L ( E) is always true, the set is Lebesgue
measurable.
I insist on this: only a short sketch of the proof has to be provided. If the
student is thinking of a longer proof as answer, then probably this is not the
expected correct answer.
In some other cases, the statement is true due to some abstract result
proven in the course. For example, the answer to the statement
The residual spectrum of the operator T : L2 ([0, 1]) → L2 ([0, 1]),
( T f )( x ) = x f ( x ) is empty
is true, because the operator T is self-adjoint, and self-adjoint bounded operators on
Hilbert spaces have no residual spectrum (from a general result seen in the course).
Finally, very often TFQ refer to a very specific situation. A typical example
is
The function f ( x ) = 1
1+ x 2
belongs to L1 (R).
The correct answer is that the statement is true. The motivation is a simple
calculation:
+∞
1 1
Z Z Z
| f ( x )|dx = dx = 2
dx
1 + x2
1 + x2
R 0
 
= 2 lim arctan( R) − arctan(0) = π < +∞ .
R→+∞

Another typical example is a more general statement regarding a specific


situation. For instance the statement

Cc (R) = C0 (R),

for which the correct answer is false, and to prove it one should find an element
2
being in one of the two sets but not in the other one, for example f ( x ) = e− x .
Here is a list of examples of TFQ given in previous exams. Please consider
that this list is meant to give an idea of how these questions look like. It may
happen (and it’s actually quite likely) that the actual TFQ at the exam will NOT be
taken from this list.

• In a metric space, any bounded sequence has a convergent subsequence.

• Let f ∈ C ([0, 1]). Then there exists a sequence f n of C ∞ functions con-


verging uniformly to f .

• Every pointwise converging sequence f n ∈ C ([0, 1]) converges uniformly.

• In any metric space X, a subset A ⊂ X is either open or closed.

• A subset of a metric space is compact if and only if it is closed and


bounded.

160
6.6. Exercises

• Let X = C ([0, 1]) be equipped with the uniform norm and let L be the
set of Lipschitz continuous functions on [0, 1]. Then L is closed in X.

• The function f ( x ) = x
1+ x 2
belongs to L2 (R).

• L2 ([0, +∞)) ⊂ L1 ([0, +∞)).

• C ([0, 1]) ⊂ L1 ([0, 1]).

• C ([0, +∞)) ⊂ L∞ ([0, +∞)).

• L6 (R) ⊂ L4 (R).

• Every compact linear operator has finite range.

• Every linear and bounded operator with finite range is compact.

• The sequence f n ( x ) : [0, 1] → R defined by f n = esin(nx) has a subse-


quence converging weakly in L2 ([0, 1]).

• Let M ⊂ H with H a Hilbert space. Then M⊥ is closed in H.

• Let T : H → H be a bounded self-adjoint linear operator on a Hilbert


space. Then σp ( T ) ⊂ R.

• The set { f ∈ C ([0, 1]) : f ∈ C1 ([0, 1]) and | f 0 ( x )| ≤ 1 for all x ∈ [0, 1]}
is relatively compact in C ([0, 1], k · k∞ ).

• The sequence f n ( x ) = x n converges to zero almost everywhere on (−1, 1).

• The sequence f n ( x ) = e−n| x| converges to zero in L2 ([−1, 1]).

• The sequence f n ( x ) = e−nx converges to zero in L2 ([0, +∞)).

• The sequence f n ( x ) = e−n| x| converges to zero in L∞ ([−1, 1]).

• Let ( X, k · k) be a normed linear space. Then the set


B = { x ∈ X : 1 < k x k ≤ 2} is open.

• Every countable subset of Rd is Lebesgue measurable.

• If f n → f in L1 (R) then f n → f almost everywhere on R.

• Let T : X → Y be a bounded linear operator. Then the operator norm of


T is defined as k T k = supx∈X k Tx kY .

• On a Hilbert space H, let T : H → H be linear and bounded. Then, σ( T )


is defined as the set of λ ∈ C such that λI − T is not onto.

• The sequence f n ( x ) = 1[n,n+1] ( x ) converges to zero weakly in L1 (R).

• Let K ∈ L( H ) with H a Hilbert space and kK k < 1. Then I − K is


invertible.

• The operator M : L2 ([0, 1]) → L2 ([0, 1]) defined by ( M f )( x ) = e x f ( x )


has no eigenvalues.

161
6. Bounded operators on Hilbert spaces and spectral theory

• Let f nR, f be measurable


R on R and assume f n → f almost everywhere.
Then f n dx → f dx.

• The sequence f n ( x ) = arctan(nx ) converges in L1 (R).

• The set of continuous functions with compact support on R is closed


with respect to the uniform norm.
x2
• The function f ( x ) = 1+ x 4
belongs to L1 (R).

• L ∞ (R) ⊂ L2 (R).

• The space L∞ ([−1, 1]) is separable.

• Let H be a Hilbert space and let f n → f weakly on H. Then f n → f


strongly.

• The Hilbert space L2 ([0, 2]) has a countable orthonormal basis.

• Let X be a Banach space and let f n → f strongly on X. Then f n → f


weakly on X.

• Every compact operator T maps bounded sets into closed sets.

• All bounded linear operators are compact operators.

• Every bounded linear operator on a Hilbert space is self-adjoint.

• The range of a bounded linear operator on a Hilbert space is always


closed.

• The identity I on a Hilbert space is always a compact operator.

Book Work questions


The book work questions are theoretical questions, such as definitions, state-
ments of results, proofs of results. Sometimes the results are referred to via
their name, sometimes they are not.
Example:

State and prove Banach’s Fixed Point Theorem.

In this case, the student should know what this is, and provide the statement
and the proof.
Example:

Prove that an equicontinuous and bounded subset of C ([0, 1]) is relatively compact.

This is one side of Arzelà-Ascoli’s Theorem. The student should recognise


it, and prove it. The student should be aware that thorough answers should
be provided in BWQ, with all details etc. A typical misunderstanding may
happen here. Example: if the question is something like

162
6.6. Exercises

Prove that, if H is a Hilbert space, then every bounded linear functional f ∈ H ∗ is of


the form f ( x ) = hy, x i for some y ∈ H,
the answer the result is true via a direct application of Riesz’ representation the-
orem is NOT ACCEPTABLE. I mean, this statement IS Riesz’ representation
theorem. I am actually asking you to recognise it, and to prove the theorem!
I am providing in what follows a thorough list of BWQ, for both the first
and the second part. Unlike the TFQ, this time the list is exhaustive: so, the
BWQ will be taken from this list.

Full list of possible BWQ for the first part


1. Define the concept of metric, or distance, on a set X. Define the concept
of norm on a linear space.

2. Let ( X, k · k) be a normed linear space. Prove that d( x, y) = k x − yk is a


distance on X.

3. Define the concepts of convergent sequence and limit of a sequence on a


metric space.

4. Provide the definition of uniformly continuous function between two met-


ric spaces at some given point.

5. Prove that a function f : X → Y between two metric spaces is continuous


at x ∈ X if and only if it is sequentially continuous at x.

6. Define the concept of topological space. Let C ⊂ X be a subset of a topo-


logical space. Define the closure of C and the interior of C. When is C
called a dense subset of X? Define the concept of separable metric space.

7. Prove that a subset of a topological space is closed if and only if it coin-


cides with its closure.

8. Define the open sets of a metric space ( X, d). Prove they define a topol-
ogy.

9. Prove that a subset F of a metric space is closed if and only if it contains


all the limits of convergent sequences xn ∈ F.

10. Define the discrete distance on R. Find a suitable equivalent condition


for the convergence of a sequence in this topology. Find a sequence that
converges in the standard sense in R, but not with respect to the discrete
distance.

11. Define the concept of sequentially compact subset of a metric space.

12. Define the concept of compactness for a subset K of a topological space.


Define the concept of total boundedness for a subset K of a metric space.

13. Let ( X, d) be a metric space and A ⊂ X. Prove that A is dense in X if and


only if for all x ∈ X and for all ε > 0 there exists a ∈ A with d( a, x ) < ε.

163
6. Bounded operators on Hilbert spaces and spectral theory

14. Prove that a sequentially compact metric space is separable.

15. Prove that every compact subset of a metric space is closed and bounded.

16. Define the concept of equivalent norms on a linear space. Let X be an


n-dimensional linear space with a given basis. Define the k · k1 -norm
with respect to that basis. Prove that all norms X are equivalent to the
k · k1 -norm.
17. Define the ` p norms and the ` p spaces for p ∈ [1, +∞].

18. State and prove Young’s inequality.

19. State and prove the discrete Hoelder inequality.

20. State and prove the discrete Minkowski inequality.

21. Let X be a metric space and f n : X → R be a sequence of functions.


Define the concepts of pointwise and uniform convergence for f n . Prove
that uniform convergence implies pointwise convergence. Define the
uniform norm, or infinity norm, of function f : X → R.

22. Let f n : X → R be a sequence of continuous functions. Assume f n → f


uniformly for some f : X → R. Prove that f is continuous.

23. Let X be a metric space. Define the spaces Cb ( X ), Cc ( X ), and C0 ( X ) with


their standard norm. Prove that in general these spaces are all distinct.
Prove they all coincide with C ( X ) in case X is compact.

24. Let X be a metric space. For a family F ⊂ C ( X ), define the concept of


equicontinuity. Let K be a compact metric space and let F ⊂ C (K ) be
closed, bounded, and equicontinuous. Then prove that K is compact in
C (K ) with the standard infinity norm.

25. Find an example of a bounded set in C ([0, 1]) which is not relatively
compact.

26. Define the concept of Lipschitz continuous function f : X → R. Define the


Lipschitz constant of f . Prove that every Lipschitz continuous function
f : [0, 1] → R is continuous. Find an example showing the vice-versa is,
in general, false.

27. Prove that the set of Lipschitz continuous functions f : X → R with


Lipschitz constant bounded by a given constant M is closed in C ( X )
with the infinity norm.

28. Define the linear space C1 ([0, 1]). Prove that C1 ([0, 1]) is not closed in
C ([0, 1]) equipped with the infinity norm.

29. Define the linear space X = C1 ([0, 1]) and the C1 norm on X. Prove that
C1 ([0, 1]) equipped with the C1 norm is a Banach space.

30. Define the concept of contraction mapping on a metric space. State and
prove the contraction mapping theorem.

164
6.6. Exercises

31. Define the concepts of inner measure and outer measure in the sense of
Peano-Jordan and Lebesgue. Define the concepts of measurable set in both
theories. Provide an example of a subset of R which is Lebesgue mea-
surable but not Peano-Jordan measurable.

32. Let f n : [ a, b] → R converge uniformly to f : [ a, b] → R. Prove that


Rb Rb
a f n dx → a f dx.

33. Taking for granted that the Lebesgue outer measure is monotone and
countably subadditive, prove that the set D = Q ∩ [0, 1] is Lebesgue
measurable.

34. Define the concept of Lebesgue measurable function f : Rd → [−∞, +∞].


Define the concept of simple function f : Rd → [−∞, +∞]. Define the
Lebesgue integral of a simple function.

35. Define the Lebesgue integral of a measurable function. Say in which


cases it is not well defined.

36. State and prove Beppo-Levi’s Theorem, or monotone convergence Theorem.

37. State and prove Fatou’s Lemma.

38. Provide an example in which the strict inequality in Fatou’s Lemma


holds. Provide all calculations supporting your choice.

39. State and prove Lebesgue dominated convergence theorem.

40. Define the L p -norms of a measurable function f : Rd → [−∞, +∞]


for p ∈ [1, +∞). Define k f k L∞ for a measurable function f : Rd →
[−∞, +∞]. Prove | f ( x )| ≤ k f k L∞ almost everywhere.

41. State and prove Hoelder’s inequality.

42. State and prove Minkowski’s inequality.

43. State (without proof) Riesz-Fisher Theorem.

44. State and prove Urysohn’s Lemma on Rd .

45. Prove that the set of simple functions that are zero outside a compact set
is dense in L p (Rd ) for p ∈ [1, +∞).

46. Prove that Cc (Rd ) is dense in L p (Rd ) for p ∈ [1, +∞). (Use the density
of simple functions).

47. Prove that L∞ (Rd ) is not separable.

48. Let Ω ⊂ Rd be bounded. Prove that if p, q ∈ [1, +∞] and p ≤ q then


L q ( Ω ) ⊂ L p ( Ω ).
p
49. Define the Lloc (Rd ) spaces for p ∈ [1, +∞].

50. State and prove Young’s theorem for convolutions.

165
6. Bounded operators on Hilbert spaces and spectral theory

51. Let f ∈ Cc (Rd ) and g ∈ L1loc (Rd ). Prove that f ∗ g is continuous.

52. Let f ∈ Cc1 (R) and g ∈ L1loc (R). Prove that f ∗ g ∈ C1 (R).

53. Define the sequence of standard mollifiers ρn on Rd . Assume f ∈ C (Rd ).


Prove ρn ∗ f → f uniformly on compact sets.

54. Use density results to prove that if u ∈ L1loc (Ω) for Ω ⊂ Rd an open set,
and if Ω uφdx = 0 for al φ ∈ Cc∞ (Rd ), then u = 0 almost everywhere on
R

Ω.

55. State and prove Kolmogorov-Riesz-Frechet Theorem.

Full list of possible BWQ for the second part


1. Let ( X, k · k X ) and (Y, k · kY ) be two normed spaces. Let T : X → Y be a
linear map. Prove that the following statements are equivalent:

(i) T is continuous at x = 0 ∈ X.
(ii) T is continuous on all points x ∈ X.
(iii) T maps bounded sets of X into bounded sets of Y.
(iv) There exists M > 0 such that

k T ( x )kY ≤ Mk x k X . (49)

2. Define the concept of bounded linear operator T between two normed


spaces X and Y and define the operator norm of T.

3. Let k T k be the operator norm of a bounded linear operator T : X → Y.


Prove that

k T k = inf{ M ≥ 0 : k Tx k ≤ Mk x k , for all x ∈ X } .

4. Let k T k be the operator norm of a bounded linear operator T : X → Y.


Prove that

k T k = sup k Tx k = sup k Tx k .
k x k≤1 k x k=1

5. Let X be a normed space and Y be a Banach space. Define L( X, Y ) and


its standard norm. Prove it is a complete normed space.

6. Prove that every linear operator on a finite-dimensional linear space is


bounded.

7. Provide an example of a linear operator which is not bounded.

8. Let T : X → Y be a bounded linear operator between normed spaces.


Prove that the null set of T is closed.

9. Define the concept of compact operator between two normed linear spaces.

166
6.6. Exercises

10. Let X and Y be Banach spaces. Prove that the set of compact linear oper-
ators K ( X, Y ) is a closed subset of the space of bounded linear operators
L( X, Y ) equipped with the operator norm.

11. State and prove Riesz’ Lemma and use it to prove that in every infinite
dimensional normed space the closed unit ball is not sequentially com-
pact.

12. Define the algebraic dual space and the topological dual space of a
normed linear space. Define the dual norm of a Banach space.

13. Let p ∈ [1, +∞) and let q be its conjugate. Prove there exists an isometric
bijection between `q and the topological dual of ` p .

14. State (without proof) Riesz’ representation theorem for L p spaces.

15. State (without proof) the analytic form of Hahn-Banach theorem.

16. Using only Hahn-Banach theorem, prove that in all normed spaces E, the
following duality formula holds:

kxk = max | f ( x )|
f ∈ E0 ,k f k≤1

0
where E is the topological dual of E.

17. Define the concept of first Baire category space and seconde Baire category
space. State (without proof) Baire’s category Theorem.

18. State and prove Banach-Steinhaus Theorem.

19. State (without proof) the Open Mapping Theorem.

20. Let T : X → Y be a bounded linear map between Banach spaces X, Y.


Prove that following statements are equivalent:

(a) There is a constant c > 0 such that

ck x k ≤ k Tx k for all x ∈ X,

(b) T has closed range, and the only solution of the equation Tx = 0 is
x = 0.

21. For a given normed linear space E, provide a detailed definition of the
weak topology σ ( E, E0 ) in terms of inverse limit topology. Define the
concept of weakly convergent sequence in E.

22. Let xn be a sequence in a normed linear space. Prove that if xn → x


strongly then xn → x weakly.

23. Let xn be a sequence in a normed linear space. Prove that if xn → x


strongly then k x k ≤ lim infn→+∞ k xn k.

167
6. Bounded operators on Hilbert spaces and spectral theory

24. Provide an example of a subset S of an arbitrary infinite dimensional


normed space which is closed in the strong topology but not in the
weak topology. Provide all the details.

25. For a given normed linear space E, define the bidual space and the con-
cept of weak star convergence.

26. State (without proof) Banach-Alaoglu-Bourbaki Theorem.

27. Define the concept of inner-product on a linear space. Define the concept
of Hilbert space.

28. State and prove Cauchy-Schwarz inequality.

29. State and prove the Parallelogram Law.

30. Prove that the orthogonal complement of a subset of a Hilbert space is a


closed and linear subspace.

31. State and prove the Orthogonal Projection Theorem on a Hilbert space.

32. Let U = {un : n ∈ N} be an orthonormal sequence in a Hilbert space


H and x ∈ H. Prove the following statements:

(a) ∑n∈N |hun , x i|2 ≤ k x k2 ;


. ∞
(b) xU = ∑+n=1 h un , x i un is a convergent sum;
(c) x − xU ∈ U ⊥ .

33. Define the comcept of complete orthonormal set (or orthonormal basis) of
a Hilbert space.

34. State (without proof) Parseval’s identityon a Hilbert space.

35. State and prove Riesz’ representation Theorem on a Hilbert space.

36. Let A : H → H be a bounded linear operator on a Hilbert space. State the


defining property of the adjoint operator A∗ . Prove that A∗ is uniquely
defined.

37. Let H be a Hilbert space and A ⊂ H. Define the orthogonal complement


A⊥ . Prove that ( A⊥ )⊥ equals the closure of the linear span of A.

38. Let A : H → H be a bounded linear operator on a Hilbert space. Prove


that Ran( A) = (Ker( A∗ ))⊥ and that Ker( A) = (Ran( A∗ ))⊥ .

39. Let xn be a sequence of a Hilbert space that converges weakly to x and


satisfies k xn k → k x k. Prove that xn → x strongly.

40. Let A : H → H be a bounded linear operator on a Hilbert space. Define


the resolvent set of A, the spectrum of A, the point spectrum of A, the
continuous spectrum of A, and the residual spectrum of A.

168
6.6. Exercises

41. Let K : H → H be a bounded linear operator on a Hilbert space. Assume


kK k < 1. Prove I − K is invertible and can be expressed through an
infinite power series.

42. Let A : H → H be a bounded linear operator on a Hilbert space. Prove


that the resolvent set of A is an open set. Prove that σ ( A) is a subset of
the closed ball with center the origin and radius k Ak.

43. Prove that the eigenvaluaes of a bounded, self-adjoint, linear operator


on a Hilbert space are real, and eigenvectors associated with different
eigenvalues are orthogonal.

44. Let A : H → H be a bounded, linear, self-adjoint operator on a Hilbert


space. Let M be an invariant subspace for A. Prove that M⊥ is also
invariant for A.

45. Let A : H → H be a bounded linear operator on a Hilbert space. Let λ


be in the residual spectrum of A. Prove that λ is an eigenvalue for A∗ .

46. State (without proof) the Spectral Theorem for compact, self-adjoint operators
on Hilbert spaces.

Problems

There are four types of problems.

The first two types or problems may appear in Part I:

(a) Given a sequence of functions on R, or a subset of R, find their point-


wise limit f (with all due calculations), say if the convergence to said
limit is uniform or not, and explain why this is the case. An additional
question is often provided on the the convergence in some L p space of
the sequence.

(b) Given a specific set A of functions, answer to some topological questions


such as is the set closed?, is the set compact?, what is the closure of A?, etc,
and provide a detailed motivation to the answer.

The last two types of problems may appear in Part II:

(c) Given an operator T on some Banach space, prove the operator is linear,
prove the operator is bounded, sometimes you’re asked to quantity ex-
actly the operator norm, sometimes you’re asked whether the operator
is compact and to say why.

(d) Given an operator T on some Hilbert space (typically `2 or L2 ), answers


to questions concerning the spectral properties, such as, is a given number
λ an eigenvalue?, prove a given number λ is in the spectrum, prove there are
no eigenvalues, etc.

169
6. Bounded operators on Hilbert spaces and spectral theory

Examples of problems of type (a) (with model solution)


2
1. Let f n : [−1, 1] → R, with f n ( x ) = nx2 e−nx .

(i) Prove that f n has a pointwise limit f on [−1, 1].


Solution. For x = 0 we have f n (0) = 0 for all n, hence f n (0) → 0
as n → +∞. For x 6= 0, the quantity nx2 tends to infinity as n →
+∞. Therefore we have an indeterminate form 0 × ∞. However, it is
well known from calculus that the exponential function ey diverges
faster than y as y → +∞. Hence, f n ( x ) → 0 as n → +∞. The
pointwise limit is f ( x ) ≡ 0.
(ii) Does f n → f uniformly on [0, 1]? Motivate your answer.
Solution. By definition of uniform convergence, the latter holds
provided k f n − f k∞ → 0 for large n. We compute, using that f n (− x ) =
f n ( x ),

k f n − f k∞ = k f n k∞ = sup | f n ( x )| = sup | f n ( x )| = sup f ( x ) .


x ∈[−1,1] x ∈[0,1] x ∈[0,1]

The latter can be computed by calculating the first derivative


2 2 2
f n0 ( x ) = 2nxe−nx − 2n2 x3 e−nx = 2nxe−nx (1 − nx2 ) ,

which shows √ that f n increases for 0√≤ x < 1/ n, has a maximum
for x = 1/ n, and decreases for 1/ n < x ≤ 1. Hence,

k f n − f k∞ = f n (1/ n) = e−1 ,

and the last quantity is a fixed positive number, and therefore can-
not converge to zero for large n. Hence, there is no uniform conver-
gence on [−1, 1].
(iii) Does f n → f in L1 ([0, 1])? Motivate your answer.
Solution. We claim that the above assertion, namely

Z1
k f n − f k L1 ([−1,1]) = | f n ( x ) − f ( x )|dx → 0 as n → +∞ ,
−1

is true. We have indeed


2
| f n ( x ) − f ( x )| = nx2 e−nx ,

and we have just proved that the last term can be controlled by e−1 .
Hence,

| f n ( x ) − f ( x )| ≤ g( x ) :≡ e−1 .

Since g has finite integral on [−1, 1], and since | f n ( x ) − f ( x )| con-


verges to zero almost everywhere due to point (i), we may apply
Dominated Convergence Theorem to get the assertion.

170
6.6. Exercises

2. Let f n : [0, 1] → R, with f n ( x ) = sin nx



1+ n .
(i) Prove that f n has a pointwise limit f on [0, 1].
Solution. The ratio 1nx +n converges to x as n → + ∞. Since the sin is
a continuous function, the sequence converges to f ( x ) = sin x for
all x ∈ [0, 1].
(ii) Does f n → f uniformly on [0, 1]? Motivate your answer.
Solution. The sin function is Lipschitz on [0, 1] with Lipschitz con-
stant given by the maximum absolute valus of its derivative cos x,
which is bounded by 1. Hence

|x|
 
nx nx 1
| f n ( x ) − f ( x )| = sin − sin x ≤ −x ≤ ≤ ,
1+n 1+n 1+n 1+n

and since the above estimate is valude for all x ∈ [0, 1], then we
deduce that f n → f uniformly on [0, 1].
(iii) Does f n → f in L1 ([0, 1])? Motivate your answer.
Solution. A theorem proven in the course states that if a sequence
converges uniformly on a compact interval, then the limit-integral
interchange property holds. This implies convergence in L1 .
nx2
3. Let f n : [−1, 1] → R, with f n ( x ) = 1+nx | x |
.

(i) Prove that f n has a pointwise limit f on [−1, 1].


Solution. We have f n ( x ) = 0, which implies f n (0) → 0 for large n.
For x 6= 0, we have

nx2 nx2 x2
= =
1 + nx | x | n(1/n + x | x |) 1/n + x | x |

and the latter converges for large n to x2 /x | x | = sign( x ). Hence,


the pointwise limit is

 −1
 if −1 ≤ x < 0
f (x) = 0 if x = 0

1 if 0 < x ≤ 1

(ii) Does f n → f uniformly on [−1, 1]? Motivate your answer.


Solution. For all n we have that f n is a continuous functions. The
pointwise limit f is instead a discontinuous function. A theorem
proven in the course shows that the uniform limit of continuous
functions is continuous. Hence, there can be no uniform conver-
gence.
(iii) Does f n → f in L2 ([−1, 1])? Motivate your answer.
Solution. A direct computation gives

Z1 2 Z1 2
nx2 nx2 − sign( x ) − nx2
k fn − f k2L2 ([−1,1]) = − sign( x ) dx = dx ,
1 + nx | x | 1 + nx | x |
−1 −1

171
6. Bounded operators on Hilbert spaces and spectral theory

where we have ignored the value of the sign function at zero, since
a point has zero measure and so it does not affect the integral. This
implies

Z1
1
k f n − f k2L2 ([−1,1]) = dx .
(1 + nx | x |)2
−1

For all x 6= 0 and for n large enough we have, for all x ∈ [−1, 1],

1 1
≤ = : gn ( x ) .
(nx | x | + 1)2 (nx2 − 1)2

Now, for all x 6= 0,

1 1
gn +1 ( x ) = 2 2
≤ = gn ( x ) ,
((n + 1) x − 1) (nx − 1)2
2

because for x 6= 0 the term nx2 becomes larger than one for n
large enough, so the assertion follows from the fact that x 7→ x2
is increasing for positive x. Since gn ( x ) → 0 almost everywhere
(that is everywhere except in x = 0), the convergence follows from
Beppo-Levi’s Theorem.

Examples of problems of type (b) (with model solution)

1. Let X = C ([−1, 1]) be equipped with the uniform norm and let

A = { f ∈ X : f ( x1 ) > f ( x2 ) for all x1 , x2 ∈ [−1, 1] such that x1 < x2 } .

Is A closed in X? Motivate your answer.


Solution. The set A is not closed. To see that, set for example

1
f n (x) = − x .
n
For x1 < x2 we have

1
f n ( x1 ) − f n ( x2 ) = ( x2 − x1 ) > 0 ,
n
so f n ∈ A. However, f n converges uniformly to f ( x ) = 0 on [−1, 1]. Now
f 6∈ A since for x1 < x2 we have

f ( x1 ) − f ( x2 ) = 0 ,

which violates the condition to be in A.

2. Let X = C ([−1, 1]) be equipped with the uniform norm and let

A = { f ∈ X : f ( x ) ≥ 0 for all x ∈ [0, 1]} .

172
6.6. Exercises

Is A closed in X? Motivate your answer.


Solution. Let f n → f uniformly and f n ∈ A. Since uniform convergence
implies pointwise convergence, we have for all x ≥ 0

f ( x ) = lim f n ( x ) ≥ lim 0 = 0 .
n→+∞ n→+∞

Therefore, f ∈ A, which proves that A is closed.

3. Let X = C ([−1, 1]) be equipped with the uniform norm and let
n o
A = f ∈ X : ( x2 + 1) f ( x ) < 1 for all x ∈ [−1, 1] .

Find A. Motivate your answer.


Solution. We claim that the closure of A is given by the set
n o
B = f ∈ X : ( x2 + 1) f ( x ) ≤ 1 for all x ∈ [−1, 1] .

First of all, we claim that B is closed. To see this, let f n → f uniformly


and f n ∈ B. Since uniform convergence implies pointwise convergence,
we get that

( x 2 + 1) f n ( x ) → ( x 2 + 1) f ( x )

for large n, and hence the inequality in the definition of B is preserved


in the limit. Since the defining property of B is less restrictive than the
one of A, it is clear that A ⊂ B. Since B is closed, we get

A ⊂ B.

We only need to prove the opposite inequality. Let f ∈ B. we need to


find a sequence f n ∈ A with f n → f uniformly. We choose

1
f n (x) = f (x) − .
n
We get

( x2 + 1) f n ( x ) = ( x2 + 1)( f ( x ) − 1/n) < ( x2 − 1) f ( x ) ≤ 1 ,

which proves f n ∈ A. The difference between f n and f is equal to 1/n,


which clearly converges to zero uniformly.

4. Let X = C ([−1, 1]) be equipped with the uniform norm and let

A = f ∈ X : f ∈ C1 and | f ( x )| + | f 0 ( x )| ≤ 1 for all x ∈ [−1, 1]. .




Is A relatively compact? Motivate your answer.


Solution. The defining property of A gives in particular k f k∞ ≤ 1 for

173
6. Bounded operators on Hilbert spaces and spectral theory

all f ∈ A, hence A is bounded. Moreover, for x, y ∈ [−1, 1] and f ∈ A,

Zy
| f ( x ) − f (y)| ≤ | f 0 (s)|ds ≤ | x − y|
x

which proves that f is equi-Lipschitz continuous. Arzelà-Ascoli Theo-


rem then implies that A is relatively compact.

5. Let X = C ([−1, 1]) be equipped with the uniform norm and let

A = f ∈ X : f ∈ C1 and | f 0 ( x )| ≤ 1 for all x ∈ [−1, 1]. .




Is A relatively compact? Motivate your answer.


Solution. The answer is clearly no, it’s not. The set A contains all constant
functions f ( x ) = c. Hence, A is not bounded, because any integer n
can be made smaller than the distance between two constant functions.
Now, a result proven in the course shows that a relatively compact set is
bounded, which gives the assertion.

 X = C ([0, 11]) be equipped


6. Let with the uniform norm and let S =
f ∈ X : f ∈ C and k f 0 k∞ ≤ 1 . Prove that S is not closed in X.

Examples of problems of type (c) (with model solution)


1. Let X = C ([0, 1]) and T : X → X defined by

Zx
f (t)
( T f )( x ) = dt , x ∈ [0, 1] , f ∈ X.
1 + t2
0

(a) Prove that T is a compact operator.


Solution. Assuming k f k∞ ≤ 1, we get with a simple estimate
k T f k∞ ≤ 1. Moreover, for all x, y ∈ [0, 1],

Zx
1
|( T f )( x ) − ( T f )(y)| ≤ | f (t)|dt ≤ k f k∞ | x − y| ≤ | x − y|,
1 + t2
y

which proves equicontinuity. Then, the closed unit ball is mapped


by this operator into a pre-compact set due to Arzelà-Ascoli Theo-
rem.
(b) Find the operator norm of T.
Solution. A simple estimate shows

Zx
| f (t)|
|( T f )( x )| ≤ dt
1 + t2
0
Zx
1 π
≤ k f k∞ 2
dt = k f k∞ arctan x ≤ k f k∞ arctan 1 = k f k∞ ,
1+t 4
0

174
6.6. Exercises

hence k T k ≤ π
4. On the other hand, using f ( x ) = 1 for all x ∈ [0, 1]
we get

Zx
1 π
k T f k∞ = sup 2
dt = sup arctan x = ,
x ∈[0,1] 1+t x ∈[0,1] 4
0

which proves k T k = π/4.

2. Let X = C ([0, 1]) and T : X → X defined by

Zx
( T f )( x ) = et f (t)dt , x ∈ [0, 1] , f ∈ X.
0

(a) Prove that T is a bounded operator.


Solution. For f ∈ X we estimate

Zx Zx
t
k T f k∞ = sup e f (t)dt ≤ sup et | f (t)|dt
x ∈[0,1] x ∈[0,1]
0 0
Zx
≤ k f k∞ sup et dt
x ∈[0,1]
0
= k f k∞ sup (e x − 1) = (e − 1)k f k∞ .
x ∈[0,1]

Hence, the definition of bounded operator is satisfied.

(b) Compute the operator norm of T.


Solution. Since the estimate in (a) is quite sharp, we expect (e − 1)
to be the operator norm. In order to prove that, it suffices to find
f ∈ X with k f k∞ = 1 such that k T f k∞ = e − 1. Let us choose
f ( x ) ≡ 1. We have

Zx
k T f k∞ = sup et dt = sup (e x − 1) = e − 1 .
x ∈[0,1] x ∈[0,1]
0

Hence, k T f k = e − 1.

3. Let X = `∞ and T : X → X defined by


xk x ∞
( Tx )k = + k +1 , for x = ( xk )+
k =1 ∈ X .
k + 1 k2 + 1
(a) Prove that T is a bounded operator and compute the operator norm.

175
6. Bounded operators on Hilbert spaces and spectral theory


Solution. For x = ( xk )+
k =1 ∈ X we have

xk x
k Tx k∞ = sup |( Tx )k | = sup + 2 k +1
k ∈N k ∈N k + 1 k +1
 
1 1
≤ k x k∞ sup + 2 = k x k∞ .
k ∈N k + 1 k +1

The above estimate is quite sharp, so we expect k T k = 1. In order to


prove that this is the case, let’s choose xk = 1 for all k = 1, 2, 3, . . ..
We have
1 1
( Tx )k = +
k + 1 k2 + 1
and therefore
1 1
k Tx k∞ = + = 1,
2 2
which proves that k T k = 1.
(b) Is T a compact operator? Motivate your answer suitably.
Solution. For a fixed N ∈ N we define the operator
(
( Tx )k if 1 ≤ k ≤ N
( TN x )k =
0 if k > N .

For all x ∈ X we have

k( T − TN ) x k∞ = sup |(( T − TN ) x )k |
k ≥1
1 1
= sup |( Tx )k | ≤ k x k∞ sup + 2
k> N k> N k + 1 k +1
 
1 1
≤ k x k∞ + 2 ,
N+1 N +1

and the above estimate shows that

k T − TN k → 0

as N → +∞. Now, TN is a finite range bounded operator for all


N, therefore TN is compact. We proved TN → T in the operator
norm, and we know from a result proven in the course that the set
of compact operators is closed in the operator norm. Hence, T is
compact.

Examples of problems of type (d) (with model solution)

1. Let X = L2 ((0, 1)) and T : X → X defined by

( T f )( x ) = 2x2 f ( x ) , x ∈ (0, 1) , f ∈ X.

176
6.6. Exercises

(a) Prove that T has no eigenvalues.


Solution. Assume λ is an eigenvalue of T. Then, for some f ∈ X
which is not equal to zero a.e., we have

0 = ( T f − λ f )( x ) = (2x2 − λ) f ( x ) ,

and since f not equal to zero a.e., there must be a set of positive
measure on which 2x2 = λ, which impossible.
(b) Prove that 3/2 ∈ σ ( T ).
Solution. Since there are no eigenvalues, the assertion follows if we
prove that the map

3
X 3 f 7→ T − I
2
is not onto, or equivalently if we can find some g ∈ X such that the
functional equation

3
Tf − f =g
2
has no solution f ∈ X. Now, the only possible candidate solution
for g( x ) = 1 is

1
f (x) = .
2x2 − 3/2
A simple computation shows

1
f (x) =  √  √ .
3 3
2 x− 2 x+ 2

Hence,

Z1 Z1
1 1
k f k2L2 ([0,1]) =  √ 2  √ 2 dx ≤ C  √ 2 dx
3 3 3
0 4 x− 2 x+ 2 0 x− 2

for some C > 0. Please note that the function


1
x 7→  √ 2
3
x+ 2

is bounded on [0, 1], since the denominator never vanishes on said


interval. A simple calculation of the integral above shows the result
is +∞. Hence, f 6∈ X and the functional equation above has no
solution, which means that g 6∈ Ran( T ), as stated above.

177
References

References
[1] Robert G. Bartle and Donald R. Sherbert. Introduction to real
analysis. John Wiley & Sons, Inc., New York, second edi-
tion, 1992. URL: http://iuuk.mff.cuni.cz/~andrew/bartle_
introduction-to-real-analysis-new-edition.pdf.

[2] Haim Brezis. Functional Analysis, Sobolev Spaces, and Partial Differential
Equations. Springer, 2011.

[3] John K. Hunter and Bruno Nachtergaele. Applied analysis. World Scientific
Publishing Co., Inc., River Edge, NJ, 2001.

178

You might also like