0% found this document useful (0 votes)
201 views

OL What If (LNC)

This document provides an introduction to non-classical logics. It discusses syntax and semantics of propositional logic to establish foundations. It then explores several non-classical logics, including three-valued logics like Lukasiewicz logic and Kleene logic as well as modal logics. Formal systems like sequent calculus and tableaux are presented as proof systems for verifying formulas in different logics. The document aims to introduce readers to various "what if" scenarios by investigating logical systems that relax or modify principles of classical logic.

Uploaded by

Sérgio Miranda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
201 views

OL What If (LNC)

This document provides an introduction to non-classical logics. It discusses syntax and semantics of propositional logic to establish foundations. It then explores several non-classical logics, including three-valued logics like Lukasiewicz logic and Kleene logic as well as modal logics. Formal systems like sequent calculus and tableaux are presented as proof systems for verifying formulas in different logics. The document aims to introduce readers to various "what if" scenarios by investigating logical systems that relax or modify principles of classical logic.

Uploaded by

Sérgio Miranda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 342

What if?

An Open Introduction to
Non-Classical Logics

F20
What if?
The Open Logic Project

Instigator
Richard Zach, University of Calgary

Editorial Board
Aldo Antonelli,† University of California, Davis
Andrew Arana, Université de Lorraine
Jeremy Avigad, Carnegie Mellon University
Tim Button, University College London
Walter Dean, University of Warwick
Gillian Russell, Dianoia Institute of Philosophy
Nicole Wyatt, University of Calgary
Audrey Yap, University of Victoria

Contributors
Samara Burns, Columbia University
Dana Hägg, University of Calgary
Zesen Qian, Carnegie Mellon University
What if?
An Open Introduction to
Non-Classical Logics

Remixed by Audrey Yap & Richard Zach

Fall 2020
The Open Logic Project would like to acknowledge the gener-
ous support of the Taylor Institute of Teaching and Learning of
the University of Calgary, and the Alberta Open Educational Re-
sources (ABOER) Initiative, which is made possible through an
investment from the Alberta government.

Cover illustrations by Matthew Leadbeater, used under a Cre-


ative Commons Attribution-NonCommercial 4.0 International Li-
cense.

Typeset in Baskervald X and Nimbus Sans by LATEX.

This version of What if? is revision 8be15d6 (2021-07-11), with


content generated from Open Logic Text revision 7219905 (2021-
07-14). Free download at:
https://builds.openlogicproject.org/courses/what-if/

What if? by Audrey Yap & Richard Zach


is licensed under a Creative Commons
Attribution 4.0 International License. It
is based on The Open Logic Text by the
Open Logic Project, used under a Cre-
ative Commons Attribution 4.0 Interna-
tional License.
Contents
Preface xiii

Introduction xiv

I Remind me, how does logic work again? 1

1 Syntax and Semantics 2


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . 2
1.2 Propositional Formulas . . . . . . . . . . . . . . . 4
1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . 6
1.4 Valuations and Satisfaction . . . . . . . . . . . . 8
1.5 Semantic Notions . . . . . . . . . . . . . . . . . . 10
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Axiomatic Derivations 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . 12
2.2 Axiomatic Derivations . . . . . . . . . . . . . . . 14
2.3 Rules and Derivations . . . . . . . . . . . . . . . 16
2.4 Axiom and Rules for the Propositional Connectives 18
2.5 Examples of Derivations . . . . . . . . . . . . . . 19
2.6 Proof-Theoretic Notions . . . . . . . . . . . . . . 21
2.7 The Deduction Theorem . . . . . . . . . . . . . . 23
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 25

v
CONTENTS vi

3 Sequent calculus 27
3.1 The Sequent Calculus . . . . . . . . . . . . . . . 27
3.2 Rules and Derivations . . . . . . . . . . . . . . . 28
3.3 Propositional Rules . . . . . . . . . . . . . . . . . 30
3.4 Structural Rules . . . . . . . . . . . . . . . . . . . 30
3.5 Derivations . . . . . . . . . . . . . . . . . . . . . 31
3.6 Examples of Derivations . . . . . . . . . . . . . . 33
3.7 Proof-Theoretic Notions . . . . . . . . . . . . . . 38
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 41

II Does everything have to be true or false? 42

4 Syntax and Semantics 43


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . 43
4.2 Languages and Connectives . . . . . . . . . . . . 44
4.3 Formulas . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Valuations and Satisfaction . . . . . . . . . . . . 47
4.6 Semantic Notions . . . . . . . . . . . . . . . . . . 48
4.7 Many-valued logics as sublogics of C . . . . . . . 49
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Three-valued Logics 51
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . 51
5.2 Łukasiewicz logic . . . . . . . . . . . . . . . . . . 51
5.3 Kleene logics . . . . . . . . . . . . . . . . . . . . 55
5.4 Gödel logics . . . . . . . . . . . . . . . . . . . . . 58
5.5 Designating not just T . . . . . . . . . . . . . . . 59
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 Sequent Calculus 66
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . 66
6.2 Rules and Derivations . . . . . . . . . . . . . . . 67
6.3 Structural Rules . . . . . . . . . . . . . . . . . . . 69
6.4 Propositional Rules for Selected Logics . . . . . . 69
CONTENTS vii

7 Infinite-valued Logics 74
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . 74
7.2 Łukasiewicz logic . . . . . . . . . . . . . . . . . . 75
7.3 Gödel logics . . . . . . . . . . . . . . . . . . . . . 76
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 78

III But isn’t truth relative (to a world)? 79

8 Syntax and Semantics 80


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . 80
8.2 The Language of Basic Modal Logic . . . . . . . 82
8.3 Simultaneous Substitution . . . . . . . . . . . . . 83
8.4 Relational Models . . . . . . . . . . . . . . . . . . 85
8.5 Truth at a World . . . . . . . . . . . . . . . . . . 86
8.6 Truth in a Model . . . . . . . . . . . . . . . . . . 88
8.7 Validity . . . . . . . . . . . . . . . . . . . . . . . . 88
8.8 Tautological Instances . . . . . . . . . . . . . . . 89
8.9 Schemas and Validity . . . . . . . . . . . . . . . . 92
8.10 Entailment . . . . . . . . . . . . . . . . . . . . . . 94
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 96

9 Axiomatic Derivations 99
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . 99
9.2 Proofs in K . . . . . . . . . . . . . . . . . . . . . . 101
9.3 Derived Rules . . . . . . . . . . . . . . . . . . . . 103
9.4 More Proofs in K . . . . . . . . . . . . . . . . . . 106
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10 Modal Tableaux 109


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . 109
10.2 Rules for K . . . . . . . . . . . . . . . . . . . . . 110
10.3 Tableaux for K . . . . . . . . . . . . . . . . . . . 113
10.4 Soundness for K . . . . . . . . . . . . . . . . . . . 114
10.5 Rules for Other Accessibility Relations . . . . . . 118
10.6 Soundness for Additional Rules . . . . . . . . . . 119
CONTENTS viii

10.7 Simple Tableaux for S5 . . . . . . . . . . . . . . . 122


10.8 Completeness for K . . . . . . . . . . . . . . . . . 123
10.9 Countermodels from Tableaux . . . . . . . . . . . 126
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 129

IV Is this really necessary? 131

11 Frame Definability 132


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . 132
11.2 Properties of Accessibility Relations . . . . . . . 133
11.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . 136
11.4 Frame Definability . . . . . . . . . . . . . . . . . 137
11.5 First-order Definability . . . . . . . . . . . . . . . 140
11.6 Equivalence Relations and S5 . . . . . . . . . . . 141
11.7 Second-order Definability . . . . . . . . . . . . . 144
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 147

12 More Axiomatic Derivations 149


12.1 Normal Modal Logics . . . . . . . . . . . . . . . 149
12.2 Derivations and Modal Systems . . . . . . . . . . 151
12.3 Dual Formulas . . . . . . . . . . . . . . . . . . . . 153
12.4 Proofs in Modal Systems . . . . . . . . . . . . . . 154
12.5 Soundness . . . . . . . . . . . . . . . . . . . . . . 156
12.6 Showing Systems are Distinct . . . . . . . . . . . 156
12.7 Derivability from a Set of Formulas . . . . . . . . 158
12.8 Properties of Derivability . . . . . . . . . . . . . . 159
12.9 Consistency . . . . . . . . . . . . . . . . . . . . . 159
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 160

13 Completeness and Canonical Models 161


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . 161
13.2 Complete 𝛴 -Consistent Sets . . . . . . . . . . . . 163
13.3 Lindenbaum’s Lemma . . . . . . . . . . . . . . . 164
13.4 Modalities and Complete Consistent Sets . . . . . 166
13.5 Canonical Models . . . . . . . . . . . . . . . . . . 169
CONTENTS ix

13.6 The Truth Lemma . . . . . . . . . . . . . . . . . 169


13.7 Determination and Completeness for K . . . . . . 170
13.8 Frame Completeness . . . . . . . . . . . . . . . . 172
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 175

14 Modal Sequent Calculus 177


14.1 Introduction . . . . . . . . . . . . . . . . . . . . . 177
14.2 Rules for K . . . . . . . . . . . . . . . . . . . . . 177
14.3 Sequent Derivations for K . . . . . . . . . . . . . 178
14.4 Rules for Other Accessibility Relations . . . . . . 180
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 181

V But you can’t tell me what to think! 183

15 Epistemic Logics 184


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . 184
15.2 The Language of Epistemic Logic . . . . . . . . . 185
15.3 Relational Models . . . . . . . . . . . . . . . . . . 187
15.4 Truth at a World . . . . . . . . . . . . . . . . . . 188
15.5 Accessibility Relations and Epistemic Principles . 190
15.6 Bisimulations . . . . . . . . . . . . . . . . . . . . 191
15.7 Public Announcement Logic . . . . . . . . . . . . 193
15.8 Semantics of Public Announcement Logic . . . . 195

VI Is this going to go on forever? 198

16 Temporal Logics 199


16.1 Introduction . . . . . . . . . . . . . . . . . . . . . 199
16.2 Semantics for Temporal Logic . . . . . . . . . . . 200
16.3 Properties of Temporal Frames . . . . . . . . . . 203
16.4 Additional Operators for Temporal Logic . . . . 204
16.5 Possible Histories . . . . . . . . . . . . . . . . . . 204
CONTENTS x

VII What if things were different? 207

17 Introduction 208
17.1 The Material Conditional . . . . . . . . . . . . . 208
17.2 Paradoxes of the Material Conditional . . . . . . 210
17.3 The Strict Conditional . . . . . . . . . . . . . . . 211
17.4 Counterfactuals . . . . . . . . . . . . . . . . . . . 213
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 214

18 Minimal Change Semantics 216


18.1 Introduction . . . . . . . . . . . . . . . . . . . . . 216
18.2 Sphere Models . . . . . . . . . . . . . . . . . . . . 218
18.3 Truth and Falsity of Counterfactuals . . . . . . . 220
18.4 Antecedent Strengthenng . . . . . . . . . . . . . 221
18.5 Transitivity . . . . . . . . . . . . . . . . . . . . . 223
18.6 Contraposition . . . . . . . . . . . . . . . . . . . 225
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 225

VIII How can it be true if you can’t prove it? 227

19 Introduction 228
19.1 Constructive Reasoning . . . . . . . . . . . . . . 228
19.2 Syntax of Intuitionistic Logic . . . . . . . . . . . 230
19.3 The Brouwer-Heyting-Kolmogorov Interpretation 231
19.4 Natural Deduction . . . . . . . . . . . . . . . . . 235
19.5 Axiomatic Derivations . . . . . . . . . . . . . . . 238
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 240

20 Semantics 241
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . 241
20.2 Relational models . . . . . . . . . . . . . . . . . . 242
20.3 Semantic Notions . . . . . . . . . . . . . . . . . . 244
20.4 Topological Semantics . . . . . . . . . . . . . . . 245
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 247
CONTENTS xi

IX Wait, hear me out: what if it’s both true and


false? 248

21 Paraconsistent logics 249

X Appendices 250

A Sets 251
A.1 Extensionality . . . . . . . . . . . . . . . . . . . . 251
A.2 Subsets and Power Sets . . . . . . . . . . . . . . . 253
A.3 Some Important Sets . . . . . . . . . . . . . . . . 254
A.4 Unions and Intersections . . . . . . . . . . . . . . 256
A.5 Pairs, Tuples, Cartesian Products . . . . . . . . . 259
A.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . 261
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 263

B Relations 264
B.1 Relations as Sets . . . . . . . . . . . . . . . . . . 264
B.2 Special Properties of Relations . . . . . . . . . . . 266
B.3 Equivalence Relations . . . . . . . . . . . . . . . 268
B.4 Orders . . . . . . . . . . . . . . . . . . . . . . . . 269
B.5 Graphs . . . . . . . . . . . . . . . . . . . . . . . . 272
B.6 Operations on Relations . . . . . . . . . . . . . . 274
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 275

C Proofs 276
C.1 Introduction . . . . . . . . . . . . . . . . . . . . . 276
C.2 Starting a Proof . . . . . . . . . . . . . . . . . . . 278
C.3 Using Definitions . . . . . . . . . . . . . . . . . . 278
C.4 Inference Patterns . . . . . . . . . . . . . . . . . . 281
C.5 An Example . . . . . . . . . . . . . . . . . . . . . 289
C.6 Another Example . . . . . . . . . . . . . . . . . . 293
C.7 Proof by Contradiction . . . . . . . . . . . . . . . 295
C.8 Reading Proofs . . . . . . . . . . . . . . . . . . . 300
C.9 I Can’t Do It! . . . . . . . . . . . . . . . . . . . . 302
CONTENTS xii

C.10 Other Resources . . . . . . . . . . . . . . . . . . 304


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 305

D Induction 306
D.1 Introduction . . . . . . . . . . . . . . . . . . . . . 306
D.2 Induction on N . . . . . . . . . . . . . . . . . . . 307
D.3 Strong Induction . . . . . . . . . . . . . . . . . . 310
D.4 Inductive Definitions . . . . . . . . . . . . . . . . 311
D.5 Structural Induction . . . . . . . . . . . . . . . . 314
D.6 Relations and Functions . . . . . . . . . . . . . . 316
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 320

E The Greek Alphabet 321

Bibliography 322

About the Open Logic Project 323


Preface
This is an introductory textbook on non-classical logics. Or
rather, it will be when it’s done. Right now it’s a work in progress.
We will use it as the main text in our courses on non-classical
logic at the Universities of Victoria and Calgary, respectively, in
Fall 2020. It is based on material from the Open Logic Project.
The main text assumes familiarity with some elementary set
theory and the basics of (propositional) logic. The textbook Sets,
Logic, Computation, is also based on the OLP, provides this back-
ground. But the required material is also included here: the basics
of classical propositional logic in part I, and the material on set
theory, and some introductory material on proofs and induction,
in part X.

xiii
Introduction
Classical logic is very useful, widely used, has a long history,
and is relatively simple. But it has limitations: for instance, it
does not (and cannot) deal well with certain locutions of natural
language such as tense and subjunctive mood, nor with certain
constructions such as “Audrey knows that p.” It makes certain
assumptions, for instance that every sentence is either true or
false and never both. It pronounces some formulas tautologies
and some arguments as valid, even though these tautologies and
arguments formalize arguments in English which some do not
consider true or valid, at least not obviously. Thus it seems there
are examples where classical logic is not expressive enough, or
even where classical logic gets things wrong.
This book discusses some alternative, non-classical logics.
These non-classical logics are either more expressive than clas-
sical logic or have different tautologies or valid arguments. For
instance, temporal logic extends classical logic by operators that
express tense; conditional logics have an additional, different con-
ditional (“if—then”) that does not suffer from the so-called para-
doxes of the material conditional. All of these logics extend classi-
cal logic by new operators or connectives, and fall into the broad
category of intensional logics. Other logics such as many-valued,
intuitionistic, and paraconsistent logics have the same basic con-
nectives as classical logic, but different infrences count as valid.
In many-valued and intutionistic logic, for instance, the law of
excluded middle A ∨ ¬A fails to hold; in paraconsistent logic the

xiv
INTRODUCTION xv

inference ex contradictione quodlibet, ⊥ ⊨ A for arbitrary A.


After we review the basics of classical propositional logic in
part I, we begin our discussion of non-classical logics in II. There
we will relax one assumption classical logic makes: that every-
thing is either true or false. There are good reasons to think
that some sentences of English are neither—they have some in-
termediate truth value. Examples of this are sentences involving
vagueness (“Mary is rich”), sentences the truth of which is not
yet determined (“There will be a sea battle tomorrow”), and—an
important case for philosopers—sentences that are paradoxical
such as “This sentence is false.” One of the earliest non-classical
logics allow truth values in addition to the classical “true” and
“false”; they are called many-valued. We cover them in part II.
Modal logics are extensions of classical logic by the operators
□ (“box”) and ◇ (“diamond”), which attach to formulas. Intu-
itively, □ may be read as “necessarily” and ◇ as “possibly,” so
□p is “p is necessarily true” and ◇p is “p is possibly true.” As
necessity and possibility are fundamental metaphysical notions,
modal logic is obviously of great philosophical interest. It allows
the formalization of metaphysical principles such as “□p → p” (if
p is necessary, it is true) or “◇p → □◇p” (if p is possible, it is
necessarily possible).
For the logic which corresponds to the interpretation of □ as
“necessarily,” this semantics is relatively simple: instead of assign-
ing truth values to propositional variables, an interpretation M
assigns a set of “worlds” to them—intuitively, those worlds w at
which p is interpreted as true. On the basis of such an interpre-
tation, we can define a satisfaction relation. The definition of
this satisfaction relation makes □A satisfied at a world w iff A is
satisfied at all worlds: M,w ⊩ □A iff M,v ⊩ A for all worlds v .
This corresponds to Leibniz’s idea that what’s necessarily true is
what’s true in every possible world.
“Necessarily” is not the only way to interpret the □ operator,
but it is the standard one—“necessarily” and “possibly” are the
so-called alethic modalities. Other interpretations read □ as “it
is known (by some person A) that,” as “some person A believes
INTRODUCTION xvi

that,” “it ought to be the case that,” or “it will always be true that.”
These are epistemic, doxastic, deontic, and temporal modalities,
respectively. Different interpretations of □ will make different for-
mulas logically true, and pronounce different inferences as valid.
For instance, everything necessary and everything known is true,
so □A →A is a logical truth on the alethic and epistemic interpre-
tations. By contrast, not everything believed nor everything that
ought to be the case actually is the case, so □A → A is not a log-
ical truth on the doxastic or deontic interpretations. We discuss
modal logics in general in parts III and IV and epistemic logics
in particular in part V.
In order to deal with different interpretations of the modal op-
erators, the semantics is extended by a relation between worlds,
the so-called accessibility relation. Then M,w ⊩ □A if M,v ⊩ A
for all worlds v which are accessible from w. The resulting se-
mantics is very versatile and powerful, and the basic idea can be
used to provide semantic interpretations for logics based on other
intensional operators. One such logic is a close relative of modal
logic called temporal logic. Instead of having just one modality
□ (plus its dual ◇), it has temporal operators such as “always P ,”
“p will be true”, etc. We study these in part VI.
Whereas the material conditional is best read as an English
indicative conditional (“If p is true then q is true”), subjunctive
conditionals are in the subjunctive mood: “if p were true then
q would be true.” While a material conditional with a false an-
tecedent is true, a subjunctive conditional need not be, e.g., “if
humans had tails, they would be able to fly.” In part VII, we
discuss logics of counterfactual coditionals.
Intuitionistic logic is a constructive logic based on L. E.
J. Brouwer’s branch of constructive mathematics. Intuitionistic
logic is philosophically interesting for this reason—it plays an
important role in constructive accounts of mathematics—but was
also proposed as a logic superior to classical logic by the influen-
tial English philosopher Michael Dummett in the 20th century.
As mentioned above, intuitionistic logic is non-classical because
it has fewer valid inferences and theorems, e.g., A ∨ ¬A and
INTRODUCTION xvii

¬¬A → A fail in general. Intuitively, this is a consequence of the


intuitionist principle that something shouldn’t count as true—you
should not assert it—unless you have a proof of it. And obviously
there are cases where we neither have a proof of A nor a proof
of ¬A. Intuitionistic logic can be given a relational semantics
very much like modal logic. We discuss it in part VIII.
One of the weirdest features of classical logic is the princi-
ple of explosion: from a contradiction, anything follows. This
principle flows from the way we set up the semantics of classical
logic, but it is very counterintuitive and goes against what we ac-
tually do when we reason. After all, once you discover that some
things you believe are contradictory, you don’t (usually) go on
to conclude arbitrary claims since they follow from your beliefs!
This has led logicians to develop systems of logic in which the
inference ex contradictione, quodlibet is blocked. Some of these are
simply further weakenings of classical logic, designed just to get
rid of explosion. Some are more philosophically motivated. Part
of what makes the principle of explosion weird is that when it pro-
nounces that A and ¬A together entail B, there is no connection
at all between the premises and the conclusion. And shouldn’t
there be such a connection in any valid argument? Shouldn’t the
premises be relevant to the conclusion? This leads to so-valled
relevant (or relevance) logic. Another motivation is the philosoph-
ical position called dialetheism: the belief that there can be true
contradictions. (If you believe that contradictions can be true,
then you would not want them to entail anything whatsoever,
e.g., something false.) All of these logic fall under the umbrella
term paraconsistent. We discuss paraconsistent logics in part IX.
PART I

Remind me,
how does
logic work
again?

1
CHAPTER 1

Syntax and
Semantics
1.1 Introduction
Propositional logic deals with formulas that are built from propo-
sitional variables using the propositional connectives ¬, ∧, ∨, →,
and ↔. Intuitively, a propositional variable p stands for a sen-
tence or proposition that is true or false. Whenever the “truth
value” of the propositional variable in a formula is determined,
so is the truth value of any formulas formed from them using
propositional connectives. We say that propositional logic is truth
functional, because its semantics is given by functions of truth val-
ues. In particular, in propositional logic we leave out of consider-
ation any further determination of truth and falsity, e.g., whether
something is necessarily true rather than just contingently true,
or whether something is known to be true, or whether something
is true now rather than was true or will be true. We only consider
two truth values true (T) and false (F), and so exclude from dis-
cussion the possibility that a statement may be neither true nor
false, or only half true. We also concentrate only on connectives
where the truth value of a formula built from them is completely
determined by the truth values of its parts (and not, say, on its
meaning). In particular, whether the truth value of conditionals

2
CHAPTER 1. SYNTAX AND SEMANTICS 3

in English is truth functional in this sense is contentious. The ma-


terial conditional → is; other logics deal with conditionals that
are not truth functional.
In order to develop the theory and metatheory of truth-
functional propositional logic, we must first define the syntax
and semantics of its expressions. We will describe one way of
constructing formulas from propositional variables using the con-
nectives. Alternative definitions are possible. Other systems will
choose different symbols, will select different sets of connectives
as primitive, and will use parentheses differently (or even not
at all, as in the case of so-called Polish notation). What all ap-
proaches have in common, though, is that the formation rules
define the set of formulas inductively. If done properly, every ex-
pression can result essentially in only one way according to the
formation rules. The inductive definition resulting in expressions
that are uniquely readable means we can give meanings to these
expressions using the same method—inductive definition.
Giving the meaning of expressions is the domain of seman-
tics. The central concept in semantics for propositonal logic is
that of satisfaction in a valuation. A valuation v assigns truth val-
ues T, F to the propositional variables. Any valuation determines
a truth value v(A) for any formula A. A formula is satisfied in
a valuation v iff v(A) = T—we write this as v ⊨ A. This relation
can also be defined by induction on the structure of A, using the
truth functions for the logical connectives to define, say, satisfac-
tion of A ∧ B in terms of satisfaction (or not) of A and B.
On the basis of the satisfaction relation v ⊨ A for sentences
we can then define the basic semantic notions of tautology, en-
tailment, and satisfiability. A formula is a tautology, ⊨ A, if every
valuation satisfies it, i.e., v(A) = T for any v. It is entailed by
a set of formulas, 𝛤 ⊨ A, if every valuation that satisfies all the
formulas in 𝛤 also satisfies A. And a set of formulas is satisfi-
able if some valuation satisfies all formulas in it at the same time.
Because formulas are inductively defined, and satisfaction is in
turn defined by induction on the structure of formulas, we can
use induction to prove properties of our semantics and to relate
CHAPTER 1. SYNTAX AND SEMANTICS 4

the semantic notions defined.

1.2 Propositional Formulas


Formulas of propositional logic are built up from propositional
variables and the propositional constant ⊥ using logical connectives.

1. A countably infinite set At0 of propositional variables p0 ,


p1 , . . .

2. The propositional constant for falsity ⊥.

3. The logical connectives: ¬ (negation), ∧ (conjunction), ∨


(disjunction), → (conditional)

4. Punctuation marks: (, ), and the comma.

We denote this language of propositional logic by L0 .


In addition to the primitive connectives introduced above,
we also use the following defined symbols: ↔ (biconditional), ⊤
(truth)
A defined symbol is not officially part of the language, but
is introduced as an informal abbreviation: it allows us to abbre-
viate formulas which would, if we only used primitive symbols,
get quite long. This is obviously an advantage. The bigger ad-
vantage, however, is that proofs become shorter. If a symbol is
primitive, it has to be treated separately in proofs. The more
primitive symbols, therefore, the longer our proofs.
You may be familiar with different terminology and symbols
than the ones we use above. Logic texts (and teachers) commonly
use either ∼, ¬, and ! for “negation”, ∧, ·, and & for “conjunction”.
Commonly used symbols for the “conditional” or “implication”
are →, ⇒, and ⊃. Symbols for “biconditional,” “bi-implication,”
or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ symbol is
variously called “falsity,” “falsum,” “absurdity,” or “bottom.” The
⊤ symbol is variously called “truth,” “verum,” or “top.”
CHAPTER 1. SYNTAX AND SEMANTICS 5

Definition 1.1 (Formula). The set Frm(L0 ) of formulas of


propositional logic is defined inductively as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an atomic formula.

3. If A is a formula, then ¬A is formula.

4. If A and B are formulas, then (A ∧ B) is a formula.

5. If A and B are formulas, then (A ∨ B) is a formula.

6. If A and B are formulas, then (A → B) is a formula.

7. Nothing else is a formula.

The definition of formulas is an inductive definition. Essen-


tially, we construct the set of formulas in infinitely many stages.
In the initial stage, we pronounce all atomic formulas to be for-
mulas; this corresponds to the first few cases of the definition, i.e.,
the cases for ⊥, pi . “Atomic formula” thus means any formula of
this form.
The other cases of the definition give rules for constructing
new formulas out of formulas already constructed. At the second
stage, we can use them to construct formulas out of atomic for-
mulas. At the third stage, we construct new formulas from the
atomic formulas and those obtained in the second stage, and so
on. A formula is anything that is eventually constructed at such
a stage, and nothing else.

Definition 1.2. Formulas constructed using the defined opera-


tors are to be understood as follows:

1. ⊤ abbreviates ¬⊥.

2. A ↔ B abbreviates (A → B) ∧ (B → A).
CHAPTER 1. SYNTAX AND SEMANTICS 6

Definition 1.3 (Syntactic identity). The symbol ≡ expresses


syntactic identity between strings of symbols, i.e., A ≡ B iff A
and B are strings of symbols of the same length and which con-
tain the same symbol in each place.

The ≡ symbol may be flanked by strings obtained by con-


catenation, e.g., A ≡ (B ∨ C ) means: the string of symbols A is
the same string as the one obtained by concatenating an opening
parenthesis, the string B, the ∨ symbol, the string C , and a clos-
ing parenthesis, in this order. If this is the case, then we know
that the first symbol of A is an opening parenthesis, A contains
B as a substring (starting at the second symbol), that substring
is followed by ∨, etc.

1.3 Preliminaries
Theorem 1.4 (Principle of induction on formulas). If some
property P holds for all the atomic formulas and is such that

1. it holds for ¬A whenever it holds for A;

2. it holds for (A ∧ B) whenever it holds for A and B;

3. it holds for (A ∨ B) whenever it holds for A and B;

4. it holds for (A → B) whenever it holds for A and B;

then P holds for all formulas.

Proof. Let S be the collection of all formulas with property P .


Clearly S ⊆ Frm(L0 ). S satisfies all the conditions of Defini-
tion 1.1: it contains all atomic formulas and is closed under
the logical operators. Frm(L0 ) is the smallest such class, so
Frm(L0 ) ⊆ S . So Frm(L0 ) = S , and every formula has prop-
erty P . □
CHAPTER 1. SYNTAX AND SEMANTICS 7

Proposition 1.5. Any formula in Frm(L0 ) is balanced, in that it


has as many left parentheses as right ones.

Proposition 1.6. No proper initial segment of a formula is a formula.

Proposition 1.7 (Unique Readability). Any formula A in


Frm(L0 ) has exactly one parsing as one of the following

1. ⊥.

2. pn for some pn ∈ At0 .

3. ¬B for some formula B.

4. (B ∧ C ) for some formulas B and C .

5. (B ∨ C ) for some formulas B and C .

6. (B → C ) for some formulas B and C .

Moreover, this parsing is unique.

Proof. By induction on A. For instance, suppose that A has two


distinct readings as (B →C ) and (B ′ →C ′). Then B and B ′ must
be the same (or else one would be a proper initial segment of the
other); so if the two readings of A are distinct it must be because
C and C ′ are distinct readings of the same sequence of symbols,
which is impossible by the inductive hypothesis. □

Definition 1.8 (Uniform Substitution). If A and B are formu-


las, and pi is a propositional variable, then A[B/pi ] denotes the
result of replacing each occurrence of pi by an occurrence of B
in A; similarly, the simultaneous substitution of p1 , . . . , pn by
formulas B 1 , . . . , Bn is denoted by A[B 1 /p1 , . . . ,Bn /pn ].
CHAPTER 1. SYNTAX AND SEMANTICS 8

1.4 Valuations and Satisfaction

Definition 1.9 (Valuations). Let {T, F} be the set of the two


truth values, “true” and “false.” A valuation for L0 is a func-
tion v assigning either T or F to the propositional variables of
the language, i.e., v : At0 → {T, F}.

Definition 1.10. Given a valuation v, define the evaluation func-


tion v : Frm(L0 ) → {T, F} inductively by:

v(⊥) = F;
v(pn ) = v(pn );
{︄
T if v(A) = F;
v(¬A) =
F otherwise.
{︄
T if v(A) = T and v(B) = T;
v(A ∧ B) =
F if v(A) = F or v(B) = F.
{︄
T if v(A) = T or v(B) = T;
v(A ∨ B) =
F if v(A) = F and v(B) = F.
{︄
T if v(A) = F or v(B) = T;
v(A → B) =
F if v(A) = T and v(B) = F.

The clauses correspond to the following truth tables:

A B A∧B A B A∨B
A ¬A T T T T T T
T F T F F T F T
F T F T F F T T
F F F F F F
CHAPTER 1. SYNTAX AND SEMANTICS 9

A B A→B
T T T
T F F
F T T
F F T

Theorem 1.11 (Local Determination). Suppose that v1 and v2


are valuations that agree on the propositional letters occurring in A, i.e.,
v1 (pn ) = v2 (pn ) whenever pn occurs in some formula A. Then v1 and
v2 also agree on A, i.e., v1 (A) = v2 (A).

Proof. By induction on A. □

Definition 1.12 (Satisfaction). Using the evaluation function,


we can define the notion of satisfaction of a formula A by a valu-
ation v, v ⊨ A, inductively as follows. (We write v ⊭ A to mean
“not v ⊨ A.”)

1. A ≡ ⊥: v ⊭ A.

2. A ≡ pi : v ⊨ A iff v(pi ) = T.

3. A ≡ ¬B: v ⊨ A iff v ⊭ B.

4. A ≡ (B ∧ C ): v ⊨ A iff v ⊨ B and v ⊨ C .

5. A ≡ (B ∨ C ): v ⊨ A iff v ⊨ A or v ⊨ B (or both).

6. A ≡ (B → C ): v ⊨ A iff v ⊭ B or v ⊨ C (or both).

If 𝛤 is a set of formulas, v ⊨ 𝛤 iff v ⊨ A for every A ∈ 𝛤.

Proposition 1.13. v ⊨ A iff v(A) = T.

Proof. By induction on A. □
CHAPTER 1. SYNTAX AND SEMANTICS 10

1.5 Semantic Notions


We define the following semantic notions:

Definition 1.14. 1. A formula A is satisfiable if for some v,


v ⊨ A; it is unsatisfiable if for no v, v ⊨ A;

2. A formula A is a tautology if v ⊨ A for all valuations v ;

3. A formula A is contingent if it is satisfiable but not a tautol-


ogy;

4. If 𝛤 is a set of formulas, 𝛤 ⊨ A (“𝛤 entails A”) if and only


if v ⊨ A for every valuation v for which v ⊨ 𝛤.

5. If 𝛤 is a set of formulas, 𝛤 is satisfiable if there is a valua-


tion v for which v ⊨ 𝛤, and 𝛤 is unsatisfiable otherwise.

Proposition 1.15. 1. A is a tautology if and only if ∅ ⊨ A;

2. If 𝛤 ⊨ A and 𝛤 ⊨ A → B then 𝛤 ⊨ B;

3. If 𝛤 is satisfiable then every finite subset of 𝛤 is also satisfiable;

4. Monotony: if 𝛤 ⊆ 𝛥 and 𝛤 ⊨ A then also 𝛥 ⊨ A;

5. Transitivity: if 𝛤 ⊨ A and 𝛥 ∪ {A} ⊨ B then 𝛤 ∪ 𝛥 ⊨ B.

Proof. Exercise. □

Proposition 1.16. 𝛤 ⊨ A if and only if 𝛤 ∪ {¬A} is unsatisfiable.

Proof. Exercise. □

Theorem 1.17 (Semantic Deduction Theorem). 𝛤 ⊨ A → B


if and only if 𝛤 ∪ {A} ⊨ B.

Proof. Exercise. □
CHAPTER 1. SYNTAX AND SEMANTICS 11

Problems
Problem 1.1. Prove Proposition 1.5

Problem 1.2. Prove Proposition 1.6

Problem 1.3. Give a mathematically rigorous definition of


A[B/p] by induction.

Problem 1.4. Prove Proposition 1.13

Problem 1.5. Prove Proposition 1.15

Problem 1.6. Prove Proposition 1.16

Problem 1.7. Prove Theorem 1.17


CHAPTER 2

Axiomatic
Derivations
2.1 Introduction
Logics commonly have both a semantics and a derivation system.
The semantics concerns concepts such as truth, satisfiability, va-
lidity, and entailment. The purpose of derivation systems is to
provide a purely syntactic method of establishing entailment and
validity. They are purely syntactic in the sense that a derivation
in such a system is a finite syntactic object, usually a sequence
(or other finite arrangement) of sentences or formulas. Good
derivation systems have the property that any given sequence or
arrangement of sentences or formulas can be verified mechani-
cally to be “correct.”
The simplest (and historically first) derivation systems for
first-order logic were axiomatic. A sequence of formulas counts
as a derivation in such a system if each individual formula in it
is either among a fixed set of “axioms” or follows from formulas
coming before it in the sequence by one of a fixed number of “in-
ference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulas
by one of the inference rules. Axiomatic derivation systems are
easy to describe—and also easy to handle meta-theoretically—

12
CHAPTER 2. AXIOMATIC DERIVATIONS 13

but derivations in them are hard to read and understand, and


are also hard to produce.
Other derivation systems have been developed with the aim
of making it easier to construct derivations or easier to under-
stand derivations once they are complete. Examples are natural
deduction, truth trees, also known as tableaux proofs, and the se-
quent calculus. Some derivation systems are designed especially
with mechanization in mind, e.g., the resolution method is easy
to implement in software (but its derivations are essentially im-
possible to understand). Most of these other derivation systems
represent derivations as trees of formulas rather than sequences.
This makes it easier to see which parts of a derivation depend on
which other parts.
So for a given logic, such as first-order logic, the different
derivation systems will give different explications of what it is for
a sentence to be a theorem and what it means for a sentence to be
derivable from some others. However that is done (via axiomatic
derivations, natural deductions, sequent derivations, truth trees,
resolution refutations), we want these relations to match the se-
mantic notions of validity and entailment. Let’s write ⊢ A for “A is
a theorem” and “𝛤 ⊢ A” for “A is derivable from 𝛤.” However
⊢ is defined, we want it to match up with ⊨, that is:

1. ⊢ A if and only if ⊨ A

2. 𝛤 ⊢ A if and only if 𝛤 ⊨ A

The “only if” direction of the above is called soundness. A deriva-


tion system is sound if derivability guarantees entailment (or va-
lidity). Every decent derivation system has to be sound; unsound
derivation systems are not useful at all. After all, the entire pur-
pose of a derivation is to provide a syntactic guarantee of validity
or entailment. We’ll prove soundness for the derivation systems
we present.
The converse “if” direction is also important: it is called com-
pleteness. A complete derivation system is strong enough to show
CHAPTER 2. AXIOMATIC DERIVATIONS 14

that A is a theorem whenever A is valid, and that 𝛤 ⊢ A when-


ever 𝛤 ⊨ A. Completeness is harder to establish, and some logics
have no complete derivation systems. First-order logic does. Kurt
Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is
that of consistency. A set of sentences is called inconsistent if any-
thing whatsoever can be derived from it, and consistent other-
wise. Inconsistency is the syntactic counterpart to unsatisfiablity:
like unsatisfiable sets, inconsistent sets of sentences do not make
good theories, they are defective in a fundamental way. Consis-
tent sets of sentences may not be true or useful, but at least they
pass that minimal threshold of logical usefulness. For different
derivation systems the specific definition of consistency of sets of
sentences might differ, but like ⊢, we want consistency to coincide
with its semantic counterpart, satisfiability. We want it to always
be the case that 𝛤 is consistent if and only if it is satisfiable. Here,
the “if” direction amounts to completeness (consistency guaran-
tees satisfiability), and the “only if” direction amounts to sound-
ness (satisfiability guarantees consistency). In fact, for classical
first-order logic, the two versions of soundness and completeness
are equivalent.

2.2 Axiomatic Derivations


Axiomatic derivations are the oldest and simplest logical deriva-
tion systems. Its derivations are simply sequences of sentences.
A sequence of sentences counts as a correct derivation if every
sentence A in it satisfies one of the following conditions:

1. A is an axiom, or

2. A is an element of a given set 𝛤 of sentences, or

3. A is justified by a rule of inference.


CHAPTER 2. AXIOMATIC DERIVATIONS 15

To be an axiom, A has to have the form of one of a number of fixed


sentence schemas. There are many sets of axiom schemas that
provide a satisfactory (sound and complete) derivation system for
first-order logic. Some are organized according to the connectives
they govern, e.g., the schemas

A → (B → A) B → (B ∨ C ) (B ∧ C ) → B

are common axioms that govern →, ∨ and ∧. Some axiom sys-


tems aim at a minimal number of axioms. Depending on the
connectives that are taken as primitives, it is even possible to
find axiom systems that consist of a single axiom.
A rule of inference is a conditional statement that gives a
sufficient condition for a sentence in a derivation to be justified.
Modus ponens is one very common such rule: it says that if A
and A → B are already justified, then B is justified. This means
that a line in a derivation containing the sentence B is justified,
provided that both A and A → B (for some sentence A) appear
in the derivation before B.
The ⊢ relation based on axiomatic derivations is defined as
follows: 𝛤 ⊢ A iff there is a derivation with the sentence A as
its last formula (and 𝛤 is taken as the set of sentences in that
derivation which are justified by (2) above). A is a theorem if A
has a derivation where 𝛤 is empty, i.e., every sentence in the
derivation is justfied either by (1) or (3). For instance, here is
a derivation that shows that ⊢ A → (B → (B ∨ A)):
1. B → (B ∨ A)
2. (B → (B ∨ A)) → (A → (B → (B ∨ A)))
3. A → (B → (B ∨ A))

The sentence on line 1 is of the form of the axiom A → (A ∨ B)


(with the roles of A and B reversed). The sentence on line 2 is of
the form of the axiom A → (B →A). Thus, both lines are justified.
Line 3 is justified by modus ponens: if we abbreviate it as D, then
line 2 has the form C → D, where C is B → (B ∨ A), i.e., line 1.
CHAPTER 2. AXIOMATIC DERIVATIONS 16

A set 𝛤 is inconsistent if 𝛤 ⊢ ⊥. A complete axiom system


will also prove that ⊥ → A for any A, and so if 𝛤 is inconsistent,
then 𝛤 ⊢ A for any A.
Systems of axiomatic derivations for logic were first given by
Gottlob Frege in his 1879 Begriffsschrift, which for this reason is
often considered the first work of modern logic. They were per-
fected in Alfred North Whitehead and Bertrand Russell’s Prin-
cipia Mathematica and by David Hilbert and his students in the
1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find
an axiomatic system for a logic. Because derivations have a very
simple structure and only one or two inference rules, it is also rel-
atively easy to prove things about them. However, they are very
hard to use in practice, i.e., it is difficult to find and write proofs.

2.3 Rules and Derivations


Axiomatic derivations are perhaps the simplest derivation system
for logic. A derivation is just a sequence of formulas. To count
as a derivation, every formula in the sequence must either be an
instance of an axiom, or must follow from one or more formulas
that precede it in the sequence by a rule of inference. A derivation
derives its last formula.

Definition 2.1 (Derivability). If 𝛤 is a set of formulas of L


then a derivation from 𝛤 is a finite sequence A1 , . . . , An of formulas
where for each i ≤ n one of the following holds:

1. Ai ∈ 𝛤; or

2. Ai is an axiom; or

3. Ai follows from some A j (and Ak ) with j < i (and k < i )


by a rule of inference.

What counts as a correct derivation depends on which infer-


ence rules we allow (and of course what we take to be axioms).
CHAPTER 2. AXIOMATIC DERIVATIONS 17

And an inference rule is an if-then statement that tells us that,


under certain conditions, a step Ai in a derivation is a correct
inference step.

Definition 2.2 (Rule of inference). A rule of inference gives a


sufficient condition for what counts as a correct inference step in
a derivation from 𝛤.

For instance, since any one-element sequence A with A ∈ 𝛤


trivially counts as a derivation, the following might be a very
simple rule of inference:

If A ∈ 𝛤, then A is always a correct inference step in


any derivation from 𝛤.

Similarly, if A is one of the axioms, then A by itself is a derivation,


and so this is also a rule of inference:

If A is an axiom, then A is a correct inference step.

It gets more interesting if the rule of inference appeals to formulas


that appear before the step considered. The following rule is
called modus ponens:

If B → A and B occur higher up in the derivation,


then A is a correct inference step.

If this is the only rule of inference, then our definition of deriva-


tion above amounts to this: A1 , . . . , An is a derivation iff for each
i ≤ n one of the following holds:

1. Ai ∈ 𝛤; or

2. Ai is an axiom; or

3. for some j < i , A j is B → Ai , and for some k < i , Ak is B.


CHAPTER 2. AXIOMATIC DERIVATIONS 18

The last clause says that Ai follows from A j (B) and Ak (B → Ai )


by modus ponens. If we can go from 1 to n, and each time we
find a formula Ai that is either in 𝛤, an axiom, or which a rule of
inference tells us that it is a correct inference step, then the entire
sequence counts as a correct derivation.

Definition 2.3 (Derivability). A formula A is derivable from 𝛤,


written 𝛤 ⊢ A, if there is a derivation from 𝛤 ending in A.

Definition 2.4 (Theorems). A formula A is a theorem if there


is a derivation of A from the empty set. We write ⊢ A if A is a
theorem and ⊬ A if it is not.

2.4 Axiom and Rules for the Propositional


Connectives
Definition 2.5 (Axioms). The set of Ax0 of axioms for the
propositional connectives comprises all formulas of the following
forms:

(A ∧ B) → A (2.1)
(A ∧ B) → B (2.2)
A → (B → (A ∧ B)) (2.3)
A → (A ∨ B) (2.4)
A → (B ∨ A) (2.5)
(A → C ) → ((B → C ) → ((A ∨ B) → C )) (2.6)
A → (B → A) (2.7)
(A → (B → C )) → ((A → B) → (A → C )) (2.8)
(A → B) → ((A → ¬B) → ¬A) (2.9)
¬A → (A → B) (2.10)
⊤ (2.11)
CHAPTER 2. AXIOMATIC DERIVATIONS 19

⊥→A (2.12)
(A → ⊥) → ¬A (2.13)
¬¬A → A (2.14)

Definition 2.6 (Modus ponens). If B and B →A already occur


in a derivation, then A is a correct inference step.

We’ll abbreviate the rule modus ponens as “mp.”

2.5 Examples of Derivations


Example 2.7. Suppose we want to prove (¬D ∨ E) → (D → E).
Clearly, this is not an instance of any of our axioms, so we have
to use the mp rule to derive it. Our only rule is MP, which given
A and A → B allows us to justify B. One strategy would be to use
eq. (2.6) with A being ¬D, B being E, and C being D → E, i.e.,
the instance

(¬D → (D → E)) → ((E → (D → E)) → ((¬D ∨ E) → (D → E))).

Why? Two applications of MP yield the last part, which is what


we want. And we easily see that ¬D → (D → E) is an instance of
eq. (2.10), and E → (D → E) is an instance of eq. (2.7). So our
derivation is:
1. ¬D → (D → E) eq. (2.10)
2. (¬D → (D → E)) →
((E → (D → E)) → ((¬D ∨ E) → (D → E))) eq. (2.6)
3. ((E → (D → E)) → ((¬D ∨ E) → (D → E)) 1, 2, mp
4. E → (D → E) eq. (2.7)
5. (¬D ∨ E) → (D → E) 3, 4, mp

Example 2.8. Let’s try to find a derivation of D →D. It is not an


instance of an axiom, so we have to use mp to derive it. eq. (2.7)
is an axiom of the form A → B to which we could apply mp. To
CHAPTER 2. AXIOMATIC DERIVATIONS 20

be useful, of course, the B which mp would justify as a correct


step in this case would have to be D → D, since this is what we
want to derive. That means A would also have to be D, i.e., we
might look at this instance of eq. (2.7):

D → (D → D)

In order to apply mp, we would also need to justify the corre-


sponding second premise, namely A. But in our case, that would
be D, and we won’t be able to derive D by itself. So we need a
different strategy.
The other axiom involving just → is eq. (2.8), i.e.,

(A → (B → C )) → ((A → B) → (A → C ))

We could get to the last nested conditional by applying mp twice.


Again, that would mean that we want an instance of eq. (2.8)
where A → C is D → D, the formula we are aiming for. Then of
course, A and C are both D. How should we pick B so that both
A → (B → C ) and A → B, i.e., in our case D → (B → D) and
D → B, are also derivable? Well, the first of these is already an
instance of eq. (2.7), whatever we decide B to be. And D → B
would be another instance of eq. (2.7) if B were (D → D). So,
our derivation is:
1. D → ((D → D) → D) eq. (2.7)
2. (D → ((D → D) → D)) →
((D → (D → D)) → (D → D)) eq. (2.8)
3. (D → (D → D)) → (D → D) 1, 2, mp
4. D → (D → D) eq. (2.7)
5. D →D 3, 4, mp

Example 2.9. Sometimes we want to show that there is a deriva-


tion of some formula from some other formulas 𝛤. For instance,
let’s show that we can derive A → C from 𝛤 = {A → B,B → C }.
CHAPTER 2. AXIOMATIC DERIVATIONS 21

1. A→B Hyp
2. B →C Hyp
3. (B → C ) → (A → (B → C )) eq. (2.7)
4. A → (B → C ) 2, 3, mp
5. (A → (B → C )) →
((A → B) → (A → C )) eq. (2.8)
6. ((A → B) → (A → C )) 4, 5, mp
7. A →C 1, 6, mp

The lines labelled “Hyp” (for “hypothesis”) indicate that the for-
mula on that line is an element of 𝛤.

Proposition 2.10. If 𝛤 ⊢ A → B and 𝛤 ⊢ B → C , then 𝛤 ⊢ A → C

Proof. Suppose 𝛤 ⊢ A →B and 𝛤 ⊢ B →C . Then there is a deriva-


tion of A → B from 𝛤; and a derivation of B → C from 𝛤 as well.
Combine these into a single derivation by concatenating them.
Now add lines 3–7 of the derivation in the preceding example.
This is a derivation of A → C —which is the last line of the new
derivation—from 𝛤. Note that the justifications of lines 4 and 7
remain valid if the reference to line number 2 is replaced by ref-
erence to the last line of the derivation of A → B, and reference
to line number 1 by reference to the last line of the derivation
of B → C . □

2.6 Proof-Theoretic Notions


Just as we’ve defined a number of important semantic notions
(tautology, entailment, satisfiabilty), we now define correspond-
ing proof-theoretic notions. These are not defined by appeal to satis-
faction of sentences in structures, but by appeal to the derivability
or non-derivability of certain formulas. It was an important dis-
covery that these notions coincide. That they do is the content
of the soundness and completeness theorems.
CHAPTER 2. AXIOMATIC DERIVATIONS 22

Definition 2.11 (Derivability). A formula A is derivable from


𝛤, written 𝛤 ⊢ A, if there is a derivation from 𝛤 ending in A.

Definition 2.12 (Theorems). A formula A is a theorem if there


is a derivation of A from the empty set. We write ⊢ A if A is a
theorem and ⊬ A if it is not.

Definition 2.13 (Consistency). A set 𝛤 of formulas is consistent


if and only if 𝛤 ⊬ ⊥; it is inconsistent otherwise.

Proposition 2.14 (Reflexivity). If A ∈ 𝛤, then 𝛤 ⊢ A.

Proof. The formula A by itself is a derivation of A from 𝛤. □

Proposition 2.15 (Monotony). If 𝛤 ⊆ 𝛥 and 𝛤 ⊢ A, then 𝛥 ⊢ A.

Proof. Any derivation of A from 𝛤 is also a derivation of A


from 𝛥. □

Proposition 2.16 (Transitivity). If 𝛤 ⊢ A and {A} ∪ 𝛥 ⊢ B, then


𝛤 ∪ 𝛥 ⊢ B.

Proof. Suppose {A} ∪ 𝛥 ⊢ B. Then there is a derivation B 1 , . . . ,


Bl = B from {A} ∪ 𝛥. Some of the steps in that derivation will be
correct because of a rule which refers to a prior line Bi = A. By
hypothesis, there is a derivation of A from 𝛤, i.e., a derivation A1 ,
. . . , Ak = A where every Ai is an axiom, an element of 𝛤, or
correct by a rule of inference. Now consider the sequence

A1 , . . . ,Ak = A,B 1 , . . . ,Bl = B .

This is a correct derivation of B from 𝛤 ∪ 𝛥 since every Bi = A


is now justified by the same rule which justifies Ak = A. □
CHAPTER 2. AXIOMATIC DERIVATIONS 23

Note that this means that in particular if 𝛤 ⊢ A and A ⊢ B,


then 𝛤 ⊢ B. It follows also that if A1 , . . . ,An ⊢ B and 𝛤 ⊢ Ai for
each i , then 𝛤 ⊢ B.

Proposition 2.17. 𝛤 is inconsistent iff 𝛤 ⊢ A for every A.

Proof. Exercise. □

Proposition 2.18 (Compactness). 1. If 𝛤 ⊢ A then there is a


finite subset 𝛤0 ⊆ 𝛤 such that 𝛤0 ⊢ A.

2. If every finite subset of 𝛤 is consistent, then 𝛤 is consistent.

Proof. 1. If 𝛤 ⊢ A, then there is a finite sequence of formulas


A1 , . . . , An so that A ≡ An and each Ai is either a logical
axiom, an element of 𝛤 or follows from previous formulas
by modus ponens. Take 𝛤0 to be those Ai which are in 𝛤.
Then the derivation is likewise a derivation from 𝛤0 , and
so 𝛤0 ⊢ A.

2. This is the contrapositive of (1) for the special case A ≡ ⊥.


2.7 The Deduction Theorem


As we’ve seen, giving derivations in an axiomatic system is cum-
bersome, and derivations may be hard to find. Rather than actu-
ally write out long lists of formulas, it is generally easier to argue
that such derivations exist, by making use of a few simple results.
We’ve already established three such results: Proposition 2.14
says we can always assert that 𝛤 ⊢ A when we know that A ∈ 𝛤.
Proposition 2.15 says that if 𝛤 ⊢ A then also 𝛤 ∪ {B } ⊢ A. And
Proposition 2.16 implies that if 𝛤 ⊢ A and A ⊢ B, then 𝛤 ⊢ B.
Here’s another simple result, a “meta”-version of modus ponens:
CHAPTER 2. AXIOMATIC DERIVATIONS 24

Proposition 2.19. If 𝛤 ⊢ A and 𝛤 ⊢ A → B, then 𝛤 ⊢ B.

Proof. We have that {A,A → B } ⊢ B:

1. A Hyp.
2. A→B Hyp.
3. B 1, 2, MP

By Proposition 2.16, 𝛤 ⊢ B. □

The most important result we’ll use in this context is the de-
duction theorem:

Theorem 2.20 (Deduction Theorem). 𝛤 ∪ {A} ⊢ B if and only


if 𝛤 ⊢ A → B.

Proof. The “if” direction is immediate. If 𝛤 ⊢ A → B then also


𝛤 ∪ {A} ⊢ A → B by Proposition 2.15. Also, 𝛤 ∪ {A} ⊢ A by
Proposition 2.14. So, by Proposition 2.19, 𝛤 ∪ {A} ⊢ B.
For the “only if” direction, we proceed by induction on the
length of the derivation of B from 𝛤 ∪ {A}.
For the induction basis, we prove the claim for every deriva-
tion of length 1. A derivation of B from 𝛤 ∪ {A} of length 1
consists of B by itself; and if it is correct B is either ∈ 𝛤 ∪ {A}
or is an axiom. If B ∈ 𝛤 or is an axiom, then 𝛤 ⊢ B. We also
have that 𝛤 ⊢ B → (A → B) by eq. (2.7), and Proposition 2.19
gives 𝛤 ⊢ A → B. If B ∈ {A} then 𝛤 ⊢ A → B because then last
sentence A → B is the same as A → A, and we have derived that
in Example 2.8.
For the inductive step, suppose a derivation of B from 𝛤 ∪{A}
ends with a step B which is justified by modus ponens. (If it
is not justified by modus ponens, B ∈ 𝛤, B ≡ A, or B is an
axiom, and the same reasoning as in the induction basis applies.)
Then some previous steps in the derivation are C → B and C , for
some formula C , i.e., 𝛤 ∪ {A} ⊢ C → B and 𝛤 ∪ {A} ⊢ C , and
CHAPTER 2. AXIOMATIC DERIVATIONS 25

the respective derivations are shorter, so the inductive hypothesis


applies to them. We thus have both:

𝛤 ⊢ A → (C → B);
𝛤 ⊢ A → C.

But also

𝛤 ⊢ (A → (C → B)) → ((A → C ) → (A → B)),

by eq. (2.8), and two applications of Proposition 2.19 give 𝛤 ⊢


A → B, as required. □

Notice how eq. (2.7) and eq. (2.8) were chosen precisely so
that the Deduction Theorem would hold.
The following are some useful facts about derivability, which
we leave as exercises.

Proposition 2.21. 1. ⊢ (A → B) → ((B → C ) → (A → C );

2. If 𝛤 ∪ {¬A} ⊢ ¬B then 𝛤 ∪ {B } ⊢ A (Contraposition);

3. {A, ¬A} ⊢ B (Ex Falso Quodlibet, Explosion);

4. {¬¬A} ⊢ A (Double Negation Elimination);

5. If 𝛤 ⊢ ¬¬A then 𝛤 ⊢ A;

Problems
Problem 2.1. Show that the following hold by exhibiting deriva-
tions from the axioms:

1. (A ∧ B) → (B ∧ A)

2. ((A ∧ B) → C ) → (A → (B → C ))

3. ¬(A ∨ B) → ¬A
CHAPTER 2. AXIOMATIC DERIVATIONS 26

Problem 2.2. Prove Proposition 2.17.

Problem 2.3. Prove Proposition 2.21


CHAPTER 3

Sequent
calculus
3.1 The Sequent Calculus
While many derivation systems operate with arrangements of sen-
tences, the sequent calculus operates with sequents. A sequent is
an expression of the form

A1 , . . . ,Am ⇒ B 1 , . . . ,Bm ,

that is a pair of sequences of sentences, separated by the sequent


symbol ⇒. Either sequence may be empty. A derivation in the se-
quent calculus is a tree of sequents, where the topmost sequents
are of a special form (they are called “initial sequents” or “ax-
ioms”) and every other sequent follows from the sequents imme-
diately above it by one of the rules of inference. The rules of in-
ference either manipulate the sentences in the sequents (adding,
removing, or rearranging them on either the left or the right), or
they introduce a complex formula in the conclusion of the rule.
For instance, the ∧L rule allows the inference from A, 𝛤 ⇒ 𝛥 to
A ∧ B, 𝛤 ⇒ 𝛥, and the →R allows the inference from A, 𝛤 ⇒ 𝛥,B
to 𝛤 ⇒ 𝛥,A → B, for any 𝛤, 𝛥, A, and B. (In particular, 𝛤 and 𝛥
may be empty.)

27
CHAPTER 3. SEQUENT CALCULUS 28

The ⊢ relation based on the sequent calculus is defined as


follows: 𝛤 ⊢ A iff there is some sequence 𝛤0 such that every A in
𝛤0 is in 𝛤 and there is a derivation with the sequent 𝛤0 ⇒ A at its
root. A is a theorem in the sequent calculus if the sequent ⇒ A
has a derivation. For instance, here is a derivation that shows
that ⊢ (A ∧ B) → A:
A ⇒ A
∧L
A∧B ⇒ A
→R
⇒ (A ∧ B) → A

A set 𝛤 is inconsistent in the sequent calculus if there is


a derivation of 𝛤0 ⇒ (where every A ∈ 𝛤0 is in 𝛤 and the right
side of the sequent is empty). Using the rule WR, any sentence
can be derived from an inconsistent set.
The sequent calculus was invented in the 1930s by Gerhard
Gentzen. Because of its systematic and symmetric design, it is
a very useful formalism for developing a theory of derivations.
It is relatively easy to find derivations in the sequent calculus,
but these derivations are often hard to read and their connection
to proofs are sometimes not easy to see. It has proved to be a
very elegant approach to derivation systems, however, and many
logics have sequent calculus systems.

3.2 Rules and Derivations


For the following, let 𝛤, 𝛥, 𝛱 , 𝛬 represent finite sequences of sen-
tences.

Definition 3.1 (Sequent). A sequent is an expression of the form

𝛤⇒𝛥

where 𝛤 and 𝛥 are finite (possibly empty) sequences of sentences


of the language L. 𝛤 is called the antecedent, while 𝛥 is the succe-
dent.
CHAPTER 3. SEQUENT CALCULUS 29

The intuitive idea behind a sequent is: if all of the sen-


tences in the antecedent hold, then at least one of the sen-
tences in the succedent holds. That is, if 𝛤 = ⟨A1 , . . . ,Am ⟩ and
𝛥 = ⟨B 1 , . . . ,Bn ⟩, then 𝛤 ⇒ 𝛥 holds iff

(A1 ∧ · · · ∧ Am ) → (B 1 ∨ · · · ∨ Bn )

holds. There are two special cases: where 𝛤 is empty and when
𝛥 is empty. When 𝛤 is empty, i.e., m = 0, ⇒ 𝛥 holds iff B 1 ∨· · ·∨
Bn holds. When 𝛥 is empty, i.e., n = 0, 𝛤 ⇒ holds iff ¬(A1 ∧
· · · ∧ Am ) does. We say a sequent is valid iff the corresponding
sentence is valid.
If 𝛤 is a sequence of sentences, we write 𝛤,A for the result
of appending A to the right end of 𝛤 (and A, 𝛤 for the result of
appending A to the left end of 𝛤). If 𝛥 is a sequence of sentences
also, then 𝛤, 𝛥 is the concatenation of the two sequences.

Definition 3.2 (Initial Sequent). An initial sequent is a sequent


of one of the following forms:

1. A ⇒ A

2. ⊥ ⇒

for any sentence A in the language.

Derivations in the sequent calculus are certain trees of se-


quents, where the topmost sequents are initial sequents, and if
a sequent stands below one or two other sequents, it must fol-
low correctly by a rule of inference. The rules for LK are divided
into two main types: logical rules and structural rules. The logical
rules are named for the main operator of the sentence contain-
ing A and/or B in the lower sequent. Each one comes in two
versions, one for inferring a sequent with the sentence contain-
ing the logical operator on the left, and one with the sentence on
the right.
CHAPTER 3. SEQUENT CALCULUS 30

3.3 Propositional Rules


Rules for ¬
𝛤 ⇒ 𝛥,A A, 𝛤 ⇒ 𝛥
¬L ¬R
¬A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥, ¬A

Rules for ∧
A, 𝛤 ⇒ 𝛥
∧L
A ∧ B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A 𝛤 ⇒ 𝛥,B
∧R
B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A ∧ B
∧L
A ∧ B, 𝛤 ⇒ 𝛥

Rules for ∨
𝛤 ⇒ 𝛥,A
∨R
A, 𝛤 ⇒ 𝛥 B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A ∨ B
∨L
A ∨ B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,B
∨R
𝛤 ⇒ 𝛥,A ∨ B

Rules for →
𝛤 ⇒ 𝛥,A B, 𝛱 ⇒ 𝛬 A, 𝛤 ⇒ 𝛥,B
→L →R
A → B, 𝛤, 𝛱 ⇒ 𝛥, 𝛬 𝛤 ⇒ 𝛥,A → B

3.4 Structural Rules


We also need a few rules that allow us to rearrange sentences in
the left and right side of a sequent. Since the logical rules require
that the sentences in the premise which the rule acts upon stand
either to the far left or to the far right, we need an “exchange”
rule that allows us to move sentences to the right position. It’s
also important sometimes to be able to combine two identical
sentences into one, and to add a sentence on either side.
CHAPTER 3. SEQUENT CALCULUS 31

Weakening

𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥
WL WR
A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A

Contraction
A,A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A,A
CL CR
A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A

Exchange

𝛤,A,B, 𝛱 ⇒ 𝛥 𝛤 ⇒ 𝛥,A,B, 𝛬
XL XR
𝛤,B,A, 𝛱 ⇒ 𝛥 𝛤 ⇒ 𝛥,B,A, 𝛬

A series of weakening, contraction, and exchange inferences


will often be indicated by double inference lines.
The following rule, called “cut,” is not strictly speaking nec-
essary, but makes it a lot easier to reuse and combine deriva-
tions.

𝛤 ⇒ 𝛥,A A, 𝛱 ⇒ 𝛬
Cut
𝛤, 𝛱 ⇒ 𝛥, 𝛬

3.5 Derivations
We’ve said what an initial sequent looks like, and we’ve given
the rules of inference. Derivations in the sequent calculus are
inductively generated from these: each derivation either is an
initial sequent on its own, or consists of one or two derivations
followed by an inference.

Definition 3.3 (LK derivation). An LK-derivation of a se-


quent S is a tree of sequents satisfying the following conditions:
CHAPTER 3. SEQUENT CALCULUS 32

1. The topmost sequents of the tree are initial sequents.

2. The bottommost sequent of the tree is S .

3. Every sequent in the tree except S is a premise of a correct


application of an inference rule whose conclusion stands
directly below that sequent in the tree.

We then say that S is the end-sequent of the derivation and that S


is derivable in LK (or LK-derivable).

Example 3.4. Every initial sequent, e.g., C ⇒ C is a derivation.


We can obtain a new derivation from this by applying, say, the
WL rule,
𝛤 ⇒ 𝛥
WL
A, 𝛤 ⇒ 𝛥

The rule, however, is meant to be general: we can replace the A


in the rule with any sentence, e.g., also with D. If the premise
matches our initial sequent C ⇒ C , that means that both 𝛤 and
𝛥 are just C , and the conclusion would then be D,C ⇒ C . So,
the following is a derivation:
C ⇒C
WL
D,C ⇒ C

We can now apply another rule, say XL, which allows us to switch
two sentences on the left. So, the following is also a correct
derivation:
C ⇒C
WL
D,C ⇒ C
XL
C ,D ⇒ C

In this application of the rule, which was given as


𝛤,A,B, 𝛱 ⇒ 𝛥
XL
𝛤,B,A, 𝛱 ⇒ 𝛥,
CHAPTER 3. SEQUENT CALCULUS 33

both 𝛤 and 𝛱 were empty, 𝛥 is C , and the roles of A and B are


played by D and C , respectively. In much the same way, we also
see that
D ⇒ D
WL
C ,D ⇒ D

is a derivation. Now we can take these two derivations, and com-


bine them using ∧R. That rule was
𝛤 ⇒ 𝛥,A 𝛤 ⇒ 𝛥,B
∧R
𝛤 ⇒ 𝛥,A ∧ B

In our case, the premises must match the last sequents of the
derivations ending in the premises. That means that 𝛤 is C ,D, 𝛥
is empty, A is C and B is D. So the conclusion, if the inference
should be correct, is C ,D ⇒ C ∧ D.
C ⇒C
WL
D,C ⇒ C D ⇒ D
XL WL
C ,D ⇒ C C ,D ⇒ D
∧R
C ,D ⇒ C ∧ D

Of course, we can also reverse the premises, then A would be D


and B would be C .
C ⇒C
WL
D ⇒ D D,C ⇒ C
WL XL
C ,D ⇒ D C ,D ⇒ C
∧R
C ,D ⇒ D ∧ C

3.6 Examples of Derivations


Example 3.5. Give an LK-derivation for the sequent A ∧B ⇒ A.
We begin by writing the desired end-sequent at the bottom of
the derivation.

A∧B ⇒ A
CHAPTER 3. SEQUENT CALCULUS 34

Next, we need to figure out what kind of inference could have


a lower sequent of this form. This could be a structural rule,
but it is a good idea to start by looking for a logical rule. The
only logical connective occurring in the lower sequent is ∧, so
we’re looking for an ∧ rule, and since the ∧ symbol occurs in the
antecedent, we’re looking at the ∧L rule.
∧L
A∧B ⇒ A

There are two options for what could have been the upper sequent
of the ∧L inference: we could have an upper sequent of A ⇒ A,
or of B ⇒ A. Clearly, A ⇒ A is an initial sequent (which is a
good thing), while B ⇒ A is not derivable in general. We fill in
the upper sequent:
A ⇒ A
∧L
A∧B ⇒ A

We now have a correct LK-derivation of the sequent A ∧ B ⇒ A.

Example 3.6. Give an LK-derivation for the sequent ¬A ∨ B ⇒


A → B.
Begin by writing the desired end-sequent at the bottom of the
derivation.

¬A ∨ B ⇒ A → B

To find a logical rule that could give us this end-sequent, we look


at the logical connectives in the end-sequent: ¬, ∨, and →. We
only care at the moment about ∨ and → because they are main
operators of sentences in the end-sequent, while ¬ is inside the
scope of another connective, so we will take care of it later. Our
options for logical rules for the final inference are therefore the
∨L rule and the →R rule. We could pick either rule, really, but
let’s pick the →R rule (if for no reason other than it allows us
to put off splitting into two branches). According to the form of
→R inferences which can yield the lower sequent, this must look
like:
CHAPTER 3. SEQUENT CALCULUS 35

A, ¬A ∨ B ⇒ B
→R
¬A ∨ B ⇒ A → B

If we move ¬A ∨ B to the outside of the antecedent, we can


apply the ∨L rule. According to the schema, this must split into
two upper sequents as follows:

¬A,A ⇒ B B,A ⇒ B
∨L
¬A ∨ B,A ⇒ B
XR
A, ¬A ∨ B ⇒ B
→R
¬A ∨ B ⇒ A→B

Remember that we are trying to wind our way up to initial se-


quents; we seem to be pretty close! The right branch is just one
weakening and one exchange away from an initial sequent and
then it is done:
B ⇒ B
WL
A,B ⇒ B
XL
¬A,A ⇒ B B,A ⇒ B
∨L
¬A ∨ B,A ⇒ B
XR
A, ¬A ∨ B ⇒ B
→R
¬A ∨ B ⇒ A → B

Now looking at the left branch, the only logical connective


in any sentence is the ¬ symbol in the antecedent sentences, so
we’re looking at an instance of the ¬L rule.
B ⇒ B
WL
A ⇒ B,A A,B ⇒ B
¬L XL
¬A,A ⇒ B B,A ⇒ B
∨L
¬A ∨ B,A ⇒ B
XR
A, ¬A ∨ B ⇒ B
→R
¬A ∨ B ⇒ A→B

Similarly to how we finished off the right branch, we are just


one weakening and one exchange away from finishing off this
left branch as well.
CHAPTER 3. SEQUENT CALCULUS 36

A ⇒ A
WR
A ⇒ A,B B ⇒ B
XR WL
A ⇒ B,A A,B ⇒ B
¬L XL
¬A,A ⇒ B B,A ⇒ B
∨L
¬A ∨ B,A ⇒ B
XR
A, ¬A ∨ B ⇒ B
→R
¬A ∨ B ⇒ A→B

Example 3.7. Give an LK-derivation of the sequent ¬A ∨ ¬B ⇒


¬(A ∧ B)
Using the techniques from above, we start by writing the de-
sired end-sequent at the bottom.

¬A ∨ ¬B ⇒ ¬(A ∧ B)

The available main connectives of sentences in the end-sequent


are the ∨ symbol and the ¬ symbol. It would work to apply either
the ∨L or the ¬R rule here, but we start with the ¬R rule because
it avoids splitting up into two branches for a moment:

A ∧ B, ¬A ∨ ¬B ⇒
¬R
¬A ∨ ¬B ⇒ ¬(A ∧ B)

Now we have a choice of whether to look at the ∧L or the ∨L


rule. Let’s see what happens when we apply the ∧L rule: we have
a choice to start with either the sequent A, ¬A ∨ B ⇒ or the
sequent B, ¬A ∨ B ⇒ . Since the derivation is symmetric with
regards to A and B, let’s go with the former:

A, ¬A ∨ ¬B ⇒
∧L
A ∧ B, ¬A ∨ ¬B ⇒
¬R
¬A ∨ ¬B ⇒ ¬(A ∧ B)

Continuing to fill in the derivation, we see that we run into a


problem:
CHAPTER 3. SEQUENT CALCULUS 37

?
A ⇒ A A ⇒ B
¬L ¬L
¬A,A ⇒ ¬B,A ⇒
∨L
¬A ∨ ¬B,A ⇒
XL
A, ¬A ∨ ¬B ⇒
∧L
A ∧ B, ¬A ∨ ¬B ⇒
¬R
¬A ∨ ¬B ⇒ ¬(A ∧ B)

The top of the right branch cannot be reduced any further, and
it cannot be brought by way of structural inferences to an initial
sequent, so this is not the right path to take. So clearly, it was a
mistake to apply the ∧L rule above. Going back to what we had
before and carrying out the ∨L rule instead, we get

¬A,A ∧ B ⇒ ¬B,A ∧ B ⇒
∨L
¬A ∨ ¬B,A ∧ B ⇒
XL
A ∧ B, ¬A ∨ ¬B ⇒
¬R
¬A ∨ ¬B ⇒ ¬(A ∧ B)

Completing each branch as we’ve done before, we get


A ⇒ A B ⇒ B
∧L ∧L
A∧B ⇒ A A∧B ⇒ B
¬L ¬L
¬A,A ∧ B ⇒ ¬B,A ∧ B ⇒
∨L
¬A ∨ ¬B,A ∧ B ⇒
XL
A ∧ B, ¬A ∨ ¬B ⇒
¬R
¬A ∨ ¬B ⇒ ¬(A ∧ B)

(We could have carried out the ∧ rules lower than the ¬ rules in
these steps and still obtained a correct derivation).

Example 3.8. So far we haven’t used the contraction rule, but


it is sometimes required. Here’s an example where that happens.
Suppose we want to prove ⇒ A ∨ ¬A. Applying ∨R backwards
would give us one of these two derivations:

A ⇒
¬R
⇒ A ⇒ ¬A
∨R ∨R
⇒ A ∨ ¬A ⇒ A ∨ ¬A
CHAPTER 3. SEQUENT CALCULUS 38

Neither of these of course ends in an initial sequent. The trick


is to realize that the contraction rule allows us to combine two
copies of a sentence into one—and when we’re searching for a
proof, i.e., going from bottom to top, we can keep a copy of
A ∨ ¬A in the premise, e.g.,

⇒ A ∨ ¬A,A
∨R
⇒ A ∨ ¬A,A ∨ ¬A
CR
⇒ A ∨ ¬A

Now we can apply ∨R a second time, and also get ¬A, which
leads to a complete derivation.
A ⇒ A
¬R
⇒ A, ¬A
∨R
⇒ A,A ∨ ¬A
XR
⇒ A ∨ ¬A,A
∨R
⇒ A ∨ ¬A,A ∨ ¬A
CR
⇒ A ∨ ¬A

3.7 Proof-Theoretic Notions


Just as we’ve defined a number of important semantic notions
(validity, entailment, satisfiabilty), we now define corresponding
proof-theoretic notions. These are not defined by appeal to satisfac-
tion of sentences in structures, but by appeal to the derivability
or non-derivability of certain sequents. It was an important dis-
covery that these notions coincide. That they do is the content
of the soundness and completeness theorem.

Definition 3.9 (Theorems). A sentence A is a theorem if there


is a derivation in LK of the sequent ⇒ A. We write ⊢ A if A is
a theorem and ⊬ A if it is not.
CHAPTER 3. SEQUENT CALCULUS 39

Definition 3.10 (Derivability). A sentence A is derivable from a


set of sentences 𝛤, 𝛤 ⊢ A, iff there is a finite subset 𝛤0 ⊆ 𝛤 and a
sequence 𝛤0′ of the sentences in 𝛤0 such that LK derives 𝛤0′ ⇒ A.
If A is not derivable from 𝛤 we write 𝛤 ⊬ A.

Because of the contraction, weakening, and exchange rules,


the order and number of sentences in 𝛤0′ does not matter: if a
sequent 𝛤0′ ⇒ A is derivable, then so is 𝛤0′′ ⇒ A for any 𝛤0′′
that contains the same sentences as 𝛤0′. For instance, if 𝛤0 =
{B,C } then both 𝛤0′ = ⟨B,B,C ⟩ and 𝛤0′′ = ⟨C ,C ,B⟩ are sequences
containing just the sentences in 𝛤0 . If a sequent containing one
is derivable, so is the other, e.g.:

B,B,C ⇒ A
CL
B,C ⇒ A
XL
C ,B ⇒ A
WL
C ,C ,B ⇒ A

From now on we’ll say that if 𝛤0 is a finite set of sentences then


𝛤0 ⇒ A is any sequent where the antecedent is a sequence of
sentences in 𝛤0 and tacitly include contractions, exchanges, and
weakenings if necessary.

Definition 3.11 (Consistency). A set of sentences 𝛤 is incon-


sistent iff there is a finite subset 𝛤0 ⊆ 𝛤 such that LK derives
𝛤0 ⇒ . If 𝛤 is not inconsistent, i.e., if for every finite 𝛤0 ⊆ 𝛤,
LK does not derive 𝛤0 ⇒ , we say it is consistent.

Proposition 3.12 (Reflexivity). If A ∈ 𝛤, then 𝛤 ⊢ A.

Proof. The initial sequent A ⇒ A is derivable, and {A} ⊆ 𝛤. □


CHAPTER 3. SEQUENT CALCULUS 40

Proposition 3.13 (Monotony). If 𝛤 ⊆ 𝛥 and 𝛤 ⊢ A, then 𝛥 ⊢ A.

Proof. Suppose 𝛤 ⊢ A, i.e., there is a finite 𝛤0 ⊆ 𝛤 such that


𝛤0 ⇒ A is derivable. Since 𝛤 ⊆ 𝛥, then 𝛤0 is also a finite subset
of 𝛥. The derivation of 𝛤0 ⇒ A thus also shows 𝛥 ⊢ A. □

Proposition 3.14 (Transitivity). If 𝛤 ⊢ A and {A} ∪ 𝛥 ⊢ B, then


𝛤 ∪ 𝛥 ⊢ B.

Proof. If 𝛤 ⊢ A, there is a finite 𝛤0 ⊆ 𝛤 and a derivation 𝜋0 of


𝛤0 ⇒ A. If {A} ∪ 𝛥 ⊢ B, then for some finite subset 𝛥0 ⊆ 𝛥,
there is a derivation 𝜋1 of A, 𝛥0 ⇒ B. Consider the following
derivation:

𝜋0 𝜋1

𝛤0 ⇒ A A, 𝛥0 ⇒ B
Cut
𝛤0 , 𝛥0 ⇒ B

Since 𝛤0 ∪ 𝛥0 ⊆ 𝛤 ∪ 𝛥, this shows 𝛤 ∪ 𝛥 ⊢ B. □

Note that this means that in particular if 𝛤 ⊢ A and A ⊢ B,


then 𝛤 ⊢ B. It follows also that if A1 , . . . ,An ⊢ B and 𝛤 ⊢ Ai for
each i , then 𝛤 ⊢ B.

Proposition 3.15. 𝛤 is inconsistent iff 𝛤 ⊢ A for every sentence A.

Proof. Exercise. □

Proposition 3.16 (Compactness). 1. If 𝛤 ⊢ A then there is a


finite subset 𝛤0 ⊆ 𝛤 such that 𝛤0 ⊢ A.

2. If every finite subset of 𝛤 is consistent, then 𝛤 is consistent.

Proof. 1. If 𝛤 ⊢ A, then there is a finite subset 𝛤0 ⊆ 𝛤 such


that the sequent 𝛤0 ⇒ A has a derivation. Consequently,
𝛤0 ⊢ A.
CHAPTER 3. SEQUENT CALCULUS 41

2. If 𝛤 is inconsistent, there is a finite subset 𝛤0 ⊆ 𝛤 such that


LK derives 𝛤0 ⇒ . But then 𝛤0 is a finite subset of 𝛤 that
is inconsistent. □

Problems
Problem 3.1. Give derivations of the following sequents:

1. ⇒ ¬(A → B) → (A ∧ ¬B)

2. (A ∧ B) → C ⇒ (A → C ) ∨ (B → C )

Problem 3.2. Prove ??


PART II

Does
everything
have to be
true or false?

42
CHAPTER 4

Syntax and
Semantics
4.1 Introduction
In classical logic, we deal with formulas that are built from propo-
sitional variables using the propositional connectives ¬, ∧, ∨, →,
and ↔. When we define a semantics for classical logic, we do so
using the two truth values T and F. We interpret propositional
variables in a valuation v, which assigns these truth values T, F
to the propositional variables. Any valuation then determines a
truth value v(A) for any formula A, and A formula is satisfied in
a valuation v, v ⊨ A, iff v(A) = T.
Many-valued logics are generalizations of classical two-valued
logic by allowing more truth values than just T and F. So in
many-valued logic, a valuation v is a function assigning to every
propositional variable p one of a range of possible truth values.
We’ll generally call the set of allowed truth values V . Classical
logic is a many-valued logic where V = {T, F}, and the truth
value v(A) is computed using the familiar characteristic truth
tables for the connectives.
Once we add additional truth values, we have more than one
natural option for how to compute v(A) for the connectives we
read as “and,” “or,” “not,” and “if—then.” So a many-valued

43
CHAPTER 4. SYNTAX AND SEMANTICS 44

logic is determined not just by the set of truth values, but also
by the truth functions we decide to use for each connective. Once
these are selected for a many-valued logic L, however, the truth
value vL (A) is uniquely determined by the valuation, just like in
classical logic. Many-valued logics, like classical logic, are truth
functional.
With this semantic building blocks in hand, we can go on to
define the analogs of the semantic concepts of tautology, entail-
ment, and satisfiability. In classical logic, a formula is a tautology
if its truth value v(A) = T for any v. In many-valued logic, we
have to generalize this a bit as well. First of all, there is no re-
quirement that the set of truth values V contains T. For instance,
some many-valued logics use numbers, such as all rational num-
bers between 0 and 1 as their set of truth values. In such a case,
1 usually plays the rule of T. In other logics, not just one but sev-
eral truth values do. So, we require that every many-valued logic
have a set V + of designated values. We can then say that a formula
is satisfied in a valuation v, v ⊨L A, iff vL (A) ∈ V + . A formula A
is a tautology of the logic, ⊨L A, iff v(A) ∈ V + for any v. And,
finally, we say that A is entailed by a set of formulas, 𝛤 ⊨L A, if
every valuation that satisfies all the formulas in 𝛤 also satisfies A.

4.2 Languages and Connectives


Classical propositional logic, and many other logics, use a set
supply of propositional constants and connectives. For instance, we
use the following as primitives:

1. The propositional constant for falsity ⊥.

2. The logical connectives: ¬ (negation), ∧ (conjunction), ∨


(disjunction), → (conditional)

In addition to the primitive connectives above, we also use sym-


bols defined as abbreviations, such as ↔ (biconditional), ⊤
(truth)
CHAPTER 4. SYNTAX AND SEMANTICS 45

The same connectives are used in many-valued logics as well.


However, it is often useful to include different versions of, say,
conjunction, in the same logic, and that would require different
symbols to keep the versions separate. Some many-valued logics
also include connectives that have no equivalent in classical logic.
So, we’ll be a bit more general than usual.

Definition 4.1. A propositional language consists of a set L of con-


nectives. Each connective ★ has an arity; a connective of arity n is
said to be n-place. Connectives of arity 0 are also called constants;
connectives of arity 1 are called unary, and connectives of arity 2,
binary.

Example 4.2. The standard language of propositional logic L0


consists of the following connectives (with associated arities):
⊥ (0) ¬ (1), ∧ (2), ∨ (2), → (2). Most logics we consider will
use this language. Some logics by tradition an convention use
different symbols for some connectives. For instance, in product
logic, the conjunction symbol is often ⊙ instead of ∧. Sometimes
it is convenient to add a new operator, e.g., the determinateness
operator △ (1-place).

4.3 Formulas

Definition 4.3 (Formula). The set Frm(L) of formulas of a


propositional language L is defined inductively as follows:

1. Every propositional variable pi is an atomic formula.

2. Every 0-place connective (propositional constant) of L is


an atomic formula.

3. If ★ is an n-place connective of L, and A1 , . . . , An are for-


mulas, then ★(A1 , . . . ,An ) is a formula.
CHAPTER 4. SYNTAX AND SEMANTICS 46

4. Nothing else is a formula.

If ★ is 1-place, then ★(A1 ) will often be written simply as ★A1 . If


★ is 2-place ★(A1 ,A2 ) will often be written as (A1 ★ A2 ).

As usual, we will often silently leave out the outermost paren-


theses.

Example 4.4. In the standard language L0 , p1 → (p1 ∧ ¬p2 ) is


a formula. In the language of product logic, it would be written
instead as p1 →(p1 ⊙¬p2 ). If we add the 1-place △ to the language,
we would also have formulas such as △(p1 ∧ p2 ) → (△p1 ∧ △p2 ).

4.4 Matrices
A many-valued logic is defined by its language, its set of truth
values V , a subset of designated truth values, and truth functions
for its connective. Together, these elements are called a matrix.

Definition 4.5 (Matrix). A matrix for the logic L consists of:

1. a set of connectives making up a language L;

2. a set V ≠ ∅ of truth values;

3. a set V + ⊆ V of designated truth values;

4. for each n-place connective ★ in L, a truth function ★˜︁ :


V n → V . If n = 0, then ★
˜︁ is just an element of V .

Example 4.6. The matrix for classical logic C consists of:

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨,


→.

2. The set of truth values V = {T, F}.

3. T is the only designated value, i.e., V + = {T}.


CHAPTER 4. SYNTAX AND SEMANTICS 47

¬
˜︁ ∧
˜︁ T F ∨
˜︁ T F →
˜︂ T F
T F T T F T T T T T F
F T F F F F T F F T T

Figure 4.1: Truth functions for classical logic C.

4. For ⊥, we have ⊥˜︁ = F. The other truth functions are given


by the usual truth tables (see Figure 4.1).

4.5 Valuations and Satisfaction

Definition 4.7 (Valuations). Let V be a set of truth values. A


valuation for L into V is a function v assigning an element of V
to the propositional variables of the language, i.e., v : At0 → V .

Definition 4.8. Given a valuation v into the set of truth val-


ues V of a many-valued logic L, define the evaluation function
v : Frm(L) → V inductively by:

1. v(pn ) = v(pn );

2. If ★ is a 0-place connective, then v(★) = ★


˜︁L ;

3. If ★ is an n-place connective, then

v(★(A1 , . . . ,An )) = ★
˜︁L (v(A1 ), . . . , v(An )).

Definition 4.9 (Satisfaction). The formula A is satisfied by


a valuation v, v ⊨L A, iff vL (A) ∈ V + , where V + is the set of
designated truth values of L.
CHAPTER 4. SYNTAX AND SEMANTICS 48

We write v ⊭L A to mean “not v ⊨L A.” If 𝛤 is a set of


formulas, v ⊨L 𝛤 iff v ⊨L A for every A ∈ 𝛤.

4.6 Semantic Notions


Suppose a many-valued logic L is given by a matrix. Then we
can define the usual semantic notions for L.

Definition 4.10. 1. A formula A is satisfiable if for some v,


v ⊨ A; it is unsatisfiable if for no v, v ⊨ A;

2. A formula A is a tautology if v ⊨ A for all valuations v ;

3. If 𝛤 is a set of formulas, 𝛤 ⊨ A (“𝛤 entails A”) if and only


if v ⊨ A for every valuation v for which v ⊨ 𝛤.

4. If 𝛤 is a set of formulas, 𝛤 is satisfiable if there is a valua-


tion v for which v ⊨ 𝛤, and 𝛤 is unsatisfiable otherwise.

We have some of the same facts for these notions as we do


for the case of classical logic:

Proposition 4.11. 1. A is a tautology if and only if ∅ ⊨ A;

2. If 𝛤 is satisfiable then every finite subset of 𝛤 is also satisfiable;

3. Monotony: if 𝛤 ⊆ 𝛥 and 𝛤 ⊨ A then also 𝛥 ⊨ A;

4. Transitivity: if 𝛤 ⊨ A and 𝛥 ∪ {A} ⊨ B then 𝛤 ∪ 𝛥 ⊨ B;

Proof. Exercise. □
In classical logic we can connect entailment and the condi-
tional. For instance, we have the validity of modus ponens: If 𝛤 ⊨ A
and 𝛤 ⊨ A → B then 𝛤 ⊨ B. Another important relationship be-
tween ⊨ and → in classical logic is the semantic deduction theo-
rem: 𝛤 ⊨ A → B if and only if 𝛤 ∪ {A} ⊨ B. These results do not
always hold in many-valued logics. Whether they do depends on
the truth function →.
˜︂
CHAPTER 4. SYNTAX AND SEMANTICS 49

4.7 Many-valued logics as sublogics of C


The usual many-valued logics are all defined using matrices in
which the value of a truth-function for arguments in {T, F} agrees
with the classical truth functions. Specifically, in these logics,
if x ∈ {T, F}, then ˜︁ ¬C (x), and for ★ any one of ∧, ∨,
¬L (x) = ˜︁
→, if x, y ∈ {T, F}, then ★L (x, y) = ★
˜︁ ˜︁C (x, y). In other words, the
truth functions for ¬, ∧, ∨, → restricted to {T, F} are exactly the
classical truth functions.
Proposition 4.12. Suppose that a many-valued logic L contains the
connectives ¬, ∧, ∨, → in its language, T, F ∈ V , and its truth func-
tions satisfy:

¬L (x) = ˜︁
1. ˜︁ ¬C (x) if x = T or x = F;

∧L (x, y) = ˜︁
2. ˜︁ ∧C (x, y),

∨L (x, y) = ˜︁
3. ˜︁ ∨C (x, y),

4. →
˜︂L (x, y) = →
˜︂C (x, y), if x, y ∈ {T, F}.

Then, for any valuation v into V such that v(p) ∈ {T, F}, vL (A) =
vC (A).

Proof. By induction on A.
1. If A ≡ p is atomic, we have vL (A) = v(p) = vC (A).
2. If A ≡ ¬B, we have
vL (A) = ˜︁
¬L (vL (B)) by Definition 4.8
¬L (vC (B))
= ˜︁ by inductive hypothesis
¬C (vC (B))
= ˜︁ by assumption (1),
since vC (B) ∈ {T, F},
= vC (A) by Definition 4.8.

3. If A ≡ (B ∧ C ), we have
vL (A) = ˜︁
∧L (vL (B), vL (C )) by Definition 4.8
CHAPTER 4. SYNTAX AND SEMANTICS 50

∧L (vC (B), vC (C )) by inductive hypothesis


= ˜︁
∧C (vC (B), vC (C )) by assumption (2),
= ˜︁
since vC (B), vC (C ) ∈ {T, F},
= vC (A) by Definition 4.8.

The cases where A ≡ (B ∨ C ) and A ≡ (B → C ) are similar. □

Corollary 4.13. If a many-valued logic satisfies the conditions of


Proposition 4.12, T ∈ V + and F ∉ V + , then ⊨L ⊆ ⊨C , i.e., if 𝛤 ⊨L B
then 𝛤 ⊨C B. In particular, every tautology of L is also a classical
tautology.

Proof. We prove the contrapositive. Suppose 𝛤 ⊭C B. Then there


is some valuation v : At0 → {T, F} such that vC (A) = T for all
A ∈ 𝛤 and vC (B) = F. Since T, F ∈ V , the valuation v is also
a valuation for L. By Proposition 4.12, vL (A) = T for all A ∈ 𝛤
and vL (B) = F. Since T ∈ V + and F ∉ V + that means v ⊨L 𝛤 and
v ⊭L B, i.e., 𝛤 ⊭L B. □

Problems
Problem 4.1. Prove Proposition 4.11
CHAPTER 5

Three-valued
Logics
5.1 Introduction
If we just add one more value U to T and F, we get a three-
valued logic. Even though there is only one more truth value, the
possibilities for defining the truth-functions for ¬, ∧, ∨, and →
are quite numerous. Then a logic might use any combination of
these truth functions, and you also have a choice of making only
T designated, or both T and U.
We present here a selection of the most well-known three-
valued logics, their motivations, and some of their properties.

5.2 Łukasiewicz logic


One of the first published, worked out proposals for a many-
valued logic is due to the Polish philosopher Jan Łukasiewicz in
1921. Łukasiewicz was motivated by Aristotle’s sea battle prob-
lem: It seems that, today, the sentence “There will be a sea battle
tomorrow” is neither true nor false: its truth value is not yet set-
tled. Łukasiewicz proposed to introduce a third truth value, to
such “future contingent” sentences.

51
CHAPTER 5. THREE-VALUED LOGICS 52

I can assume without contradiction that my presence


in Warsaw at a certain moment of next year, e.g., at
noon on 21 December, is at the present time deter-
mined neither positively nor negatively. Hence it is
possible, but not necessary, that I shall be present
in Warsaw at the given time. On this assumption
the proposition “I shall be in Warsaw at noon on 21
December of next year,” can at the present time be
neither true nor false. For if it were true now, my
future presence in Warsaw would have to be neces-
sary, which is contradictory to the assumption. If it
were false now, on the other hand, my future pres-
ence in Warsaw would have to be impossible, which
is also contradictory to the assumption. Therefore
the proposition considered is at the moment neither
true nor false and must possess a third value, differ-
ent from “0” or falsity and “1” or truth. This value
we can designate by “ 21 .” It represents “the possible,”
and joins “the true” and “the false” as a third value.

We will use U for Łukasiewicz’s third truth value.1


The truth functions for the connectives ¬, ∧, and ∨ are easy to
determine on this interpretation: the negation of a future contin-
gent sentence is also a future contingent sentence, so ˜︁
¬(U) = U.
If one conjunct of a conjunction is undetermined and the other is
true, the conjunction is also undetermined—after all, depending
on how the future contingent conjunct turns out, the conjunction
might turn out to be true, and it might turn out to be false. So

∧(T, U) = ˜︁
˜︁ ∧(U, T) = U.

If the other conjunct is false, however, it cannot turn out true, so

∧(F, U) = ˜︁
˜︁ ∧(F, U) = F.
1 Łukasiewicz here uses “possible” in a way that is uncommon today, namely
to mean possible but not necessary.
CHAPTER 5. THREE-VALUED LOGICS 53

The other values (if the arguments are settled truth values, T or
F, are like in classical logic.
For the conditional, the situation is a little trickier. Suppose
q is a future contingent statement. If p is false, then p → q will be
true, regardless of how q turns out, so we should set →(F,
˜︂ U) = T.
And if p is true, then q →p will be true, regardless of what q turns
out to be, so →(U,
˜︂ T) = T. If p is true, then p → q might turn out
to be true or false, so →(T,
˜︂ U) = U. Similarly, if p is false, then
q → p might turn out to be true or false, so →(U, ˜︂ F) = U. This
leaves the case where p and q are both future contingents. On
the basis of the motivation, we should really assign U in this case.
However, this would make A → A not a tautology. Łukasiewicz
had not trouble giving up A ∨ ¬A and ¬(A ∧ ¬A), but balked at
giving up A → A. So he stipulated →(U, ˜︂ U) = T.

Definition 5.1. Three-valued Łukasiewicz logic is defined using


the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:

¬
˜︁ ∧Ł 3
˜︁ T U F
T F T T U F
U U U U U F
F T F F F F

∨Ł3
˜︁ T U F →
˜︂Ł3 T U F
T T T T T T U F
U T U U U T T U
F T U F F T T T

As can easily be seen, any formula A containing only ¬, ∧,


CHAPTER 5. THREE-VALUED LOGICS 54

and ∨ will take the truth value U if all its propositional variables
are assigned U. So for instance, the classical tautologies p ∨ ¬p
and ¬(p ∧¬p) are not tautologies in Ł 3 , since v(A) = U whenever
v(p) = U.
On valuations where v(p) = T or F, v(A) will coincide with
its classical truth value.

Proposition 5.2. If v(p) ∈ {T, F} for all p in A, then vŁ3 (A) =


vC (A).

Many classical tautologies are also tautologies in Ł 3 , e.g,


¬p → (p → q ). Just like in classical logic, we can use truth ta-
bles to verify this:

p q ¬ p → (p → q)
T T F T T T T T
T U F T T T U U
T F F T T T F F
U T U U T U T T
U U U U T U T U
U F U U T U U F
F T T F T F T T
F U T F T F T U
F F T F T F T F
One might therefore perhaps think that although not all clas-
sical tautologies are tautologies in Ł 3 , they should at least take
either the value T or the value U on every valuation. This is not
the case. A counterexample is given by

¬(p → ¬p) ∨ ¬(¬p → p)

which is F if p is U.
Łukasiewicz hoped to build a logic of possibility on the basis
of his three-valued system, by introducing a one-place connec-
tive ◇A (for “A is possible”) and a corresponding □A (for “A is
necessary”):
CHAPTER 5. THREE-VALUED LOGICS 55


˜︁ □
˜︁
T T T T
U T U F
F F F F
In other words, p is possible iff it is not already settled as false;
and p is necessary iff it is already settled as true.
However, the shortcomings of this proposed modal logic soon
became evident: However things turn out, p ∧ ¬p can never turn
out to be true. So even if it is not now settled (and therefore unde-
termined), it should count as impossible, i.e., ¬◇(p ∧ ¬p) should
be a tautology. However, if v(p) = U, then v(¬◇(p ∧ ¬p)) = U.
Although Łukasiewicz was correct that two truth values will not
be enough to accommodate modal distinctions such as possiblity
and necessity, introducing a third truth value is also not enough.

5.3 Kleene logics


Stephen Kleene introduced two three-valued logics motivated by
a logic in which truth values are thought of the outcomes of com-
putational procedures: a procedure may yield T or F, but it may
also fail to terminate. In that case the corresponding truth value
is undefined, represented by the truth value U.
To compute the negation of a proposition A, you would first
compute the value of A, and then return the opposite of the re-
sult. If the computation of A does not terminate, then the entire
procedure does not either: so the negation of U is U.
To compute a conjunction A ∧ B, there are two options: one
can first compute A, then B, and then the result would be T if
the outcome of both is T, and F otherwise. If either computation
fails to halt, the entire procedure does as well. So in this case,
the if one conjunct is undefined, the conjunction is as well. The
same goes for disjunction.
However, if we can evaluate A and B in parallel, we can do
better. Then, if one of the two procedures halts and returns F, we
CHAPTER 5. THREE-VALUED LOGICS 56

can stop, as the answer must be false. So in that case a conjunc-


tion with one false conjunct is false, even if the other conjunct is
undefined. Similarly, when computing a disjunction in parallel,
we can stop once the procedure for one of the two disjuncts has
returned true: then the disjunction must be true. So in this case
we can know what the outcome of a compound claim is, even if
one of the components is undefined. On this interpretation, we
might read U as “unknown” rather than “undefined.”
The two interpretations give rise to Kleene’s strong and weak
logic. The conditional is defined as equivalent to ¬A ∨ B.

Definition 5.3. Strong Kleene logic Ks is defined using the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:

¬
˜︁ ∧Ks
˜︁ T U F
T F T T U F
U U U U U F
F T F F F F

∨Ks
˜︁ T U F →
˜︂Ks T U F
T T T T T T U F
U T U U U T U U
F T U F F T T T

Definition 5.4. Weak Kleene logic Kw is defined using the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.


CHAPTER 5. THREE-VALUED LOGICS 57

3. T is the only designated value, i.e., V + = {T}.

4. Truth functions are given by the following tables:

¬
˜︁ ∧Kw
˜︁ T U F
T F T T U F
U U U U U U
F T F F U F

∨Kw
˜︁ T U F →
˜︂Kw T U F
T T U T T T U F
U U U U U U U U
F T U F F T U T

Proposition 5.5. Ks and Kw have no tautologies.

Proof. If v(p) = U for all propositional variables p, then any for-


mula A will have truth value v(A) = U, since

¬(U) = ˜︁
˜︁ ∨(U, U) = ˜︁
∧(U, U) = →(U,
˜︂ U) = U

in both logics. As U ∉ V + for either Ks or Kw, on this valuation,


A will not be designated. □

Although both weak and strong Kleene logic have no tautolo-


gies, they have non-trivial consequence relations.
Dmitry Bochvar interpreted U as “meaningless” and at-
tempted to use it to solve paradoxes such as the Liar paradox by
stipulating that paradoxical sentences take the value U. He in-
troduced a logic which is essentially weak Kleene logic extended
by additional connectives, two of which are “external negation”
and the “is undefined” operator:

˜︁ +
˜︁
T F T F
U T U T
F T F F
CHAPTER 5. THREE-VALUED LOGICS 58

5.4 Gödel logics


Kurt Gödel introduced a sequence of n-valued logics that each
contain all formulas valid in intuitionistic logic, and are con-
tained in classical logic. Here is the first interesting one:

Definition 5.6. 3-valued Gödel logic G is defined using the matrix:

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨,


→.

2. The set of truth values V = {T, U, F}.

3. T is the only designated value, i.e., V + = {T}.

4. For ⊥, we have ⊥˜︁ = F. Truth functions for the remaining


connectives are given by the following tables:

¬G
˜︁ ∧G
˜︁ T U F
T F T T U F
U F U U U F
F T F F F F

∨G
˜︁ T U F →
˜︂G T U F
T T T T T T U F
U T U U U T T F
F T U F F T T T

You’ll notice that the truth tables for ∧ and ∨ are the same as
in Łukasiewicz and strong Kleene logic, but the truth tables for
¬ and → differ for each. In Gödel logic, ˜︁ ¬(U) = F. In contrast
to Łukasiewicz logic and Kleene logic, →(U,˜︂ F) = F; in contrast
to Kleene logic (but as in Łukasiewicz logic), →(U,
˜︂ U) = T.
As the connection to intuitionistic logic alluded to above sug-
gests, G3 is close to intuitionistic logic. All intuitionistic truths
are tautologies in G3 , and many classical tautologies that are not
valid intuitionistically also fail to be tautologies in G3 . For in-
CHAPTER 5. THREE-VALUED LOGICS 59

stance, the following are not tautologies:

p ∨ ¬p (p → q ) → (¬p ∨ q )
¬¬p → p ¬(p ∧ q ) → (¬p ∨ ¬q )
((p → q ) → p) → p

However, not every tautology of G3 is also intuitionistically valid,


e.g., (p → q ) ∨ (q → p).

5.5 Designating not just T


So far the logics we’ve seen all had the set of designated truth
values V + = {T}, i.e., something counts as true iff its truth value
is T. But one might also count something as true if it’s just not F.
Then one would get a logic by stipulating in the matrix, e.g., that
V + = {T, U}.

Definition 5.7. The logic of paradox LP is defined using the ma-


trix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V = {T, U, F}.

3. T and U are designated, i.e., V + = {T, U}.

4. Truth functions are the same as in strong Kleene logic.

Definition 5.8. Halldén’s logic of nonsense Hal is defined using


the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →


and a 1-place connective +.

2. The set of truth values V = {T, U, F}.


CHAPTER 5. THREE-VALUED LOGICS 60

3. T and U are designated, i.e., V + = {T, U}.

4. Truth functions are the same as weak Kleene logic, plus the
“is meaningless” operator:

+
˜︁
T F
U T
F F

By contrast to the Kleene logics with which they share truth


tables, these do have tautologies.

Proposition 5.9. The tautologies of LP are the same as the tautolo-


gies of classical propositional logic.

Proof. By Proposition 4.12, if ⊨LP A then ⊨C A. To show the


reverse, we show that if there is a valuation v : At0 → {F, T, U}
such that vKs (A) = F then there is a valuation v′ : At0 → {F, T}
such that v′C (A) = F. This establishes the result for LP, since
Ks and LP have the same characteristic truth functions, and F
is the only truth value of LP that is not designated (that is the
only difference between LP and Ks). Thus, if ⊭LP A, for some
valuation v, vLP (A) = vKs (A) = F. By the claim we’re proving,
v′C (A) = F, i.e., ⊭C A.
To establish the claim, we first define v′ as
{︄
′ T if v(p) ∈ {T, U}
v (p) =
F otherwise

We now show by induction on A that (a) if vKs (A) = F then


v′C (A) = F, and (b) if vKs (A) = T then v′C (A) = T

1. Induction basis: A ≡ p. By Definition 4.8, vKs (A) = v(p) =


v′C (A), which implies both (a) and (b).
For the induction step, consider the cases:
CHAPTER 5. THREE-VALUED LOGICS 61

2. A ≡ ¬B.
a) Suppose vKs (¬B) = F. By the definition of ˜︁ ¬Ks ,
vKs (B) = T. By inductive hypothesis, case (b), we
get v′C (B) = T, so v′C (¬B) = F.
b) Suppose vKs (¬B) = T. By the definition of ˜︁ ¬Ks ,
vKs (B) = F. By inductive hypothesis, case (a), we
get v′C (B) = F, so v′C (¬B) = T.
3. A ≡ (B ∧ C ).
a) Suppose vKs (B ∧ C ) = F. By the definition of ˜︁ ∧Ks ,
vKs (B) = F or vKs (B) = F. By inductive hypothesis,
case (a), we get v′C (B) = F or v′C (C ) = F, so v′C (B ∧
C ) = F.
b) Suppose vKs (B ∧ C ) = T. By the definition of ˜︁ ∧Ks ,
vKs (B) = T and vKs (B) = T. By inductive hypothe-
sis, case (b), we get v′C (B) = T and v′C (C ) = T, so
v′C (B ∧ C ) = T.
The other two cases are similar, and left as exercises. Alter-
natively, the proof above establishes the result for all formulas
only containing ¬ and ∧. One may now appeal to the facts that
in both Ks and C, for any v, v(B ∨ C ) = v(¬(¬B ∧ ¬C )) and
v(B → C ) = v(¬(B ∧ ¬C )). □

Although they have the same tautologies as classical logic,


their consequence relations are different. LP, for instance, is
paraconsistent in that ¬p, p ⊭ q , and so the principle of explo-
sion ¬A,A ⊨ B does not hold in general. (It holds for some cases
of A and B, e.g., if B is a tautology.)
What if you make U designated in Ł 3 ?

Definition 5.10. The logic 3-valued R-Mingle RM 3 is defined us-


ing the matrix:

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨,


CHAPTER 5. THREE-VALUED LOGICS 62

→.

2. The set of truth values V = {T, U, F}.

3. T and U are designated, i.e., V + = {T, U}.

4. Truth functions are the same as Łukasiewicz logic Ł 3 .

Different truth tables can sometimes generate the same logic


(entailment relation) just by changing the designated values. E.g.,
this happens if in Gödel logic we take V + = {T, U} instead of {T}.

Proposition 5.11. The matrix with V = {F, U, T}, V + = {T, U},


and the truth functions of 3-valued Gödel logic defines classical logic.

Proof. Exercise. □

Problems
Problem 5.1. Suppose we define v(A ↔ B) = v((A → B) ∧ (B →
A)) in Ł 3 . What truth table would ↔ have?

Problem 5.2. Show that the following are tautologies in Ł 3 :

1. p → (q → p)

2. ¬(p ∧ q ) ↔ (¬p ∨ ¬q )

3. ¬(p ∨ q ) ↔ (¬p ∧ ¬q )

(In (2) and (3), take A ↔ B as an abbreviation for (A → B) ∧


(B → A), or refer to your solution to problem 5.1.)

Problem 5.3. Show that the following classical tautologies are


not tautologies in Ł 3 :

1. (¬p ∧ p) → q )

2. ((p → q ) → p) → p
CHAPTER 5. THREE-VALUED LOGICS 63

3. (p → (p → q )) → (p → q )

Problem 5.4. Which of the following relations hold in


Łukasiewicz logic? Give a truth table for each.

1. p, p → q ⊨ q

2. ¬¬p ⊨ p

3. p ∧ q ⊨ p

4. p ⊨ p ∧ p

5. p ⊨ p ∨ q

Problem 5.5. Show that □p ↔ ¬◇¬p and ◇p ↔ ¬□¬p are tau-


tologies in Ł 3 , extended with the truth tables for □ and ◇.

Problem 5.6. Which of the following relations hold in (a) strong


and (b) weak Kleene logic? Give a truth table for each.

1. p, p → q ⊨ q

2. p ∨ q , ¬p ⊨ q

3. p ∧ q ⊨ p

4. p ⊨ p ∧ p

5. p ⊨ p ∨ q

Problem 5.7. Can you define ∼ in Bochvar’s logic in terms of


¬ and +, i.e., find a formula with only the propositional vari-
able p and not involving ∼ which always takes the same truth
value as ∼p? Give a truth table to show you’re right.

Problem 5.8. Give a truth table to show that (p → q ) ∨ (q → p)


is a tautology of G3 .
CHAPTER 5. THREE-VALUED LOGICS 64

Problem 5.9. Give truth tables that show that the following are
not tautologies of G3

(p → q ) → (¬p ∨ q )
¬(p ∧ q ) → (¬p ∨ ¬q )
((p → q ) → p) → p

Problem 5.10. Which of the following relations hold in Gödel


logic? Give a truth table for each.

1. p, p → q ⊨ q

2. p ∨ q , ¬p ⊨ q

3. p ∧ q ⊨ p

4. p ⊨ p ∧ p

5. p ⊨ p ∨ q

Problem 5.11. Complete the proof Proposition 5.9, i.e., estab-


lish (a) and (b) for the cases where A ≡ (B ∨C ) and A ≡ (B →C ).

Problem 5.12. Prove that every classical tautology is a tautology


in Hal.

Problem 5.13. Which of the following relations hold in (a) LP


and in (b) Hal? Give a truth table for each.

1. p, p → q ⊨ q

2. ¬q , p → q ⊨ ¬p

3. p ∨ q , ¬p ⊨ q

4. ¬p, p ⊨ q

5. p ⊨ p ∨ q

6. p → q ,q → r ⊨ p → r
CHAPTER 5. THREE-VALUED LOGICS 65

Problem 5.14. Which of the following relations hold in RM 3 ?

1. p, p → q ⊨ q

2. p ∨ q , ¬p ⊨ q

3. ¬p, p ⊨ q

4. p ⊨ p ∨ q

Problem 5.15. Prove Proposition 5.11 by showing that for the


logic L defined just like Gödel logic but with V + = {T, U}, if
𝛤 ⊭L B then 𝛤 ⊭C B. Use the ideas of Proposition 5.9, except
instead of proving properties (a) and (b), show that vG (A) = F
iff v′C (A) = F (and hence that vG (A) ∈ {T, U} iff v′C (A) = T).
Explain why this establishes the proposition.
CHAPTER 6

Sequent
Calculus
6.1 Introduction
The sequent calculus for classical logic is an efficient and simple
derivation system. If a many-valued logic is defined by a matrix
with finitely many truth values, i.e., V is finite, it is possible to
provide a sequent calculus for it. The idea for how to do this
comes from considering the meanings of sequents and the form
of inference rules in the classical case.
Now recall that a sequent
A1 , . . . ,An ⇒ B 1 , . . . ,Bn

can be interpreted as the formula

(A1 ∧ · · · ∧ Am ) → (B 1 ∨ · · · ∨ Bn )
In other words, A valuation v satisfies a sequent 𝛤 ⇒ 𝛥 iff either
v(A) = F for some A ∈ 𝛤 or v(A) = T for some A ∈ 𝛥. On
this interpretation, initial sequents A ⇒ A are always satisfied,
because either v(A) = T or (A) = F.
Here are the inference rules for the conditional in LK, with
side formulas 𝛤, 𝛥 left out:

66
CHAPTER 6. SEQUENT CALCULUS 67

⇒ A B ⇒ A ⇒ B
→L →R
A→B ⇒ ⇒ A→B

If we apply the above semantic interpretation of a sequent,


we can read the →L rule as saying that if v(A) = T and v(B) = F,
then v(A → B) = F. Similarly, the →R rule says that if either
v(A) = F or v(B) = T, then v(A → B) = T. And in fact, these
conditionals are actually biconditionals. In the case of the ∧L
and ∨R rules in their standard formulation, the corresponding
conditionals would not be biconditionals. But there are alterna-
tive versions of these rules where they are:

A,B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A,B
∧L ∨R
A ∧ B, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A ∨ B

This basic idea, applied to an n-valued logic, then results in a


sequent calculus with n instead of two places, one for each truth
value. For a three-valued logic with V = {F, U, T}, a sequent is
an expression 𝛤 | 𝛱 | 𝛥. It is satisfied in a valuation v iff either
v(A) = F for some A ∈ 𝛤 or v(A) = T for some A ∈ 𝛥 or v(A) = U
for some A ∈ 𝛱 . Consequently, initial sequents A | A | A are
always satisfied.

6.2 Rules and Derivations


For the following, let 𝛤, 𝛥, 𝛱 , 𝛬 represent finite sequences of sen-
tences.

Definition 6.1 (Sequent). An n-sided sequent is an expression of


the form
𝛤1 | . . . | 𝛤n
where each 𝛤1 is a finite (possibly empty) sequences of sentences
of the language L.
CHAPTER 6. SEQUENT CALCULUS 68

Definition 6.2 (Initial Sequent). An n-sided initial sequent is an


n-sided sequent of the form A | . . . | A for any sentence A in the
language.
If the language contains a 0-place connective ★, i.e., a propo-
sitional constant, then we also take the sequent . . . | ★ | . . . where
★ appears in the space for the truth value associated with ★ ˜︁ ∈ V ,
and is empty otherwise.

For each connective of an n-valued logic L, there is a logi-


cal rule for each truth value that this connective can take in L.
Derivations in an n-sided sequent calculus for L are trees of se-
quents, where the topmost sequents are initial sequents, and if a
sequent stands below one or more other sequents, it must follow
correctly by a rule of inference for the connectives of L.

Definition 6.3 (Theorems). A sentence A is a theorem of an n-


valued logic L if there is a derivation of the n-sequent containing
A in each position corresponding to a designated truth value of L.
We write ⊢L A if A is a theorem and ⊬L A if it is not.

Definition 6.4 (Derivability). A sentence A is derivable from a


set of sentences 𝛤 in an n-valued logic L, 𝛤 ⊢L A, iff there is a
finite subset 𝛤0 ⊆ 𝛤 and a sequence 𝛤0′ of the sentences in 𝛤0 such
that the following sequent has a derivation:

𝛬1 | . . . | 𝛬n

where 𝛬i is A if position i corresponds to a designated truth value,


and 𝛤0′otherwise. If A is not derivable from 𝛤 we write 𝛤 ⊬ A.

For instance, 3-valued Łukasiewicz logic has a 3-sided sequent


calculus. In a 3-sided sequent 𝛤 | 𝛱 | 𝛥, 𝛤 corresponds to F,
𝛥 to T, and 𝛱 to U. Axioms are A | A | A. Since only T is
designated, 𝛤 ⊢Ł3 A iff the sequent 𝛤 | 𝛤 | A has a derivation. (If
U were also designated, we would need a derivation of 𝛤 | A | A.)
CHAPTER 6. SEQUENT CALCULUS 69

6.3 Structural Rules


The structural rules for n-sided sequent calculus operate as in the
classical case, except for each position i .

𝛤1 | . . . | 𝛤i | . . . | 𝛤n
Wi
𝛤1 | . . . | A, 𝛤i | . . . | 𝛤n
𝛤1 | . . . | A,A, 𝛤i | . . . | 𝛤n
Ci
𝛤1 | . . . | A, 𝛤i | . . . | 𝛤n
𝛤1 | . . . | 𝛤i ,A,B, 𝛤i′ | . . . | 𝛤n
Xi
𝛤1 | . . . | 𝛤i ,B,A, 𝛤i′ | . . . | 𝛤n

A series of weakening, contraction, and exchange inferences


will often be indicated by double inference lines.
The Cut rule comes in several forms, one for every combina-
tion of distinct positions in the sequent i ≠ j :

𝛤1 | . . . | A, 𝛤i | . . . | 𝛤n 𝛥1 | . . . | A, 𝛥 j | . . . | 𝛥n
Cuti , j
𝛤1 , 𝛥1 | . . . | 𝛤n , 𝛥n

6.4 Propositional Rules for Selected Logics


The inference rules for a connective in an n-sided sequent calcu-
lus only depend on the characteristic truth function for the con-
nective. Thus, if some connective is defined by the same truth
function in different logics, these n-sided sequent rules for the
connective are the same in those logics.

Rules for ¬
The following rules for ¬ apply to Łukasiewicz and Kleene logics,
and their variants.
CHAPTER 6. SEQUENT CALCULUS 70

𝛤 | 𝛱 | 𝛥,A
¬F
¬A, 𝛤 | 𝛱 | 𝛥
𝛤 | A, 𝛱 | 𝛥
¬U
𝛤 | ¬A, 𝛱 | 𝛥
A, 𝛤 | 𝛱 | 𝛥
¬T
𝛤 | 𝛱 | 𝛥, ¬A

The following rules for ¬ apply to Gödel logic.

𝛤 | A, 𝛱 | 𝛥,A A, 𝛤 | 𝛱 | 𝛥
¬G F ¬G T
¬A, 𝛤 | 𝛱 | 𝛥 𝛤 | 𝛱 | 𝛥, ¬A

(In Gödel logic, ¬A can never take the value U, so there is no


rule for the middle position.)

Rules for ∧
These are the rules for ∧ in Łukasiewicz, strong Kleene, and
Gödel logic.

A,B, 𝛤 | 𝛱 | 𝛥
∧F
A ∧ B, 𝛤 | 𝛱 | 𝛥
𝛤 | A, 𝛱 | A, 𝛥 𝛤 | B, 𝛱 | B, 𝛥 𝛤 | A,B, 𝛱 | 𝛥
∧U
𝛤 | A ∧ B, 𝛱 | 𝛥
𝛤 | 𝛱 | 𝛥,A 𝛤 | 𝛱 | 𝛥,B
∧T
𝛤 | 𝛱 | 𝛥,A ∧ B

Rules for ∨
These are the rules for ∨ in Łukasiewicz, strong Kleene, and
Gödel logic.
CHAPTER 6. SEQUENT CALCULUS 71

A, 𝛤 | 𝛱 | 𝛥 B, 𝛤 | 𝛱 | 𝛥
∨F
A ∨ B, 𝛤 | 𝛱 | 𝛥
A, 𝛤 | A, 𝛱 | 𝛥 B, 𝛤 | B, 𝛱 | 𝛥 𝛤 | A,B, 𝛱 | 𝛥
∨U
𝛤 | A ∨ B, 𝛱 | 𝛥
𝛤 | 𝛱 | 𝛥,A,B
∨T
𝛤 | 𝛱 | 𝛥,A ∨ B

Rules for →
These are the rules for → in Łukasiewicz logic.

𝛤 | 𝛱 | 𝛥,A B, 𝛤 | 𝛱 | 𝛥
→Ł3 F
A → B, 𝛤 | 𝛱 | 𝛥
𝛤 | A,B, 𝛱 | 𝛥 B, 𝛤 | 𝛱 | 𝛥,A
→Ł 3 U
𝛤 | A → B, 𝛱 | 𝛥
A, 𝛤 | B, 𝛱 | 𝛥,B A, 𝛤 | A, 𝛱 | 𝛥,B
→Ł 3 T
𝛤 | 𝛱 | 𝛥,A → B

These are the rules for → in strong Kleene logic.

𝛤 | 𝛱 | 𝛥,A B, 𝛤 | 𝛱 | 𝛥
→Ks F
A → B, 𝛤 | 𝛱 | 𝛥
B, 𝛤 | B, 𝛱 | 𝛥 𝛤 | A,B, 𝛱 | 𝛥 𝛤 | A, 𝛱 | 𝛥,A
→Ks U
𝛤 | A → B, 𝛱 | 𝛥
A, 𝛤 | 𝛱 | 𝛥,B
→Ks T
𝛤 | 𝛱 | 𝛥,A → B

These are the rules for → in Gödel logic.


CHAPTER 6. SEQUENT CALCULUS 72

𝛤 | A, 𝛱 | 𝛥,A B, 𝛤 | 𝛱 | 𝛥
→G 3 F
A → B, 𝛤 | 𝛱 | 𝛥
𝛤 | B, 𝛱 | 𝛥 𝛤 | 𝛱 | 𝛥,A
→G 3 U
𝛤 | A → B, 𝛱 | 𝛥
A, 𝛤 | B, 𝛱 | 𝛥,B A, 𝛤 | A, 𝛱 | 𝛥,B
→G 3 T
𝛤 | 𝛱 | 𝛥,A → B
B |B |B
WU
B | A,B | B
XU
A|A|A A|A|A B | B,A | B A|A|A
WT WT WU WT
A | A | B,A A | A | A,A B | A,B,A | B A | A | B,A
WU WT WF WF
A | B,A | B,A A | A | B,A,A A,B | A,B,A | B B,A | A | B,A
WU WF XF WF
CHAPTER 6. SEQUENT CALCULUS

A | A,B,A | B,A B,A | A | B,A,A B,A | A,B,A | B B,B,A | A | B,A


→U →U
A | A → B,A | B,A B,A | A → B,A | B
→F
A → B,A | A → B,A | B

Figure 6.1: Example derivation in Ł 3


73
CHAPTER 7

Infinite-valued
Logics
7.1 Introduction
The number of truth values of a matrix need not be finite. An
obvious choice for a set of infinitely many truth values is the set
of rational numbers between 0 and 1, V∞ = [0, 1] ∩ Q, i.e.,
n
V∞ = { : n,m ∈ N and n ≤ m}.
m
When considering this infinite truth value set, it is often useful to
also consider the subsets
n
Vm = { : n ∈ N and n ≤ m}
m−1
For instance, V5 is the set with 5 evenly spaced truth values,

1 1 3
V5 = {0, , , , 1}.
4 2 4
In logics based on these truth value sets, usually only 1 is des-
ignated, i.e., V + = {1}. In other words, we let 1 play the role of
(absolute) truth, 0 as absolute falsity, but formulas may take any
intermediate value in V .

74
CHAPTER 7. INFINITE-VALUED LOGICS 75

One can also consider the set V [0,1] = [0, 1] of all real num-
bers between 0 and 1, or other infinite subsets of [0, 1], however.
Logics with this truth value set are often called fuzzy.

7.2 Łukasiewicz logic

Definition 7.1. Infinite-valued Łukasiewicz logic Ł ∞ is defined


using the matrix:

1. The standard propositional language L0 with ¬, ∧, ∨, →.

2. The set of truth values V∞ .

3. 1 is the only designated value, i.e., V + = {1}.

4. Truth functions are given by the following functions:

¬Ł (x) = 1 − x
˜︁
∧Ł (x, y) = min(x, y)
˜︁
∨Ł (x, y) = max(x, y)
˜︁
{︄
1 if x ≤ y
˜︂Ł (x, y) = min(1, 1 − (x − y)) =

1 − (x − y) otherwise.

m-valued Łukasiewicz logic is defined the same, except V = Vm .

Proposition 7.2. The logic Ł3 defined by Definition 5.1 is the same


as Ł3 defined by Definition 7.1.

Proof. This can be seen by comparing the truth tables for the con-
nectives given in Definition 5.1 with the truth tables determined
by the equations in Definition 7.1:
CHAPTER 7. INFINITE-VALUED LOGICS 76

¬
˜︁ ∧Ł 3
˜︁ 1 1/2 0
1 0 1 1 1/2 0
1/2 1/2 1/2 1/2 1/2 0
0 1 0 0 0 0

∨Ł3
˜︁ 1 1/2 0 →
˜︂Ł3 1 1/2 0
1 1 1 1 1 1 1/2 0

1/2 1 1/2 1/2 1/2 1 1 1/2
0 1 1/2 0 0 1 1 1

Proposition 7.3. If 𝛤 ⊨Ł∞ B then 𝛤 ⊨Łm B for all m ≥ 2.

Proof. Exercise. □

In fact, the converse holds as well.


Infinite-valued Łukasiewicz logic is the most popular fuzzy
logic. In the fuzzy logic literature, the conditional is often defined
as ¬A ∨ B. The result would be an infinite-valued strong Kleene
logic.

7.3 Gödel logics

Definition 7.4. Infinite-valued Gödel logic G∞ is defined using


the matrix:

1. The standard propositional language L0 with ⊥, ¬, ∧, ∨,


→.

2. The set of truth values V∞ .

3. 1 is the only designated value, i.e., V + = {1}.

4. Truth functions are given by the following functions:

˜︁ = 0

CHAPTER 7. INFINITE-VALUED LOGICS 77

{︄
1 if x = 0
¬G (x) =
0 otherwise
˜︁

∧G (x, y) = min(x, y)
˜︁
∨G (x, y) = max(x, y)
˜︁
{︄
1 if x ≤ y

˜︂G (x, y) =
y otherwise.

m-valued Gödel logic is defined the same, except V = Vm .

Proposition 7.5. The logic G3 defined by Definition 5.6 is the same


as G3 defined by Definition 7.4.

Proof. This can be seen by comparing the truth tables for the con-
nectives given in Definition 5.6 with the truth tables determined
by the equations in Definition 7.4:

¬G 3
˜︁ ∧G
˜︁ 1 1/2 0
1 0 1 1 1/2 0
1/2 0 1/2 1/2 1/2 0
0 1 0 0 0 0

∨G
˜︁ 1 1/2 0 →
˜︂G 1 1/2 0
1 1 1 1 1 1 1/2 0

1/2 1 1/2 1/2 1/2 1 1 0
0 1 1/2 0 0 1 1 1

Proposition 7.6. If 𝛤 ⊨G∞ B then 𝛤 ⊨Gm B for all m ≥ 2.

Proof. Exercise. □

In fact, the converse holds as well.


CHAPTER 7. INFINITE-VALUED LOGICS 78

Like G3 , G∞ has all intuitionistically valid formulas as tau-


tologies, and the same examples of non-tautologies are non-
tautologies of G∞ :

p ∨ ¬p (p → q ) → (¬p ∨ q )
¬¬p → p ¬(p ∧ q ) → (¬p ∨ ¬q )
((p → q ) → p) → p

The example of an intuitionistically invalid formula that is nev-


ertheless a tautology of G3 , (p → q ) ∨ (q → p), is also a tautology
in G∞ . In fact, G∞ can be characterized as intuitionistic logic
to which the schema (A → B) ∨ (B → A) is added. This was
shown by Michael Dummett, and so G∞ is often referred to as
Gödel-Dummett logic LC.

Problems
Problem 7.1. Prove Proposition 7.3.

Problem 7.2. Show that (p → q ) ∨ (q → p) is a tautology of Ł ∞ .

Problem 7.3. Prove Proposition 7.6.

Problem 7.4. Show that (p → q ) ∨ (q → p) is a tautology of G∞ .


PART III

But isn’t
truth
relative (to a
world)?

79
CHAPTER 8

Syntax and
Semantics
8.1 Introduction
Modal logic deals with modal propositions and the entailment re-
lations among them. Examples of modal propositions are the
following:

1. It is necessary that 2 + 2 = 4.

2. It is necessarily possible that it will rain tomorrow.

3. If it is necessarily possible that A then it is possible that A.

Possibility and necessity are not the only modalities: other unary
connectives are also classified as modalities, for instance, “it
ought to be the case that A,” “It will be the case that A,” “Dana
knows that A,” or “Dana believes that A.”
Modal logic makes its first appearance in Aristotle’s De Inter-
pretatione: he was the first to notice that necessity implies possi-
bility, but not vice versa; that possibility and necessity are inter-
definable; that If A ∧ B is possibly true then A is possibly true
and B is possibly true, but not conversely; and that if A → B is
necessary, then if A is necessary, so is B.

80
CHAPTER 8. SYNTAX AND SEMANTICS 81

The first modern approach to modal logic was the work of


C. I. Lewis, culminating with Lewis and Langford, Symbolic Logic
(1932). Lewis & Langford were unhappy with the representation
of implication by means of the material conditional: A → B is
a poor substitute for “A implies B.” Instead, they proposed to
characterize implication as “Necessarily, if A then B,” symbolized
as A ⥽ B. In trying to sort out the different properties, Lewis
identified five different modal systems, S1, . . . , S4, S5, the last
two of which are still in use.
The approach of Lewis and Langford was purely syntactical:
they identified reasonable axioms and rules and investigated what
was provable with those means. A semantic approach remained
elusive for a long time, until a first attempt was made by Rudolf
Carnap in Meaning and Necessity (1947) using the notion of a state
description, i.e., a collection of atomic sentences (those that are
“true” in that state description). After lifting the truth definition
to arbitrary sentences A, Carnap defines A to be necessarily true
if it is true in all state descriptions. Carnap’s approach could
not handle iterated modalities, in that sentences of the form “Pos-
sibly necessarily . . . possibly A” always reduce to the innermost
modality.
The major breakthrough in modal semantics came with Saul
Kripke’s article “A Completeness Theorem in Modal Logic” ( JSL
1959). Kripke based his work on Leibniz’s idea that a statement
is necessarily true if it is true “at all possible worlds.” This idea,
though, suffers from the same drawbacks as Carnap’s, in that the
truth of statement at a world w (or a state description s ) does not
depend on w at all. So Kripke assumed that worlds are related
by an accessibility relation R, and that a statement of the form
“Necessarily A” is true at a world w if and only if A is true at all
worlds w ′ accessible from w. Semantics that provide some version
of this approach are called Kripke semantics and made possible
the tumultuous development of modal logics (in the plural).
When interpreted by the Kripke semantics, modal logic shows
us what relational structures look like “from the inside.” A rela-
tional structure is just a set equipped with a binary relation (for
CHAPTER 8. SYNTAX AND SEMANTICS 82

instance, the set of students in the class ordered by their social


security number is a relational structure). But in fact relational
structures come in all sorts of domains: besides relative possibil-
ity of states of the world, we can have epistemic states of some
agent related by epistemic possibility, or states of a dynamical
system with their state transitions, etc. Modal logic can be used
to model all of these: the first gives us ordinary, alethic, modal
logic; the others give us epistemic logic, dynamic logic, etc.
We focus on one particular angle, known to modal logicians
as “correspondence theory.” One of the most significant early
discoveries of Kripke’s is that many properties of the accessibil-
ity relation R (whether it is transitive, symmetric, etc.) can be
characterized in the modal language itself by means of appropri-
ate “modal schemas.” Modal logicians say, for instance, that the
reflexivity of R “corresponds” to the schema “If necessarily A,
then A”. We explore mainly the correspondence theory of a num-
ber of classical systems of modal logic (e.g., S4 and S5) obtained
by a combination of the schemas D, T, B, 4, and 5.

8.2 The Language of Basic Modal Logic

Definition 8.1. The basic language of modal logic contains

1. The propositional constant for falsity ⊥.

2. A countably infinite set of propositional variables: p0 , p1 ,


p2 , . . .

3. The propositional connectives: ¬ (negation), ∧ (conjunc-


tion), ∨ (disjunction), → (conditional).

4. The modal operator □.

5. The modal operator ◇.


CHAPTER 8. SYNTAX AND SEMANTICS 83

Definition 8.2. Formulas of the basic modal language are induc-


tively defined as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an (atomic) formula.

3. If A is a formula, then ¬A is a formula.

4. If A and B are formulas, then (A ∧ B) is a formula.

5. If A and B are formulas, then (A ∨ B) is a formula.

6. If A and B are formulas, then (A → B) is a formula.

7. If A is a formula, then □A is a formula.

8. If A is a formula, then ◇A is a formula.

9. Nothing else is a formula.

Definition 8.3. Formulas constructed using the defined opera-


tors are to be understood as follows:

1. ⊤ abbreviates ¬⊥.

2. A ↔ B abbreviates (A → B) ∧ (B → A).

If a formula A does not contain □ or ◇, we say it is modal-free.

8.3 Simultaneous Substitution


An instance of a formula A is the result of replacing all occurrences
of a propositional variable in A by some other formula. We will
refer to instances of formulas often, both when discussing validity
and when discussing derivability. It therefore is useful to define
the notion precisely.
CHAPTER 8. SYNTAX AND SEMANTICS 84

Definition 8.4. Where A is a modal formula all of whose propo-


sitional variables are among p 1 , . . . , p n , and D 1 , . . . , D n are also
modal formulas, we define A[D 1 /p 1 , . . . ,D n /p n ] as the result of
simultaneously substituting each D i for pi in A. Formally, this is
a definition by induction on A:

1. A ≡ ⊥: A[D 1 /p 1 , . . . ,D n /p n ] is ⊥.

2. A ≡ q : A[D 1 /p 1 , . . . ,D n /p n ] is q , provided q ̸≡ pi for i = 1,


. . . , n.

3. A ≡ pi : A[D 1 /p 1 , . . . ,D n /p n ] is D i .

4. A ≡ ¬B: A[D 1 /p 1 , . . . ,D n /p n ] is ¬B [D 1 /p 1 , . . . ,D n /p n ].

5. A ≡ (B ∧ C ): A[D 1 /p 1 , . . . ,D n /p n ] is

(B [D 1 /p 1 , . . . ,D n /p n ] ∧ C [D 1 /p 1 , . . . ,D n /p n ]).

6. A ≡ (B ∨ C ): A[D 1 /p 1 , . . . ,D n /p n ] is

(B [D 1 /p 1 , . . . ,D n /p n ] ∨ C [D 1 /p 1 , . . . ,D n /p n ]).

7. A ≡ (B → C ): A[D 1 /p 1 , . . . ,D n /p n ] is

(B [D 1 /p 1 , . . . ,D n /p n ] → C [D 1 /p 1 , . . . ,D n /p n ]).

8. A ≡ (B ↔ C ): A[D 1 /p 1 , . . . ,D n /p n ] is

(B [D 1 /p 1 , . . . ,D n /p n ] ↔ C [D 1 /p 1 , . . . ,D n /p n ]).

9. A ≡ □B: A[D 1 /p 1 , . . . ,D n /p n ] is □B [D 1 /p 1 , . . . ,D n /p n ].

10. A ≡ ◇B: A[D 1 /p 1 , . . . ,D n /p n ] is ◇B [D 1 /p 1 , . . . ,D n /p n ].

The formula A[D 1 /p 1 , . . . ,D n /p n ] is called a substitution instance


of A.
CHAPTER 8. SYNTAX AND SEMANTICS 85

Example 8.5. Suppose A is p 1 → □(p 1 ∧ p 2 ), D 1 is ◇(p 2 → p 3 )


and D 2 is ¬□p 1 . Then A[D 1 /p 1 ,D 2 /p 2 ] is
◇(p 2 → p 3 ) → □(◇(p 2 → p 3 ) ∧ ¬□p 1 )

while A[D 2 /p 1 ,D 1 /p 2 ] is

¬□p 1 → □(¬□p 1 ∧ ◇(p 2 → p 3 ))

Note that simultaneous substitution is in general not the same as


iterated substitution, e.g., compare A[D 1 /p 1 ,D 2 /p 2 ] above with
(A[D 1 /p 1 ]) [D 2 /p 2 ], which is:

◇(p 2 → p 3 ) → □(◇(p 2 → p 3 ) ∧ p 2 ) [¬□p 1 /p 2 ], i.e.,


◇(¬□p 1 → p 3 ) → □(◇(¬□p 1 → p 3 ) ∧ ¬□p 1 )

and with (A[D 2 /p 2 ]) [D 1 /p 1 ]:

p 1 → □(p 1 ∧ ¬□p 1 ) [◇(p 2 → p 3 )/p 1 ], i.e.,


◇(p 2 → p 3 ) → □(◇(p 2 → p 3 ) ∧ ¬□◇(p 2 → p 3 )).

8.4 Relational Models


The basic concept of semantics for normal modal logics is that of
a relational model. It consists of a set of worlds, which are related
by a binary “accessibility relation,” together with an assignment
which determines which propositional variables count as “true”
at which worlds.

Definition 8.6. A model for the basic modal language is a triple


M = ⟨W,R,V ⟩, where

1. W is a nonempty set of “worlds,”

2. R is a binary accessibility relation on W , and

3. V is a function assigning to each propositional variable p


a set V (p) of possible worlds.
CHAPTER 8. SYNTAX AND SEMANTICS 86

p
w2
q

p
w1
¬q

¬p
w3
¬q

Figure 8.1: A simple model.

When Rww ′ holds, we say that w ′ is accessible from w. When


w ∈ V (p) we say p is true at w.

The great advantage of relational semantics is that mod-


els can be represented by means of simple diagrams, such as
the one in Figure 8.1. Worlds are represented by nodes, and
world w ′ is accessible from w precisely when there is an arrow
from w to w ′. Moreover, we label a node (world) by p when
w ∈ V (p), and otherwise by ¬p. Figure 8.1 represents the model
with W = {w 1 ,w 2 ,w 3 }, R = {⟨w 1 ,w 2 ⟩, ⟨w 1 ,w 3 ⟩}, V (p) = {w 1 ,w 2 },
and V (q ) = {w 2 }.

8.5 Truth at a World


Every modal model determines which modal formulas count as
true at which worlds in it. The relation “model M makes for-
mula A true at world w” is the basic notion of relational seman-
tics. The relation is defined inductively and coincides with the
usual characterization using truth tables for the non-modal oper-
ators.
CHAPTER 8. SYNTAX AND SEMANTICS 87

Definition 8.7. Truth of a formula A at w in a M, in symbols:


M,w ⊩ A, is defined inductively as follows:

1. A ≡ ⊥: Never M,w ⊩ ⊥.

2. M,w ⊩ p iff w ∈ V (p).

3. A ≡ ¬B: M,w ⊩ A iff M,w ⊮ B.

4. A ≡ (B ∧ C ): M,w ⊩ A iff M,w ⊩ B and M,w ⊩ C .

5. A ≡ (B ∨C ): M,w ⊩ A iff M,w ⊩ B or M,w ⊩ C (or both).

6. A ≡ (B → C ): M,w ⊩ A iff M,w ⊮ B or M,w ⊩ C .

7. A ≡ □B: M,w ⊩ A iff M,w ′ ⊩ B for all w ′ ∈ W with Rww ′.

8. A ≡ ◇B: M,w ⊩ A iff M,w ′ ⊩ B for at least one w ′ ∈ W


with Rww ′.

Note that by clause (7), a formula □B is true at w whenever


there are no w ′ with Rww ′. In such a case □B is vacuously true
at w. Also, □B may be satisfied at w even if B is not. The truth
of B at w does not guarantee the truth of ◇B at w. This holds,
however, if Rww, e.g., if R is reflexive. If there is no w ′ such that
Rww ′, then M,w ⊮ ◇A, for any A.

Proposition 8.8. 1. M,w ⊩ □A iff M,w ⊩ ¬◇¬A.

2. M,w ⊩ ◇A iff M,w ⊩ ¬□¬A.

Proof. 1. M,w ⊩ ¬◇¬A iff M ⊮ ◇¬A by definition of M,w ⊩.


M,w ⊩ ◇¬A iff for some w ′ with Rww ′, M,w ′ ⊩ ¬A.
Hence, M,w ⊮ ◇¬A iff for all w ′ with Rww ′, M,w ′ ⊮ ¬A.
We also have M,w ′ ⊮ ¬A iff M,w ′ ⊩ A. Together we have
M,w ⊩ ¬◇¬A iff for all w ′ with Rww ′, M,w ′ ⊩ A. Again
by definition of M,w ⊩, that is the case iff M,w ⊩ □A.
2. Exercise. □
CHAPTER 8. SYNTAX AND SEMANTICS 88

8.6 Truth in a Model


Sometimes we are interested which formulas are true at every
world in a given model. Let’s introduce a notation for this.

Definition 8.9. A formula A is true in a model M = ⟨W,R,V ⟩,


written M ⊩ A, if and only if M,w ⊩ A for every w ∈ W .

Proposition 8.10. 1. If M ⊩ A then M ⊮ ¬A, but not vice-versa.

2. If M ⊩ A → B then M ⊩ A only if M ⊩ B, but not vice-versa.

Proof. 1. If M ⊩ A then A is true at all worlds in W , and since


W ≠ ∅, it can’t be that M ⊩ ¬A, or else A would have to be
both true and false at some world.
On the other hand, if M ⊮ ¬A then A is true at some world
w ∈ W . It does not follow that M,w ⊩ A for every w ∈ W .
For instance, in the model of Figure 8.1, M ⊮ ¬p, and also
M ⊮ p.

2. Assume M ⊩ A → B and M ⊩ A; to show M ⊩ B let w ∈ W


be an arbitrary world. Then M,w ⊩ A → B and M,w ⊩ A,
so M,w ⊩ B, and since w was arbitrary, M ⊩ B.
To show that the converse fails, we need to find a model
M such that M ⊩ A only if M ⊩ B, but M ⊮ A → B.
Consider again the model of Figure 8.1: M ⊮ p and hence
(vacuously) M ⊩ p only if M ⊩ q . However, M ⊮ p → q , as
p is true but q false at w 1 . □

8.7 Validity
Formulas that are true in all models, i.e., true at every world in
every model, are particularly interesting. They represent those
modal propositions which are true regardless of how □ and ◇ are
CHAPTER 8. SYNTAX AND SEMANTICS 89

interpreted, as long as the interpretation is “normal” in the sense


that it is generated by some accessibility relation on possible
worlds. We call such formulas valid. For instance, □(p ∧ q ) → □p
is valid. Some formulas one might expect to be valid on the basis
of the alethic interpretation of □, such as □p → p, are not valid,
however. Part of the interest of relational models is that different
interpretations of □ and ◇ can be captured by different kinds of
accessibility relations. This suggests that we should define valid-
ity not just relative to all models, but relative to all models of a
certain kind. It will turn out, e.g., that □p → p is true in all mod-
els where every world is accessible from itself, i.e., R is reflexive.
Defining validity relative to classes of models enables us to for-
mulate this succinctly: □p → p is valid in the class of reflexive
models.

Definition 8.11. A formula A is valid in a class C of models if


it is true in every model in C (i.e., true at every world in every
model in C). If A is valid in C, we write C ⊨ A, and we write ⊨ A
if A is valid in the class of all models.

Proposition 8.12. If A is valid in C it is also valid in each class


C′ ⊆ C.

Proposition 8.13. If A is valid, then so is □A.

Proof. Assume ⊨ A. To show ⊨ □A let M = ⟨W,R,V ⟩ be a model


and w ∈ W . If Rww ′ then M,w ′ ⊩ A, since A is valid, and so
also M,w ⊩ □A. Since M and w were arbitrary, ⊨ □A. □

8.8 Tautological Instances


A modal-free formula is a tautology if it is true under every truth-
value assignment. Clearly, every tautology is true at every world
in every model. But for formulas involving □ and ◇, the notion
CHAPTER 8. SYNTAX AND SEMANTICS 90

of tautology is not defined. Is it the case, e.g., that □p ∨ ¬□p—


an instance of the principle of excluded middle—is valid? The
notion of a tautological instance helps: a formula that is a substi-
tution instance of a (non-modal) tautology. It is not surprising,
but still requires proof, that every tautological instance is valid.

Definition 8.14. A modal formula B is a tautological instance


if and only if there is a modal-free tautology A with proposi-
tional variables p 1 , . . . , p n and formulas D 1 , . . . , D n such that
B ≡ A[D 1 /p 1 , . . . ,D n /p n ].

Lemma 8.15. Suppose A is a modal-free formula whose propositional


variables are p 1 , . . . , p n , and let D 1 , . . . , D n be modal formulas. Then
for any assignment v, any model M = ⟨W,R,V ⟩, and any w ∈ W such
that v(pi ) = T if and only if M,w ⊩ D i we have that v ⊨ A if and
only if M,w ⊩ A[D 1 /p 1 , . . . ,D n /p n ].

Proof. By induction on A.

1. A ≡ ⊥: Both v ⊭ ⊥ and M,w ⊮ ⊥.

2. A ≡ pi :

v ⊨ pi ⇔ v(pi ) = T
by definition of v ⊨ pi
⇔ M,w ⊩ D i
by assumption
⇔ M,w ⊩ pi [D 1 /p 1 , . . . ,D n /p n ]
since pi [D 1 /p 1 , . . . ,D n /p n ] ≡ D i .

3. A ≡ ¬B:

v ⊨ ¬B ⇔ v ⊭ B
by definition of v ⊨;
⇔ M,w ⊮ B [D 1 /p 1 , . . . ,D n /p n ]
CHAPTER 8. SYNTAX AND SEMANTICS 91

by induction hypothesis
⇔ M,w ⊩ ¬B [D 1 /p 1 , . . . ,D n /p n ]
by definition of v ⊨.

4. A ≡ (B ∧ C ):

v ⊨ B ∧ C ⇔ v ⊨ B and v ⊨ C
by definition of v ⊨
⇔ M,w ⊩ B [D 1 /p 1 , . . . ,D n /p n ] and
M,w ⊩ C [D 1 /p 1 , . . . ,D n /p n ]
by induction hypothesis
⇔ M,w ⊩ (B ∧ C ) [D 1 /p 1 , . . . ,D n /p n ]
by definition of M,w ⊩.

5. A ≡ (B ∨ C ):

v ⊨ B ∨ C ⇔ v ⊨ B or v ⊨ C
by definition of v ⊨;
⇔ M,w ⊩ B [D 1 /p 1 , . . . ,D n /p n ] or
M,w ⊩ C [D 1 /p 1 , . . . ,D n /p n ]
by induction hypothesis
⇔ M,w ⊩ (B ∨ C ) [D 1 /p 1 , . . . ,D n /p n ]
by definition of M,w ⊩.

6. A ≡ (B → C ):

v ⊨ B → C ⇔ v ⊭ B or v ⊨ C
by definition of v ⊨
⇔ M,w ⊮ B [D 1 /p 1 , . . . ,D n /p n ] or
M,w ⊩ C [D 1 /p 1 , . . . ,D n /p n ]
by induction hypothesis
⇔ M,w ⊩ (B → C ) [D 1 /p 1 , . . . ,D n /p n ]
CHAPTER 8. SYNTAX AND SEMANTICS 92

by definition of M,w ⊩.

Proposition 8.16. All tautological instances are valid.

Proof. Contrapositively, suppose A is such that M,w ⊮


A[D 1 /p 1 , . . . ,D n /p n ], for some model M and world w. Define
an assignment v such that v(pi ) = T if and only if M,w ⊩ D i
(and v assigns arbitrary values to q ∉ {p 1 , . . . , p n }). Then by
Lemma 8.15, v ⊭ A, so A is not a tautology. □

8.9 Schemas and Validity

Definition 8.17. A schema is a set of formulas comprising all and


only the substitution instances of some modal formula C , i.e.,

{B : ∃D 1 , . . . , ∃D n B = C [D 1 /p 1 , . . . ,D n /p n ] }.
(︁ )︁

The formula C is called the characteristic formula of the schema,


and it is unique up to a renaming of the propositional variables.
A formula A is an instance of a schema if it is a member of the
set.

It is convenient to denote a schema by the meta-linguistic


expression obtained by substituting ‘A’, ‘B’, . . . , for the atomic
components of C . So, for instance, the following denote schemas:
‘A’, ‘A→□A’, ‘A→(B →A)’. They correspond to the characteristic
formulas p, p → □p, p → (q → p). The schema ‘A’ denotes the
set of all formulas.

Definition 8.18. A schema is true in a model if and only if all of


its instances are; and a schema is valid if and only if it is true in
every model.
CHAPTER 8. SYNTAX AND SEMANTICS 93

Proposition 8.19. The following schema K is valid

□(A → B) → (□A → □B). (K)

Proof. We need to show that all instances of the schema are true
at every world in every model. So let M = ⟨W,R,V ⟩ and w ∈ W
be arbitrary. To show that a conditional is true at a world we
assume the antecedent is true to show that consequent is true as
well. In this case, let M,w ⊩ □(A → B) and M,w ⊩ □A. We
need to show M ⊩ □B. So let w ′ be arbitrary such that Rww ′.
Then by the first assumption M,w ′ ⊩ A → B and by the second
assumption M,w ′ ⊩ A. It follows that M,w ′ ⊩ B. Since w ′ was
arbitrary, M,w ⊩ □B. □

Proposition 8.20. The following schema dual is valid

◇A ↔ ¬□¬A. (dual)

Proof. Exercise. □

Proposition 8.21. If A and A → B are true at a world in a model


then so is B. Hence, the valid formulas are closed under modus ponens.

Proposition 8.22. A formula A is valid iff all its substitution in-


stances are. In other words, a schema is valid iff its characteristic for-
mula is.

Proof. The “if” direction is obvious, since A is a substitution in-


stance of itself.
To prove the “only if” direction, we show the follow-
ing: Suppose M = ⟨W,R,V ⟩ is a modal model, and B ≡
A[D 1 /p 1 , . . . ,D n /p n ] is a substitution instance of A. Define M ′ =
⟨W,R,V ′⟩ by V ′ (pi ) = {w : M,w ⊩ D i }. Then M,w ⊩ B iff
M ′,w ⊩ A, for any w ∈ W . (We leave the proof as an exercise.)
CHAPTER 8. SYNTAX AND SEMANTICS 94

Valid Schemas Invalid Schemas


□(A → B) → (◇A → ◇B) □(A ∨ B) → (□A ∨ □B)
◇(A → B) → (□A → ◇B) (◇A ∧ ◇B) → ◇(A ∧ B)
□(A ∧ B) ↔ (□A ∧ □B) A → □A
□A → □(B → A) □◇A → B
¬◇A → □(A → B) □□A → □A
◇(A ∨ B) ↔ (◇A ∨ ◇B) □◇A → ◇□A.
Table 8.1: Valid and (or?) invalid schemas.

Now suppose that A was valid, but some substitution instance


B of A was not valid. Then for some M = ⟨W,R,V ⟩ and some
w ∈ W , M,w ⊮ B. But then M ′,w ⊮ A by the claim, and A is not
valid, a contradiction. □

Note, however, that it is not true that a schema is true in a


model iff its characteristic formula is. Of course, the “only if”
direction holds: if every instance of A is true in M, A itself is
true in M. But it may happen that A is true in M but some
instance of A is false at some world in M. For a very simple
counterexample consider p in a model with only one world w
and V (p) = {w }, so that p is true at w. But ⊥ is an instance of p,
and not true at w.

8.10 Entailment
With the definition of truth at a world, we can define an entail-
ment relation between formulas. A formula B entails A iff, when-
ever B is true, A is true as well. Here, “whenever” means both
“whichever model we consider” as well as “whichever world in
that model we consider.”

Definition 8.23. If 𝛤 is a set of formulas and A a formula, then


𝛤 entails A, in symbols: 𝛤 ⊨ A, if and only if for every model
CHAPTER 8. SYNTAX AND SEMANTICS 95

w2 p w3 p

w 1 ¬p

Figure 8.2: Counterexample to p → ◇p ⊨ □p → p.

M = ⟨W,R,V ⟩ and world w ∈ W , if M,w ⊩ B for every B ∈ 𝛤,


then M,w ⊩ A. If 𝛤 contains a single formula B, then we write
B ⊨ A.

Example 8.24. To show that a formula entails another, we have


to reason about all models, using the definition of M,w ⊩. For
instance, to show p → ◇p ⊨ □¬p → ¬p, we might argue as fol-
lows: Consider a model M = ⟨W,R,V ⟩ and w ∈ W , and suppose
M,w ⊩ p → ◇p. We have to show that M,w ⊩ □¬p → ¬p. Sup-
pose not. Then M,w ⊩ □¬p and M,w ⊮ ¬p. Since M,w ⊮ ¬p,
M,w ⊩ p. By assumption, M,w ⊩ p → ◇p, hence M,w ⊩ ◇p. By
definition of M,w ⊩ ◇p, there is some w ′ with Rww ′ such that
M,w ′ ⊩ p. Since also M,w ⊩ □¬p, M,w ′ ⊩ ¬p, a contradiction.
To show that a formula B does not entail another A, we have
to give a counterexample, i.e., a model M = ⟨W,R,V ⟩ where we
show that at some world w ∈ W , M,w ⊩ B but M,w ⊮ A. Let’s
show that p → ◇p ⊭ □p → p. Consider the model in Figure 8.2.
We have M,w 1 ⊩ ◇p and hence M,w 1 ⊩ p → ◇p. However, since
M,w 1 ⊩ □p but M,w 1 ⊮ p, we have M,w 1 ⊮ □p → p.
Often very simple counterexamples suffice. The model M ′ =
{W ,R ′,V ′ } with W ′ = {w }, R ′ = ∅, and V ′ (p) = ∅ is also a

counterexample: Since M ′,w ⊮ p, M ′,w ⊩ p → ◇p. As no worlds


are accessible from w, we have M ′,w ⊩ □p, and so M ′,w ⊮ □p →
p.
CHAPTER 8. SYNTAX AND SEMANTICS 96

Problems
Problem 8.1. Consider the model of Figure 8.1. Which of the
following hold?

1. M,w 1 ⊩ q ;

2. M,w 3 ⊩ ¬q ;

3. M,w 1 ⊩ p ∨ q ;

4. M,w 1 ⊩ □(p ∨ q );

5. M,w 3 ⊩ □q ;

6. M,w 3 ⊩ □⊥;

7. M,w 1 ⊩ ◇q ;

8. M,w 1 ⊩ □q ;

9. M,w 1 ⊩ ¬□□¬q .

Problem 8.2. Complete the proof of Proposition 8.8.

Problem 8.3. Let M = ⟨W,R,V ⟩ be a model, and suppose


w 1 ,w 2 ∈ W are such that:

1. w 1 ∈ V (p) if and only if w 2 ∈ V (p); and

2. for all w ∈ W : Rw 1w if and only if Rw 2w.

Using induction on formulas, show that for all formulas A:


M,w 1 ⊩ A if and only if M,w 2 ⊩ A.

Problem 8.4. Let M = ⟨W,R,V ⟩. Show that M,w ⊩ ¬◇A if and


only if M,w ⊩ □¬A.

Problem 8.5. Consider the following model M for the language


comprising p 1 , p 2 , p 3 as the only propositional variables:
CHAPTER 8. SYNTAX AND SEMANTICS 97

p1 p1
¬p 2 w 1 w3 p2
¬p 3 p3
p1
w2 p2
¬p 3

Are the following formulas and schemas true in the model M,


i.e., true at every world in M? Explain.

1. p → ◇p (for p atomic);

2. A → ◇A (for A arbitrary);

3. □p → p (for p atomic);

4. ¬p → ◇□p (for p atomic);

5. ◇□A (for A arbitrary);

6. □◇p (for p atomic).

Problem 8.6. Show that the following are valid:

1. ⊨ □p → □(q → p);

2. ⊨ □¬⊥;

3. ⊨ □p → (□q → □p).

Problem 8.7. Show that A →□A is valid in the class C of models


M = ⟨W,R,V ⟩ where W = {w }. Similarly, show that B → □A and
◇A → B are valid in the class of models M = ⟨W,R,V ⟩ where
R = ∅.

Problem 8.8. Prove Proposition 8.20.


CHAPTER 8. SYNTAX AND SEMANTICS 98

Problem 8.9. Prove the claim in the “only if” part of the proof
of Proposition 8.22. (Hint: use induction on A.)

Problem 8.10. Show that none of the following formulas are


valid:

D: □p → ◇p;

T: □p → p;

B: p → □◇p;

4: □p → □□p;

5: ◇p → □◇p.

Problem 8.11. Prove that the schemas in the first column of Ta-
ble 8.1 are valid and those in the second column are not valid.

Problem 8.12. Decide whether the following schemas are valid


or invalid:

1. (◇A → □B) → (□A → □B);

2. ◇(A → B) ∨ □(B → A).

Problem 8.13. For each of the following schemas find a model


M such that every instance of the formula is true in M:

1. p → ◇◇p;

2. ◇p → □p.

Problem 8.14. Show that □(A ∧ B) ⊨ □A.

Problem 8.15. Show that □(p → q ) ⊭ p → □q and p → □q ⊭


□(p → q ).
CHAPTER 9

Axiomatic
Derivations
9.1 Introduction
We have a semantics for the basic modal language in terms of
modal models, and a notion of a formula being valid—true at
all worlds in all models—or valid with respect to some class of
models or frames—true at all worlds in all models in the class, or
based on the frame. Logic usually connects such semantic charac-
terizations of validity with a proof-theoretic notion of derivability.
The aim is to define a notion of derivability in some system such
that a formula is derivable iff it is valid.
The simplest and historically oldest derivation systems are
so-called Hilbert-type or axiomatic derivation systems. Hilbert-
type derivation systems for many modal logics are relatively easy
to construct: they are simple as objects of metatheoretical study
(e.g., to prove soundness and completeness). However, they are
much harder to use to prove formulas in than, say, natural deduc-
tion systems.
In Hilbert-type derivation systems, a derivation of a formula is
a sequence of formulas leading from certain axioms, via a handful
of inference rules, to the formula in question. Since we want the
derivation system to match the semantics, we have to guarantee

99
CHAPTER 9. AXIOMATIC DERIVATIONS 100

that the set of derivable formulas are true in all models (or true in
all models in which all axioms are true). We’ll first isolate some
properties of modal logics that are necessary for this to work: the
“normal” modal logics. For normal modal logics, there are only
two inference rules that need to be assumed: modus ponens and
necessitation. As axioms we take all (substitution instances) of
tautologies, and, depending on the modal logic we deal with, a
number of modal axioms. Even if we are just interested in the
class of all models, we must also count all substitution instances
of K and Dual as axioms. This alone generates the minimal nor-
mal modal logic K.

Definition 9.1. The rule of modus ponens is the inference schema


A A → B mp
B

We say a formula B follows from formulas A, C by modus ponens


iff C ≡ A → B.

Definition 9.2. The rule of necessitation is the inference schema


A nec
□A

We say the formula B follows from the formulas A by necessitation


iff B ≡ □A.

Definition 9.3. A derivation from a set of axioms 𝛴 is a sequence


of formulas B 1 , B 2 , . . . , Bn , where each Bi is either

1. a substitution instance of a tautology, or

2. a substitution instance of a formula in 𝛴 , or

3. follows from two formulas B j , Bk with j , k < i by modus


CHAPTER 9. AXIOMATIC DERIVATIONS 101

ponens, or

4. follows from a formula B j with j < i by necessitation.

If there is such a derivation with Bn ≡ A, we say that A is derivable


from 𝛴 , in symbols 𝛴 ⊢ A.

With this definition, it will turn out that the set of derivable
formulas forms a normal modal logic, and that any derivable for-
mula is true in every model in which every axiom is true. This
property of derivations is called soundness. The converse, com-
pleteness, is harder to prove.

9.2 Proofs in K
In order to practice proofs in the smallest modal system, we show
the valid formulas on the left-hand side of Table 8.1 can all be
given K-proofs.

Proposition 9.4. K ⊢ □A → □(B → A)

Proof.

1. A → (B → A) taut
2. □(A → (B → A)) nec, 1
3. □(A → (B → A)) → (□A → □(B → A)) K
4. □A → □(B → A) mp, 2, 3 □

Proposition 9.5. K ⊢ □(A ∧ B) → (□A ∧ □B)

Proof.
CHAPTER 9. AXIOMATIC DERIVATIONS 102

1. (A ∧ B) → A taut
2. □((A ∧ B) → A) nec
3. □((A ∧ B) → A) → (□(A ∧ B) → □A) K
4. □(A ∧ B) → □A mp, 2, 3
5. (A ∧ B) → B taut
6. □((A ∧ B) → B) nec
7. □((A ∧ B) → B) → (□(A ∧ B) → □B) K
8. □(A ∧ B) → □B mp, 6, 7
9. (□(A ∧ B) → □A) →
((□(A ∧ B) → □B) →
(□(A ∧ B) → (□A ∧ □B))) taut
10. (□(A ∧ B) → □B) →
(□(A ∧ B) → (□A ∧ □B)) mp, 4, 9
11. □(A ∧ B) → (□A ∧ □B) mp, 8, 10.
Note that the formula on line 9 is an instance of the tautology
(p → q ) → ((p → r ) → (p → (q ∧ r ))). □

Proposition 9.6. K ⊢ (□A ∧ □B) → □(A ∧ B)

Proof.
1. A → (B → (A ∧ B)) taut
2. □(A → (B → (A ∧ B))) nec, 1
3. □(A → (B → (A ∧ B))) → (□A → □(B → (A ∧ B))) K
4. □A → □(B → (A ∧ B)) mp, 2, 3
5. □(B → (A ∧ B)) → (□B → □(A ∧ B)) K
6. (□A → □(B → (A ∧ B))) →
(□(B → (A ∧ B)) → (□B → □(A ∧ B))) →
(□A → (□B → □(A ∧ B)))) taut
7. (□(B → (A ∧ B)) → (□B → □(A ∧ B))) →
(□A → (□B → □(A ∧ B))) mp, 4, 6
8. □A → (□B → □(A ∧ B))) mp, 5, 7
9. (□A → (□B → □(A ∧ B)))) →
((□A ∧ □B) → □(A ∧ B)) taut
10. (□A ∧ □B) → □(A ∧ B) mp, 8, 9
CHAPTER 9. AXIOMATIC DERIVATIONS 103

The formulas on lines 6 and 9 are instances of the tautologies

(p → q ) → ((q → r ) → (p → r ))
(p → (q → r )) → ((p ∧ q ) → r ) □

Proposition 9.7. K ⊢ ¬□p → ◇¬p

Proof.

1. ◇¬p ↔ ¬□¬¬p dual


2. (◇¬p ↔ ¬□¬¬p) →
(¬□¬¬p → ◇¬p) taut
3. ¬□¬¬p → ◇¬p mp, 1, 2
4. ¬¬p → p taut
5. □(¬¬p → p) nec, 4
6. □(¬¬p → p) → (□¬¬p → □p) K
7. (□¬¬p → □p) mp, 5, 6
8. (□¬¬p → □p) → (¬□p → ¬□¬¬p) taut
9. ¬□p → ¬□¬¬p mp, 7, 8
10. (¬□p → ¬□¬¬p) →
((¬□¬¬p → ◇¬p) → (¬□p → ◇¬p)) taut
11. (¬□¬¬p → ◇¬p) → (¬□p → ◇¬p) mp, 9, 10
12. ¬□p → ◇¬p mp, 3, 11

The formulas on lines 8 and 10 are instances of the tautologies

(p → q ) → (¬q → ¬p)
(p → q ) → ((q → r ) → (p → r )). □

9.3 Derived Rules


Finding and writing derivations is obviously difficult, cumber-
some, and repetitive. For instance, very often we want to pass
from A → B to □A → □B, i.e., apply rule rk. That requires an
application of nec, then recording the proper instance of K, then
CHAPTER 9. AXIOMATIC DERIVATIONS 104

applying mp. Passing from A → B and B → C to A → C requires


recording the (long) tautological instance
(A → B) → ((B → C ) → (A → C ))
and applying mp twice. Often we want to replace a sub-formula
by a formula we know to be equivalent, e.g., ◇A by ¬□¬A, or
¬¬A by A. So rather than write out the actual derivation, it is
more convenient to simply record why the intermediate steps are
derivable. For this purpose, let us collect some facts about deriv-
ability.

Proposition 9.8. If K ⊢ A1 , . . . , K ⊢ An , and B follows from A1 ,


. . . , An by propositional logic, then K ⊢ B.

Proof. If B follows from A1 , . . . , An by propositional logic, then


A1 → (A2 → · · · (An → B) . . . )
is a tautological instance. Applying mp n times gives a derivation
of B. □

We will indicate use of this proposition by pl.

Proposition 9.9. If K ⊢ A1 → (A2 → · · · (An−1 → An ) . . . ) then


K ⊢ □A1 → (□A2 → · · · (□An−1 → □An ) . . . ).

Proof. By induction on n, just as in the proof of Proposition 12.3.□

We will indicate use of this proposition by rk. Let’s illustrate


how these results help establishing derivability results more eas-
ily.

Proposition 9.10. K ⊢ (□A ∧ □B) → □(A ∧ B)

Proof.
1. K ⊢ A → (B → (A ∧ B)) taut
2. K ⊢ □A → (□B → □(A ∧ B))) rk, 1
3. K ⊢ (□A ∧ □B) → □(A ∧ B) pl, 2 □
CHAPTER 9. AXIOMATIC DERIVATIONS 105

Proposition 9.11. If K ⊢ A↔B and K ⊢ C [A/q ] then K ⊢ C [B/q ]

Proof. Exercise. □

This proposition comes in handy especially when we want


to convert ◇ into □ (or vice versa), or remove double nega-
tions inside a formula. In what follows, we will mark applica-
tions of Proposition 9.11 by “A for B” whenever we re-write a for-
mula C (B) for C (A). In other words, “A for B” abbreviates:
⊢ C (A)
⊢A↔B
⊢ C (B) by Proposition 9.11
For instance:
Proposition 9.12. K ⊢ ¬□p → ◇¬p

Proof.
1. K ⊢ ◇¬p ↔ ¬□¬¬p dual
2. K ⊢ ¬□¬¬p → ◇¬p pl, 1
3. K ⊢ ¬□p → ◇¬p p for ¬¬p □

In the above derivation, the final step “p for ¬¬p” is short for
K ⊢ ¬□¬¬p → ◇¬p
K ⊢ ¬¬p ↔ p taut
K ⊢ ¬□p → ◇¬p by Proposition 9.11
The roles of C (q ), A, and B in Proposition 9.11 are played here,
respectively, by ¬□q → ◇¬p, ¬¬p, and p.
When a formula contains a sub-formula ¬◇A, we can replace
it by □¬A using Proposition 9.11, since K ⊢ ¬◇A ↔ □¬A. We’ll
indicate this and similar replacements simply by “□¬ for ¬◇.”
The following proposition justifies that we can establish deriv-
ability results schematically. E.g., the previous proposition does
not just establish that K ⊢ ¬□p → ◇¬p, but K ⊢ ¬□A → ◇¬A for
arbitrary A.
CHAPTER 9. AXIOMATIC DERIVATIONS 106

Proposition 9.13. If A is a substitution instance of B and K ⊢ B,


then K ⊢ A.

Proof. It is tedious but routine to verify (by induction on the


length of the derivation of B) that applying a substitution to
an entire derivation also results in a correct derivation. Specif-
ically, substitution instances of tautological instances are them-
selves tautological instances, substitution instances of instances
of dual and K are themselves instances of dual and K, and appli-
cations of mp and nec remain correct when substituting formulas
for propositional variables in both premise(s) and conclusion. □

9.4 More Proofs in K


Let’s see some more examples of derivability in K, now using the
simplified method introduced in section 9.3.

Proposition 9.14. K ⊢ □(A → B) → (◇A → ◇B)

Proof.

1. K ⊢ (A → B) → (¬B → ¬A) pl
2. K ⊢ □(A → B) → (□¬B → □¬A) rk, 1
3. K ⊢ (□¬B → □¬A) → (¬□¬A → ¬□¬B) taut
4. K ⊢ □(A → B) → (¬□¬A → ¬□¬B) pl, 2, 3
5. K ⊢ □(A → B) → (◇A → ◇B) ◇ for ¬□¬. □

Proposition 9.15. K ⊢ □A → (◇(A → B) → ◇B)

Proof.

1. K ⊢ A → (¬B → ¬(A → B)) taut


2. K ⊢ □A → (□¬B → □¬(A → B)) rk, 1
3. K ⊢ □A → (¬□¬(A → B) → ¬□¬B) pl, 2
4. K ⊢ □A → (◇(A → B) → ◇B) ◇ for ¬□¬. □
CHAPTER 9. AXIOMATIC DERIVATIONS 107

Proposition 9.16. K ⊢ (◇A ∨ ◇B) → ◇(A ∨ B)

Proof.

1. K ⊢ ¬(A ∨ B) → ¬A taut
2. K ⊢ □¬(A ∨ B) → □¬A rk, 1
3. K ⊢ ¬□¬A → ¬□¬(A ∨ B) pl, 2
4. K ⊢ ◇A → ◇(A ∨ B) ◇ for ¬□¬
5. K ⊢ ◇B → ◇(A ∨ B) similarly
6. K ⊢ (◇A ∨ ◇B) → ◇(A ∨ B) pl, 4, 5. □

Proposition 9.17. K ⊢ ◇(A ∨ B) → (◇A ∨ ◇B)

Proof.

1. K ⊢ ¬A → (¬B → ¬(A ∨ B) taut


2. K ⊢ □¬A → (□¬B → □¬(A ∨ B) rk
3. K ⊢ □¬A → (¬□¬(A ∨ B) → ¬□¬B)) pl, 2
4. K ⊢ ¬□¬(A ∨ B) → (□¬A → ¬□¬B) pl, 3
5. K ⊢ ¬□¬(A ∨ B) → (¬¬□¬B → ¬□¬A) pl, 4
6. K ⊢ ◇(A ∨ B) → (¬◇B → ◇A) ◇ for ¬□¬
7. K ⊢ ◇(A ∨ B) → (◇B ∨ ◇A) pl, 6. □

Problems
Problem 9.1. Find derivations in K for the following formulas:

1. □¬p → □(p → q )

2. (□p ∨ □q ) → □(p ∨ q )

3. ◇p → ◇(p ∨ q )

Problem 9.2. Prove Proposition 9.11 by proving, by induction


on the complexity of C , that if K ⊢ A ↔ B then K ⊢ C [A/q ] ↔
C [B/q ].
CHAPTER 9. AXIOMATIC DERIVATIONS 108

Problem 9.3. Show that the following derivability claims hold:

1. K ⊢ ◇¬⊥ → (□A → ◇A);

2. K ⊢ □(A ∨ B) → (◇A ∨ □B);

3. K ⊢ (◇A → □B) → □(A → B).


CHAPTER 10

Modal
Tableaux
10.1 Introduction
Tableaux are certain (downward-branching) trees of signed for-
mulas, i.e., pairs consisting of a truth value sign (T or F) and a
sentence
T A or F A.
A tableau begins with a number of assumptions. Each further
signed formula is generated by applying one of the inference
rules. Some inference rules add one or more signed formulas
to a tip of the tree; others add two new tips, resulting in two
branches. Rules result in signed formulas where the formula is
less complex than that of the signed formula to which it was ap-
plied. When a branch contains both T A and F A, we say the
branch is closed. If every branch in a tableau is closed, the entire
tableau is closed. A closed tableau consititues a derivation that
shows that the set of signed formulas which were used to begin
the tableau are unsatisfiable. This can be used to define a ⊢ rela-
tion: 𝛤 ⊢ A iff there is some finite set 𝛤0 = {B 1 , . . . ,Bn } ⊆ 𝛤 such
that there is a closed tableau for the assumptions

{F A, T B 1 , . . . , T Bn }.

109
CHAPTER 10. MODAL TABLEAUX 110

For modal logics, we have to both extend the notion of signed


formula and add rules that cover □ and ◇ In addition to a
sign(T or F), formulas in modal tableaux also have prefixes 𝜎.
The prefixes are non-empty sequences of positive integers, i.e.,
𝜎 ∈ (Z+ ) ∗ \ {𝛬}. When we write such prefixes without the sur-
rounding ⟨ ⟩, and separate the individual elements by .’s instead
of ,’s. If 𝜎 is a prefix, then 𝜎.n is 𝜎 ⌢ ⟨n⟩; e.g., if 𝜎 = 1.2.1, then
𝜎.3 is 1.2.1.3. So for instance,

1.2 T □A → A

is a prefixed signed formula (or just a prefixed formula for short).


Intuitively, the prefix names a world in a model that might
satisfy the formulas on a branch of a tableau, and if 𝜎 names
some world, then 𝜎.n names a world accessible from (the world
named by) 𝜎.

10.2 Rules for K


The rules for the regular propositional connectives are the same
as for regular propositional signed tableaux, just with prefixes
added. In each case, the rule applied to a signed formula 𝜎 S A
produces new formulas that are also prefixed by 𝜎. This should
be intuitively clear: e.g., if A ∧ B is true at (a world named by) 𝜎,
then A and B are true at 𝜎 (and not at any other world). We
collect the propositional rules in Table 10.1.
The closure condition is the same as for ordinary tableaux,
although we require that not just the formulas but also the prefixes
must match. So a branch is closed if it contains both

𝜎TA and 𝜎FA

for some prefix 𝜎 and formula A.


The rules for setting up assumptions is also as for ordinary
tableaux, except that for assumptions we always use the prefix 1.
(It does not matter which prefix we use, as long as it’s the same
CHAPTER 10. MODAL TABLEAUX 111

𝜎 T ¬A 𝜎 F ¬A
¬T ¬F
𝜎FA 𝜎TA

𝜎TA ∧B
∧T 𝜎FA∧B
𝜎TA ∧F
𝜎FA | 𝜎FB
𝜎TB

𝜎FA∨B
𝜎TA ∨B ∨F
∨T 𝜎FA
𝜎TA | 𝜎TB
𝜎FB

𝜎FA→B
𝜎TA →B →F
→T 𝜎TA
𝜎FA | 𝜎TB
𝜎FB

Table 10.1: Prefixed tableau rules for the propositional connectives

for all assumptions.) So, e.g., we say that

B 1 , . . . ,Bn ⊢ A

iff there is a closed tableau for the assumptions

1 T B 1 , . . . , 1 T Bn , 1 F A.

For the modal operators □ and ◇, the prefix of the conclusion


of the rule applied to a formula with prefix 𝜎 is 𝜎.n. However,
which n is allowed depends on whether the sign is T or F.
The T□ rule extends a branch containing 𝜎 T □A by 𝜎.n T A.
Similarly, the F◇ rule extends a branch containing 𝜎 F ◇A by
𝜎.n F A. They can only be applied for a prefix 𝜎.n which already
occurs on the branch in which it is applied. Let’s call such a prefix
“used” (on the branch).
The F□ rule extends a branch containing 𝜎 F □A by 𝜎.n F A.
Similarly, the T◇ rule extends a branch containing 𝜎 T ◇A by
CHAPTER 10. MODAL TABLEAUX 112

𝜎 T □A 𝜎 F □A
□T □F
𝜎.n T A 𝜎.n F A

𝜎.n is used 𝜎.n is new

𝜎 T ◇A 𝜎 F ◇A
◇T ◇F
𝜎.n T A 𝜎.n F A

𝜎.n is new 𝜎.n is used

Table 10.2: The modal rules for K.

𝜎.n T A. These rules, however, can only be applied for a pre-


fix 𝜎.n which does not already occur on the branch in which it is
applied. We call such prefixes “new” (to the branch).
The rules are given in Table 10.2.
The requirement that the restriction that the prefix for □T
must be used is necessary as otherwise we would count the fol-
lowing as a closed tableau:

1. 1T □A Assumption
2. 1F ◇A Assumption
3. 1.1 T A □T 1
4. 1.1 F A ◇F 2

But □A ⊭ ◇A, so our proof system would be unsound. Like-


wise, ◇A ⊭ □A, but without the restriction that the prefix for □F
must be new, this would be a closed tableau:
CHAPTER 10. MODAL TABLEAUX 113

1. 1T ◇A Assumption
2. 1F □A Assumption
3. 1.1 T A ◇T 1
4. 1.1 F A □F 2

10.3 Tableaux for K


Example 10.1. We give a closed tableau that shows ⊢ (□A ∧
□B) → □(A ∧ B).

1. 1 F (□A ∧ □B) → □(A ∧ B) Assumption


2. 1 T □A ∧ □B →F 1
3. 1 F □(A ∧ B) →F 1
4. 1 T □A ∧T 2
5. 1 T □B ∧T 2
6. 1.1 F A ∧ B □F 3

7. 1.1 F A 1.1 F B ∧F 6
8. 1.1 T A 1.1 T B □T 4; □T 5
⊗ ⊗

Example 10.2. We give a closed tableau that shows ⊢ ◇(A ∨


B) → (◇A ∨ ◇B):
CHAPTER 10. MODAL TABLEAUX 114

1. 1 F ◇(A ∨ B) → (◇A ∨ ◇B) Assumption


2. 1 T ◇(A ∨ B) →F 1
3. 1 F ◇A ∨ ◇B →F 1
4. 1 F ◇A ∨F 3
5. 1 F ◇B ∨F 3
6. 1.1 T A ∨ B ◇T 2

7. 1.1 T A 1.1 T B ∨T 6
8. 1.1 F A 1.1 F B ◇F 4; ◇F 5
⊗ ⊗

10.4 Soundness for K


In order to show that prefixed tableaux are sound, we have to
show that if
1 T B1 , . . . , 1 T Bn , 1 F A
has a closed tableau then B 1 , . . . ,Bn ⊨ A. It is easier to prove
the contrapositive: if for some M and world w, M,w ⊩ Bi for all
i = 1, . . . , n but M,w ⊩ A, then no tableau can close. Such a
countermodel shows that the initial assumptions of the tableau
are satisfiable. The strategy of the proof is to show that whenever
all the prefixed formulas on a tableau branch are satisfiable, any
application of a rule results in at least one extended branch that
is also satisfiable. Since closed branches are unsatisfiable, any
tableau for a satisfiable set of prefixed formulas must have at least
one open branch.
In order to apply this strategy in the modal case, we have to
extend our definition of “satisfiable” to modal modals and pre-
fixes. With that in hand, however, the proof is straightforward.

Definition 10.3. Let P be some set of prefixes, i.e., P ⊆ (Z+ ) ∗ \


{𝛬} and let M be a model. A function f : P → W is an inter-
pretation of P in M if, whenever 𝜎 and 𝜎.n are both in P , then
CHAPTER 10. MODAL TABLEAUX 115

R f (𝜎) f (𝜎.n).
Relative to an interpretation of prefixes P we can define:

1. M satisfies 𝜎 T A iff M, f (𝜎) ⊩ A.

2. M satisfies 𝜎 F A iff M, f (𝜎) ⊮ A.

Definition 10.4. Let 𝛤 be a set of prefixed formulas, and let


P (𝛤) be the set of prefixes that occur in it. If f is an interpretation
of P (𝛤) in M, we say that M satisfies 𝛤 with respect to f , M, f ⊩
𝛤, if M satisfies every prefixed formula in 𝛤 with respect to f . 𝛤
is satisfiable iff there is a model M and interpretation f of P (𝛤)
such that M, f ⊩ 𝛤.

Proposition 10.5. If 𝛤 contains both 𝜎 T A and 𝜎 F A, for some for-


mula A and prefix 𝜎, then 𝛤 is unsatisfiable.

Proof. There cannot be a model M and interpretation f of P (𝛤)


such that both M, f (𝜎) ⊩ A and M, f (𝜎) ⊮ A. □

Theorem 10.6 (Soundness). If 𝛤 has a closed tableau, 𝛤 is unsat-


isfiable.

Proof. We call a branch of a tableau satisfiable iff the set of signed


formulas on it is satisfiable, and let’s call a tableau satisfiable if it
contains at least one satisfiable branch.
We show the following: Extending a satisfiable tableau by one
of the rules of inference always results in a satisfiable tableau.
This will prove the theorem: any closed tableau results by apply-
ing rules of inference to the tableau consisting only of assump-
tions from 𝛤. So if 𝛤 were satisfiable, any tableau for it would
be satisfiable. A closed tableau, however, is clearly not satisfi-
able, since all its branches are closed and closed branches are
unsatisfiable.
CHAPTER 10. MODAL TABLEAUX 116

Suppose we have a satisfiable tableau, i.e., a tableau with at


least one satisfiable branch. Applying a rule of inference either
adds signed formulas to a branch, or splits a branch in two. If
the tableau has a satisfiable branch which is not extended by the
rule application in question, it remains a satisfiable branch in
the extended tableau, so the extended tableau is satisfiable. So
we only have to consider the case where a rule is applied to a
satisfiable branch.
Let 𝛤 be the set of signed formulas on that branch, and let
𝜎 S A ∈ 𝛤 be the signed formula to which the rule is applied. If
the rule does not result in a split branch, we have to show that the
extended branch, i.e., 𝛤 together with the conclusions of the rule,
is still satisfiable. If the rule results in split branch, we have to
show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences with only one premise.

1. The branch is expanded by applying ¬T to 𝜎 T ¬B ∈ 𝛤.


Then the extended branch contains the signed formulas
𝛤 ∪ {𝜎 F B }. Suppose M, f ⊩ 𝛤. In particular, M, f (𝜎) ⊩
¬B. Thus, M, f (𝜎) ⊮ B, i.e., M satisfies 𝜎 F B with respect
to f .

2. The branch is expanded by applying ¬F to 𝜎 F ¬B ∈ 𝛤:


Exercise.

3. The branch is expanded by applying ∧T to 𝜎 T B ∧ C ∈ 𝛤,


which results in two new signed formulas on the branch:
𝜎 T B and 𝜎 T C . Suppose M, f ⊩ 𝛤, in particular
M, f (𝜎) ⊩ B ∧ C . Then M, f (𝜎) ⊩ B and M, f (𝜎) ⊩ C .
This means that M satisfies both 𝜎 T B and 𝜎 T C with re-
spect to f .

4. The branch is expanded by applying ∨F to F B ∨ C ∈ 𝛤:


Exercise.

5. The branch is expanded by applying →F to 𝜎 F B →


C ∈ 𝛤: This results in two new signed formulas on the
CHAPTER 10. MODAL TABLEAUX 117

branch: 𝜎 T B and 𝜎 F C . Suppose M, f ⊩ 𝛤, in particular


M, f (𝜎) ⊮ B → C . Then M, f (𝜎) ⊩ B and M, f (𝜎) ⊮ C .
This means that M, f satisfies both 𝜎 T B and 𝜎 F C .
6. The branch is expanded by applying □T to 𝜎 T □B ∈ 𝛤:
This results in a new signed formula 𝜎.n T B on the branch,
for some 𝜎.n ∈ P (𝛤) (since 𝜎.n must be used). Suppose
M, f ⊩ 𝛤, in particular, M, f (𝜎) ⊩ □B. Since f is an in-
terpretation of prefixes and both 𝜎, 𝜎.n ∈ P (𝛤), we know
that R f (𝜎) f (𝜎.n). Hence, M, f (𝜎.n) ⊩ B, i.e., M, f satis-
fies 𝜎.n T B.
7. The branch is expanded by applying □F to 𝜎 F □B ∈ 𝛤:
This results in a new signed formula 𝜎.n F A, where 𝜎.n is
a new prefix on the branch, i.e., 𝜎.n ∉ P (𝛤). Since 𝛤 is
satisfiable, there is a M and interpretation f of P (𝛤) such
that M, f ⊨ 𝛤, in particular M, f (𝜎) ⊮ □B. We have to
show that 𝛤 ∪ {𝜎.n F B } is satisfiable. To do this, we define
an interpretation of P (𝛤) ∪ {𝜎.n} as follows:
Since M, f (𝜎) ⊮ □B, there is a w ∈ W such that R f (𝜎)w
and M,w ⊮ B. Let f ′ be like f , except that f ′ (𝜎.n) = w.
Since f ′ (𝜎) = f (𝜎) and R f (𝜎)w, we have R f ′ (𝜎) f ′ (𝜎.n),
so f ′ is an interpretation of P (𝛤) ∪ {𝜎.n}. Obviously
M, f ′ (𝜎.n) ⊮ B. Since f (𝜎 ′) = f ′ (𝜎 ′) for all prefixes
𝜎 ′ ∈ P (𝛤), M, f ′ ⊩ 𝛤. So, M, f ′ satisfies 𝛤 ∪ {𝜎.n F B }.
Now let’s consider the possible inferences with two premises.
1. The branch is expanded by applying ∧F to 𝜎 F B ∧ C ∈ 𝛤,
which results in two branches, a left one continuing through
𝜎 F B and a right one through 𝜎 F C . Suppose M, f ⊩ 𝛤,
in particular M, f (𝜎) ⊮ B ∧ C . Then M, f (𝜎) ⊮ B or
M, f (𝜎) ⊮ C . In the former case, M, f satisfies 𝜎 F B, i.e.,
the left branch is satisfiable. In the latter, M, f satisfies
𝜎 F C , i.e., the right branch is satisfiable.
2. The branch is expanded by applying ∨T to 𝜎 T B ∨ C ∈ 𝛤:
Exercise.
CHAPTER 10. MODAL TABLEAUX 118

3. The branch is expanded by applying →T to 𝜎 T B →C ∈ 𝛤:


Exercise. □

Corollary 10.7. If 𝛤 ⊢ A then 𝛤 ⊨ A.

Proof. If 𝛤 ⊢ A then for some B 1 , . . . , Bn ∈ 𝛤, 𝛥 =


{1 F A, 1 T B 1 , . . . , 1 T Bn } has a closed tableau. We want to show
that 𝛤 ⊨ A. Suppose not, so for some M and w, M,w ⊩ Bi for
i = 1, . . . , n, but M,w ⊮ A. Let f (1) = w; then f is an interpre-
tation of P ( 𝛥) into M, and M satisfies 𝛥 with respect to f . But
by Theorem 10.6, 𝛥 is unsatisfiable since it has a closed tableau,
a contradiction. So we must have 𝛤 ⊢ A after all. □

Corollary 10.8. If ⊢ A then A is true in all models.

10.5 Rules for Other Accessibility


Relations
In order to deal with logics determined by special accessibility
relations, we consider the additional rules in Table 10.3.
Adding these rules results in systems that are sound and com-
plete for the logics given in Table 10.4.

Example 10.9. We give a closed tableau that shows S5 ⊢ 5, i.e.,


□A → □◇A.

1. 1 F □A → □◇A Assumption
2. 1 T □A →F 1
3. 1 F □◇A →F 1
4. 1.1 F ◇A □F 3
5. 1 F ◇A 4r◇ 4
6. 1.1 F A ◇F 5
7. 1.1 T A □T 2

CHAPTER 10. MODAL TABLEAUX 119

𝜎 T □A 𝜎 F ◇A
T□ T◇
𝜎TA 𝜎FA

𝜎 T □A 𝜎 F ◇A
D□ D◇
𝜎 T ◇A 𝜎 F □A

𝜎.n T □A 𝜎.n F ◇A
B□ B◇
𝜎TA 𝜎FA

𝜎 T □A 4□ 𝜎 F ◇A 4◇
𝜎.n T □A 𝜎.n F ◇A

𝜎.n is used 𝜎.n is used

𝜎.n T □A 4r□ 𝜎.n F ◇A 4r◇


𝜎 T □A 𝜎 F ◇A

Table 10.3: More modal rules.

10.6 Soundness for Additional Rules


We say a rule is sound for a class of models if, whenever a branch
in a tableau is satisfiable in a model from that class, the branch
resulting from applying the rule is also satisfiable in a model from
that class.

Proposition 10.10. T□ and T◇ are sound for reflexive models.

Proof. 1. The branch is expanded by applying T□ to 𝜎 T □B ∈


𝛤: This results in a new signed formula 𝜎 T B on the
branch. Suppose M, f ⊩ 𝛤, in particular, M, f (𝜎) ⊩ □B.
CHAPTER 10. MODAL TABLEAUX 120

Logic R is . . . Rules
T = KT reflexive T□, T◇
D = KD serial D□, D◇
K4 transitive 4□, 4◇
B = KTB reflexive, T□, T◇
symmetric B□, B◇
S4 = KT4 reflexive, T□, T◇,
transitive 4□, 4◇
S5 = KT4B reflexive, T□, T◇,
transitive, 4□, 4◇,
euclidean 4r□, 4r◇

Table 10.4: Tableau rules for various modal logics.

Since R is reflexive, we know that R f (𝜎) f (𝜎). Hence,


M, f (𝜎) ⊩ B, i.e., M, f satisfies 𝜎 T B.

2. The branch is expanded by applying T◇ to 𝜎 F ◇B ∈ 𝛤:


Exercise. □

Proposition 10.11. D□ and D◇ are sound for serial models.

Proof. 1. The branch is expanded by applying D□ to 𝜎 T □B ∈


𝛤: This results in a new signed formula 𝜎 T ◇B on the
branch. Suppose M, f ⊩ 𝛤, in particular, M, f (𝜎) ⊩ □B.
Since R is serial, there is a w ∈ W such that R f (𝜎)w. Then
M,w ⊩ B, and hence M, f (𝜎) ⊩ ◇B. So, M, f satisfies
𝜎 T ◇B.

2. The branch is expanded by applying D◇ to 𝜎 F ◇B ∈ 𝛤:


Exercise. □
CHAPTER 10. MODAL TABLEAUX 121

Proposition 10.12. B□ and B◇ are sound for symmetric models.

Proof. 1. The branch is expanded by applying B□ to


𝜎.n T □B ∈ 𝛤: This results in a new signed formula 𝜎 T B
on the branch. Suppose M, f ⊩ 𝛤, in particular,
M, f (𝜎.n) ⊩ □B. Since f is an interpretation of prefixes
on the branch into M, we know that R f (𝜎) f (𝜎.n). Since
R is symmetric, R f (𝜎.n) f (𝜎). Since M, f (𝜎.n) ⊩ □B,
M, f (𝜎) ⊩ B. Hence, M, f satisfies 𝜎 T B.
2. The branch is expanded by applying B◇ to 𝜎.n F ◇B ∈ 𝛤:
Exercise. □

Proposition 10.13. 4□ and 4◇ are sound for transitive models.

Proof. 1. The branch is expanded by applying 4□ to 𝜎 T □B ∈


𝛤: This results in a new signed formula 𝜎.n T □B on the
branch. Suppose M, f ⊩ 𝛤, in particular, M, f (𝜎) ⊩ □B.
Since f is an interpretation of prefixes on the branch into M
and 𝜎.n must be used, we know that R f (𝜎) f (𝜎.n). Now
let w be any world such that R f (𝜎.n)w. Since R is tran-
sitive, R f (𝜎)w. Since M, f (𝜎) ⊩ □B, M,w ⊩ B. Hence,
M, f (𝜎.n) ⊩ □B, and M, f satisfies 𝜎.n T □B.
2. The branch is expanded by applying 4◇ to 𝜎 F ◇B ∈ 𝛤:
Exercise. □

Proposition 10.14. 4r□ and 4r◇ are sound for euclidean models.

Proof. 1. The branch is expanded by applying 4r□ to


𝜎.n T □B ∈ 𝛤: This results in a new signed formula 𝜎 T □B
on the branch. Suppose M, f ⊩ 𝛤, in particular,
M, f (𝜎.n) ⊩ □B. Since f is an interpretation of prefixes on
the branch into M, we know that R f (𝜎) f (𝜎.n). Now let
w be any world such that R f (𝜎)w. Since R is euclidean,
R f (𝜎.n)w. Since M, f (𝜎).n ⊩ □B, M,w ⊩ B. Hence,
M, f (𝜎) ⊩ □B, and M, f satisfies 𝜎 T □B.
CHAPTER 10. MODAL TABLEAUX 122

2. The branch is expanded by applying 4r◇ to 𝜎.n F ◇B ∈ 𝛤:


Exercise. □

Corollary 10.15. The tableau systems given in Table 10.4 are sound
for the respective classes of models.

10.7 Simple Tableaux for S5


S5 is sound and complete with respect to the class of universal
models, i.e., models where every world is accessible from every
world. In universal models the accessibility relation doesn’t mat-
ter: “there is a world w where M,w ⊩ A” is true if and only if
there is such a w that’s accessible from u. So in S5, we can define
models as simply a set of worlds and a valuation V . This suggests
that we should be able to simplify the tableau rules as well. In the
general case, we take as prefixes sequences of positive integers, so
that we can keep track of which such prefixes name worlds which
are accessible from others: 𝜎.n names a world accessible from 𝜎.
But in S5 any world is accessible from any world, so there is no
need to so keep track. Instead, we can use positive integers as
prefixes. The simplified rules are given in Table 10.5.

Example 10.16. We give a simplified closed tableau that shows


S5 ⊢ 5, i.e., ◇A → □◇A.

1. 1 F ◇A → □◇A Assumption
2. 1 T ◇A →F 1
3. 1 F □◇A →F 1
4. 2 F ◇A □F 3
5. 3T A ◇T 2
6. 3F A ◇F 4

CHAPTER 10. MODAL TABLEAUX 123

n T □A n F □A
□T □F
m TA mFA

m is used m is new

n T ◇A n F ◇A
◇T ◇F
m TA mFA

m is new m is used

Table 10.5: Simplified rules for S5.

10.8 Completeness for K


To show that the method of tableaux is complete, we have to show
that whenever there is no closed tableau to show 𝛤 ⊢ A, then 𝛤 ⊭
A, i.e., there is a countermodel. But “there is no closed tableau”
means that every way we could try to construct one has to fail
to close. The trick is to see that if every such way fails to close,
then a specific, systematic and exhaustive way also fails to close.
And this systematic and exhaustive way would close if a closed
tableau exists. The single tableau will contain, among its open
branches, all the information required to define a countermodel.
The countermodel given by an open branch in this tableau will
contain the all the prefixes used on that branch as the worlds,
and a propositional variable p is true at 𝜎 iff 𝜎 T p occurs on the
branch.

Definition 10.17. A branch in a tableau is called complete if,


whenever it contains a prefixed formula 𝜎 S A to which a rule
CHAPTER 10. MODAL TABLEAUX 124

can be applied, it also contains

1. the prefixed formulas that are the corresponding conclu-


sions of the rule, in the case of propositional stacking rules;

2. one of the corresponding conclusion formulas in the case


of propositional branching rules;

3. at least one possible conclusion in the case of modal rules


that require a new prefix;

4. the corresponding conclusion for every prefix occurring on


the branch in the case of modal rules that require a used
prefix.

For instance, a complete branch contains 𝜎 T B and 𝜎 T C


whenever it contains T B ∧C . If it contains 𝜎 T B ∨C it contains at
least one of 𝜎 F B and 𝜎 T C . If it contains 𝜎 F □ it also contains
𝜎.n F □ for at least one n. And whenever it contains 𝜎 T □ it also
contains 𝜎.n T □ for every n such that 𝜎.n is used on the branch.

Proposition 10.18. Every finite 𝛤 has a tableau in which every


branch is complete.

Proof. Consider an open branch in a tableau for 𝛤. There are


finitely many prefixed formulas in the branch to which a rule
could be applied. In some fixed order (say, top to bottom), for
each of these prefixed formulas for which the conditions (1)–(4)
do not already hold, apply the rules that can be applied to it to
extend the branch. In some cases this will result in branching;
apply the rule at the tip of each resulting branch for all remain-
ing prefixed formulas. Since the number of prefixed formulas is
finite, and the number of used prefixes on the branch is finite,
this procedure eventually results in (possibly many) branches ex-
tending the original branch. Apply the procedure to each, and
repeat. But by construction, every branch is closed. □
CHAPTER 10. MODAL TABLEAUX 125

Theorem 10.19 (Completeness). If 𝛤 has no closed tableau, 𝛤 is


satisfiable.

Proof. By the proposition, 𝛤 has a tableau in which every branch


is complete. Since it has no closed tableau, it thas has a tableau
in which at least one branch is open and complete. Let 𝛥 be
the set of prefixed formulas on the branch, and P ( 𝛥) the set of
prefixes occurring in it.
We define a model M( 𝛥) = ⟨P ( 𝛥),R,V ⟩ where the worlds are
the prefixes occurring in 𝛥, the accessibility relation is given by:

R𝜎𝜎 ′ iff 𝜎 ′ = 𝜎.n for some n

and
V (p) = {𝜎 : 𝜎 T p ∈ 𝛥}.
We show by induction on A that if 𝜎 T A ∈ 𝛥 then M( 𝛥), 𝜎 ⊩ A,
and if 𝜎 F A ∈ 𝛥 then M( 𝛥), 𝜎 ⊮ A.

1. A ≡ p: If 𝜎 T A ∈ 𝛥 then 𝜎 ∈ V (p) (by definition of V )


and so M( 𝛥), 𝜎 ⊩ A.
If 𝜎 F A ∈ 𝛥 then 𝜎 T A ∉ 𝛥, since the branch would other-
wise be closed. So 𝜎 ∉ V (p) and thus M( 𝛥), 𝜎 ⊮ A.

2. A ≡ ¬B: If 𝜎 T A ∈ 𝛥, then 𝜎 F B ∈ 𝛥 since the branch is


complete. By induction hypothesis, M( 𝛥), 𝜎 ⊮ B and thus
M( 𝛥), 𝜎 ⊩ A.
If 𝜎 F A ∈ 𝛥, then 𝜎 T B ∈ 𝛥 since the branch is complete.
By induction hypothesis, M( 𝛥), 𝜎 ⊩ B and thus M( 𝛥), 𝜎 ⊮
A.

3. A ≡ B ∧ C : Exercise.

4. A ≡ B ∨ C : If 𝜎 T A ∈ 𝛥, then either 𝜎 T B ∈ 𝛥 or 𝜎 T C ∈
𝛥 since the branch is complete. By induction hypothesis,
either M( 𝛥), 𝜎 ⊩ B or M( 𝛥), 𝜎 ⊩ C . Thus M( 𝛥), 𝜎 ⊩ A.
CHAPTER 10. MODAL TABLEAUX 126

If 𝜎 F A ∈ 𝛥, then both 𝜎 F B ∈ 𝛥 and 𝜎 F C ∈ 𝛥 since


the branch is complete. By induction hypothesis, both
M( 𝛥), 𝜎 ⊮ B and M( 𝛥), 𝜎 ⊮ B. Thus M( 𝛥), 𝜎 ⊮ A.

5. A ≡ B → C : Exercise.

6. A ≡ □B: If 𝜎 T A ∈ 𝛥, then, since the branch is complete,


𝜎.n T B ∈ 𝛥 for every 𝜎.n used on the branch, i.e., for
every 𝜎 ′ ∈ P ( 𝛥) such that R𝜎𝜎 ′. By induction hypothesis,
M( 𝛥), 𝜎 ′ ⊩ B for every 𝜎 ′ such that R𝜎𝜎 ′. Therefore,
M( 𝛥), 𝜎 ⊩ A.
If 𝜎 F A ∈ 𝛥, then for some 𝜎.n, 𝜎.n F B ∈ 𝛥 since the
branch is complete. By induction hypothesis, M( 𝛥), 𝜎.n ⊮
B. Since R𝜎(𝜎.n), there is a 𝜎 ′ such that M( 𝛥), 𝜎 ′ ⊮ B.
Thus M( 𝛥), 𝜎 ⊮ A.

7. A ≡ ◇B: Exercise.

Since 𝛤 ⊆ 𝛥, M( 𝛥) ⊩ 𝛤. □

Corollary 10.20. If 𝛤 ⊨ A then 𝛤 ⊢ A.

Corollary 10.21. If A is true in all models, then ⊢ A.

10.9 Countermodels from Tableaux


The proof of the completeness theorem doesn’t just show that if
⊨ A then ⊢ A, it also gives us a method for constructing coun-
termodels to A if ⊭ A. In the case of K, this method constitutes
a decision procedure. For suppose ⊭ A. Then the proof of Propo-
sition 10.18 gives a method for constructing a complete tableau.
The method in fact always terminates. The propositional rules
for K only add prefixed formulas of lower complexity, i.e., each
propositional rule need only be applied once on a branch for any
signed formula 𝜎 S A. New prefixes are only generated by the □F
CHAPTER 10. MODAL TABLEAUX 127

and ◇T rules, and also only have to be applied once (and produce
a single new prefix). □T and ◇F have to be applied potentially
multiple times, but only once per prefix, and only finitely many
new prefixes are generated. So the construction either results in
a closed branch or a complete branch after finitely many stages.
Once a tableau with an open complete branch is constructed,
the proof of Theorem 10.19 gives us an explict model that satisfies
the original set of prefixed formulas. So not only is it the case
that if 𝛤 ⊨ A, then a closed tableau exists and 𝛤 ⊢ A, if we look for
the closed tableau in the right way and end up with a “complete”
tableau, we’ll not only know that 𝛤 ⊭ A but actually be able to
construct a countermodel.

Example 10.22. We know that ⊬ □(p ∨ q ) → (□p ∨ □q ). The


construction of a tableau begins with:

1. 1 F □(p ∨ q ) → (□p ∨ □q ) ✓ Assumption


2. 1 T □(p ∨ q ) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5

The tableau is of course not finished yet. In the next step, we


consider the only line without a checkmark: the prefixed formula
1 T □(p ∨q ) on line 2. The construction of the closed tableau says
to apply the □T rule for every prefix used on the branch, i.e., for
both 1.1 and 1.2:
CHAPTER 10. MODAL TABLEAUX 128

1. 1 F □(p ∨ q ) → (□p ∨ □q ) ✓ Assumption


2. 1 T □(p ∨ q ) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5
8. 1.1 T p ∨ q □T 2
9. 1.2 T p ∨ q □T 2

Now lines 2, 8, and 9, don’t have checkmarks. But no new prefix


has been added, so we apply ∨T to lines 8 and 9, on all resulting
branches (as long as they don’t close):

1. 1 F □(p ∨ q ) → (□p ∨ □q ) ✓ Assumption


2. 1 T □(p ∨ q ) →F 1
3. 1 F □p ∨ □q ✓ →F 1
4. 1 F □p ✓ ∨F 3
5. 1 F □q ✓ ∨F 3
6. 1.1 F p ✓ □F 4
7. 1.2 F q ✓ □F 5
8. 1.1 T p ∨ q ✓ □T 2
9. 1.2 T p ∨ q ✓ □T 2

10. 1.1 T p ✓ 1.1 T q ✓ ∨T 8

11. ⊗ 1.2 T p ✓ 1.2 T q ✓ ∨T 9


There is one remaining open branch, and it is complete. From


it we define the model with worlds W = {1, 1.1, 1.2} (the only
prefixes appearing on the open branch), the accessibility relation
R = {⟨1, 1.1⟩, ⟨1, 1.2⟩}, and the assignment V (p) = {1.2} (because
line 11 contains 1.2 T p) and V (q ) = {1.1} (because line 10 con-
CHAPTER 10. MODAL TABLEAUX 129

¬p p
1.1 q 1.2 ¬q

¬p
1 ¬q

Figure 10.1: A countermodel to □(p ∨ q ) → (□p ∨ □q ).

tains 1.1 T q ). The model is pictured in Figure 10.1, and you can
verify that it is a countermodel to □(p ∨ q ) → (□p ∨ □q ).

Problems
Problem 10.1. Find closed tableaux in K for the following for-
mulas:

1. □¬p → □(p → q )

2. (□p ∨ □q ) → □(p ∨ q )

3. ◇p → ◇(p ∨ q )

4. □(p ∧ q ) → □p

Problem 10.2. Complete the proof of Theorem 10.6.

Problem 10.3. Give closed tableaux that show the following:

1. KT5 ⊢ B;

2. KT5 ⊢ 4;

3. KDB4 ⊢ T;

4. KB4 ⊢ 5;

5. KB5 ⊢ 4;
CHAPTER 10. MODAL TABLEAUX 130

6. KT ⊢ D.

Problem 10.4. Complete the proof of Proposition 10.10

Problem 10.5. Complete the proof of Proposition 10.11

Problem 10.6. Complete the proof of Proposition 10.12

Problem 10.7. Complete the proof of Proposition 10.13

Problem 10.8. Complete the proof of Proposition 10.14

Problem 10.9. Complete the proof of Theorem 10.19.


PART IV

Is this really
necessary?

131
CHAPTER 11

Frame
Definability
11.1 Introduction
One question that interests modal logicians is the relationship be-
tween the accessibility relation and the truth of certain formulas
in models with that accessibility relation. For instance, suppose
the accessibility relation is reflexive, i.e., for every w ∈ W , Rww.
In other words, every world is accessible from itself. That means
that when □A is true at a world w, w itself is among the accessible
worlds at which A must therefore be true. So, if the accessibility
relation R of M is reflexive, then whatever world w and formula
A we take, □A → A will be true there (in other words, the schema
□p → p and all its substitution instances are true in M).
The converse, however, is false. It’s not the case, e.g., that if
□p → p is true in M, then R is reflexive. For we can easily find
a non-reflexive model M where □p → p is true at all worlds: take
the model with a single world w, not accessible from itself, but
with w ∈ V (p). By picking the truth value of p suitably, we can
make □A → A true in a model that is not reflexive.
The solution is to remove the variable assignment V from the
equation. If we require that □p → p is true at all worlds in M,
regardless of which worlds are in V (p), then it is necessary that

132
CHAPTER 11. FRAME DEFINABILITY 133

R is reflexive. For in any non-reflexive model, there will be at


least one world w such that not Rww. If we set V (p) = W \ {w },
then p will be true at all worlds other than w, and so at all worlds
accessible from w (since w is guaranteed not to be accessible
from w, and w is the only world where p is false). On the other
hand, p is false at w, so □p → p is false at w.
This suggests that we should introduce a notation for model
structures without a valuation: we call these frames. A frame
F is simply a pair ⟨W,R⟩ consisting of a set of worlds with an
accessibility relation. Every model ⟨W,R,V ⟩ is then, as we say,
based on the frame ⟨W,R⟩. Conversely, a frame determines the
class of models based on it; and a class of frames determines the
class of models which are based on any frame in the class. And
we can define F ⊨ A, the notion of a formula being valid in a
frame as: M ⊩ A for all M based on F.
With this notation, we can establish correspondence relations
between formulas and classes of frames: e.g., F ⊨ □p → p if, and
only if, F is reflexive.

11.2 Properties of Accessibility Relations


Many modal formulas turn out to be characteristic of simple, and
even familiar, properties of the accessibility relation. In one direc-
tion, that means that any model that has a given property makes
a corresponding formula (and all its substitution instances) true.
We begin with five classical examples of kinds of accessibility
relations and the formulas the truth of which they guarantee.

Theorem 11.1. Let M = ⟨W,R,V ⟩ be a model. If R has the property


on the left side of Table 11.1, every instance of the formula on the right
side is true in M.

Proof. Here is the case for B: to show that the schema is true in
a model we need to show that all of its instances are true at all
worlds in the model. So let A → □◇A be a given instance of B,
CHAPTER 11. FRAME DEFINABILITY 134

If R is . . . then . . . is true in M:
serial: ∀u∃vRuv □p → ◇p (D)
reflexive: ∀wRww □p → p (T)
symmetric: p → □◇p (B)
∀u∀v (Ruv → Rvu)
transitive: □p → □□p (4)
∀u∀v∀w ((Ruv ∧ Rvw) → Ruw)
euclidean: ◇p → □◇p (5)
∀w∀u∀v ((Rwu ∧ Rwv ) → Ruv )
Table 11.1: Five correspondence facts.

w w′
⊩A ⊩ ◇A
⊩ □◇A

Figure 11.1: The argument from symmetry.

and let w ∈ W be an arbitrary world. Suppose the antecedent A


is true at w, in order to show that □◇A is true at w. So we need to
show that ◇A is true at all w ′ accessible from w. Now, for any w ′
such that Rww ′ we have, using the hypothesis of symmetry, that
also Rw ′w (see Figure 11.1). Since M,w ⊩ A, we have M,w ′ ⊩
◇A. Since w ′ was an arbitrary world such that Rww ′, we have
M,w ⊩ □◇A.
We leave the other cases as exercises. □

Notice that the converse implications of Theorem 11.1 do not


hold: it’s not true that if a model verifies a schema, then the ac-
cessibility relation of that model has the corresponding property.
In the case of T and reflexive models, it is easy to give an exam-
ple of a model in which T itself fails: let W = {w } and V (p) = ∅.
Then R is not reflexive, but M,w ⊩ □p and M,w ⊮ p. But here we
have just a single instance of T that fails in M, other instances,
CHAPTER 11. FRAME DEFINABILITY 135

e.g., □¬p → ¬p are true. It is harder to give examples where every


substitution instance of T is true in M and M is not reflexive. But
there are such models, too:

Proposition 11.2. Let M = ⟨W,R,V ⟩ be a model such that W =


{u,v }, where worlds u and v are related by R: i.e., both Ruv and Rvu.
Suppose that for all p: u ∈ V (p) ⇔ v ∈ V (p). Then:

1. For all A: M,u ⊩ A if and only if M,v ⊩ A (use induction on


A).

2. Every instance of T is true in M.

Since M is not reflexive (it is, in fact, irreflexive), the converse of The-
orem 11.1 fails in the case of T (similar arguments can be given for
some—though not all—the other schemas mentioned in Theorem 11.1).

Although we will focus on the five classical formulas D, T,


B, 4, and 5, we record in Table 11.2 a few more properties of
accessibility relations. The accessibility relation R is partially
functional, if from every world at most one world is accessible. If
it is the case that from every world exactly one world is accessible,
we call it functional. (Thus the functional relations are precisely
those that are both serial and partially functional). They are
called “functional” because the accessibility relation operates like
a (partial) function. A relation is weakly dense if whenever Ruv ,
there is a w “between” u and v . So weakly dense relations are in a
sense the opposite of transitive relations: in a transitive relation,
whenever you can reach v from u by a detour via w, you can
reach v from u directly; in a weakly dense relation, whenever you
can reach v from u directly, you can also reach it by a detour
via some w. A relation is weakly directed if whenever you can
reach worlds u and v from some world w, you can reach a single
world t from both u and v —this is sometimes called the “diamond
property” or “confluence.”
CHAPTER 11. FRAME DEFINABILITY 136

If R is . . . then . . . is true in M:
partially functional:
◇p → □p
∀w∀u∀v ((Rwu ∧ Rwv ) → u = v )
functional: ∀w∃v∀u (Rwu ↔ u = v ) ◇p ↔ □p
weakly dense:
□□p → □p
∀u∀v (Ruv → ∃w (Ruw ∧ Rwv ))
weakly connected:
□((p ∧ □p) → q ) ∨
∀w∀u∀v ((Rwu ∧ Rwv ) → (L)
□((q ∧ □q ) → p)
(Ruv ∨ u = v ∨ Rvu))
weakly directed:
∀w∀u∀v ((Rwu ∧ Rwv ) → ◇□p → □◇p (G)
∃t (Rut ∧ Rvt ))
Table 11.2: Five more correspondence facts.

11.3 Frames

Definition 11.3. A frame is a pair F = ⟨W,R⟩ where W is a non-


empty set of worlds and R a binary relation on W . A model M
is based on a frame F = ⟨W,R⟩ if and only if M = ⟨W,R,V ⟩ for
some valuation V .

Definition 11.4. If F is a frame, we say that A is valid in F, F ⊨ A,


if M ⊩ A for every model M based on F.
If F is a class of frames, we say A is valid in F, F ⊨ A, iff
F ⊨ A for every frame F ∈ F.

The reason frames are interesting is that correspondence be-


tween schemas and properties of the accessibility relation R is
at the level of frames, not of models. For instance, although T is
true in all reflexive models, not every model in which T is true
is reflexive. However, it is true that not only is T valid on all
reflexive frames, also every frame in which T is valid is reflexive.
CHAPTER 11. FRAME DEFINABILITY 137

Remark 1. Validity in a class of frames is a special case of the


notion of validity in a class of models: F ⊨ A iff C ⊨ A where C
is the class of all models based on a frame in F.
Obviously, if a formula or a schema is valid, i.e., valid with
respect to the class of all models, it is also valid with respect to
any class F of frames.

11.4 Frame Definability


Even though the converse implications of Theorem 11.1 fail, they
hold if we replace “model” by “frame”: for the properties con-
sidered in Theorem 11.1, it is true that if a formula is valid in a
frame then the accessibility relation of that frame has the corre-
sponding property. So, the formulas considered define the classes
of frames that have the corresponding property.

Definition 11.5. If F is a class of frames, we say A defines F iff


F ⊨ A for all and only frames F ∈ F.

We now proceed to establish the full definability results for


frames.

Theorem 11.6. If the formula on the right side of Table 11.1 is valid
in a frame F, then F has the property on the left side.

Proof. 1. Suppose D is valid in F = ⟨W,R⟩, i.e., F ⊨ □p → ◇p.


Let M = ⟨W,R,V ⟩ be a model based on F, and w ∈ W . We
have to show that there is a v such that Rwv . Suppose not:
then both M ⊩ □A and M,w ⊮ ◇A for any A, including p.
But then M,w ⊮ □p → ◇p, contradicting the assumption
that F ⊨ □p → ◇p.

2. Suppose T is valid in F, i.e., F ⊨ □p → p. Let w ∈ W be


an arbitrary world; we need to show Rww. Let u ∈ V (p) if
and only if Rwu (when q is other than p, V (q ) is arbitrary,
say V (q ) = ∅). Let M = ⟨W,R,V ⟩. By construction, for all
CHAPTER 11. FRAME DEFINABILITY 138

u such that Rwu: M,u ⊩ p, and hence M,w ⊩ □p. But by


hypothesis □p → p is true at w, so that M,w ⊩ p, but by
definition of V this is possible only if Rww.

3. We prove the contrapositive: Suppose F is not symmetric,


we show that B, i.e., p →□◇p is not valid in F = ⟨W,R⟩. If F
is not symmetric, there are u, v ∈ W such that Ruv but not
Rvu. Define V such that w ∈ V (p) if and only if not Rvw
(and V is arbitrary otherwise). Let M = ⟨W,R,V ⟩. Now,
by definition of V , M,w ⊩ p for all w such that not Rvw,
in particular, M,u ⊩ p since not Rvu. Also, since Rvw iff
w ∉ V (p), there is no w such that Rvw and M,w ⊩ p, and
hence M,v ⊮ ◇p. Since Ruv , also M,u ⊮ □◇p. It follows
that M,u ⊮ p → □◇p, and so B is not valid in F.

4. Suppose 4 is valid in F = ⟨W,R⟩, i.e., F ⊨ □p →□□p, and let


u, v , w ∈ W be arbitrary worlds such that Ruv and Rvw;
we need to show that Ruw. Define V such that z ∈ V (p)
if and only if Ruz (and V is arbitrary otherwise). Let M =
⟨W,R,V ⟩. By definition of V , M, z ⊩ p for all z such that
Ruz , and hence M,u ⊩ □p. But by hypothesis 4, □p → □□p,
is true at u, so that M,u ⊩ □□p. Since Ruv and Rvw, we
have M,w ⊩ p, but by definition of V this is possible only
if Ruw, as desired.

5. We proceed contrapositively, assuming that the frame F =


⟨W,R⟩ is not euclidean, and show that it falsifies 5, i.e.,
F ⊭ ◇p → □◇p. Suppose there are worlds u, v , w ∈ W such
that Rwu and Rwv but not Ruv . Define V such that for all
worlds z , z ∈ V (p) if and only if it is not the case that Ruz .
Let M = ⟨W,R,V ⟩. Then by hypothesis M,v ⊩ p and since
Rwv also M,w ⊩ ◇p. However, there is no world y such
that Ruy and M, y ⊩ p so M,u ⊮ ◇p. Since Rwu, it follows
that M,w ⊮ □◇p, so that 5, ◇p → □◇p, fails at w. □

You’ll notice a difference between the proof for D and the


other cases: no mention was made of the valuation V . In effect,
CHAPTER 11. FRAME DEFINABILITY 139

we proved that if M ⊩ D then M is serial. So D defines the class


of serial models, not just frames.

Corollary 11.7. Any model where D is true is serial.

Corollary 11.8. Each formula on the right side of Table 11.1 defines
the class of frames which have the property on the left side.

Proof. In Theorem 11.1, we proved that if a model has the prop-


erty on the left, the formula on the right is true in it. Thus, if
a frame F has the property on the left, the formula on the right
is valid in F. In Theorem 11.6, we proved the converse implica-
tions: if a formula on the right is valid in F, F has the property
on the left. □

Theorem 11.6 also shows that the properties can be com-


bined: for instance if both B and 4 are valid in F then the frame
is both symmetric and transitive, etc. Many important modal
logics are characterized as the set of formulas valid in all frames
that combine some frame properties, and so we can character-
ize them as the set of formulas valid in all frames in which the
corresponding defining formulas are valid. For instance, the clas-
sical system S4 is the set of all formulas valid in all reflexive and
transitive frames, i.e., in all those where both T and 4 are valid.
S5 is the set of all formulas valid in all reflexive, symmetric, and
euclidean frames, i.e., all those where all of T, B, and 5 are valid.
Logical relationships between properties of R in general cor-
respond to relationships between the corresponding defining for-
mulas. For instance, every reflexive relation is serial; hence,
whenever T is valid in a frame, so is D. (Note that this rela-
tionship is not that of entailment. It is not the case that whenever
M,w ⊩ T then M,w ⊩ D.) We record some such relationships.

Proposition 11.9. Let R be a binary relation on a set W ; then:

1. If R is reflexive, then it is serial.


CHAPTER 11. FRAME DEFINABILITY 140

2. If R is symmetric, then it is transitive if and only if it is euclidean.

3. If R is symmetric or euclidean then it is weakly directed (it has


the “diamond property”).

4. If R is euclidean then it is weakly connected.

5. If R is functional then it is serial.

11.5 First-order Definability


We’ve seen that a number of properties of accessibility relations
of frames can be defined by modal formulas. For instance, sym-
metry of frames can be defined by the formula B, p → □◇p.
The conditions we’ve encountered so far can all be expressed
by first-order formulas in a language involving a single two-
place predicate symbol. For instance, symmetry is defined by
∀x ∀y (Q (x, y)→Q (y,x)) in the sense that a first-order structure M
with |M| = W and Q M = R satisfies the preceding formula iff R
is symmetric. This suggests the following definition:

Definition 11.10. A class F of frames is first-order definable if


there is a sentence A in the first-order language with a single two-
place predicate symbol Q such that F = ⟨W,R⟩ ∈ F iff M ⊨ A in
the first-order structure M with |M| = W and Q M = R.

It turns out that the properties and modal formulas that define
them considered so far are exceptional. Not every formula defines
a first-order definable class of frames, and not every first-order
definable class of frames is definable by a modal formula.
A counterexample to the first is given by the Löb formula:

□(□p → p) → □p. (W)

W defines the class of transitive and converse well-founded


frames. A relation is well-founded if there is no infinite sequence
CHAPTER 11. FRAME DEFINABILITY 141

w 1 , w 2 , . . . such that Rw 2w 1 , Rw 3w 2 , . . . . For instance, the rela-


tion < on N is well-founded, whereas the relation < on Z is not. A
relation is converse well-founded iff its converse is well-founded.
So converse well-founded relations are those where there is no
infinite sequence w 1 , w 2 , . . . such that Rw 1w 2 , Rw 2w 3 , . . . .
There is, however, no first-order formula defining transitive
converse well-founded relations. For suppose M ⊨ F iff R = Q M
is transitive converse well-founded. Let An be the formula
(Q (a1 ,a2 ) ∧ · · · ∧ Q (an−1 ,an ))
Now consider the set of formulas
𝛤 = {F,A1 ,A2 , . . . }.
Every finite subset of 𝛤 is satisfiable: Let k be largest such that Ak
M
is in the subset, |Mk | = {1, . . . ,k }, ai k = i , and Q Mk =<. Since
< on {1, . . . ,k } is transitive and converse well-founded, Mk ⊨ F .
Mk ⊨ Ai by construction, for all i ≤ k . By the Compactness
Theorem for first-order logic, 𝛤 is satisfiable in some structure M.
By hypothesis, since M ⊨ F , the relation Q M is converse well-
founded. But clearly, a 1M , a2M , . . . would form an infinite sequence
of the kind ruled out by converse well-foundedness.
A counterexample to the second claim is given by the prop-
erty of universality: for every u and v , Ruv . Universal frames are
first-order definable by the formula ∀x ∀y Q (x, y). However, no
modal formula is valid in all and only the universal frames. This
is a consequence of a result that is independently interesting: the
formulas valid in universal frames are exactly the same as those
valid in reflexive, symmetric, and transitive frames. There are re-
flexive, symmetric, and transitive frames that are not universal,
hence every formula valid in all universal frames is also valid in
some non-universal frames.

11.6 Equivalence Relations and S5


The modal logic S5 is characterized as the set of formulas valid
on all universal frames, i.e., every world is accessible from every
CHAPTER 11. FRAME DEFINABILITY 142

world, including itself. In such a scenario, □ corresponds to ne-


cessity and ◇ to possibility: □A is true if A is true at every world,
and ◇A is true if A is true at some world. It turns out that S5
can also be characterized as the formulas valid on all reflexive,
symmetric, and transitive frames, i.e., on all equivalence relations.

Definition 11.11. A binary relation R on W is an equivalence


relation if and only if it is reflexive, symmetric and transitive. A
relation R on W is universal if and only if Ruv for all u,v ∈ W .

Since T, B, and 4 characterize the reflexive, symmetric, and


transitive frames, the frames where the accessibility relation is
an equivalence relation are exactly those in which all three for-
mulas are valid. It turns out that the equivalence relations can
also be characterized by other combinations of formulas, since
the conditions with which we’ve defined equivalence relations are
equivalent to combinations of other familiar conditions on R.

Proposition 11.12. The following are equivalent:

1. R is an equivalence relation;

2. R is reflexive and euclidean;

3. R is serial, symmetric, and euclidean;

4. R is serial, symmetric, and transitive.

Proof. Exercise. □

Proposition 11.12 is the semantic counterpart to Proposi-


tion 12.12, in that it gives an equivalent characterization of the
modal logic of frames over which R is an equivalence relation
(the logic traditionally referred to as S5).
What is the relationship between universal and equivalence
relations? Although every universal relation is an equivalence
CHAPTER 11. FRAME DEFINABILITY 143

relation, clearly not every equivalence relation is universal. How-


ever, the formulas valid on all universal relations are exactly the
same as those valid on all equivalence relations.

Proposition 11.13. Let R be an equivalence relation, and for each


w ∈ W define the equivalence class of w as the set [w] = {w ′ ∈ W :
Rww ′ }. Then:

1. w ∈ [w];

2. R is universal on each equivalence class [w];

3. The collection of equivalence classes partitions W into mutually


exclusive and jointly exhaustive subsets.

Proposition 11.14. A formula A is valid in all frames F = ⟨W,R⟩


where R is an equivalence relation, if and only if it is valid in all
frames F = ⟨W,R⟩ where R is universal. Hence, the logic of universal
frames is just S5.

Proof. It’s immediate to verify that a universal relation R on W


is an equivalence. Hence, if A is valid in all frames where R is
an equivalence it is valid in all universal frames. For the other
direction, we argue contrapositively: suppose B is a formula that
fails at a world w in a model M = ⟨W,R,V ⟩ based on a frame
⟨W,R⟩, where R is an equivalence on W . So M,w ⊮ B. Define a
model M ′ = ⟨W ′,R ′,V ′⟩ as follows:

1. W ′ = [w];

2. R ′ is universal on W ′;

3. V ′ (p) = V (p) ∩ W ′.

(So the set W ′ of worlds in M ′ is represented by the shaded


area in Figure 11.2.) It is easy to see that R and R ′ agree on
W ′. Then one can show by induction on formulas that for all
w ′ ∈ W ′: M ′,w ′ ⊩ A if and only if M,w ′ ⊩ A for each A (this
CHAPTER 11. FRAME DEFINABILITY 144

[w]

[z ]

[u]
[v ]

Figure 11.2: A partition of W in equivalence classes.

makes sense since W ′ ⊆ W ). In particular, M ′,w ⊮ B, and B


fails in a model based on a universal frame. □

11.7 Second-order Definability


Not every frame property definable by modal formulas is first-
order definable. However, if we allow quantification over one-
place predicates (i.e., monadic second-order quantification), we
define all modally definable frame properties. The trick is to
exploit a systematic way in which the conditions under which a
modal formula is true at a world are related to first-order formu-
las. This is the so-called standard translation of modal formulas
into first-order formulas in a language containing not just a two-
place predicate symbol Q for the accessibility relation, but also a
one-place predicate symbol Pi for the propositional variables pi
occurring in A.

Definition 11.15. The standard translation STx (A) is inductively


defined as follows:

1. A ≡ ⊥: STx (A) = ⊥.

2. A ≡ pi : STx (A) = Pi (x).


CHAPTER 11. FRAME DEFINABILITY 145

3. A ≡ ¬B: STx (A) = ¬STx (B).

4. A ≡ (B ∧ C ): STx (A) = (STx (B) ∧ STx (C )).

5. A ≡ (B ∨ C ): STx (A) = (STx (B) ∨ STx (C )).

6. A ≡ (B → C ): STx (A) = (STx (B) → STx (C )).

7. A ≡ □B: STx (A) = ∀y (Q (x, y) → STy (B)).

8. A ≡ ◇B: STx (A) = ∃y (Q (x, y) ∧ STy (B)).

For instance, STx (□p →p) is ∀y (Q (x, y) →P (y)) →P (x). Any


structure for the language of STx (A) requires a domain, a two-
place relation assigned to Q , and subsets of the domain assigned
to the one-place predicate symbols Pi . In other words, the com-
ponents of such a structure are exactly those of a model for A:
the domain is the set of worlds, the two-place relation assigned
to Q is the accessibility relation, and the subsets assigned to Pi
are just the assignments V (pi ). It won’t surprise that satisfac-
tion of A in a modal model and of STx (A) in the corresponding
structure agree:

Proposition 11.16. Let M = ⟨W,R,V ⟩, M ′ be the first-order struc-


′ ′
ture with |M ′ | = W , Q M = R, and PiM = V (pi ), and s (x) = w.
Then
M,w ⊩ A iff M ′,s ⊨ STx (A)

Proof. By induction on A. □

Proposition 11.17. Suppose A is a modal formula and F = ⟨W,R⟩ is



a frame. Let F ′ be the first-order structure with |F ′ | = W and Q F = R,
and let A ′ be the second-order formula

∀X 1 . . . ∀Xn ∀x STx (A) [X1 /P 1 , . . . ,Xn /Pn ],


CHAPTER 11. FRAME DEFINABILITY 146

where P 1 , . . . , Pn are all one-place predicate symbols in STx (A). Then

F ⊨ A iff F ′ ⊨ A ′

Proof. F ′ ⊨ A ′ iff for every structure M ′ where PiM ⊆ W for
i = 1, . . . , n, and for every s with s (x) ∈ W , M ′,s ⊨ STx (A). By
Proposition 11.16, that is the case iff for all models M based on F
and every world w ∈ W , M,w ⊩ A, i.e., F ⊨ A. □

Definition 11.18. A class F of frames is second-order definable if


there is a sentence A in the second-order language with a single
two-place predicate symbol P and quantifiers only over monadic
set variables such that F = ⟨W,R⟩ ∈ F iff M ⊨ A in the struc-
ture M with |M| = W and P M = R.

Corollary 11.19. If a class of frames is definable by a formula A, the


corresponding class of accessibility relations is definable by a monadic
second-order sentence.

Proof. The monadic second-order sentence A ′ of the preceding


proof has the required property. □

As an example, consider again the formula □p → p. It de-


fines reflexivity. Reflexivity is of course first-order definable by
the sentence ∀x Q (x,x). But it is also definable by the monadic
second-order sentence

∀X ∀x (∀y (Q (x, y) → X (y)) → X (x)).

This means, of course, that the two sentences are equivalent.


Here’s how you might convince yourself of this directly: First
suppose the second-order sentence is true in a structure M. Since
x and X are universally quantified, the remainder must hold for
any x ∈ W and set X ⊆ W , e.g., the set {z : Rxz } where R = Q M .
So, for any s with s (x) ∈ W and s (X ) = {z : Rxz } we have
CHAPTER 11. FRAME DEFINABILITY 147

M ⊨ ∀y (Q (x, y) → X (y)) → X (x). But by the way we’ve picked


s (X ) that means M,s ⊨ ∀y (Q (x, y) → Q (x, y)) → Q (x,x), which
is equivalent to Q (x,x) since the antecedent is valid. Since s (x)
is arbitrary, we have M ⊨ ∀x Q (x,x).
Now suppose that M ⊨ ∀x Q (x,x) and show that M ⊨
∀X ∀x (∀y (Q (x, y) → X (y)) → X (x)). Pick any assignment s ,
and assume M,s ⊨ ∀y (Q (x, y) → X (y)). Let s ′ be the y-variant
of s with s ′ (y) = s (x); we have M,s ′ ⊨ Q (x, y) → X (y), i.e.,
M,s ⊨ Q (x,x) → X (x). Since M ⊨ ∀x Q (x,x), the antecedent
is true, and we have M,s ⊨ X (x), which is what we needed to
show.
Since some definable classes of frames are not first-order de-
finable, not every monadic second-order sentence of the form A ′
is equivalent to a first-order sentence. There is no effective
method to decide which ones are.

Problems
Problem 11.1. Complete the proof of Theorem 11.1.

Problem 11.2. Prove the claims in Proposition 11.2.

Problem 11.3. Let M = ⟨W,R,V ⟩ be a model. Show that if R


satisfies the left-hand properties of Table 11.2, every instance of
the corresponding right-hand formula is true in M.

Problem 11.4. Show that if the formula on the right side of Ta-
ble 11.2 is valid in a frame F, then F has the property on the left
side. To do this, consider a frame that does not satisfy the prop-
erty on the left, and define a suitable V such that the formula on
the right is false at some world.

Problem 11.5. Prove Proposition 11.9.

Problem 11.6. Prove Proposition 11.12 by showing:

1. If R is symmetric and transitive, it is euclidean.


CHAPTER 11. FRAME DEFINABILITY 148

2. If R is reflexive, it is serial.

3. If R is reflexive and euclidean, it is symmetric.

4. If R is symmetric and euclidean, it is transitive.

5. If R is serial, symmetric, and transitive, it is reflexive.

Explain why this suffices for the proof that the conditions are
equivalent.
CHAPTER 12

More Axiomatic
Derivations
12.1 Normal Modal Logics
Not every set of modal formulas can easily be characterized as
those formulas derivable from a set of axioms. We want modal
logics to be well-behaved. First of all, everything we can derive in
classical propositional logic should still be derivable, of course
taking into account that the formulas may now contain also □
and ◇. To this end, we require that a modal logic contain all
tautological instances and be closed under modus ponens.

Definition 12.1. A modal logic is a set 𝛴 of modal formulas


which

1. contains all tautologies, and

2. is closed under substitution, i.e., if A ∈ 𝛴 , and D 1 , . . . , D n


are formulas, then

A[D 1 /p 1 , . . . ,D n /p n ] ∈ 𝛴 ,

149
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 150

3. is closed under modus ponens, i.e., if A and A → B ∈ 𝛴 , then


B ∈ 𝛴.

In order to use the relational semantics for modal logics, we


also have to require that all formulas valid in all modal models are
included. It turns out that this requirement is met as soon as all
instances of K and dual are derivable, and whenever a formula A
is derivable, so is □A. A modal logic that satisfies these conditions
is called normal. (Of course, there are also non-normal modal
logics, but the usual relational models are not adequate for them.)

Definition 12.2. A modal logic 𝛴 is normal if it contains

□(p → q ) → (□p → □q ), (K)


◇p ↔ ¬□¬p (dual)

and is closed under necessitation, i.e., if A ∈ 𝛴 , then □A ∈ 𝛴 .

Observe that while tautological implication is “fine-grained”


enough to preserve truth at a world, the rule nec only preserves
truth in a model (and hence also validity in a frame or in a class
of frames).

Proposition 12.3. Every normal modal logic is closed under rule rk,
A1 → (A2 → · · · (An−1 → An ) · · · )
rk
□A1 → (□A2 → · · · (□An−1 → □An ) · · · ).

Proof. By induction on n: If n = 1, then the rule is just nec, and


every normal modal logic is closed under nec.
Now suppose the result holds for n −1; we show it holds for n.
Assume

A1 → (A2 → · · · (An−1 → An ) · · · ) ∈ 𝛴

By the induction hypothesis, we have

□A1 → (□A2 → · · · □(An−1 → An ) · · · ) ∈ 𝛴


CHAPTER 12. MORE AXIOMATIC DERIVATIONS 151

Since 𝛴 is a normal modal logic, it contains all instances of K,


in particular

□(An−1 → An ) → (□An−1 → □An ) ∈ 𝛴

Using modus ponens and suitable tautological instances we get

□A1 → (□A2 → · · · (□An−1 → □An ) · · · ) ∈ 𝛴 . □

Proposition 12.4. Every normal modal logic 𝛴 contains ¬◇⊥.

Proposition 12.5. Let A1 , . . . , An be formulas. Then there is a small-


est modal logic 𝛴 containing all instances of A1 , . . . , An .

Proof. Given A1 , . . . , An , define 𝛴 as the intersection of all nor-


mal modal logics containing all instances of A1 , . . . , An . The
intersection is non-empty as Frm(L), the set of all formulas, is
such a modal logic. □

Definition 12.6. The smallest normal modal logic containing


A1 , . . . , An is called a modal system and denoted by KA1 . . . An .
The smallest normal modal logic is denoted by K.

12.2 Derivations and Modal Systems


We first define what a derivation is for normal modal logics.
Roughly, a derivation is a sequence of formulas in which every
element is either (a substitution instance of) one of a number of
axioms, or follows from previous elements by one of a few infer-
ence rules. For normal modal logics, all instances of tautologies,
K, and dual count as axioms. This results in the modal system K,
the smallest normal modal logic. We may wish to add additional
axioms to obtain other systems, however. The rules are always
modus ponens mp and necessitation nec.
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 152

Definition 12.7. Given a modal system KA1 . . . An and a for-


mula B we say that B is derivable in KA1 . . . An , written
KA1 . . . An ⊢ B, if and only if there are formulas C 1 , . . . , C k such
that C k = B and each C i is either a tautological instance, or an
instance of one of K, dual, A1 , . . . , An , or it follows from previous
formulas by means of the rules mp or nec.

The following proposition allows us to show that B ∈ 𝛴 by


exhibiting a 𝛴 -derivation of B.

Proposition 12.8. KA1 . . . An = {B : KA1 . . . An ⊢ B }.

Proof. We use induction on the length of derivations to show that


{B : KA1 . . . An ⊢ B } ⊆ KA1 . . . An .
If the derivation of B has length 1, it contains a single formula.
That formula cannot follow from previous formulas by mp or nec,
so must be a tautological instance, an instance of K, dual, or an
instance of one of A1 , . . . , An . But KA1 . . . An contains these as
well, so B ∈ KA1 . . . An .
If the derivation of B has length > 1, then B may in addition
be obtained by mp or nec from formulas not occurring as the last
line in the derivation. If B follows from C and C →B (by mp), then
C and C → B ∈ KA1 . . . An by induction hypothesis. But every
modal logic is closed under modus ponens, so B ∈ KA1 . . . An . If
B ≡ □C follows from C by nec, then C ∈ KA1 . . . An by induction
hypothesis. But every normal modal logic is closed under nec,
so B ∈ KA1 . . . An .
The converse inclusion follows by showing that 𝛴 = {B :
KA1 . . . An ⊢ B } is a normal modal logic containing all the in-
stances of A1 , . . . , An , and the observation that KA1 . . . An is, by
definition, the smallest such logic.
1. Every tautology B is a tautological instance, so KA1 . . . An ⊢
B, so 𝛴 contains all tautologies.
2. If KA1 . . . An ⊢ C and KA1 . . . An ⊢ C →B, then KA1 . . . An ⊢
B: Combine the derivation of C with that of C → B, and
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 153

add the line B. The last line is justified by mp. So 𝛴 is


closed under modus ponens.

3. If B has a derivation, then every substitution instance of B


also has a derivation: apply the substitution to every for-
mula in the derivation. (Exercise: prove by induction on the
length of derivations that the result is also a correct deriva-
tion). So 𝛴 is closed under uniform substitution. (We have
now established that 𝛴 satisfies all conditions of a modal
logic.)

4. We have KA1 . . . An ⊢ K, so K ∈ 𝛴 .

5. We have KA1 . . . An ⊢ dual, so dual ∈ 𝛴 .

6. If KA1 . . . An ⊢ C , the additional line □C is justified by nec.


Consequently, 𝛴 is closed under nec. Thus, 𝛴 is normal.

12.3 Dual Formulas

Definition 12.9. Each of the formulas T, B, 4, and 5 has a dual,


denoted by a subscripted diamond, as follows:

p → ◇p (T◇ )
◇□p → p (B◇ )
◇◇p → ◇p (4◇ )
◇□p → □p (5◇ )

Each of the above dual formulas is obtained from the corre-


sponding formula by substituting ¬p for p, contraposing, replac-
ing ¬□¬ by ◇, and replacing ¬◇¬ by □. D, i.e., □A → ◇A is its
own dual in that sense.
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 154

12.4 Proofs in Modal Systems


We now come to proofs in systems of modal logic other than K.

Proposition 12.10. The following provability results obtain:

1. KT5 ⊢ B;

2. KT5 ⊢ 4;

3. KDB4 ⊢ T;

4. KB4 ⊢ 5;

5. KB5 ⊢ 4;

6. KT ⊢ D.

Proof. We exhibit proofs for each.

1. KT5 ⊢ B:

1. KT5 ⊢ ◇A → □◇A 5
2. KT5 ⊢ A → ◇A T◇
3. KT5 ⊢ A → □◇A pl.

2. KT5 ⊢ 4:

1. KT5 ⊢ ◇□A → □◇□A 5 with □A for p


2. KT5 ⊢ □A → ◇□A T◇ with □A for p
3. KT5 ⊢ □A → □◇□A pl, 1, 2
4. KT5 ⊢ ◇□A → □A 5◇
5. KT5 ⊢ □◇□A → □□A rk, 4
6. KT5 ⊢ □A → □□A pl, 3, 5.

3. KDB4 ⊢ T:
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 155

1. KDB4 ⊢ ◇□A → A B◇
2. KDB4 ⊢ □□A → ◇□A D with □A for p
3. KDB4 ⊢ □□A → A pl1, 2
4. KDB4 ⊢ □A → □□A 4
5. KDB4 ⊢ □A → A pl, 1, 4.

4. KB4 ⊢ 5:

1. KB4 ⊢ ◇A → □◇◇A B with ◇A for p


2. KB4 ⊢ ◇◇A → ◇A 4◇
3. KB4 ⊢ □◇◇A → □◇A rk, 2
4. KB4 ⊢ ◇A → □◇A pl, 1, 3.

5. KB5 ⊢ 4:

1. KB5 ⊢ □A → □◇□A B with □A for p


2. KB5 ⊢ ◇□A → □A 5◇
3. KB5 ⊢ □◇□A → □□A rk, 2
4. KB5 ⊢ □A → □□A pl, 1, 3.

6. KT ⊢ D:

1. KT ⊢ □A → A T
2. KT ⊢ A → ◇A T◇
3. KT ⊢ □A → ◇A pl, 1, 2 □

Definition 12.11. Following tradition, we define S4 to be the


system KT4, and S5 the system KTB4.

The following proposition shows that the classical system S5


has several equivalent axiomatizations. This should not surprise,
as the various combinations of axioms all characterize equiva-
lence relations (see Proposition 11.12).
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 156

Proposition 12.12. KTB4 = KT5 = KDB4 = KDB5.

Proof. Exercise. □

12.5 Soundness
A derivation system is called sound if everything that can be de-
rived is valid. When considering modal systems, i.e., derivations
where in addition to K we can use instances of some formulas A1 ,
. . . , An , we want every derivable formula to be true in any model
in which A1 , . . . , An are true.

Theorem 12.13 (Soundness Theorem). If every instance of A1 ,


. . . , An is valid in the classes of models C1 , . . . , Cn , respectively, then
KA1 . . . An ⊢ B implies that B is valid in the class of models C1 ∩
· · · ∩ Cn .

12.6 Showing Systems are Distinct


In section 12.4 we saw how to prove that two systems of modal
logic are in fact the same system. Theorem 12.13 allows us to
show that two modal systems 𝛴 and 𝛴 ′ are distinct, by finding
a formula A such that 𝛴 ′ ⊢ A that fails in a model of 𝛴 .

Proposition 12.14. KD ⊊ KT

Proof. This is the syntactic counterpart to the semantic fact that


all reflexive relations are serial. To show KD ⊆ KT we need to
see that KD ⊢ B implies KT ⊢ B, which follows from KT ⊢ D,
as shown in Proposition 12.10(6). To show that the inclusion is
proper, by Soundness (Theorem 12.13), it suffices to exhibit a
model of KD where T, i.e., □p → p, fails (an easy task left as an
exercise), for then by Soundness KD ⊬ □p → p. □
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 157

Proposition 12.15. KB ≠ K4.

Proof. We construct a symmetric model where some instance of


4 fails; since obviously the instance is derivable for K4 but not in
KB, it will follow K4 ⊈ KB. Consider the symmetric model M of
Figure 12.1. Since the model is symmetric, K and B are true in M
(by Proposition 8.19 and Theorem 11.1, respectively). However,
M,w 1 ⊮ □p → □□p. □
¬p p
w1 w2
⊩ □p ⊮ □p
⊮ □□p
Figure 12.1: A symmetric model falsifying an instance of 4.

Theorem 12.16. KTB ⊬ 4 and KTB ⊬ 5.

Proof. By Theorem 11.1 we know that all instances of T and B


are true in every reflexive symmetric model (respectively). So by
soundness, it suffices to find a reflexive symmetric model contain-
ing a world at which some instance of 4 fails, and similarly for 5.
We use the same model for both claims. Consider the symmetric,
reflexive model in Figure 12.2. Then M,w 1 ⊮ □p → □□p, so 4
fails at w 1 . Similarly, M,w 2 ⊮ ◇¬p → □◇¬p, so the instance of 5
with A = ¬p fails at w 2 . □

w1 p w2 p w 3 ¬p
⊩ □p ⊩ ◇¬p
⊮ □□p ⊮ □◇¬p
⊮ ◇¬p
Figure 12.2: The model for Theorem 12.16.
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 158

w 4 ¬p

p p
w2 w3

w 1 ¬p
⊩ □p, ⊮ □□p
Figure 12.3: The model for Theorem 12.17.

Theorem 12.17. KD5 ≠ KT4 = S4.

Proof. By Theorem 11.1 we know that all instances of D and 5


are true in all serial euclidean models. So it suffices to find a
serial euclidean model containing a world at which some instance
of 4 fails. Consider the model of Figure 12.3, and notice that
M,w 1 ⊮ □p → □□p. □

12.7 Derivability from a Set of Formulas


In section 12.4 we defined a notion of provability of a formula in
a system 𝛴 . We now extend this notion to provability in 𝛴 from
formulas in a set 𝛤.

Definition 12.18. A formula A is derivable in a system 𝛴 from


a set of formulas 𝛤, written 𝛤 ⊢ 𝛴 A if and only if there are B 1 ,
. . . , Bn ∈ 𝛤 such that 𝛴 ⊢ B 1 → (B 2 → · · · (Bn → A) · · · ).
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 159

12.8 Properties of Derivability


Proposition 12.19. Let 𝛴 be a modal system and 𝛤 a set of modal
formulas. The following properties hold:

1. Monotony: If 𝛤 ⊢ 𝛴 A and 𝛤 ⊆ 𝛥 then 𝛥 ⊢ 𝛴 A;

2. Reflexivity: If A ∈ 𝛤 then 𝛤 ⊢ 𝛴 A;

3. Cut: If 𝛤 ⊢ 𝛴 A and 𝛥 ∪ {A} ⊢ 𝛴 B then 𝛤 ∪ 𝛥 ⊢ 𝛴 B;

4. Deduction theorem: 𝛤 ∪ {B } ⊢ 𝛴 A if and only if 𝛤 ⊢ 𝛴 B →A;

5. 𝛤 ⊢ 𝛴 A1 and . . . and 𝛤 ⊢ 𝛴 An and A1 → (A2 → · · · (An →


B) · · · ) is a tautological instance, then 𝛤 ⊢ 𝛴 B.

The proof is an easy exercise. Part (5) of Proposition 12.19


gives us that, for instance, if 𝛤 ⊢ 𝛴 A ∨ B and 𝛤 ⊢ 𝛴 ¬A, then
𝛤 ⊢ 𝛴 B. Also, in what follows, we write 𝛤,A ⊢ 𝛴 B instead of
𝛤 ∪ {A} ⊢ 𝛴 B.

Definition 12.20. A set 𝛤 is deductively closed relatively to a sys-


tem 𝛴 if and only if 𝛤 ⊢ 𝛴 A implies A ∈ 𝛤.

12.9 Consistency
Consistency is an important property of sets of formulas. A set
of formulas is inconsistent if a contradiction, such as ⊥, is deriv-
able from it; and otherwise consistent. If a set is inconsistent, its
formulas cannot all be true in a model at a world. For the com-
pleteness theorem we prove the converse: every consistent set is
true at a world in a model, namely in the “canonical model.”

Definition 12.21. A set 𝛤 is consistent relatively to a system 𝛴


or, as we will say, 𝛴 -consistent, if and only if 𝛤 ⊬ 𝛴 ⊥.
CHAPTER 12. MORE AXIOMATIC DERIVATIONS 160

So for instance, the set {□(p → q ), □p, ¬□q } is consistent rel-


atively to propositional logic, but not K-consistent. Similarly, the
set {◇p, □◇p → q , ¬q } is not K5-consistent.

Proposition 12.22. Let 𝛤 be a set of formulas. Then:

1. 𝛤 is 𝛴 -consistent if and only if there is some formula A such that


𝛤 ⊬ 𝛴 A.

2. 𝛤 ⊢ 𝛴 A if and only if 𝛤 ∪ {¬A} is not 𝛴 -consistent.

3. If 𝛤 is 𝛴 -consistent, then for any formula A, either 𝛤 ∪ {A} is


𝛴 -consistent or 𝛤 ∪ {¬A} is 𝛴 -consistent.

Proof. These facts follow easily using classical propositional logic.


We give the argument for (3). Proceed contrapositively and sup-
pose neither 𝛤 ∪ {A} nor 𝛤 ∪ {¬A} is 𝛴 -consistent. Then by
(2), both 𝛤,A ⊢ 𝛴 ⊥ and 𝛤, ¬A ⊢ 𝛴 ⊥. By the deduction theorem
𝛤 ⊢ 𝛴 A → ⊥ and 𝛤 ⊢ 𝛴 ¬A →⊥. But (A →⊥) → ((¬A →⊥) →⊥) is
a tautological instance, hence by Proposition 12.19(5), 𝛤 ⊢ 𝛴 ⊥.□

Problems
Problem 12.1. Prove Proposition 12.4.

Problem 12.2. Show that for each formula A in Definition 12.9:


K ⊢ A ↔ A◇ .

Problem 12.3. Prove Proposition 12.12.

Problem 12.4. Give an alternative proof of Theorem 12.17 using


a model with 3 worlds.

Problem 12.5. Provide a single reflexive transitive model show-


ing that both KT4 ⊬ B and KT4 ⊬ 5.
CHAPTER 13

Completeness
and Canonical
Models
13.1 Introduction
If 𝛴 is a modal system, then the soundness theorem establishes
that if 𝛴 ⊢ A, then A is valid in any class C of models in which all
instances of all formulas in 𝛴 are valid. In particular that means
that if K ⊢ A then A is true in all models; if KT ⊢ A then A is
true in all reflexive models; if KD ⊢ A then A is true in all serial
models, etc.
Completeness is the converse of soundness: that K is com-
plete means that if a formula A is valid, ⊢ A, for instance. Prov-
ing completeness is a lot harder to do than proving soundness.
It is useful, first, to consider the contrapositive: K is complete iff
whenever ⊬ A, there is a countermodel, i.e., a model M such that
M ⊮ A. Equivalently (negating A), we could prove that whenever
⊬ ¬A, there is a model of A. In the construction of such a model,
we can use information contained in A. When we find models
for specific formulas we often do the same: e.g., if we want to

161
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 162

find a countermodel to p → □q , we know that it has to contain


a world where p is true and □q is false. And a world where □q
is false means there has to be a world accessible from it where
q is false. And that’s all we need to know: which worlds make
the propositional variables true, and which worlds are accessible
from which worlds.
In the case of proving completeness, however, we don’t have
a specific formula A for which we are constructing a model. We
want to establish that a model exists for every A such that ⊬ 𝛴 ¬A.
This is a minimal requirement, since if ⊢ 𝛴 ¬A, by soundness,
there is no model for A (in which 𝛴 is true). Now note that
⊬ 𝛴 ¬A iff A is 𝛴 -consistent. (Recall that 𝛴 ⊬ 𝛴 ¬A and A ⊬ 𝛴 ⊥
are equivalent.) So our task is to construct a model for every
𝛴 -consistent formula.
The trick we’ll use is to find a 𝛴 -consistent set of formulas
that contains A, but also other formulas which tell us what the
world that makes A true has to look like. Such sets are complete 𝛴 -
consistent sets. It’s not enough to construct a model with a single
world to make A true, it will have to contain multiple worlds and
an accessibility relation. The complete 𝛴 -consistent set contain-
ing A will also contain other formulas of the form □B and ◇C . In
all accessible worlds, B has to be true; in at least one, C has to be
true. In order to accomplish this, we’ll simply take all possible
complete 𝛴 -consistent sets as the basis for the set of worlds. A
tricky part will be to figure out when a complete 𝛴 -consistent set
should count as being accessible from another in our model.
We’ll show that in the model so defined, A is true at a world—
which is also a complete 𝛴 -consistent set—iff A is an element of
that set. If A is 𝛴 -consistent, it will be an element of at least one
complete 𝛴 -consistent set (a fact we’ll prove), and so there will
be a world where A is true. So we will have a single model where
every 𝛴 -consistent formula A is true at some world. This single
model is the canonical model for 𝛴 .
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 163

13.2 Complete 𝛴 -Consistent Sets


Suppose 𝛴 is a set of modal formulas—think of them as the
axioms or defining principles of a normal modal logic. A set
𝛤 is 𝛴 -consistent iff 𝛤 ⊬ 𝛴 ⊥, i.e., if there is no derivation
of A1 → (A2 → · · · (An → ⊥) . . . ) from 𝛴 , where each Ai ∈ 𝛤.
We will construct a “canonical” model in which each world is
taken to be a special kind of 𝛴 -consistent set: one which is not
just 𝛴 -consistent, but maximally so, in the sense that it settles the
truth value of every modal formula: for every A, either A ∈ 𝛤 or
¬A ∈ 𝛤:

Definition 13.1. A set 𝛤 is complete 𝛴 -consistent if and only if it


is 𝛴 -consistent and for every A, either A ∈ 𝛤 or ¬A ∈ 𝛤.

Complete 𝛴 -consistent sets 𝛤 have a number of useful prop-


erties. For one, they are deductively closed, i.e., if 𝛤 ⊢ 𝛴 A then
A ∈ 𝛤. This means in particular that every instance of a for-
mula A ∈ 𝛴 is also ∈ 𝛤. Moreover, membership in 𝛤 mirrors the
truth conditions for the propositional connectives. This will be
important when we define the “canonical model.”

Proposition 13.2. Suppose 𝛤 is complete 𝛴 -consistent. Then:

1. 𝛤 is deductively closed in 𝛴 .

2. 𝛴 ⊆ 𝛤.

3. ⊥ ∉ 𝛤

4. ¬A ∈ 𝛤 if and only if A ∉ 𝛤.

5. A ∧ B ∈ 𝛤 iff A ∈ 𝛤 and B ∈ 𝛤

6. A ∨ B ∈ 𝛤 iff A ∈ 𝛤 or B ∈ 𝛤

7. A → B ∈ 𝛤 iff A ∉ 𝛤 or B ∈ 𝛤
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 164

Proof. 1. Suppose 𝛤 ⊢ 𝛴 A but A ∉ 𝛤. Then since 𝛤 is complete


𝛴 -consistent, ¬A ∈ 𝛤. This would make 𝛤 inconsistent,
since A, ¬A ⊢ 𝛴 ⊥.

2. If A ∈ 𝛴 then 𝛤 ⊢ 𝛴 A, and A ∈ 𝛤 by deductive closure, i.e.,


case (1).

3. If ⊥ ∈ 𝛤, then 𝛤 ⊢ 𝛴 ⊥, so 𝛤 would be 𝛴 -inconsistent.

4. If ¬A ∈ 𝛤, then by consistency A ∉ 𝛤; and if A ∉ 𝛤 then


A ∈ 𝛤 since 𝛤 is complete 𝛴 -consistent.

5. Exercise.

6. Suppose A ∨ B ∈ 𝛤, and A ∉ 𝛤 and B ∉ 𝛤. Since 𝛤 is com-


plete 𝛴 -consistent, ¬A ∈ 𝛤 and ¬B ∈ 𝛤. Then ¬(A∨B) ∈ 𝛤
since ¬A → (¬B →¬(A ∨B)) is a tautological instance. This
would mean that 𝛤 is 𝛴 -inconsistent, a contradiction.

7. Exercise.

13.3 Lindenbaum’s Lemma


Lindenbaum’s Lemma establishes that every 𝛴 -consistent set of
formulas is contained in at least one complete 𝛴 -consistent set.
Our construction of the canonical model will show that for each
complete 𝛴 -consistent set 𝛥, there is a world in the canonical
model where all and only the formulas in 𝛥 are true. So Linden-
baum’s Lemma guarantees that every 𝛴 -consistent set is true at
some world in the canonical model.

Theorem 13.3 (Lindenbaum’s Lemma). If 𝛤 is 𝛴 -consistent


then there is a complete 𝛴 -consistent set 𝛥 extending 𝛤.

Proof. Let A0 , A1 , . . . be an exhaustive listing of all formulas


of the language (repetitions are allowed). For instance, start by
listing p0 , and at each stage n ≥ 1 list the finitely many formulas
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 165

of length n using only variables among p0 , . . . , pn . We define sets


of formulas 𝛥n by induction on n, and we then set 𝛥 = n 𝛥n . We
⋃︁
first put 𝛥0 = 𝛤. Supposing that 𝛥n has been defined, we define
𝛥n+1 by:
{︄
𝛥n ∪ {An }, if 𝛥n ∪ {An } is 𝛴 -consistent;
𝛥n+1 =
𝛥n ∪ {¬An }, otherwise.
Now let 𝛥 = n=0 𝛥n .
⋃︁∞
We have to show that this definition actually yields a set 𝛥
with the required properties, i.e., 𝛤 ⊆ 𝛥 and 𝛥 is complete 𝛴 -
consistent.
It’s obvious that 𝛤 ⊆ 𝛥, since 𝛥0 ⊆ 𝛥 by construction, and
𝛥0 = 𝛤. In fact, 𝛥n ⊆ 𝛥 for all n, since 𝛥 is the union of all 𝛥n .
(Since in each step of the construction, we add a formula to the
set already constructed, 𝛥n ⊆ 𝛥n+1 , so since ⊆ is transitive, 𝛥n ⊆
𝛥m whenever n ≤ m.) At each stage of the construction, we either
add An or ¬An , and every formula appears (at least once) in the
list of all An . So, for every A either A ∈ 𝛥 or ¬A ∈ 𝛥, so 𝛥 is
complete by definition.
Finally, we have to show, that 𝛥 is 𝛴 -consistent. To do this,
we show that (a) if 𝛥 were 𝛴 -inconsistent, then some 𝛥n would
be 𝛴 -inconsistent, and (b) all 𝛥n are 𝛴 -consistent.
So suppose 𝛥 were 𝛴 -inconsistent. Then 𝛥 ⊢ 𝛴 ⊥, i.e., there
are A1 , . . . , Ak ∈ 𝛥 such that 𝛴 ⊢ A1 → (A2 → · · · (Ak → ⊥) . . . ).
Since 𝛥 = n=0 𝛥n , each Ai ∈ 𝛥ni for some ni . Let n be the largest
⋃︁∞
of these. Since ni ≤ n, 𝛥ni ⊆ 𝛥n . So, all Ai are in some 𝛥n . This
would mean 𝛥n ⊢ 𝛴 ⊥, i.e., 𝛥n is 𝛴 -inconsistent.
To show that each 𝛥n is 𝛴 -consistent, we use a simple induc-
tion on n. 𝛥0 = 𝛤, and we assumed 𝛤 was 𝛴 -consistent. So
the claim holds for n = 0. Now suppose it holds for n, i.e., 𝛥n
is 𝛴 -consistent. 𝛥n+1 is either 𝛥n ∪ {An } if that is 𝛴 -consistent,
otherwise it is 𝛥n ∪ {¬An }. In the first case, 𝛥n+1 is clearly 𝛴 -
consistent. However, by Proposition 12.22(3), either 𝛥n ∪ {An }
or 𝛥n ∪ {¬An } is consistent, so 𝛥n+1 is consistent in the other case
as well. □
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 166

Corollary 13.4. 𝛤 ⊢ 𝛴 A if and only if A ∈ 𝛥 for each complete 𝛴 -


consistent set 𝛥 extending 𝛤 (including when 𝛤 = ∅, in which case we
get another characterization of the modal system 𝛴 .)

Proof. Suppose 𝛤 ⊢ 𝛴 A, and let 𝛥 be any complete 𝛴 -consistent


set extending 𝛤. If A ∉ 𝛥 then by maximality ¬A ∈ 𝛥 and so
𝛥 ⊢ 𝛴 A (by monotony) and 𝛥 ⊢ 𝛴 ¬A (by reflexivity), and so 𝛥 is
inconsistent. Conversely if 𝛤 ⊬ 𝛴 A, then 𝛤 ∪{¬A} is 𝛴 -consistent,
and by Lindenbaum’s Lemma there is a complete consistent set
𝛥 extending 𝛤 ∪ {¬A}. By consistency, A ∉ 𝛥. □

13.4 Modalities and Complete Consistent


Sets
When we construct a model M 𝛴 whose set of worlds is given by
the complete 𝛴 -consistent sets 𝛥 in some normal modal logic 𝛴 ,
we will also need to define an accessibility relation R 𝛴 between
such “worlds.” We want it to be the case that the accessibility
relation (and the assignment V 𝛴 ) are defined in such a way that
M 𝛴 , 𝛥 ⊩ A iff A ∈ 𝛥. How should we do this?
Once the accessibility relation is defined, the definition of
truth at a world ensures that M 𝛴 , 𝛥 ⊩ □A iff M 𝛴 , 𝛥′ ⊩ A for
all 𝛥′ such that R 𝛴 𝛥𝛥′. The proof that M 𝛴 , 𝛥 ⊩ A iff A ∈ 𝛥
requires that this is true in particular for formulas starting with
a modal operator, i.e., M 𝛴 , 𝛥 ⊩ □A iff □A ∈ 𝛥. Combining this
requirement with the definition of truth at a world for □A yields:

□A ∈ 𝛥 iff A ∈ 𝛥′ for all 𝛥′ with R 𝛴 𝛥𝛥′

Consider the left-to-right direction: it says that if □A ∈ 𝛥, then


A ∈ 𝛥′ for any A and any 𝛥′ with R 𝛴 𝛥𝛥′. If we stipulate that
R 𝛴 𝛥𝛥′ iff A ∈ 𝛥′ for all □A ∈ 𝛥, then this holds. We can write
the condition on the right of the “iff” more compactly as: {A :
□A ∈ 𝛥} ⊆ 𝛥′.
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 167

So the question is: does this definition of R 𝛴 in fact guarantee


that □A ∈ 𝛥 iff M 𝛴 , 𝛥 ⊩ □A? Does it also guarantee that ◇A ∈ 𝛥
iff M 𝛴 , 𝛥 ⊩ ◇A? The next few results will establish this.

Definition 13.5. If 𝛤 is a set of formulas, let

□𝛤 = {□B : B ∈ 𝛤 }
◇𝛤 = {◇B : B ∈ 𝛤 }

and

□−1 𝛤 = {B : □B ∈ 𝛤 }
◇−1 𝛤 = {B : ◇B ∈ 𝛤 }

In other words, □𝛤 is 𝛤 with □ in front of every formula in 𝛤;


□−1 𝛤 is all the □’ed formulas of 𝛤 with the initial □’s removed.
This definition is not terribly important on its own, but will sim-
plify the notation considerably.
Note that □□−1 𝛤 ⊆ 𝛤:
□□−1 𝛤 = {□B : □B ∈ 𝛤 }
i.e., it’s just the set of all those formulas of 𝛤 that start with □.

Lemma 13.6. If 𝛤 ⊢ 𝛴 A then □𝛤 ⊢ 𝛴 □A.

Proof. If 𝛤 ⊢ 𝛴 A then there are B 1 , . . . , Bk ∈ 𝛤 such that 𝛴 ⊢


B 1 → (B 2 → · · · (Bn → A) · · · ). Since 𝛴 is normal, by rule rk,
𝛴 ⊢ □B 1 → (□B 2 → · · · (□Bn → □A) · · · ), where obviously □B 1 ,
. . . , □Bk ∈ □𝛤. Hence, by definition, □𝛤 ⊢ 𝛴 □A. □

Lemma 13.7. If □−1 𝛤 ⊢ 𝛴 A then 𝛤 ⊢ 𝛴 □A.

Proof. Suppose □−1 𝛤 ⊢ 𝛴 A; then by Lemma 13.6, □□−1 𝛤 ⊢ □A.


But since □□−1 𝛤 ⊆ 𝛤, also 𝛤 ⊢ 𝛴 □A by Monotony. □
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 168

Proposition 13.8. If 𝛤 is complete 𝛴 -consistent, then □A ∈ 𝛤 if and


only if for every complete 𝛴 -consistent 𝛥 such that □−1 𝛤 ⊆ 𝛥, it holds
that A ∈ 𝛥.
Proof. Suppose 𝛤 is complete 𝛴 -consistent. The “only if” direc-
tion is easy: Suppose □A ∈ 𝛤 and that □−1 𝛤 ⊆ 𝛥. Since □A ∈ 𝛤,
A ∈ □−1 𝛤 ⊆ 𝛥, so A ∈ 𝛥.
For the “if” direction, we prove the contrapositive: Suppose
□A ∉ 𝛤. Since 𝛤 is complete 𝛴 -consistent, it is deductively
closed, and hence 𝛤 ⊬ 𝛴 □A. By Lemma 13.7, □−1 𝛤 ⊬ 𝛴 A. By
Proposition 12.22(2), □−1 𝛤 ∪ {¬A} is 𝛴 -consistent. By Linden-
baum’s Lemma, there is a complete 𝛴 -consistent set 𝛥 such that
□−1 𝛤 ∪ {¬A} ⊆ 𝛥. By consistency, A ∉ 𝛥. □

Lemma 13.9. Suppose 𝛤 and 𝛥 are complete 𝛴 -consistent. Then


□−1 𝛤 ⊆ 𝛥 if and only if ◇𝛥 ⊆ 𝛤.

Proof. “Only if” direction: Assume □−1 𝛤 ⊆ 𝛥 and suppose ◇A ∈


◇𝛥 (i.e., A ∈ 𝛥). In order to show ◇A ∈ 𝛤, it suffices to show
□¬A ∉ 𝛤, for then by maximality, ¬□¬A ∈ 𝛤. Now, if □¬A ∈ 𝛤
then by hypothesis ¬A ∈ 𝛥, against the consistency of 𝛥 (since
A ∈ 𝛥). Hence □¬A ∉ 𝛤, as required.
“If” direction: Assume ◇𝛥 ⊆ 𝛤. We argue contrapositively:
suppose A ∉ 𝛥 in order to show □A ∉ 𝛤. If A ∉ 𝛥 then by
maximality ¬A ∈ 𝛥 and so by hypothesis ◇¬A ∈ 𝛤. But in a
normal modal logic ◇¬A is equivalent to ¬□A, and if the latter
is in 𝛤, by consistency □A ∉ 𝛤, as required. □

Proposition 13.10. If 𝛤 is complete 𝛴 -consistent, then ◇A ∈ 𝛤 if


and only if for some complete 𝛴 -consistent 𝛥 such that ◇𝛥 ⊆ 𝛤, it
holds that A ∈ 𝛥.
Proof. Suppose 𝛤 is complete 𝛴 -consistent. ◇A ∈ 𝛤 iff ¬□¬A ∈ 𝛤
by dual and closure. ¬□¬A ∈ 𝛤 iff □¬A ∉ 𝛤 by Propo-
sition 13.2(4) since 𝛤 is complete 𝛴 -consistent. By Proposi-
tion 13.8, □¬A ∉ 𝛤 iff, for some complete 𝛴 -consistent 𝛥 with
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 169

□−1 𝛤 ⊆ 𝛥, ¬A ∉ 𝛥. Now consider any such 𝛥. By Lemma 13.9,


□−1 𝛤 ⊆ 𝛥 iff ◇𝛥 ⊆ 𝛤. Also, ¬A ∉ 𝛥 iff A ∈ 𝛥 by Proposi-
tion 13.2(4). So ◇A ∈ 𝛤 iff, for some complete 𝛴 -consistent 𝛥
with ◇𝛥 ⊆ 𝛤, A ∈ 𝛥. □

13.5 Canonical Models


The canonical model for a modal system 𝛴 is a specific model M 𝛴
in which the worlds are all complete 𝛴 -consistent sets. Its acces-
sibility relation R 𝛴 and valuation V 𝛴 are defined so as to guar-
antee that the formulas true at a world 𝛥 are exactly the formulas
making up 𝛥.

Definition 13.11. Let 𝛴 be a normal modal logic. The canonical


model for 𝛴 is M 𝛴 = ⟨W 𝛴 ,R 𝛴 ,V 𝛴 ⟩, where:

1. W 𝛴 = {𝛥 : 𝛥 is complete 𝛴 -consistent}.

2. R 𝛴 𝛥𝛥′ holds if and only if □−1 𝛥 ⊆ 𝛥′.

3. V 𝛴 (p) = {𝛥 : p ∈ 𝛥}.

13.6 The Truth Lemma


The canonical model M 𝛴 is defined in such a way that M 𝛴 , 𝛥 ⊩ A
iff A ∈ 𝛥. For propositional variables, the definition of V 𝛴 yields
this directly. We have to verify that the equivalence holds for all
formulas, however. We do this by induction. The inductive step
involves proving the equivalence for formulas involving proposi-
tional operators (where we have to use Proposition 13.2) and the
modal operators (where we invoke the results of section 13.4).

Proposition 13.12 (Truth Lemma). For every formula A,


M 𝛴 , 𝛥 ⊩ A if and only if A ∈ 𝛥.

Proof. By induction on A.
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 170

1. A ≡ ⊥: M 𝛴 , 𝛥 ⊮ ⊥ by Definition 8.7, and ⊥ ∉ 𝛥 by Propo-


sition 13.2(3).

2. A ≡ p: M 𝛴 , 𝛥 ⊩ p iff 𝛥 ∈ V 𝛴 (p) by Definition 8.7. Also,


𝛥 ∈ V 𝛴 (p) iff p ∈ 𝛥 by definition of V 𝛴 .

3. A ≡ ¬B: M 𝛴 , 𝛥 ⊩ ¬B iff M 𝛴 , 𝛥 ⊮ B (Definition 8.7) iff


B ∉ 𝛥 (by inductive hypothesis) iff ¬B ∈ 𝛥 (by Proposi-
tion 13.2(4)).

4. A ≡ B ∧ C : Exercise.

5. A ≡ B ∨ C : M 𝛴 , 𝛥 ⊩ B ∨ C iff M 𝛴 , 𝛥 ⊩ B or M 𝛴 , 𝛥 ⊩ C (by
Definition 8.7) iff B ∈ 𝛥 or C ∈ 𝛥 (by inductive hypothesis)
iff B ∨ C ∈ 𝛥 (by Proposition 13.2(6)).

6. A ≡ B → C : Exercise.

7. A ≡ □B: First suppose that M 𝛴 , 𝛥 ⊩ □B. By Definition 8.7,


for every 𝛥′ such that R 𝛴 𝛥𝛥′, M 𝛴 , 𝛥′ ⊩ B. By inductive
hypothesis, for every 𝛥′ such that R 𝛴 𝛥𝛥′, B ∈ 𝛥′. By defi-
nition of R 𝛴 , for every 𝛥′ such that □−1 𝛥 ⊆ 𝛥′, B ∈ 𝛥′. By
Proposition 13.8, □B ∈ 𝛥.
Now assume □B ∈ 𝛥. Let 𝛥′ ∈ W 𝛴 be such that R 𝛴 𝛥𝛥′,
i.e., □−1 𝛥 ⊆ 𝛥′. Since □B ∈ 𝛥, B ∈ □−1 𝛥. Consequently,
B ∈ 𝛥′. By inductive hypothesis, M 𝛴 , 𝛥′ ⊩ B. Since 𝛥′ is
arbitrary with R 𝛴 𝛥𝛥′, for all 𝛥′ ∈ W 𝛴 such that R 𝛴 𝛥𝛥′,
M 𝛴 , 𝛥′ ⊩ B. By Definition 8.7, M 𝛴 , 𝛥 ⊩ □B.

8. A ≡ ◇B: Exercise. □

13.7 Determination and Completeness for


K
We are now prepared to use the canonical model to establish com-
pleteness. Completeness follows from the fact that the formulas
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 171

true in the canonical model for 𝛴 are exactly the 𝛴 -derivable


ones. Models with this property are said to determine 𝛴 .

Definition 13.13. A model M determines a normal modal


logic 𝛴 precisely when M ⊩ A if and only if 𝛴 ⊢ A, for all formu-
las A.

Theorem 13.14 (Determination). M 𝛴 ⊩ A if and only if 𝛴 ⊢ A.

Proof. If M 𝛴 ⊩ A, then for every complete 𝛴 -consistent 𝛥, we


have M 𝛴 , 𝛥 ⊩ A. Hence, by the Truth Lemma, A ∈ 𝛥 for every
complete 𝛴 -consistent 𝛥, whence by Corollary 13.4 (with 𝛤 = ∅),
𝛴 ⊢ A.
Conversely, if 𝛴 ⊢ A then by Proposition 13.2(1), every com-
plete 𝛴 -consistent 𝛥 contains A, and hence by the Truth Lemma,
M 𝛴 , 𝛥 ⊩ A for every 𝛥 ∈ W 𝛴 , i.e., M 𝛴 ⊩ A. □

Since the canonical model for K determines K, we immedi-


ately have completeness of K as a corollary:

Corollary 13.15. The basic modal logic K is complete with respect to


the class of all models, i.e., if ⊨ A then K ⊢ A.

Proof. Contrapositively, if K ⊬ A then by Determination M K ⊮ A


and hence A is not valid. □

For the general case of completeness of a system 𝛴 with re-


spect to a class of models, e.g., of KTB4 with respect to the class
of reflexive, symmetric, transitive models, determination alone
is not enough. We must also show that the canonical model for
the system 𝛴 is a member of the class, which does not follow ob-
viously from the canonical model construction—nor is it always
true!
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 172

13.8 Frame Completeness


The completeness theorem for K can be extended to other modal
systems, once we show that the canonical model for a given logic
has the corresponding frame property.

Theorem 13.16. If a normal modal logic 𝛴 contains one of the for-


mulas on the left-hand side of Table 13.1, then the canonical model
for 𝛴 has the corresponding property on the right-hand side.

If 𝛴 contains . . . . . . the canonical model for 𝛴 is:


D: □A → ◇A serial;
T: □A → A reflexive;
B: A → □◇A symmetric;
4: □A → □□A transitive;
5: ◇A → □◇A euclidean.
Table 13.1: Basic correspondence facts.

Proof. We take each of these up in turn.


Suppose 𝛴 contains D, and let 𝛥 ∈ W 𝛴 ; we need to show
that there is a 𝛥′ such that R 𝛴 𝛥𝛥′. It suffices to show that □−1 𝛥 is
𝛴 -consistent, for then by Lindenbaum’s Lemma, there is a com-
plete 𝛴 -consistent set 𝛥′ ⊇ □−1 𝛥, and by definition of R 𝛴 we
have R 𝛴 𝛥𝛥′. So, suppose for contradiction that □−1 𝛥 is not 𝛴 -
consistent, i.e., □−1 𝛥 ⊢ 𝛴 ⊥. By Lemma 13.7, 𝛥 ⊢ 𝛴 □⊥, and since
𝛴 contains D, also 𝛥 ⊢ 𝛴 ◇⊥. But 𝛴 is normal, so 𝛴 ⊢ ¬◇⊥
(Proposition 12.4), whence also 𝛥 ⊢ 𝛴 ¬◇⊥, against the consis-
tency of 𝛥.
Now suppose 𝛴 contains T, and let 𝛥 ∈ W 𝛴 . We want to
show R 𝛴 𝛥𝛥, i.e., □−1 𝛥 ⊆ 𝛥. But if □A ∈ 𝛥 then by T also A ∈ 𝛥,
as desired.
Now suppose 𝛴 contains B, and suppose R 𝛴 𝛥𝛥′ for 𝛥,
𝛥 ∈ W 𝛴 . We need to show that R 𝛴 𝛥′ 𝛥, i.e., □−1 𝛥′ ⊆ 𝛥. By

Lemma 13.9, this is equivalent to ◇𝛥 ⊆ 𝛥′. So suppose A ∈ 𝛥.


By B, also □◇A ∈ 𝛥. By the hypothesis that R 𝛴 𝛥𝛥′, we have that
□−1 𝛥 ⊆ 𝛥′, and hence ◇A ∈ 𝛥′, as required.
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 173

Now suppose 𝛴 contains 4, and suppose R 𝛴 𝛥1 𝛥2 and


R𝛴 𝛥2 𝛥3 . We need to show R 𝛴 𝛥1 𝛥3 . From the hypothesis we have
both □−1 𝛥1 ⊆ 𝛥2 and □−1 𝛥2 ⊆ 𝛥3 . In order to show R 𝛴 𝛥1 𝛥3 it
suffices to show □−1 𝛥1 ⊆ 𝛥3 . So let B ∈ □−1 𝛥1 , i.e., □B ∈ 𝛥1 . By
4, also □□B ∈ 𝛥1 and by hypothesis we get, first, that □B ∈ 𝛥2
and, second, that B ∈ 𝛥3 , as desired.
Now suppose 𝛴 contains 5, suppose R 𝛴 𝛥1 𝛥2 and R 𝛴 𝛥1 𝛥3 .
We need to show R 𝛴 𝛥2 𝛥3 . The first hypothesis gives □−1 𝛥1 ⊆
𝛥2 , and the second hypothesis is equivalent to ◇𝛥3 ⊆ 𝛥2 , by
Lemma 13.9. To show R 𝛴 𝛥2 𝛥3 , by Lemma 13.9, it suffices to
show ◇𝛥3 ⊆ 𝛥2 . So let ◇A ∈ ◇𝛥3 , i.e., A ∈ 𝛥3 . By the second
hypothesis ◇A ∈ 𝛥1 and by 5, □◇A ∈ 𝛥1 as well. But now the
first hypothesis gives ◇A ∈ 𝛥2 , as desired. □

As a corollary we obtain completeness results for a number


of systems. For instance, we know that S5 = KT5 = KTB4 is
complete with respect to the class of all reflexive euclidean mod-
els, which is the same as the class of all reflexive, symmetric and
transitive models.

Theorem 13.17. Let CD , CT , CB , C4 , and C5 be the class of all se-


rial, reflexive, symmetric, transitive, and euclidean models (respectively).
Then for any schemas A1 , . . . , An among D, T, B, 4, and 5, the system
KA1 . . . An is determined by the class of models C = CA1 ∩ · · · ∩ CAn .

Proposition 13.18. Let 𝛴 be a normal modal logic; then:

1. If 𝛴 contains the schema ◇A → □A then the canonical model


for 𝛴 is partially functional.

2. If 𝛴 contains the schema ◇A ↔ □A then the canonical model


for 𝛴 is functional.

3. If 𝛴 contains the schema □□A → □A then the canonical model


for 𝛴 is weakly dense.
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 174

(see Table 11.2 for definitions of these frame properties).

Proof. 1. Suppose that 𝛴 contains the schema ◇A → □A, to


show that R 𝛴 is partially functional we need to prove that
for any 𝛥1 , 𝛥2 , 𝛥3 ∈ W 𝛴 , if R 𝛴 𝛥1 𝛥2 and R 𝛴 𝛥1 𝛥3 then 𝛥2 =
𝛥3 . Since R 𝛴 𝛥1 𝛥2 we have □−1 𝛥1 ⊆ 𝛥2 and since R 𝛴 𝛥1 𝛥3
also □−1 𝛥1 ⊆ 𝛥3 . The identity 𝛥2 = 𝛥3 will follow if we can
establish the two inclusions 𝛥2 ⊆ 𝛥3 and 𝛥3 ⊆ 𝛥2 . For the
first inclusion, let A ∈ 𝛥2 ; then ◇A ∈ 𝛥1 , and by the schema
and deductive closure of 𝛥1 also □A ∈ 𝛥1 , whence by the
hypothesis that R 𝛴 𝛥1 𝛥3 , A ∈ 𝛥3 . The second inclusion is
similar.
2. This follows immediately from part (1) and the seriality
proof in Theorem 13.16.
3. Suppose 𝛴 contains the schema □□A → □A and to show
that R 𝛴 is weakly dense, let R 𝛴 𝛥1 𝛥2 . We need to show that
there is a complete 𝛴 -consistent set 𝛥3 such that R 𝛴 𝛥1 𝛥3
and R 𝛴 𝛥3 𝛥2 . Let:

𝛤 = □−1 𝛥1 ∪ ◇𝛥2 .

It suffices to show that 𝛤 is 𝛴 -consistent, for then by Lin-


denbaum’s Lemma it can be extended to a complete 𝛴 -
consistent set 𝛥3 such that □−1 𝛥1 ⊆ 𝛥3 and ◇𝛥2 ⊆ 𝛥3 , i.e.,
R 𝛴 𝛥1 𝛥3 and R 𝛴 𝛥3 𝛥2 (by Lemma 13.9).
Suppose for contradiction that 𝛤 is not consistent. Then
there are formulas □A1 , . . . , □An ∈ 𝛥1 and B 1 , . . . , Bm ∈ 𝛥2
such that

A1 , . . . ,An , ◇B 1 , . . . , ◇Bm ⊢ 𝛴 ⊥.

Since ◇(B 1 ∧ · · · ∧ Bm ) → (◇B 1 ∧ · · · ∧ ◇Bm ) is derivable


in every normal modal logic, we argue as follows, contra-
dicting the consistency of 𝛥2 :

A1 , . . . ,An ,◇B 1 , . . . , ◇Bm ⊢ 𝛴 ⊥


CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 175

A1 , . . . ,An ⊢ 𝛴 (◇B 1 ∧ · · · ∧ ◇Bm ) → ⊥


by the deduction theorem
Proposition 12.19(4), and taut
A1 , . . . ,An ⊢ 𝛴 ◇(B 1 ∧ · · · ∧ Bm ) → ⊥
since 𝛴 is normal
A1 , . . . ,An ⊢ 𝛴 ¬◇(B 1 ∧ · · · ∧ Bm )
by pl
A1 , . . . ,An ⊢ 𝛴 □¬(B 1 ∧ · · · ∧ Bm )
□¬ for ¬◇
□A1 , . . . , □An ⊢ 𝛴 □□¬(B 1 ∧ · · · ∧ Bm )
by Lemma 13.6
□A1 , . . . , □An ⊢ 𝛴 □¬(B 1 ∧ · · · ∧ Bm )
by schema □□A → □A
𝛥1 ⊢ 𝛴 □¬(B 1 ∧ · · · ∧ Bm )
by monotony, Proposition 12.19(1)
□¬(B 1 ∧ · · · ∧ Bm ) ∈ 𝛥1
by deductive closure;
¬(B 1 ∧ · · · ∧ Bm ) ∈ 𝛥2
since R 𝛴 𝛥1 𝛥2 . □

On the strength of these examples, one might think that every


system 𝛴 of modal logic is complete, in the sense that it proves ev-
ery formula which is valid in every frame in which every theorem
of 𝛴 is valid. Unfortunately, there are many systems that are not
complete in this sense.

Problems
Problem 13.1. Complete the proof of Proposition 13.2.
CHAPTER 13. COMPLETENESS AND CANONICAL MODELS 176

Problem 13.2. Show that if 𝛤 is complete 𝛴 -consistent, then


◇A ∈ 𝛤 if and only if there is a complete 𝛴 -consistent 𝛥 such
that □−1 𝛤 ⊆ 𝛥 and A ∈ 𝛥. Do this without using Lemma 13.9.

Problem 13.3. Complete the proof of Proposition 13.12.


CHAPTER 14

Modal Sequent
Calculus
14.1 Introduction
The sequent calculus for propositional logic can be extended by
additional rules that deal with □ and ◇. For instance, for K, we
have LK plus:

𝛤 ⇒ 𝛥,A A, 𝛤 ⇒ 𝛥
□ ◇
□𝛤 ⇒ ◇𝛥, □A ◇A, □𝛤 ⇒ ◇𝛥
For extensions of K, additional rules have to be added as well.
Not every modal logic has such a sequent calculus. Even S5,
which is semantically simple (it can be defined without using ac-
cessibility relations at all) is not known to have a sequent calcu-
lus that results from LK which is complete without the rule Cut.
However, it has a cut-free complete hypersequent calculus.

14.2 Rules for K


The rules for the regular propositional connectives are the same
as for regular sequent calculus LK. Axioms are also the same:
any sequent of the form A ⇒ A counts as an axiom.

177
CHAPTER 14. MODAL SEQUENT CALCULUS 178

For the modal operators □ and ◇, we have the following ad-


ditional rules:
𝛤 ⇒ 𝛥,A A, 𝛤 ⇒ 𝛥
□ ◇
□𝛤 ⇒ ◇𝛥, □A ◇A, □𝛤 ⇒ ◇𝛥
Here, □𝛤 means the sequence of formulas resulting from 𝛤 by
putting □ in front of every formula in 𝛤 and ◇𝛥 is the sequence of
formulas resulting from 𝛥 by putting ◇ in front of every formula
in 𝛥. 𝛤 and 𝛥 may be empty; in that case the corresponding part
□𝛤 and ◇𝛥 of the conclusion sequent is empty as well.
The restriction of adding a □ on the right and ◇ on the left
to a single formula A is necessary. If we allowed to add □ to any
number of formulas on the right or to add ◇ to any number of
formulas on the left we would be able to derive:
A ⇒ A A ⇒ A
¬R ¬L
⇒ A, ¬A ¬A,A ⇒
□∗ ◇∗
⇒ □A, □¬A ◇¬A, ◇A ⇒
¬R
∨R ◇A ⇒ ¬◇¬A
⇒ □A ∨ □¬A →R
⇒ ◇A → ¬◇¬A
But □A ∨ □¬A and ◇A → ¬◇¬A are not valid in K.
If we allowed side formulas in addition to A in the premise,
and allowed the □ rule to add □ to only A on the right, or allowed
the ◇ rule to add ◇ to only A on the left (but do nothing to the
side formulas) we would be able to derive:
A ⇒ A A ⇒ A
¬R ¬L
⇒ A, ¬A ¬A,A ⇒
XR ◇∗
⇒ ¬A,A ◇¬A,A ⇒
□∗ ¬R
⇒ ¬A, □A A ⇒ ¬◇¬A
∨R →R
⇒ ¬A ∨ □A ⇒ A → ¬◇¬A
But ¬A ∨ □A (which is equivalent to A → □A) and A → ¬◇¬A are
not valid in K.

14.3 Sequent Derivations for K


Example 14.1. We give a sequent calculus derivation that shows
⊢ (□A ∧ □B) → □(A ∧ B).
CHAPTER 14. MODAL SEQUENT CALCULUS 179

A ⇒ A B ⇒ B
B,A ⇒ A B,A ⇒ B
∧R
B,A ⇒ A∧B

□B, □A ⇒ □(A ∧ B)
∧L
□A ∧ □B, □A ⇒ □(A ∧ B)
XL
□A, □A ∧ □B ⇒ □(A ∧ B)
∧L
□A ∧ □B, □A ∧ □B ⇒ □(A ∧ B)
CL
□A ∧ □B ⇒ □(A ∧ B)
→R
⇒ (□A ∧ □B) → □(A ∧ B)

Example 14.2. We give a sequent calculus derivation that shows


⊢ ◇(A ∨ B) → (◇A ∨ ◇B).
A ⇒ A B ⇒ B
A ⇒ A,B B ⇒ A,B
∨L
A∨B ⇒ A,B

◇(A ∨ B) ⇒ ◇A, ◇B
∨R
◇(A ∨ B) ⇒ ◇A, ◇A ∨ ◇B
XR
◇(A ∨ B) ⇒ ◇A ∨ ◇B, ◇A
∨R
◇(A ∨ B) ⇒ ◇A ∨ ◇B, ◇A ∨ ◇B
CR
◇(A ∨ B) ⇒ ◇A ∨ ◇B
→R
⇒ ◇(A ∨ B) → (◇A ∨ ◇B)

Here is a derivation of dual.

A ⇒ A
¬R
⇒ A, ¬A
XR
A ⇒ A ⇒ ¬A,A
¬R □
¬A,A ⇒ ⇒ ◇¬A, □A
◇ XR
◇¬A, □A ⇒ ⇒ □A, ◇¬A
¬R ¬R
□A ⇒ ¬◇¬A ¬◇¬A ⇒ □A
→R →R
⇒ □A → ¬◇¬A ⇒ ¬◇¬A → □A
∧R
⇒ □A ↔ ¬◇¬A
CHAPTER 14. MODAL SEQUENT CALCULUS 180

A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥,A
T□ T◇
□A, 𝛤 ⇒ 𝛥 𝛤 ⇒ 𝛥, ◇A

𝛤 ⇒ 𝛥
D
□𝛤 ⇒ ◇𝛥

𝛤, ◇𝛱 ⇒ □𝛥, 𝛬,A A, ◇𝛤, 𝛱 ⇒ □𝛬, 𝛥


B□ B◇
□𝛤, 𝛱 ⇒ 𝛥, ◇𝛬, □A ◇A, 𝛤, □𝛱 ⇒ 𝛬, ◇𝛥

□𝛤 ⇒ ◇𝛥,A 4□ A, □𝛤 ⇒ ◇𝛥
4◇
□𝛤 ⇒ ◇𝛥, □A ◇A, □𝛤 ⇒ ◇𝛥

□𝛤, ◇𝛱 ⇒ □𝛥, ◇𝛬,A 5□ A, ◇𝛤, □𝛱 ⇒ ◇𝛥, □𝛬


5◇
□𝛤, ◇𝛱 ⇒ □𝛥, ◇𝛬, □A ◇A, ◇𝛤, □𝛱 ⇒ ◇𝛥, □𝛬

Table 14.1: More modal rules.

14.4 Rules for Other Accessibility


Relations
In order to deal with logics determined by special accessibility
relations, we consider the additional rules in Table 14.1.
Adding these rules results in systems that are sound and com-
plete for the logics given in Table 14.2.

Example 14.3. We give a sequent derivation that shows K4 ⊢ 4,


i.e., □A → □□A.
□A ⇒ □A 4□
□A ⇒ □□A
→R
⇒ □A → □□A
CHAPTER 14. MODAL SEQUENT CALCULUS 181

Logic R is . . . Rules
T = KT reflexive □, T□, T◇
D = KD serial □, D
K4 transitive □, 4□, 4◇
B = KTB reflexive, □, T□, T◇
symmetric B□, B◇
S4 = KT4 reflexive, □, T□, T◇
transitive 4□, 4◇
S5 = KT5 reflexive, □, T□, T◇
transitive, 5□, 5◇
euclidean

Table 14.2: Sequent rules for various modal logics.

Example 14.4. We give a sequent derivation that shows S5 ⊢ 5,


i.e., ◇A → □◇A.
◇A ⇒ ◇A 5□
◇A ⇒ □◇A
→R
⇒ ◇A → □◇A

Example 14.5. The sequent calculus for S5 is not complete with-


out the Cut rule; e.g., ◇□A →A, which is valid in S5, has no proof
without Cut. Here is a derivation using Cut:
□A ⇒ □A 5◇ A ⇒ A
T□
◇□A ⇒ □A □A ⇒ A
Cut
◇□A ⇒ A
→R
⇒ ◇□A → A

Problems
Problem 14.1. Find sequent calculus proofs in K for the follow-
ing formulas:

1. □¬p → □(p → q )
CHAPTER 14. MODAL SEQUENT CALCULUS 182

2. (□p ∨ □q ) → □(p ∨ q )

3. ◇p → ◇(p ∨ q )

4. □(p ∧ q ) → □p

Problem 14.2. Give sequent derivations that show the following:

1. KT5 ⊢ B;

2. KT5 ⊢ 4;

3. KDB4 ⊢ T;

4. KB4 ⊢ 5;

5. KB5 ⊢ 4;

6. KT ⊢ D.
PART V

But you
can’t tell me
what to
think!

183
CHAPTER 15

Epistemic
Logics
15.1 Introduction
Just as modal logic deals with modal propositions and the entail-
ment relations among them, epistemic logic deals with epistemic
propositions and the entailment relations among them. Rather
than interpreting the modal operators as representing possibility
and necessity, the unary connectives are interpreted in epistemic
or doxastic ways, to model knowledge and belief. For example,
we might want to express claims like the following:

1. Richard knows that Calgary is in Alberta.

2. Audrey thinks it is possible that a dog is on the couch.

3. Richard knows that Audrey knows that her class is on Tues-


days.

4. Everyone knows that a year has 12 months.

Contemporary epistemic logic is often traced to Jaako Hintikka’s


Knowledge and Belief, from 1962, and it was written at a time when
possible worlds semantics were becoming increasingly more used

184
CHAPTER 15. EPISTEMIC LOGICS 185

in logic. In fact, epistemic logics use most of the same seman-


tic tools as other modal logics, but will interpret them differently.
The main change is in what we take the accessibility relation to rep-
resent. In epistemic logics, they represent some form of epistemic
possibility. We’ll see that the epistemic notion that we’re modelling
will affect the constraints that we want to place on the accessibil-
ity relation. And we’ll also see what happens to correspondence
theory when it is given an epistemic interpretation. You’ll no-
tice that the examples above mention two agents: Richard and
Audrey, and the relationship between the things that each one
knows. The epistemic logics we’ll consider will be multi-agent
logics, in which such things can be expressed. In contrast, a
single-agent epistemic logic would only talk about what one indi-
vidual knows or believes.

15.2 The Language of Epistemic Logic

Definition 15.1. Let G be a set of agent-symbols. The basic


language of multi-agent epistemic logic contains

1. The propositional constant for falsity ⊥.

2. A countably infinite set of propositional variables: p0 , p1 ,


p2 , . . .

3. The propositional connectives: ¬ (negation), ∧ (conjunc-


tion), ∨ (disjunction), → (conditional).

4. The knowledge operator Ka where a ∈ G .

If we are only concerned with the knowledge of a single agent


in our system, we can drop the reference to the set G , and indi-
vidual agents. In that case, we only have the basic operator K.

Definition 15.2. Formulas of the epistemic language are induc-


CHAPTER 15. EPISTEMIC LOGICS 186

tively defined as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an (atomic) formula.

3. If A is a formula, then ¬A is a formula.

4. If A and B are formulas, then (A ∧ B) is a formula.

5. If A and B are formulas, then (A ∨ B) is a formula.

6. If A and B are formulas, then (A → B) is a formula.

7. If A is a formula and a ∈ G , then Ka A is a formula.

8. Nothing else is a formula.

If a formula A does not contain Ka , we say it is modal-free.

Definition 15.3. While the K operator is intended to symbolize


individual knowledge, E, often read as “everybody knows,” sym-
bolizes group knowledge. Where G ′ ⊆ G , we define EG ′ A as an
abbreviation for ⋀︂
Kb A.
b ∈G ′

We can also define an even stronger sense of knowledge,


namely common knowledge among a group of agents G . When
a piece of information is common knowledge among a group
of agents, it means that for every combination of agents in that
group, they all know that each other knows that each other knows
. . . ad infinitum. This is significantly stronger than group knowl-
edge, and it is easy to come up with relational models in which
a formula is group knowledge, but not common knowledge. We
will use CG A to symbolize “it is common knowledge among G
that A.”
CHAPTER 15. EPISTEMIC LOGICS 187

15.3 Relational Models


The basic semantic concept for epistemic logics is the same as
that of ordinary modal logics. Relational models still consist of
a set of worlds, and an assignment that determines which propo-
sitional variables count as “true” at which worlds. And if we are
only dealing with a single agent, we have a single accessibility re-
lation as usual. However, if we have a multi-agent epistemic logic,
then our single accessibility relation becomes a set of accessibility
relations, one for each a in our set of agent symbols G .
A relational model consists of a set of worlds, which are related
by binary accessibility relations—one for each agent—together
with an assignment which determines which propositional vari-
ables are true at which worlds.

Definition 15.4. A model for the multi-agent epistemic language


is a triple M = ⟨W,R,V ⟩, where

1. W is a nonempty set of “worlds,”

2. For each a ∈ G , R a is a binary accessibility relation on W ,


and

3. V is a function assigning to each propositional variable p


a set V (p) of possible worlds.

When R a ww ′ holds, we say that w ′ is accessible by a from w. When


w ∈ V (p) we say p is true at w.

The mechanics are just like the mechanics for normal modal
logic, just with more accessibility relations added in. For a given
agent, we will generally interpret their accessibility relation as
representing something about their informational states. For ex-
ample, we often treat R a ww ′, as expressing that w ′ is consistent
with a’s information at w. Or to put it another way, at w, they
cannot tell the difference between world w and world w ′.
CHAPTER 15. EPISTEMIC LOGICS 188

15.4 Truth at a World


Just as with normal modal logic, every epistemic model deter-
mines which formulas count as true at which worlds in it. We use
the same notation “model M makes formula A true at world w”
for the basic notion of relational semantics. The relation is de-
fined inductively and is identical to the normal modal case for all
non-modal operators.

Definition 15.5. Truth of a formula A at w in a M, in symbols:


M,w ⊩ A, is defined inductively as follows:

1. A ≡ ⊥: Never M,w ⊩ ⊥.

2. M,w ⊩ p iff w ∈ V (p)

3. A ≡ ¬B: M,w ⊩ A iff M,w ⊮ B.

4. A ≡ (B ∧ C ): M,w ⊩ A iff M,w ⊩ B and M,w ⊩ C .

5. A ≡ (B ∨C ): M,w ⊩ A iff M,w ⊩ B or M,w ⊩ C (or both).

6. A ≡ (B → C ): M,w ⊩ A iff M,w ⊮ B or M,w ⊩ C .

7. A ≡ Ka B: M,w ⊩ A iff M,w ′ ⊩ B for all w ′ ∈ W with


R a ww ′

Here’s where we need to think about restrictions on our acces-


sibility relations, though. After all, by clause (7), a formula Ka B is
true at w whenever there are no w ′ with wR a w ′. This is the same
clause as in normal modal logic; when a world has no successors,
all □-formulas are vacuously true there. This seems extremely
counterintuitive if we think about K as representing knowledge.
After all, we tend to think that there are no circumstances under
which an agent might know both A and ¬A at the same time.
One solution is to ensure that our accessibility relation in
epistemic logic will always be reflexive. This roughly corresponds
to the idea that the actual world is consistent with an agent’s
CHAPTER 15. EPISTEMIC LOGICS 189

a,b

p
w2
q
a

p
a,b w1
¬q

b
¬p
w3
¬q

a,b

Figure 15.1: A simple epistemic model.

information. In fact, epistemic logics typically use S5, but others


might use weaker systems depending on what exactly they want
the Ka relation to represent.
Now that we have given our basic definition of truth at a
world, the other semantic concepts from normal modal logic,
such as modal validity and entailment, simply carry over, ap-
plied to this new way of thinking about the interpretation for the
modal operators.
We are now also in a position to give truth conditions for the
common knowledge operator CG . Recall from appendix B.6 that
the transitive closure R + of a relation R is defined as
⋃︂
R+ = Rn ,
n ∈N

where

R 0 = R and
R n+1 = {⟨x, z ⟩ : ∃y (R n xy ∧ Ryz )}.
CHAPTER 15. EPISTEMIC LOGICS 190

If R is . . . then . . . is true in M:
K (p → q ) → ( Kp → Kq )
(Closure)
reflexive: ∀wRww Kp → p
(Veridicality)
transitive: Kp → KKp
∀u∀v∀w ((Ruv ∧ Rvw) → Ruw) (Positive Introspection)
euclidean: ¬Kp → K¬Kp
∀w∀u∀v ((Rwu ∧ Rwv ) → Ruv ) (Negative Introspection)
Table 15.1: Four epistemic principles.

Then, where G is a group of agents, we define RG = ( b ∈G Rb ) +


⋃︁
to be the transitive closure of the union of all agents’ accessibility
relations.

Definition 15.6. If G ′ ⊆ G , we let M,w ⊩ CG ′ A iff for every w ′


such that RG ′ww ′, M,w ′ ⊩ A.

15.5 Accessibility Relations and Epistemic


Principles
Given what we already know about frame correspondence in nor-
mal modal logics, we might want to see what the characteristic
formulas look like given epistemic interpretations. We have al-
ready said that epistemic logics are typically interpreted in S5.
So let’s take a look at how various epistemic principles are rep-
resented, and consider how they correspond to various frame
conditions.
Recall from normal modal logic, that different modal formu-
las characterized different properties of accessibility relations.
This table picks out a few that correspond to particular epistemic
principles.
Veridicality, corresponding to the T axiom, is often treated as
the most uncontroversial of these principles, as it represents that
CHAPTER 15. EPISTEMIC LOGICS 191

claim that if a formula is known, then it must be true. Closure,


as well as Positive and Negative Introspection are much more
contested.
Closure, corresponding to the K axiom, represents the idea
that an agent’s knowledge is closed under implication. This might
seem plausible to us in some cases. For instance, I might know
that if I am in Victoria, then I am on Vancouver Island. Barring
odd skeptical scenarios, I do know that I am in Victoria, and
this should also suggest that I know I am on Vancouver Island.
So in this case, the logical closure of my knowledge might seem
relatively intuitive. On the other hand, we do not always think
through the consequences of our knowledge, and so this might
lead to less intuitive results in other cases.
Positive Introspection, sometimes known as the KK-principle,
is sometimes articulated as the statement that if I know some-
thing, then I know that I know. It is the epistemic counterpart
of the 4 axiom. Correspondingly, negative introspection is articu-
lated as the statement that if I don’t know something, then I know
that I don’t know it, which is the counterpart of the 5 axiom. Both
of these seem to admit of relatively ordinary counterexamples, in
which I am unsure whether or not I know something that I do in
fact know.

15.6 Bisimulations
One remaining question that we might have about the expressive
power of our epistemic language has to do with the relationship
between models and the formulas that hold in them. We have
seen from our frame correspondence results that when certain
formulas are valid in a frame, they will also ensure that those
frames satisfy certain properties. But does our modal language,
for example, allow us to distinguish between a world at which
there is a reflexive arrow, and an infinite chain of worlds, each
of which leads to the next? That is, is there any formula A that
might hold at only one of these two worlds?
CHAPTER 15. EPISTEMIC LOGICS 192

Bisimulation is a relationship that we can define between rela-


tional models to say that they have effectively the same structure.
And as we will see, it will capture a sense of equivalence between
models that can be captured in our epistemic language.

Definition 15.7 (Bisimulation). Let M 1 = ⟨W1 ,R 1 ,V1 ⟩ and


M2 = ⟨W2 ,R 2 ,V2 ⟩ be two relational models. And let R ⊆ W1 ×W2
be a binary relation. We say that R is a bisimulation when for
every ⟨w 1 ,w 2 ⟩ ∈ R, we have:

1. w 1 ∈ V1 (p) iff w 2 ∈ V2 (p) for all propositional variables p.

2. For all agents a ∈ A and worlds v 1 ∈ W1 , if R 1a w 1v 1 then


there is some v 2 ∈ W2 such that R 2a w 2v 2 , and ⟨v 1 ,v 2 ⟩ ∈ R.

3. For all agents a ∈ A and worlds v 2 ∈ W2 , if R 2a w 2v 2 then


there is some v 1 ∈ W1 such that R 1a w 1v 1 , and ⟨v 1 ,v 2 ⟩ ∈ R.

When there is a bisimulation between M1 and M2 that links


worlds w 1 and w 2 , we can also write ⟨M1 ,w 1 ⟩ - ⟨M2 ,w 2 ⟩, and
call ⟨M 1 ,w 1 ⟩ and ⟨M2 ,w 2 ⟩ bisimilar.

The different clauses in the bisimulation relation ensure dif-


ferent things. Clause 1 ensures that bisimilar worlds will satisfy
the same modal-free formulas, since it ensures agreement on all
propositional variables. The other two clauses, sometimes re-
ferred to as “forth” and “back,” respectively, ensure that the ac-
cessibility relations will have the same structure.

Theorem 15.8. If ⟨M1 ,w 1 ⟩ - ⟨M2 ,w 2 ⟩, then for every formula A,


we have that M1 ,w 1 ⊩ A iff M2 ,w 2 ⊩ A.

Even though the two models pictured in Figure 15.2 aren’t


quite the same as each other, there is a bisimulation linking
worlds w 1 and v 1 . This bisimulation will also link both w 2 and
w 3 to v 2 , with the idea being that there is nothing expressible
in our modal language that can really distinguish between them.
CHAPTER 15. EPISTEMIC LOGICS 193

a a
a

w2 w3 v2

a a a
w1 v1

a a

Figure 15.2: Two bisimilar models.

The situation would be different if w 2 and w 3 satisfied different


propositional variables, however.

15.7 Public Announcement Logic


Dynamic epistemic logics allow us to represent the ways in which
agents’ knowledge changes over time, or as they gain new in-
formation. Many of these represent changes in knowlege using
informational events or updates. The most basic kind of update is
a public announcement in which some formula is truthfully an-
nounced and all of the agents witness this taking place together.
To do this, we expand the language as follows

Definition 15.9. Let G be a set of agent-symbols. The basic lan-


guage of multi-agent epistemic logic with public announcements
contains

1. The propositional constant for falsity ⊥.

2. A countably infinite set of propositional variables: p0 , p1 ,


CHAPTER 15. EPISTEMIC LOGICS 194

p2 , . . .

3. The propositional connectives: ¬ (negation), ∧ (conjunc-


tion), ∨ (disjunction), → (conditional)

4. The knowledge operator Ka where a ∈ G .

5. The public announcement operator [B] where B is a for-


mula.

The public announcement operator functions as a box opera-


tor, and our inductive definition of the language is given accord-
ingly:

Definition 15.10. Formulas of the epistemic language are induc-


tively defined as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an (atomic) formula.

3. If A is a formula, then ¬A is a formula.

4. If A and B are formulas, then (A ∧ B) is a formula.

5. If A and B are formulas, then (A ∨ B) is a formula.

6. If A and B are formulas, then (A → B) is a formula.

7. If A is a formula and a ∈ G , then Ka A is a formula.

8. If A and B are formulas, then [A]B is a formula.

9. Nothing else is a formula.

The intended reading of the formula [A]B is “After A is truth-


fully announced, B holds. It will sometimes also be useful to
talk about common knowledge in the context of public announce-
ments, so the language may also include the common knowledge
operator CG A.
CHAPTER 15. EPISTEMIC LOGICS 195

15.8 Semantics of Public Announcement


Logic
Relational models for public announcement logics are the same
as they were in epistemic logics. However, the semantics for the
public announcement operator are something new.

Definition 15.11. Truth of a formula A at w in a M = ⟨W,R,V ⟩,


in symbols: M,w ⊩ A, is defined inductively as follows:

1. A ≡ ⊥: Never M,w ⊩ ⊥.

2. M,w ⊩ p iff w ∈ V (p)

3. A ≡ ¬B: M,w ⊩ A iff M,w ⊮ B.

4. A ≡ (B ∧ C ): M,w ⊩ A iff M,w ⊩ B and M,w ⊩ C .

5. A ≡ (B ∨C ): M,w ⊩ A iff M,w ⊩ B or M,w ⊩ C (or both).

6. A ≡ (B → C ): M,w ⊩ A iff M,w ⊮ B or M,w ⊩ C .

7. A ≡ Ka B: M,w ⊩ A iff M,w ′ ⊩ B for all w ′ ∈ W with


R a ww ′

8. A ≡ [B]C : M,w ⊩ A iff M,w ⊩ B implies M | B,w ⊩ C


Where M | B = ⟨W ′,R ′,V ′⟩ is defined as follows:

a) W ′ = {u ∈ W : M,u ⊩ B }. So the worlds of M | B are


the worlds in M at which B holds.
b) R a′ = R a ∩ (W ′ × W ′). Each agent’s accessibility re-
lation is simply restricted to the worlds that remain
in W ′.
c) V ′ (p) = {u ∈ W ′ : u ∈ V (p)}. Similarly, the proposi-
tional valuations at worlds remain the same, represent-
ing the idea that informational events will not change
the truth value of propositional variables.
CHAPTER 15. EPISTEMIC LOGICS 196

a,b
a,b
w 3 p,q

w 3′ p,q
a,b a
a
b announcement of p
¬p, ¬q w 2 w1 w 1′ p, ¬q
p, ¬q
a,b a,b

M M|p

Figure 15.3: Before and after the public announcement of p.

What is distinctive, then, about public announcement logics,


is that the truth of a formula at M can sometimes only be decided
by referring to a model other than M itself.
Notice also that our semantics treats the announcement op-
erator as a □ operator, and so if a formula A cannot be truthfully
announced at a world, then [A]B will hold there trivially, just as
all □ formulas hold at endpoints.
We can see the public announcement of a formula as shrink-
ing a model, or restricting it to the worlds at which the formula
was true. Figure 15.3 gives an example of the effects of publicly
announcing p. One notable thing about that model is that agent b
learns that p as a result of the announcement, while agent a does
not (since a already knew that p was true).
More formally, we have M,w 1 ⊩ ¬Kb p but M | p,w 1′ ⊩ Kb p.
This implies that M,w 1 ⊩ [p] Kb p. But we have some even
stronger claims that we can make about the result of the an-
nouncement. In fact, it is the case that M,w 1 ⊩ [p] C {a,b } p. In
other words, after p is announced, it becomes common knowledge.
CHAPTER 15. EPISTEMIC LOGICS 197

We might wonder, though, whether this holds in the general


case, and whether a truthful announcement of A will always result
in A becoming common knowledge. It may be surprising that the
answer is no. And in fact, it is possible to truthfully announce
formulas that will no longer be true once they are announced.
For example, consider the effects of announcing p ∧ ¬Kb p at w 1
in Figure 15.3. In fact, M | p and M | (p ∧ ¬Kb p) are the same
model. However, as we have already noted, M | p,w 1′ ⊩ Kb p.
Therefore, M | (p ∧ ¬Kb p),w 1′ ⊩ ¬(p ∧ ¬Kb p), so this is a formula
that becomes false once it has been announced.
PART VI

Is this going
to go on
forever?

198
CHAPTER 16

Temporal
Logics
16.1 Introduction
Temporal logics deal with claims about things that will or have
been the case. Arthur Prior is credited as the originator of tem-
poral logic, which he called tense logic. Our treatment of tempo-
ral logic here will largely follow Prior’s original modal treatment
of introducing temporal operators into the basic framework of
propositional logic, which treats claims as generally lacking in
tense.
For example, in propositional logic, I might talk about a dog,
Beezie, who sometimes sits and sometimes doesn’t sit, as dogs
are wont to do. It would be contradictory in classical logic to
claim that Beezie is sitting and also that Beezie is not sitting. But
obviously both can be true, just not at the same time; adding
temporal operators to the language can allow us to express that
claim relatively easily. The addition of temporal operators also
allows us to account for the validity of inferences like the one
from “Beezie will get a treat or a ball" to “Beezie will get a treat
or Beezie will get a ball."
However, a lot of philosophical issues arise with temporal
logic that might lead us to adopt one framework of temporal

199
CHAPTER 16. TEMPORAL LOGICS 200

logic over another. For example, a future contingent is a state-


ment about the future that is neither necessary nor impossible.
If we say “Richard will go to the grocery store tomorrow," we are
expressing a claim about something that has not yet happened,
and whose truth value is contestable. In fact, it is contestable
whether that claim can even be assigned a truth value in the first
place. If we are strict determinists, then perhaps we can be com-
fortable with the idea that this sentence is in fact true or false,
even before the event in question is supposed to take place—it
just may be that we do not know its truth value yet. In contrast,
we might believe in a genuinely open future, in which the truth
values of future contingents are undetermined.
As it turns out, a lot of these commitments about the struc-
ture and nature of time are built in to our choices of models
and frameworks of temporal logics. For example, we might ask
ourselves whether we should construct models in which time is
linear, branching or even circular. We might have to make de-
cisions about whether our temporal models will have beginning
and end points, and whether time is to be represented using dis-
crete instants or as a continuum.

16.2 Semantics for Temporal Logic

Definition 16.1. The basic language of temporal logic contains

1. The propositional constant for falsity ⊥.

2. A countably infinite set of propositional variables: p0 , p1 ,


p2 , . . .

3. The propositional connectives: ¬ (negation), ∧ (conjunc-


tion), ∨ (disjunction), → (conditional).

4. Past operators P and H.

5. Future operators F and G.


CHAPTER 16. TEMPORAL LOGICS 201

Later on, we will discuss the potential addition of other kinds


of modal operators.

Definition 16.2. Formulas of the temporal language are induc-


tively defined as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an (atomic) formula.

3. If A is a formula, then ¬A is a formula.

4. If A and B are formulas, then (A ∧ B) is a formula.

5. If A and B are formulas, then (A ∨ B) is a formula.

6. If A and B are formulas, then (A → B) is a formula.

7. If A is a formula, then PA, HA, F A, GA are all formulas.

8. Nothing else is a formula.

The semantics of temporal logics are given in terms of rela-


tional models, as with other kinds of intensional logics.

Definition 16.3. A model for temporal language is a triple M =


⟨T , ≺,V ⟩, where

1. T is a nonempty set, interpreted as points in time.

2. ≺ is a binary relation on T .

3. V is a function assigning to each propositional variable p


a set V (p) of points in time.

When t ≺ t ′ holds, we say that t precedes t ′. When t ∈ V (p) we


say p is true at t .

For now, you will notice that we do not impose any conditions
on our precedence relation ≺. This means that at present, there
CHAPTER 16. TEMPORAL LOGICS 202

are no restrictions on the structure of our temporal models, so


we could have models in which time is linear, branching, circular,
or has any structure whatsoever.
Just as with normal modal logic, every temporal model deter-
mines which formulas count as true at which points in it. We use
the same notation “model M makes formula A true at point t ”
for the basic notion of relational semantics. The relation is de-
fined inductively and is identical to the normal modal case for all
non-modal operators.

Definition 16.4. Truth of a formula A at t in a M, in symbols:


M,t ⊩ A, is defined inductively as follows:

1. A ≡ ⊥: Never M,t ⊩ ⊥.

2. M,t ⊩ p iff t ∈ V (p)

3. A ≡ ¬B: M,t ⊩ A iff M,t ⊮ B.

4. A ≡ (B ∧ C ): M,t ⊩ A iff M,t ⊩ B and M,t ⊩ C .

5. A ≡ (B ∨ C ): M,t ⊩ A iff M,t ⊩ B or M,t ⊩ C (or both).

6. A ≡ (B → C ): M,t ⊩ A iff M,t ⊮ B or M,t ⊩ C .

7. A ≡ PB: M,t ⊩ A iff M,t ′ ⊩ B for some t ′ ∈ T with t ′ ≺ t

8. A ≡ HB: M,t ⊩ A iff M,t ′ ⊩ B for every t ′ ∈ T with t ′ ≺ t

9. A ≡ FB: M,t ⊩ A iff M,t ′ ⊩ B for some t ′ ∈ T with t ≺ t ′

10. A ≡ GB: M,t ⊩ A iff M,t ′ ⊩ B for every t ′ ∈ T with t ≺ t ′

Based on the semantics, you might be able to see that the


operators P and H are duals, as well as the operators F and G,
such that we could define HA as ¬P¬A, and the same with G
and F.
CHAPTER 16. TEMPORAL LOGICS 203

If ≺ is . . . then . . . is true in M:
transitive: FFp → Fp
∀u∀v∀w ((u ≺ v ∧ v ≺ w) → u ≺
w)
linear: ( FPp ∨ PFp) → ( Pp ∨ p ∨ Fp)
∀w∀v (w ≺ v ∨ w = v ∨ v ≺ w)
dense: Fp → FFp
∀w∀v (w ≺ v → ∃u (w ≺ u ∧u ≺
v ))
unbounded (past): Hp → Pp
∀w∃v (v ≺ w)
unbounded (future): Gp → Fp
∀w∃v (w ≺ v )
Table 16.1: Some temporal frame correspondence properties.

16.3 Properties of Temporal Frames


Given that our temporal models do not impose any conditions
on the relation ≺, the only one of our familiar axioms that holds
in all models is K , or its analogues KG and K H :

G (p → q ) → ( Gp → Gq ) (KG )
H (p → q ) → ( Hp → Hq ) (KH )

However, if we want our models to impose stricter conditions


on how time is represented, for instance by ensuring that ≺ is
a linear order, then we will end up with other validities in our
models.
Several of the properties from Table 16.1 might seem like de-
sirable features for a model that is intended to represent time.
However, it is worth noting that, even though we can impose
whichever conditions we like on the ≺ relation, not all conditions
correspond to formulas that can be expressed in the language
of temporal logic. For example, irreflexivity, or the idea that
CHAPTER 16. TEMPORAL LOGICS 204

∀w¬(w ≺ w), does not have a corresponding formula in tem-


poral logic.

16.4 Additional Operators for Temporal


Logic
In addition to the unary operators for past and future, temporal
logics also sometimes include binary operators S and U, intended
to symbolize “since” and “until”. This means adding S and U into
the language of temporal logic and adding the following clause
into the definition of a temporal formula:

If A and B are formulas, then ( SAB) and ( UAB) are both


formulas.

The semantics for these operators are then given as follows:

Definition 16.5. Truth of a formula A at t in a M:

1. A ≡ SBC : M,t ⊩ A iff M,t ′ ⊩ B for some t ′ ∈ T with


t ′ ≺ t , and for all s with t ′ ≺ s ≺ t , M,s ⊩ C

2. A ≡ UBC : M,t ⊩ A iff M,t ′ ⊩ B for some t ′ ∈ T with


t ≺ t ′, and for all s with t ≺ s ≺ t ′, M,s ⊩ C

The intuitive reading of SBC is “Since B was the case, C has


been the case.” And the intuitive reading of UBC is “Until B will
be the case, C will be the case.”

16.5 Possible Histories


The relational models of temporal logic that we have been us-
ing are extremely flexible, since we do not have to place any re-
strictions on the accessibility relation. This means that temporal
models can branch in the past and in the future, but we might
want to consider a more “modal” conception of branching, in
CHAPTER 16. TEMPORAL LOGICS 205

which we consider sequences of events as possible histories. This


does not necessarily require changing our language, though we
might also add our “ordinary” modal operators □ and ◇, and
we could also consider adding epistemic accessibility relations to
represent changes in agents’ knowledge over time.

Definition 16.6. A possible histories model for the temporal lan-


guage is a triple M = ⟨T ,C ,V ⟩, where

1. T is a nonempty set, interpreted as states in time.

2. C is a set of computational paths, or possible histories of a


system. In other words, C is a set of sequences 𝜎 of states
s 1 , s 2 , s 3 , . . . , where every si ∈ T .

3. V is a function assigning to each propositional variable p


a set V (p) of points in time.

To make things simpler, we will also generally assume that when


a history is in C , then so are all of its suffixes. For example, if
s 1 , s2 , s3 is a sequence in C , then so are s 2 , s3 and s 3 . Also, when
two states si and s j appear in a sequence 𝜎, we say that si ≺𝜎 s j
when i < j . When t ∈ V (p) we say p is true at t .

The one relevant change is that when we evaluate the truth


of a formula at a point in time t in a model M, we do so relative
to a history 𝜎, in which t appears as a state. We do not need
to change any of the semantics for propositional variables or for
truth-functional connectives, though. All of those are exactly as
they were in Definition 16.4, since none of those will make refer-
ence to 𝜎. However, we now redefine our future operator F and
add our ◇ operator with respect to these histories.

Definition 16.7. Truth of a formula A at t , 𝜎 in M, in symbols:


M,t , 𝜎 ⊩ A:

1. A ≡ FB: M,t , 𝜎 ⊩ A iff M,t ′, 𝜎 ⊩ B for some t ′ ∈ T such


CHAPTER 16. TEMPORAL LOGICS 206

that t ≺𝜎 t ′.

2. A ≡ ◇B: M,t , 𝜎 ⊩ A iff M,t , 𝜎 ′ ⊩ B for some 𝜎 ′ ∈ C in


which t occurs.

Other temporal and modal operators can be defined similarly.


However, we can now represent claims that combine tense and
modality. For example, we might symbolize “p will not occur, but
it might have occurred” using the formula ¬Fp ∧ ◇ Fp. This would
hold at a point and a history at which p does not become true
at a successor state, but there is an alternative history at which p
will become true.
PART VII

What if
things were
different?

207
CHAPTER 17

Introduction
17.1 The Material Conditional
In its simplest form in English, a conditional is a sentence of the
form “If . . . then . . . ,” where the . . . are themselves sentences,
such as “If the butler did it, then the gardener is innocent.” In
introductory logic courses, we earn to symbolize conditionals us-
ing the → connective: symbolize the parts indicated by . . . , e.g.,
by formulas A and B, and the entire conditional is symbolized by
A → B.
The connective → is truth-functional, i.e., the truth value—T
or F—of A → B is determined by the truth values of A and B:
A → B is true iff A is false or B is true, and false otherwise.
Relative to a truth value assignment v, we define v ⊨ A → B iff
v ⊭ A or v ⊨ B. The connective → with this semantics is called
the material conditional.
This definition results in a number of elementary logical facts.
First of all, the deduction theorem holds for the material condi-
tional:

If 𝛤,A ⊨ B then 𝛤 ⊨ A → B (17.1)

It is truth-functional: A → B and ¬A ∨ B are equivalent:

A → B ⊨ ¬A ∨ B (17.2)
¬A ∨ B ⊨ A → B (17.3)

208
CHAPTER 17. INTRODUCTION 209

A material conditional is entailed by its consequent and by the


negation of its antecedent:

B ⊨A→B (17.4)
¬A ⊨ A → B (17.5)

A false material conditional is equivalent to the conjunction of its


antecedent and the negation of its consequent: if A → B is false,
A ∧ ¬B is true, and vice versa:

¬(A → B) ⊨ A ∧ ¬B (17.6)
A ∧ ¬B ⊨ ¬(A → B) (17.7)

The material conditional supports modus ponens:

A,A → B ⊨ B (17.8)

The material conditional agglomerates:

A → B,A → C ⊨ A → (B ∧ C ) (17.9)

We can always strengthen the antecedent, i.e., the conditional is


monotonic:

A → B ⊨ (A ∧ C ) → B (17.10)

The material conditional is transitive, i.e., the chain rule is valid:

A → B,B → C ⊨ A → C (17.11)

The material conditional is equivalent to its contrapositive:

A → B ⊨ ¬B → ¬A (17.12)
¬B → ¬A ⊨ A → B (17.13)
CHAPTER 17. INTRODUCTION 210

These are all useful and unproblematic inferences in mathe-


matical reasoning. However, the philosophical and linguistic liter-
ature is replete with purported counterexamples to the equivalent
inferences in non-mathematical contexts. These suggest that the
material conditional → is not—or at least not always—the ap-
propriate connective to use when symbolizing English “if . . . then
. . . ” statements.

17.2 Paradoxes of the Material Conditional


One of the first to criticize the use of A →B as a way to symbolize
“if . . . then . . . ” statements of English was C. I. Lewis. Lewis was
criticizing the use of the material condition in Whitehead and
Russell’s Principia Mathematica, who pronounced → as “implies.”
Lewis rightly complained that if → meant “implies,” then any
false proposition p implies that p implies q , since p → (p → q ) is
true if p is false, and that any true proposition q implies that p
implies q , since q → (p → q ) is true if q is true.
Logicians of course know that implication, i.e., logical entail-
ment, is not a connective but a relation between formulas or state-
ments. So we should just not read → as “implies” to avoid confu-
sion.1 As long as we don’t, the particular worry that Lewis had
simply does not arise: p does not “imply” q even if we think of
p as standing for a false English sentence. To determine if p ⊨ q
we must consider all valuations, and p ⊭ q even when we use p
to symbolize a sentence which happens to be false.
But there is still something odd about “if . . . then. . . ” state-
ments such as Lewis’s

If the moon is made of green cheese, then 2 + 2 = 4.

and about the inferences


1 Reading “→” as “implies” is still widely practised by mathematicians and
computer scientists, although philosophers try to avoid the confusions Lewis
highlighted by pronouncing it as “only if.”
CHAPTER 17. INTRODUCTION 211

The moon is not made of green cheese. Therefore, if


the moon is made of green cheese, then 2 + 2 = 4.
2 + 2 = 4. Therefore, if the moon is made of green
cheese, then 2 + 2 = 4.

Yet, if “if . . . then . . . ” were just →, the sentence would be un-


problematically true, and the inferences unproblematically valid.
Another example of concerns the tautology (A→B)∨(B →A).
This would suggest that if you take two indicative sentences S and
T from the newspaper at random, the sentence “If S then T , or
if T then S ” should be true.

17.3 The Strict Conditional


Lewis introduced the strict conditional ⥽ and argued that it, not
the material conditional, corresponds to implication. In alethic
modal logic, A ⥽ B can be defined as □(A → B). A strict con-
ditional is thus true (at a world) iff the corresponding material
conditional is necessary.
How does the strict conditional fare vis-a-vis the paradoxes
of the material conditional? A strict conditional with a false an-
tecedent and one with a true consequent, may be true, or it may
be false. Moreover, (A ⥽ B) ∨ (B ⥽ A) is not valid. The strict
conditional A ⥽ B is also not equivalent to ¬A ∨ B, so it is not
truth functional.
We have:

A ⥽ B ⊨ ¬A ∨ B but: (17.14)
¬A ∨ B ⊭ A ⥽ B (17.15)
B ⊭A⥽B (17.16)
¬A ⊭ A ⥽ B (17.17)
¬(A → B) ⊭ A ∧ ¬B but: (17.18)
A ∧ ¬B ⊨ ¬(A ⥽ B) (17.19)
CHAPTER 17. INTRODUCTION 212

However, the strict conditional still supports modus ponens:

A,A ⥽ B ⊨ B (17.20)

The strict conditional agglomerates:

A ⥽ B,A ⥽ C ⊨ A ⥽ (B ∧ C ) (17.21)

Antecedent strengthening holds for the strict conditional:

A ⥽ B ⊨ (A ∧ C ) ⥽ B (17.22)

The strict conditional is also transitive:

A ⥽ B,B ⥽ C ⊨ A ⥽ C (17.23)

Finally, the strict conditional is equivalent to its contrapositive:

A ⥽ B ⊨ ¬B ⥽ ¬A (17.24)
¬B ⥽ ¬A ⊨ A ⥽ B (17.25)
However, the strict conditional still has its own “paradoxes.”
Just as a material conditional with a false antecedent or a true
consequent is true, a strict conditional with a necessarily false an-
tecedent or a necessarily true consequent is true. Moreover, any
true strict conditional is necessarily true, and any false strict con-
ditional is necessarily false. In other words, we have
□¬A ⊨ A ⥽ B (17.26)
□B ⊨ A ⥽ B (17.27)
A ⥽ B ⊨ □(A ⥽ B) (17.28)
¬(A ⥽ B) ⊨ □¬(A ⥽ B) (17.29)
These are not problems if you think of ⥽ as “implies.” Logical
entailment relationships are, after all, mathematical facts and so
can’t be contingent. But they do raise issues if you want to use
⥽ as a logical connective that is supposed to capture “if . . . then
. . . ,” especially the last two. For surely there are “if . . . then . . . ”
statements that are contingently true or contingently false—in
fact, they generally are neither necessary nor impossible.
CHAPTER 17. INTRODUCTION 213

17.4 Counterfactuals
A very common and important form of “if . . . then . . . ” construc-
tions in English are built using the past subjunctive form of to
be: “if it were the case that . . . then it would be the case that . . . ”
Because usually the antecedent of such a conditional is false, i.e.,
counter to fact, they are called counterfactual conditionals (and
because they use the subjunctive form of to be, also subjunctive
conditionals. They are distinguished from indicative conditionals
which take the form of “if it is the case that . . . then it is the
case that . . . ” Counterfactual and indicative conditionals differ
in truth conditions. Consider Adams’s famous example:

If Oswald didn’t kill Kennedy, someone else did.


If Oswald hadn’t killed Kennedy, someone else would
have.

The first is indicative, the second counterfactual. The first is


clearly true: we know JFK was killed by someone, and if that
someone wasn’t (contrary to the Warren Report) Lee Harvey Os-
wald, then someone else killed JFK. The second one says some-
thing different. It claims that if Oswald hadn’t killed Kennedy,
i.e., if the Dallas shooting had been avoided or had been unsuc-
cessful, history would have subsequently unfolded in such a way
that another assassination would have been successful. In order
for it to be true, it would have to be the case that powerful forces
had conspired to ensure JFK’s death (as many JFK conspiracy
theorists believe).
It is a live debate whether the indicative conditional is cor-
rectly captured by the material conditional, in particular, whether
the paradoxes of the material conditional can be “explained” in
a way that is compatible with it giving the truth conditions for
English indicative conditionals. By contrast, it is uncontrover-
sial that counterfactual conditionals cannot be symbolized cor-
rectly by the material conditionals. That is clear because, even
though generally the antecedents of counterfactuals are false, not
CHAPTER 17. INTRODUCTION 214

all counterfactuals with false antecedents are true—for instance,


if you believe the Warren Report, and there was no conspiracy
to assassinate JFK, then Adams’s counterfactual conditional is an
example.
Counterfactual conditionals play an important role in causal
reasoning: a prime example of the use of counterfactuals is to ex-
press causal relationships. E.g., striking a match causes it to light,
and you can express this by saying “if this match were struck,
it would light.” Material, and generally indicative conditionals,
cannot be used to express this: “the match is struck → the match
lights” is true if the match is never struck, regardless of what
would happen if it were. Even worse, “the match is struck → the
match turns into a bouquet of flowers” is also true if it is never
struck, but the match would certainly not turn into a bouquet of
flowers if it were struck.
It is still debated What exactly the correct logic of counter-
factuals is. An influential analysis of counterfactuals was given
by Stalnaker and Lewis. According to them, a counterfactual “if
it were the case that S then it would be the case that T ” is true iff
T is true in the counterfactual situation (“possible world”) that
is closest to the way the actual world is and where S is true. This
is called an “ontic” analysis, since it makes reference to an ontol-
ogy of possible worlds. Other analyses make use of conditional
probabilities or theories of belief revision. There is a proliferation
of different proposed logics of counterfactuals. There isn’t even
a single Lewis-Stalnaker logic of counterfactuals: even though
Stalnaker and Lewis proposed accounts along similar lines with
reference to closest possible worlds, the assumptions they made
result in different valid inferences.

Problems
Problem 17.1. Give S5-counterexamples to the entailment rela-
tions which do not hold for the strict conditional, i.e., for:

1. ¬p ⊭ □(p → q )
CHAPTER 17. INTRODUCTION 215

2. q ⊭ □(p → q )

3. ¬□(p → q ) ⊭ p ∧ ¬q

4. ⊭ □(p → q ) ∨ □(q → p)

Problem 17.2. Show that the valid entailment relations hold for
the strict conditional by giving S5-proofs of:

1. □(A → B) ⊨ ¬A ∨ B

2. A ∧ ¬B ⊨ ¬□(A → B)

3. A, □(A → B) ⊨ B

4. □(A → B), □(A → C ) ⊨ □(A → (B ∧ C ))

5. □(A → B) ⊨ □((A ∧ C ) → B)

6. □(A → B), □(B → C ) ⊨ □(A → C )

7. □(A → B) ⊨ □(¬B → ¬A)

8. □(¬B → ¬A) ⊨ □(A → B)

Problem 17.3. Give proofs in S5 of:

1. □¬B ⊨ A ⥽ B

2. A ⥽ B ⊨ □(A ⥽ B)

3. ¬(A ⥽ B) ⊨ □¬(A ⥽ B)

Use the definition of ⥽ to do so.


CHAPTER 18

Minimal
Change
Semantics
18.1 Introduction
Stalnaker and Lewis proposed accounts of counterfactual condi-
tionals such as “If the match were struck, it would light.” Their
accounts were proposals for how to properly understand the truth
conditions for such sentences. The idea behind both proposals is
this: to evaluate whether a counterfactual conditional is true, we
have to consider those possible worlds which are minimally dif-
ferent from the way the world actually is to make the antecedent
true. If the consequent is true in these possible worlds, then the
counterfactual is true. For instance, suppose I hold a match and
a matchbook in my hand. In the actual world I only look at them
and ponder what would happen if I were to strike the match. The
minimal change from the actual world where I strike the match
is that where I decide to act and strike the match. It is minimal
in that nothing else changes: I don’t also jump in the air, striking
the match doesn’t also light my hair on fire, I don’t suddenly lose

216
CHAPTER 18. MINIMAL CHANGE SEMANTICS 217

all strength in my fingers, I am not simultaneously doused with


water in a SuperSoaker ambush, etc. In that alternative possibil-
ity, the match lights. Hence, it’s true that if I were to strike the
match, it would light.
This intuitive account can be paired with formal semantics
for logics of counterfactuals. Lewis introduced the symbol “□→”
for the counterfactual while Stalnaker used the symbol “>”. We’ll
use □→, and add it as a binary connective to propositional logic.
So, we have, in addition to formulas of the form A → B also
formulas of the form A □→ B. The formal semantics, like the
relational semantics for modal logic, is based on models in which
formulas are evaluated at worlds, and the satisfaction condition
defining M,w ⊩ A □→ B is given in terms of M,w ′ ⊩ A and
M,w ′ ⊩ B for some (other) worlds w ′. Which w ′? Intuitively,
the one(s) closest to w for which it holds that M,w ′ ⊩ A. This
requires that a relation of “closeness” has to be included in the
model as well.
Lewis introduced an instructive way of representing counter-
factual situations graphically. Each possible world is at the center
of a set of nested spheres containing other worlds—we draw these
spheres as concentric circles. The worlds between two spheres are
equally close to the world at the center as each other, those con-
tained in a nested sphere are closer, and those in a surrounding
sphere further away.

w A

The closest A-worlds are those worlds w ′ where A is satisfied


which lie in the smallest sphere around the center world w (the
CHAPTER 18. MINIMAL CHANGE SEMANTICS 218

gray area). Intuitively, A □→ B is satisfied at w if B is true at all


closest A-worlds.

18.2 Sphere Models


One way of providing a formal semantics for counterfactuals is
to turn Lewis’s informal account into a mathematical structure.
The spheres around a world w then are sets of worlds. Since the
spheres are nested, the sets of worlds around w have to be linearly
ordered by the subset relation.

Definition 18.1. A sphere model is a triple M = ⟨W,O ,V ⟩ where


W is a non-empty set of worlds, V : At0 → ℘(W ) is a valua-
tion, and O : W → ℘(℘(W )) assigns to each world w a system of
spheres O w . For each w, O w is a set of sets of worlds, and must
satisfy:

1. O w is centered on w: {w } ∈ O w .

2. O w is nested: whenever S 1 , S 2 ∈ O w , S 1 ⊆ S 2 or S 2 ⊆ S 1 , i.e.,


O w is linearly ordered by ⊆.

3. O w is closed under non-empty unions.

4. O w is closed under non-empty intersections.

The intuition behind O w is that the worlds “around” w are


stratified according to how far away they are from w. The inner-
most sphere is just w by itself, i.e., the set {w }: w is closer to w
than the worlds in any other sphere. If S ⊊ S ′, then the worlds in
S ′ \ S are further way from w than the worlds in S : S ′ \ S is the
“layer” between the S and the worlds outside of S ′. In particular,
we have to think of the spheres as containing all the worlds within
their outer surface; they are not just the individual layers.
The diagram in Figure 18.1 corresponds to the sphere model
with W = {w,w 1 , . . . ,w 7 }, V (p) = {w 5 ,w 6 ,w 7 }. The innermost
sphere S 1 = {w }. The closest worlds to w are w 1 ,w 2 ,w 3 , so the
CHAPTER 18. MINIMAL CHANGE SEMANTICS 219

w1 w7

w5
w p
w6
w2
w3

w4

Figure 18.1: Diagram of a sphere model

next larger sphere is S 2 = {w,w 1 ,w 2 ,w 3 }. The worlds further out


are w 4 , w 5 , w 6 , so the outermost sphere is S 3 = {w,w 1 , . . . ,w 6 }.
The system of spheres around w is O w = {S 1 ,S 2 ,S 3 }. The
world w 7 is not in any sphere around w. The closest worlds in
which p is true are w 5 and w 6 , and so the smallest p-admitting
sphere is S 3 .
To define satisfaction of a formula A at world w in a sphere
model M, M,w ⊩ A, we expand the definition for modal formulas
to include a clause for B □→ C :

Definition 18.2. M,w ⊩ B □→ C iff either

1. For all u ∈ O w , M,u ⊮ B, or


⋃︁

2. For some S ∈ O w ,

a) M,u ⊩ B for some u ∈ S , and


b) for all v ∈ S , either M,v ⊮ B or M,v ⊩ C .

According to this definition, M,w ⊩ B □→ C iff either the


antecedent B is false everywhere in the spheres around w, or
there is a sphere S where B is true, and the material conditional
B → C is true at all worlds in that “B-admitting” sphere. Note
CHAPTER 18. MINIMAL CHANGE SEMANTICS 220

w A

Figure 18.2: Non-vacuously true counterfactual

that we didn’t require in the definition that S is the innermost B-


admitting sphere, contrary to what one might expect from the
intuitive explanation. But if the condition in (2) is satisfied for
some sphere S , then it is also satisfied for all spheres S contains,
and hence in particular for the innermost sphere.
Note also that the definition of sphere models does not re-
quire that there is an innermost B-admitting sphere: we may
have an infinite sequence S 1 ⊋ S 2 ⊋ · · · ⊋ {w } of B-admitting
spheres, and hence no innermost B-admitting spheres. In that
case, M,w ⊩ B □→ C iff B → C holds throughout the spheres S i ,
S i +1 , . . . , for some i .

18.3 Truth and Falsity of Counterfactuals


A counterfactual A □→ B is (non-vacuously) true if the closest
A-worlds are all B-worlds, as depicted in Figure 18.2. A counter-
factual is also true at w if the system of spheres around w has no
A-admitting spheres at all. In that case it is vacuously true (see
Figure 18.3).
It can be false in two ways. One way is if the closest A-worlds
are not all B-worlds, but some of them are. In this case, A □→ ¬B
is also false (see Figure 18.4). If the closest A-worlds do not
overlap with the B-worlds at all, then A □→ B. But, in this case
CHAPTER 18. MINIMAL CHANGE SEMANTICS 221

w A

Figure 18.3: Vacuously true counterfactual

w A

Figure 18.4: False counterfactual, false opposite

all the closest A-worlds are ¬B-worlds, and so A □→ ¬B is true


(see Figure 18.5).
In contrast to the strict conditional, counterfactuals may be
contingent. Consider the sphere model in Figure 18.6. The A-
worlds closest to u are all B-worlds, so M,u ⊩ A □→ B. But there
are A-worlds closest to v which are not B-worlds, so M,v ⊮ A□→B.

18.4 Antecedent Strengthenng


“Strengthening the antecedent” refers to the inference A → C ⊨
(A ∧ B) → C . It is valid for the material conditional, but invalid
CHAPTER 18. MINIMAL CHANGE SEMANTICS 222

w A

Figure 18.5: False counterfactual, true opposite

u v

Figure 18.6: Contingent counterfactual

for counterfactuals. Suppose it is true that if I were to strike this


match, it would light. (That means, there is nothing wrong with
the match or the matchbook surface, I will not break the match,
etc.) But it is not true that if I were to light this match in outer
space, it would light. So the following inference is invalid:

I the match were struck, it would light.


Therefore, if the match were struck in outer space, it
would light.

The Lewis-Stalnaker account of conditionals explains this:


CHAPTER 18. MINIMAL CHANGE SEMANTICS 223

w1

w
w2
q

Figure 18.7: Counterexample to antecedent strengthening

the closest world where I light the match and I do so in outer


space is much further removed from the actual world than the
closest world where I light the match is. So although it’s true that
the match lights in the latter, it is not in the former. And that is
as it schould be.

Example 18.3. The sphere semantics invalidates the infer-


ence, i.e., we have p □→ r ⊭ (p ∧ q ) □→ r . Consider
the model M = ⟨W,O ,V ⟩ where W = {w,w 1 ,w 2 }, O w =
{{w }, {w,w 1 }, {w,w 1 ,w 2 }}, V (p) = {w 1 ,w 2 }, V (q ) = {w 2 }, and
V (r ) = {w 1 }. There is a p-admitting sphere S = {w,w 1 } and
p → r is true at all worlds in it, so M,w ⊩ p □→ r . There is also a
(p ∧ q )-admitting sphere S ′ = {w,w 1 ,w 2 } but M,w 2 ⊮ (p ∧ q ) →r ,
so M,w ⊮ (p ∧ q ) □→ r (see Figure 18.7).

18.5 Transitivity
For the material conditional, the chain rule holds: A →B,B →C ⊨
A → C . In other words, the material conditional is transitive. Is
the same true for counterfactuals? Consider the following exam-
ple due to Stalnaker.
CHAPTER 18. MINIMAL CHANGE SEMANTICS 224

If J. Edgar Hoover had been born a Russian, he would


have been a Communist.
If J. Edgar Hoover were a Communist, he would have
been be a traitor.
Therefore, If J. Edgar Hoover had been born a Rus-
sian, he would have been be a traitor.

If Hoover had been born (at the same time he actually did), not
in the United States, but in Russia, he would have grown up in
the Soviet Union and become a Communist (let’s assume). So
the first premise is true. Likewise, the second premise, consid-
ered in isolation is true. The conclusion, however, is false: in all
likelihood, Hoover would have been a fervent Communist if he
had been born in the USSR, and not been a traitor (to his coun-
try). The intuitive assignment of truth values is borne out by the
Stalnaker-Lewis account. The closest possible world to ours with
the only change being Hoover’s place of birth is the one where
Hoover grows up to be a good citizen of the USSR. This is the
closest possible world where the antecedent of the first premise
and of the conclusion is true, and in that world Hoover is a loyal
member of the Communist party, and so not a traitor. To eval-
uate the second premise, we have to look at a different world,
however: the closest world where Hoover is a Communist, which
is one where he was born in the United States, turned, and thus
became a traitor.1

Example 18.4. The sphere semantics invalidates the infer-


ence, i.e., we have p □→ q ,q □→ r ⊭ p □→ r . Consider
the model M = ⟨W,O ,V ⟩ where W = {w,w 1 ,w 2 }, O w =
{{w }, {w,w 1 }, {w,w 1 ,w 2 }}, V (p) = {w 2 }, V (q ) = {w 1 ,w 2 }, and
V (r ) = {w 1 }. There is a p-admitting sphere S = {w,w 1 ,w 2 } and
p → q is true at all worlds in it, so M,w ⊩ p □→ q . There is also
1 Ofcourse, to appreciate the force of the example we have to take on
board some metaphysical and political assumptions, e.g., that it is possible
that Hoover could have been born to Russian parents, or that Communists in
the US of the 1950s were traitors to their country.
CHAPTER 18. MINIMAL CHANGE SEMANTICS 225

a q -admitting sphere S ′ = {w,w 1 } and M ⊮ q → r is true at all


worlds in it, so M,w ⊩ q □→ r . However, the p-admitting sphere
{w,w 1 ,w 2 } contains a world, namely w 2 , where M,w 2 ⊮ p → r .

18.6 Contraposition
Material and strict conditionals are equivalent to their contra-
positives. Counterfactuals are not. Here is an example due to
Kratzer:

If Goethe hadn’t died in 1832, he would (still) be dead


now.
If Goethe weren’t dead now, he would have died in
1832.

The first sentence is true: humans don’t live hundreds of years.


The second is clearly false: if Goethe weren’t dead now, he would
be still alive, and so couldn’t have died in 1832.

Example 18.5. The sphere semantics invalidates contraposi-


tion, i.e., we have p □→ q ⊭ ¬q □→ ¬p. Think of p as “Goethe
didn’t die in 1832” and q as “Goethe is dead now.” We can cap-
ture this in a model M1 = ⟨W,O ,V ⟩ with W = {w,w 1 ,w 2 }, O =
{{w }, {w,w 1 }, {w,w 1 ,w 2 }}, V (p) = {w 1 ,w 2 } and V (q ) = {w,w 1 }.
So w is the actual world where Goethe died in 1832 and is still
dead; w 1 is the (close) world where Goethe died in, say, 1833,
and is still dead; and w 2 is a (remote) world where Goethe is still
alive. There is a p-admitting sphere S = {w,w 1 } and p →q is true
at all worlds in it, so M,w ⊩ p □→ q . However, the ¬q -admitting
sphere {w,w 1 ,w 2 } contains a world, namely w 2 , where q is false
and p is true, so M,w 2 ⊮ ¬q → ¬p.

Problems
Problem 18.1. Find a convincing, intuitive example for the fail-
ure of transitivity of counterfactuals.
CHAPTER 18. MINIMAL CHANGE SEMANTICS 226

¬q
q

w w1

w2

p
¬p

Figure 18.8: Counterexample to contraposition

Problem 18.2. Draw the sphere diagram corresponding to the


counterexample in Example 18.4.

Problem 18.3. In Example 18.4, world w 2 is where Hoover is


born in Russia, is a communist, and not a traitor, and w 1 is the
world where Hoover is born in the US, is a communist, and a
traitor. In this model, w 1 is closer to w than w 2 is. Is this neces-
sary? Can you give a counterexample that does not assume that
Hoover’s being born in Russia is a more remote possibility than
him being a Communist?
PART VIII

How can it
be true if
you can’t
prove it?

227
CHAPTER 19

Introduction
19.1 Constructive Reasoning
In contrast to extensions of classical logic by modal operators or
second-order quantifiers, intuitionistic logic is “non-classical” in
that it restricts classical logic. Classical logic is non-constructive in
various ways. Intuitionistic logic is intended to capture a more
“constructive” kind of reasoning characteristic of a kind of con-
structive mathematics. The following examples may serve to il-
lustrate some of the underlying motivations.
Suppose someone claimed that they had determined a natu-
ral number n with the property that if n is even, the Riemann
hypothesis is true, and if n is odd, the Riemann hypothesis is
false. Great news! Whether the Riemann hypothesis is true or
not is one of the big open questions of mathematics, and they
seem to have reduced the problem to one of calculation, that is,
to the determination of whether a specific number is even or not.
What is the magic value of n? They describe it as follows: n is
the natural number that is equal to 2 if the Riemann hypothesis
is true, and 3 otherwise.
Angrily, you demand your money back. From a classical point
of view, the description above does in fact determine a unique
value of n; but what you really want is a value of n that is given
explicitly.
To take another, perhaps less contrived example, consider

228
CHAPTER 19. INTRODUCTION 229

the following question. We know that it is possible to raise an


irrational number to a rational power, and get a rational result.
√ 2
For example, 2 = 2. What is less clear is whether or not it is
possible to raise an irrational number to an irrational power, and
get a rational result. The following theorem answers this in the
affirmative:

Theorem 19.1. There are irrational numbers a and b such that a b


is rational.
√ √2
Proof. Consider
√ 2 . If this is rational, we are done: we can let
a = b = 2. Otherwise, it is irrational. Then we have
√ √2 √2 √ √2·√2 √ 2
( 2 ) = 2 = 2 = 2,
√ √2 √
which is rational. So, in this case, let a be 2 , and let b be 2.□

Does this constitute a valid proof? Most mathematicians feel


that it does. But again, there is something a little bit unsatisfying
here: we have proved the existence of a pair of real numbers
with a certain property, without being able to say which pair of
numbers it is. It is possible to prove the same result, but √in such
a way that the pair a, b is given in the proof: take a = 3 and
b = log3 4. Then
√ log 4
a b = 3 3 = 31/2·log3 4 = (3log3 4 ) 1/2 = 41/2 = 2,

since 3log3 x = x.
Intuitionistic logic is designed to capture a kind of reasoning
where moves like the one in the first proof are disallowed. Proving
the existence of an x satisfying A(x) means that you have to give a
specific x, and a proof that it satisfies A, like in the second proof.
Proving that A or B holds requires that you can prove one or the
other.
Formally speaking, intuitionistic logic is what you get if you
restrict a derivation system for classical logic in a certain way.
CHAPTER 19. INTRODUCTION 230

From the mathematical point of view, these are just formal deduc-
tive systems, but, as already noted, they are intended to capture
a kind of mathematical reasoning. One can take this to be the
kind of reasoning that is justified on a certain philosophical view
of mathematics (such as Brouwer’s intuitionism); one can take it
to be a kind of mathematical reasoning which is more “concrete”
and satisfying (along the lines of Bishop’s constructivism); and
one can argue about whether or not the formal description cap-
tures the informal motivation. But whatever philosophical posi-
tions we may hold, we can study intuitionistic logic as a formally
presented logic; and for whatever reasons, many mathematical
logicians find it interesting to do so.

19.2 Syntax of Intuitionistic Logic


The syntax of intuitionistic logic is the same as that for proposi-
tional logic. In classical propositional logic it is possible to define
connectives by others, e.g., one can define A → B by ¬A ∨ B, or
A ∨ B by ¬(¬A ∧ ¬B). Thus, presentations of classical logic often
introduce some connectives as abbreviations for these definitions.
This is not so in intuitionistic logic, with two exceptions: ¬A can
be—and often is—defined as an abbreviation for A →⊥. Then, of
course, ⊥ must not itself be defined! Also, A ↔ B can be defined,
as in classical logic, as (A → B) ∧ (B → A).
Formulas of propositional intuitionistic logic are built up from
propositional variables and the propositional constant ⊥ using log-
ical connectives. We have:

1. A countably infinite set At0 of propositional variables p0 ,


p1 , . . .

2. The propositional constant for falsity ⊥.

3. The logical connectives: ∧ (conjunction), ∨ (disjunction),


→ (conditional)

4. Punctuation marks: (, ), and the comma.


CHAPTER 19. INTRODUCTION 231

Definition 19.2 (Formula). The set Frm(L0 ) of formulas of


propositional intuitionistic logic is defined inductively as follows:

1. ⊥ is an atomic formula.

2. Every propositional variable pi is an atomic formula.

3. If A and B are formulas, then (A ∧ B) is a formula.

4. If A and B are formulas, then (A ∨ B) is a formula.

5. If A and B are formulas, then (A → B) is a formula.

6. Nothing else is a formula.

In addition to the primitive connectives introduced above, we


also use the following defined symbols: ¬ (negation) and ↔ (bi-
conditional). Formulas constructed using the defined operators
are to be understood as follows:

1. ¬A abbreviates A → ⊥.

2. A ↔ B abbreviates (A → B) ∧ (B → A).

Although ¬ is officially treated as an abbreviation, we will


sometimes give explicit rules and clauses in definitions for ¬ as
if it were primitive. This is mostly so we can state practice prob-
lems.

19.3 The Brouwer-Heyting-Kolmogorov


Interpretation
There is an informal constructive interpretation of the intuitionist
connectives, usually known as the Brouwer-Heyting-Kolmogorov
interpretation. It uses the notion of a “construction,” which you
may think of as a constructive proof. (We don’t use “proof” in
the BHK interpretation so as not to get confused with the notion
of a derivation in a formal derivation system.) Based on this
CHAPTER 19. INTRODUCTION 232

intuitive notion, the BHK interpretation explains the meanings


of the intuitionistic connectives.

1. We assume that we know what constitutes a construction


of an atomic statement.

2. A construction of A1 ∧ A2 is a pair ⟨M1 ,M2 ⟩ where M1 is a


construction of A1 and M2 is a construction of A2 .

3. A construction of A1 ∨ A2 is a pair ⟨s ,M ⟩ where s is 1 and


M is a construction of A1 , or s is 2 and M is a construction
of A2 .

4. A construction of A → B is a function that converts a con-


struction of A into a construction of B.

5. There is no construction for ⊥ (absurdity).

6. ¬A is defined as synonym for A→⊥. That is, a construction


of ¬A is a function converting a construction of A into a
construction of ⊥.

Example 19.3. Take ¬⊥ for example. A construction of it is a


function which, given any construction of ⊥ as input, provides a
construction of ⊥ as output. Obviously, the identity function Id
is such a construction: given a construction M of ⊥, Id(M ) = M
yields a construction of ⊥.

Generally speaking, ¬A means “A construction of A is impos-


sible”.

Example 19.4. Let us prove A → ¬¬A for any proposition A,


which is A → ((A → ⊥) → ⊥). The construction should be a
function f that, given a construction M of A, returns a construc-
tion f (M ) of (A → ⊥) → ⊥. Here is how f constructs the con-
struction of (A → ⊥) → ⊥: We have to define a function g which,
when given a construction h of A → ⊥ as input, outputs a con-
struction of ⊥. We can define g as follows: apply the input h
CHAPTER 19. INTRODUCTION 233

to the construction M of A (that we received earlier). Since the


output h (M ) of h is a construction of ⊥, f (M ) (h) = h (M ) is a
construction of ⊥ if M is a construction of A.

Example 19.5. Let us give a construction for ¬(A ∧ ¬A), i.e.,


(A ∧ (A → ⊥)) → ⊥. This is a function f which, given as input
a construction M of A ∧ (A → ⊥), yields a construction of ⊥. A
construction of a conjunction B 1 ∧ B 2 is a pair ⟨N1 ,N 2 ⟩ where N 1
is a construction of B 1 and N 2 is a construction of B 2 . We can
define functions p 1 and p 2 which recover from a construction of
B 1 ∧ B 2 the constructions of B 1 and B 2 , respectively:

p 1 (⟨N 1 ,N 2 ⟩) = N1
p 2 (⟨N 1 ,N 2 ⟩) = N2

Here is what f does: First it applies p 1 to its input M . That yields


a construction of A. Then it applies p 2 to M , yielding a construc-
tion of A → ⊥. Such a construction, in turn, is a function p 2 (M )
which, if given as input a construction of A, yields a construc-
tion of ⊥. In other words, if we apply p 2 (M ) to p 1 (M ), we get a
construction of ⊥. Thus, we can define f (M ) = p 2 (M ) (p 1 (M )).

Example 19.6. Let us give a construction of ((A ∧ B) → C ) →


(A → (B → C )), i.e., a function f which turns a construction g
of (A ∧ B) → C into a construction of (A → (B → C )). The
construction g is itself a function (from constructions of A ∧ B
to constructions of C ). And the output f (g ) is a function h g
from constructions of A to functions from constructions of B to
constructions of C .
Ok, this is confusing. We have to construct a certain function
h g , which will be the output of f for input g . The input of h g is
a construction M of A. The output of h g (M ) should be a func-
tion kM from constructions N of B to constructions of C . Let
k g ,M (N ) = g (⟨M ,N ⟩). Remember that ⟨M ,N ⟩ is a construction
of A ∧ B. So k g ,M is a construction of B → C : it maps construc-
tions N of B to constructions of C . Now let h g (M ) = k g ,M . That’s
CHAPTER 19. INTRODUCTION 234

a function that maps constructions M of A to constructions k g ,M


of B → C . Now let f (g ) = h g . That’s a function that maps con-
structions g of (A ∧ B) → C to constructions of A → (B → C ).
Whew!

The statement A ∨ ¬A is called the Law of Excluded Mid-


dle. We can prove it for some specific A (e.g., ⊥ ∨ ¬⊥), but not
in general. This is because the intuitionistic disjunction requires
a construction of one of the disjuncts, but there are statements
which currently can neither be proved nor refuted (say, Gold-
bach’s conjecture). However, you can’t refute the law of excluded
middle either: that is, ¬¬(A ∨ ¬A) holds.

Example 19.7. To prove ¬¬(A ∨ ¬A), we need a function f that


transforms a construction of ¬(A∨¬A), i.e., of (A∨ (A→⊥)) →⊥,
into a construction of ⊥. In other words, we need a function f
such that f (g ) is a construction of ⊥ if g is a construction of
¬(A ∨ ¬A).
Suppose g is a construction of ¬(A ∨ ¬A), i.e., a function that
transforms a construction of A ∨ ¬A into a construction of ⊥. A
construction of A ∨ ¬A is a pair ⟨s ,M ⟩ where either s = 1 and
M is a construction of A, or s = 2 and M is a construction of
¬A. Let h1 be the function mapping a construction M1 of A to a
construction of A ∨ ¬A: it maps M1 to ⟨1,M2 ⟩. And let h 2 be the
function mapping a construction M2 of ¬A to a construction of
A ∨ ¬A: it maps M 2 to ⟨2,M2 ⟩.
Let k be g ◦ h1 : it is a function which, if given a construction
of A, returns a construction of ⊥, i.e., it is a construction of A →
⊥ or ¬A. Now let l be g ◦ h2 . It is a function which, given a
construction of ¬A, provides a construction of ⊥. Since k is a
construction of ¬A, l (k ) is a construction of ⊥.
Together, what we’ve done is describe how we can turn a con-
struction g of ¬(A∨¬A) into a construction of ⊥, i.e., the function
f mapping a construction g of ¬(A∨¬A) to the construction l (k )
of ⊥ is a construction of ¬¬(A ∨ ¬A).
CHAPTER 19. INTRODUCTION 235

As you can see, using the BHK interpretation to show the


intuitionistic validity of formulas quickly becomes cumbersome
and confusing. Luckily, there are better derivation systems for
intuitionistic logic, and more precise semantic interpretations.

19.4 Natural Deduction


Natural deduction without the ⊥C rules is a standard derivation
system for intuitionistic logic. We repeat the rules here and indi-
cate the motivation using the BHK interpretation. In each case,
we can think of a rule which allows us to conclude that if the
premises have constructions, so does the conclusion.
Since natural deduction derivations have undischarged as-
sumptions, we should consider such a derivation, say, of A from
undischarged assumptions 𝛤, as a function that turns construc-
tions of all B ∈ 𝛤 into a construction of A. If there is a derivation
of A from no undischarged assumptions, then there is a construc-
tion of A in the sense of the BHK interpretation. For the purpose
of the discussion, however, we’ll suppress the 𝛤 when not needed.
An assumption A by itself is a derivation of A from the undis-
charged assumption A. This agrees with the BHK-interpretation:
the identity function on constructions turns any construction of A
into a construction of A.

Conjunction

A∧B
∧Elim
A B A
∧Intro
A∧B A∧B
∧Elim
B

Suppose we have constructions N 1 , N 2 of A1 and A2 , respec-


tively. Then we also have a construction A1 ∧ A2 , namely the pair
⟨N 1 ,N 2 ⟩.
CHAPTER 19. INTRODUCTION 236

A construction of A1 ∧ A1 on the BHK interpretation is a pair


⟨N 1 ,N 2 ⟩. So assume we have such a pair. Then we also have a
construction of each conjunct: N 1 is a construction of A1 and N 2
is a construction of A2 .

Conditional
[A] u
A→B A
→Elim
B
B
u →Intro
A→B

If we have a derivation of B from undischarged assumption A,


then there is a function f that turns constructions of A into con-
structions of B. That same function is a construction of A → B.
So, if the premise of →Intro has a construction conditional on a
construction of A, the conclusion A → B has a construction.
On the other hand, suppose there are constructions N of A
and f of A → B. A construction of A → B is a function that turns
constructions of A into constructions of B. So, f (N ) is a con-
struction of B, i.e., the conclusion of →Elim has a construction.

Disjunction

[A] n [B] n
A
∨Intro
A∨B
B
∨Intro A∨B C C
A∨B n ∨Elim
C

If we have a construction Ni of Ai we can turn it into a con-


struction ⟨i ,Ni ⟩ of A1 ∨A2 . On the other hand, suppose we have a
construction of A1 ∨ A2 , i.e., a pair ⟨i ,Ni ⟩ where Ni is a construc-
tion of Ai , and also functions f1 , f2 , which turn constructions of
CHAPTER 19. INTRODUCTION 237

A1 , A2 , respectively, into constructions of C . Then fi (Ni ) is a


construction of C , the conclusion of ∨Elim.

Absurdity

⊥ ⊥
I
A

If we have a derivation of ⊥ from undischarged assump-


tions B 1 , . . . , Bn , then there is a function f (M1 , . . . ,Mn ) that turns
constructions of B 1 , . . . , Bn into a construction of ⊥. Since ⊥ has
no construction, there cannot be any constructions of all of B 1 ,
. . . , Bn either. Hence, f also has the property that if M1 , . . . , Mn
are constructions of B 1 , . . . , Bn , respectively, then f (M1 , . . . ,Mn )
is a construction of A.

Rules for ¬
Since ¬A is defined as A → ⊥, we strictly speaking do not need
rules for ¬. But if we did, this is what they’d look like:

[A] n
¬A A
⊥ ¬Elim

n ¬Intro
¬A

Examples of Derivations
1. ⊢ A → (¬A → ⊥), i.e., ⊢ A → ((A → ⊥) → ⊥)

[A] 2 [A → ⊥] 1
⊥ →Elim
1 →Intro
(A → ⊥) → ⊥
2 →Intro
A → (A → ⊥) → ⊥

2. ⊢ ((A ∧ B) → C ) → (A → (B → C ))
CHAPTER 19. INTRODUCTION 238

[A] 2 [B] 1
∧Intro
[(A ∧ B) → C ]3 A∧B
→Elim
1
C
→Intro
2
B → C
→Intro
A → (B → C )
3 →Intro
((A ∧ B) → C ) → (A → (B → C ))

3. ⊢ ¬(A ∧ ¬A), i.e., ⊢ (A ∧ (A → ⊥)) → ⊥

[A ∧ (A → ⊥)] 1 [A ∧ (A → ⊥)] 1
∧Elim ∧Elim
A→⊥ A
⊥ →Elim
1 →Intro
(A ∧ (A → ⊥)) → ⊥

4. ⊢ ¬¬(A ∨ ¬A), i.e., ⊢ ((A ∨ (A → ⊥)) → ⊥) → ⊥

[A] 1
∨Intro
[(A ∨ (A → ⊥)) → ⊥] 2 A ∨ (A → ⊥)
⊥ →Elim
1 →Intro
A→⊥
∨Intro
[(A ∨ (A → ⊥)) → ⊥] 2 A ∨ (A → ⊥)
⊥ →Elim
2 →Intro
((A ∨ (A → ⊥)) → ⊥) → ⊥

Proposition 19.8. If 𝛤 ⊢ A in intuitionistic logic, 𝛤 ⊢ A in classical


logic. In particular, if A is an intuitionistic theorem, it is also a classical
theorem.
Proof. Every natural deduction rule is also a rule in classical nat-
ural deduction, so every derivation in intuitionistic logic is also
a derivation in classical logic. □

19.5 Axiomatic Derivations


Axiomatic derivations for intuitionistic propositional logic are
the conceptually simplest, and historically first, derivation sys-
tems. They work just as in classical propositional logic.
CHAPTER 19. INTRODUCTION 239

Definition 19.9 (Derivability). If 𝛤 is a set of formulas of L


then a derivation from 𝛤 is a finite sequence A1 , . . . , An of formulas
where for each i ≤ n one of the following holds:

1. Ai ∈ 𝛤; or

2. Ai is an axiom; or

3. Ai follows from some A j and Ak with j < i and k < i by


modus ponens, i.e., Ak ≡ A j → Ai .

Definition 19.10 (Axioms). The set of Ax0 of axioms for the


intuitionistic propositional logic are all formulas of the following
forms:

(A ∧ B) → A (19.1)
(A ∧ B) → B (19.2)
A → (B → (A ∧ B)) (19.3)
A → (A ∨ B) (19.4)
A → (B ∨ A) (19.5)
(A → C ) → ((B → C ) → ((A ∨ B) → C )) (19.6)
A → (B → A) (19.7)
(A → (B → C )) → ((A → B) → (A → C )) (19.8)
⊥→A (19.9)

Definition 19.11 (Derivability). A formula A is derivable from


𝛤, written 𝛤 ⊢ A, if there is a derivation from 𝛤 ending in A.

Definition 19.12 (Theorems). A formula A is a theorem if there


CHAPTER 19. INTRODUCTION 240

is a derivation of A from the empty set. We write ⊢ A if A is a


theorem and ⊬ A if it is not.

Proposition 19.13. If 𝛤 ⊢ A in intuitionistic logic, 𝛤 ⊢ A in clas-


sical logic. In particular, if A is an intuitionistic theorem, it is also a
classical theorem.

Proof. Every intuitionistic axiom is also a classical axiom, so ev-


ery derivation in intuitionistic logic is also a derivation in classi-
cal logic. □

Problems
Problem 19.1. Give derivations in intutionistic logic of the fol-
lowing.

1. (¬A ∨ B) → (A → B)

2. ¬¬¬A → ¬A

3. ¬¬(A ∧ B) ↔ (¬¬A ∧ ¬¬B)

4. ¬(A ∨ B) ↔ (¬A ∧ B)

5. (¬A ∨ ¬B) → ¬(A ∧ B)

6. ¬¬(A ∧ B) → (¬¬A ∨ ¬¬B)


CHAPTER 20

Semantics
20.1 Introduction
No logic is satisfactorily described without a semantics, and in-
tuitionistic logic is no exception. Whereas for classical logic, the
semantics based on valuations is canonical, there are several com-
peting semantics for intuitionistic logic. None of them are com-
pletely satisfactory in the sense that they give an intuitionistically
acceptable account of the meanings of the connectives.
The semantics based on relational models, similar to the se-
mantics for modal logics, is perhaps the most popular one. In
this semantics, propositional variables are assigned to worlds,
and these worlds are related by an accessibility relation. That re-
lation is always a partial order, i.e., it is reflexive, antisymmetric,
and transitive.
Intuitively, you might think of these worlds as states of knowl-
edge or “evidentiary situations.” A state w ′ is accessible from w
iff, for all we know, w ′ is a possible (future) state of knowledge,
i.e., one that is compatible with what’s known at w. Once a propo-
sition is known, it can’t become un-known, i.e., whenever A is
known at w and Rww ′, A is known at w ′ as well. So “knowledge”
is monotonic with respect to the accessibility relation.
If we define “A is known” as in epistemic logic as “true in all
epistemic alternatives,” then A∧B is known at w if in all epistemic
alternatives, both A and B are known. But since knowledge is

241
CHAPTER 20. SEMANTICS 242

monotonic and R is reflexive, that means that A ∧ B is known


at w iff A and B are known at w. For the same reason, A ∨ B
is known at w iff at least one of them is known. So for ∧ and
∨, the truth conditions of the connectives coincide with those in
classical logic.
The truth conditions for the conditional, however, differ from
classical logic. A → B is known at w iff at no w ′ with Rww ′, A is
known without B also being known. This is not the same as the
condition that A is unknown or B is known at w. For if we know
neither A nor B at w, there might be a future epistemic state w ′
with Rww ′ such that at w ′, A is known without also coming to
know B.
We know ¬A only if there is no possible future epistemic state
in which we know A. Here the idea is that if A were knowable,
then in some possible future epistemic state A becomes known.
Since we can’t know ⊥, in that future epistemic state, we would
know A but not know ⊥.
On this interpretation the principle of excluded middle fails.
For there are some A which we don’t yet know, but which we might
come to know. For such an A, both A and ¬A are unknown, so
A ∨ ¬A is not known. But we do know, e.g., that ¬(A ∧ ¬A). For
no future state in which we know both A and ¬A is possible, and
we know this independently of whether or not we know A or ¬A.
Relational models are not the only available semantics for
intuitionistic logic. The topological semantics is another: here
propositions are interpreted as open sets in a topological space,
and the connectives are interpreted as operations on these sets
(e.g., ∧ corresponds to intersection).

20.2 Relational models


In order to give a precise semantics for intuitionistic proposi-
tional logic, we have to give a definition of what counts as a model
relative to which we can evaluate formulas. On the basis of such
a definition it is then also possible to define semantics notions
CHAPTER 20. SEMANTICS 243

such as validity and entailment. One such semantics is given by


relational models.

Definition 20.1. A relational model for intuitionistic proposi-


tional logic is a triple M = ⟨W,R,V ⟩, where

1. W is a non-empty set,

2. R is a partial order (i.e., a reflexive, antisymmetric, and


transitive binary relation) on W , and

3. V is a function assigning to each propositional variable p


a subset of W , such that

4. V is monotone with respect to R, i.e., if w ∈ V (p) and


Rww ′, then w ′ ∈ V (p).

Definition 20.2. We define the notion of A being true at w in M,


M,w ⊩ A, inductively as follows:

1. A ≡ p: M,w ⊩ A iff w ∈ V (p).

2. A ≡ ⊥: not M,w ⊩ A.

3. A ≡ ¬B: M,w ⊩ A iff for no w ′ such that Rww ′, M,w ′ ⊩ B.

4. A ≡ B ∧ C : M,w ⊩ A iff M,w ⊩ B and M,w ⊩ C .

5. A ≡ B ∨ C : M,w ⊩ A iff M,w ⊩ B or M,w ⊩ C (or both).

6. A ≡ B → C : M,w ⊩ A iff for every w ′ such that Rww ′, not


M,w ′ ⊩ B or M,w ′ ⊩ C (or both).

We write M,w ⊮ A if not M,w ⊩ A. If 𝛤 is a set of formulas,


M,w ⊩ 𝛤 means M,w ⊩ B for all B ∈ 𝛤.
CHAPTER 20. SEMANTICS 244

Proposition 20.3. Truth at worlds is monotonic with respect to R,


i.e., if M,w ⊩ A and Rww ′, then M,w ′ ⊩ A.

Proof. Exercise. □

20.3 Semantic Notions

Definition 20.4. We say A is true in the model M = ⟨W,R,V ⟩,


M ⊩ A, iff M,w ⊩ A for all w ∈ W . A is valid, ⊨ A, iff it is true
in all models. We say a set of formulas 𝛤 entails A, 𝛤 ⊨ A, iff for
every model M and every w such that M,w ⊩ 𝛤, M,w ⊩ A.

Proposition 20.5. 1. If M,w ⊩ 𝛤 and 𝛤 ⊨ A, then M,w ⊩ A.

2. If M ⊩ 𝛤 and 𝛤 ⊨ A, then M ⊩ A.

Proof. 1. Suppose M ⊩ 𝛤. Since 𝛤 ⊨ A, we know that if M,w ⊩


𝛤, then M,w ⊩ A. Since M,u ⊩ 𝛤 for all every u ∈ W ,
M,w ⊩ 𝛤. Hence M,w ⊩ A.

2. Follows immediately from (1). □

Definition 20.6. Suppose M is a relational model and w ∈ W .


The restriction Mw = ⟨Ww ,Rw ,Vw ⟩ of M to w is given by:

Ww = {u ∈ W : Rwu },
Rw = R ∩ (Ww ) 2 , and
Vw (p) = V (p) ∩ Ww .

Proposition 20.7. M,w ⊩ A iff Mw ⊩ A.


CHAPTER 20. SEMANTICS 245

Proposition 20.8. Suppose for every model M such that M ⊩ 𝛤,


M ⊩ A. Then 𝛤 ⊨ A.

Proof. Suppose that M,w ⊩ 𝛤. By the Proposition 20.7 applied


to every B ∈ 𝛤, we have Mw ⊩ 𝛤. By the assumption, we have
Mw ⊩ A. By Proposition 20.7 again, we get M,w ⊩ A. □

20.4 Topological Semantics


Another way to provide a semantics for intuitionistic logic is us-
ing the mathematical concept of a topology.

Definition 20.9. Let X be a set. A topology on X is a set O ⊆


℘(X ) that satisfies the properties below. The elements of O are
called the open sets of the topology. The set X together with O is
called a topological space.

1. The empty set and the entire space are open: ∅, X ∈ O.

2. Open sets are closed under finite intersections: if U , V ∈ O


then U ∩ V ∈ O

3. Open sets are closed under arbitrary unions: if Ui ∈ O for


all i ∈ I , then {Ui : i ∈ I } ∈ O.
⋃︁

We may write X for a topology if the collection of open sets


can be inferred from the context; note that, still, only after X is
endowed with open sets can it be called a topology.

Definition 20.10. A topological model of intuitionistic proposi-


tional logic is a triple X = ⟨X , O,V ⟩ where O is a topology on X
and V is a function assigning an open set in O to each proposi-
tional variable.
Given a topological model X, we can define [ A]] X inductively
as follows:
CHAPTER 20. SEMANTICS 246

1. [ ⊥]] X = ∅

2. [ p]] X = V (p)

3. [ A ∧ B]] X = [ A]] X ∩ [ B]] X

4. [ A ∨ B]] X = [ A]] X ∪ [ B]] X

5. [ A → B]] X = Int((X \ [ A]] X ) ∪ [ B]] X )

Here, Int(V ) is the function that maps a set V ⊆ X to its interior,


that is, the union of all open sets it contains. In other words,
⋃︂
Int(V ) = {U : U ⊆ V and U ∈ O}.

Note that the interior of any set is always open, since it is a


union of open sets. Thus, [ A]] X is always an open set.
Although topological semantics is highly abstract, there are
ways to think about it that might motivate it. Suppose that the
elements, or “points,” of X are points at which statements can be
evaluated. The set of all points where A is true is the proposition
expressed by A. Not every set of points is a potential proposition;
only the elements of O are. A ⊨ B iff B is true at every point at
which A is true, i.e., [ A]] X ⊆ [ B]] X , for all X . The absurd state-
ment ⊥ is never true, so [ ⊥]] X = ∅. How must the propositions
expressed by B ∧ C , B ∨ C , and B → C be related to those ex-
pressed by B and C for the intuitionistically valid laws to hold,
i.e., so that A ⊢ B iff [ A]] X ⊂ [ B]] X . ⊥ ⊢ A for any A, and only
∅ ⊆ U for all U . Since B ∧ C ⊢ B, [ B ∧ C ] X ⊆ [ B]] X , and sim-
ilarly [ B ∧ C ] X ⊆ [ C ] X . The largest set satisfying W ⊆ U and
W ⊆ V is U ∩ V . Conversely, B ⊢ B ∨ C and C ⊢ B ∨ C , and so
[ B]] X ⊆ [ B ∨ C ] X and [ C ] X ⊆ [ B ∨ C ] X . The smallest set W
such that U ⊆ W and V ⊆ W is U ∪ V . The definition for → is
tricky: A → B expresses the weakest proposition that, combined
with A, entails B. That A → B combined with A entails B is clear
from (A → B) ∧ A ⊢ B. So [ A → B]] X should be the greatest open
set such that [ A → B]] X ∩ [ A]] X ⊂ [ B]] X , leading to our definition.
CHAPTER 20. SEMANTICS 247

Problems
Problem 20.1. Show that according to Definition 20.2, M,w ⊩
¬A iff M,w ⊩ A → ⊥.

Problem 20.2. Prove Proposition 20.3.

Problem 20.3. Prove Proposition 20.7.


PART IX

Wait, hear
me out:
what if it’s
both true
and false?

248
CHAPTER 21

Paraconsistent
logics
To p and to ¬p, that is the question.

249
PART X

Appendices

250
APPENDIX A

Sets
A.1 Extensionality
A set is a collection of objects, considered as a single object. The
objects making up the set are called elements or members of the
set. If x is an element of a set a, we write x ∈ a; if not, we write
x ∉ a. The set which has no elements is called the empty set and
denoted “∅”.
It does not matter how we specify the set, or how we order
its elements, or indeed how many times we count its elements.
All that matters are what its elements are. We codify this in the
following principle.

Definition A.1 (Extensionality). If A and B are sets, then A =


B iff every element of A is also an element of B, and vice versa.

Extensionality licenses some notation. In general, when we


have some objects a1 , . . . , an , then {a1 , . . . ,an } is the set whose
elements are a1 , . . . ,an . We emphasise the word “the”, since ex-
tensionality tells us that there can be only one such set. Indeed,
extensionality also licenses the following:

{a,a,b } = {a,b } = {b,a}.

251
APPENDIX A. SETS 252

This delivers on the point that, when we consider sets, we don’t


care about the order of their elements, or how many times they
are specified.

Example A.2. Whenever you have a bunch of objects, you can


collect them together in a set. The set of Richard’s siblings, for
instance, is a set that contains one person, and we could write it as
S = {Ruth}. The set of positive integers less than 4 is {1, 2, 3}, but
it can also be written as {3, 2, 1} or even as {1, 2, 1, 2, 3}. These are
all the same set, by extensionality. For every element of {1, 2, 3}
is also an element of {3, 2, 1} (and of {1, 2, 1, 2, 3}), and vice versa.

Frequently we’ll specify a set by some property that its ele-


ments share. We’ll use the following shorthand notation for that:
{x : 𝜑(x)}, where the 𝜑(x) stands for the property that x has to
have in order to be counted among the elements of the set.

Example A.3. In our example, we could have specified S also


as
S = {x : x is a sibling of Richard}.

Example A.4. A number is called perfect iff it is equal to the


sum of its proper divisors (i.e., numbers that evenly divide it but
aren’t identical to the number). For instance, 6 is perfect because
its proper divisors are 1, 2, and 3, and 6 = 1 + 2 + 3. In fact, 6
is the only positive integer less than 10 that is perfect. So, using
extensionality, we can say:

{6} = {x : x is perfect and 0 ≤ x ≤ 10}

We read the notation on the right as “the set of x’s such that x
is perfect and 0 ≤ x ≤ 10”. The identity here confirms that,
when we consider sets, we don’t care about how they are spec-
ified. And, more generally, extensionality guarantees that there
is always only one set of x’s such that 𝜑(x). So, extensionality
justifies calling {x : 𝜑(x)} the set of x’s such that 𝜑(x).
APPENDIX A. SETS 253

Extensionality gives us a way for showing that sets are iden-


tical: to show that A = B, show that whenever x ∈ A then also
x ∈ B, and whenever y ∈ B then also y ∈ A.

A.2 Subsets and Power Sets


We will often want to compare sets. And one obvious kind of
comparison one might make is as follows: everything in one set is
in the other too. This situation is sufficiently important for us to
introduce some new notation.

Definition A.5 (Subset). If every element of a set A is also


an element of B, then we say that A is a subset of B, and write
A ⊆ B. If A is not a subset of B we write A ⊈ B. If A ⊆ B but
A ≠ B, we write A ⊊ B and say that A is a proper subset of B.

Example A.6. Every set is a subset of itself, and ∅ is a subset of


every set. The set of even numbers is a subset of the set of natural
numbers. Also, {a,b } ⊆ {a,b,c }. But {a,b,e } is not a subset of
{a,b,c }.

Example A.7. The number 2 is an element of the set of integers,


whereas the set of even numbers is a subset of the set of integers.
However, a set may happen to both be an element and a subset
of some other set, e.g., {0} ∈ {0, {0}} and also {0} ⊆ {0, {0}}.

Extensionality gives a criterion of identity for sets: A = B


iff every element of A is also an element of B and vice versa.
The definition of “subset” defines A ⊆ B precisely as the first
half of this criterion: every element of A is also an element of B.
Of course the definition also applies if we switch A and B: that
is, B ⊆ A iff every element of B is also an element of A. And
that, in turn, is exactly the “vice versa” part of extensionality. In
other words, extensionality entails that sets are equal iff they are
subsets of one another.
APPENDIX A. SETS 254

Proposition A.8. A = B iff both A ⊆ B and B ⊆ A.

Now is also a good opportunity to introduce some further


bits of helpful notation. In defining when A is a subset of B
we said that “every element of A is . . . ,” and filled the “. . . ” with
“an element of B”. But this is such a common shape of expression
that it will be helpful to introduce some formal notation for it.

Definition A.9. (∀x ∈ A)𝜑 abbreviates ∀x (x ∈ A→𝜑). Similarly,


(∃x ∈ A)𝜑 abbreviates ∃x (x ∈ A ∧ 𝜑).

Using this notation, we can say that A ⊆ B iff (∀x ∈ A)x ∈ B.


Now we move on to considering a certain kind of set: the set
of all subsets of a given set.

Definition A.10 (Power Set). The set consisting of all subsets


of a set A is called the power set of A, written ℘(A).

℘(A) = {B : B ⊆ A}

Example A.11. What are all the possible subsets of {a,b,c }?


They are: ∅, {a}, {b }, {c }, {a,b }, {a,c }, {b,c }, {a,b,c }. The set
of all these subsets is ℘({a,b,c }):

℘({a,b,c }) = {∅, {a}, {b }, {c }, {a,b }, {b,c }, {a,c }, {a,b,c }}

A.3 Some Important Sets


Example A.12. We will mostly be dealing with sets whose el-
ements are mathematical objects. Four such sets are important
enough to have specific names:

N = {0, 1, 2, 3, . . .}
the set of natural numbers
Z = {. . . , −2, −1, 0, 1, 2, . . .}
APPENDIX A. SETS 255

the set of integers


Q = {m/n : m,n ∈ Z and n ≠ 0}
the set of rationals
R = (−∞, ∞)
the set of real numbers (the continuum)

These are all infinite sets, that is, they each have infinitely many
elements.
As we move through these sets, we are adding more numbers
to our stock. Indeed, it should be clear that N ⊆ Z ⊆ Q ⊆ R:
after all, every natural number is an integer; every integer is a
rational; and every rational is a real. Equally, it should be clear
that N ⊊ Z ⊊ Q, since −1 is an integer but not a natural number,
and 1/2 is rational but not integer. It is less obvious that Q ⊊ R,
i.e., that there are some real numbers which are not rational.
We’ll sometimes also use the set of positive integers Z+ =
{1, 2, 3, . . . } and the set containing just the first two natural num-
bers B = {0, 1}.

Example A.13 (Strings). Another interesting example is the


set A∗ of finite strings over an alphabet A: any finite sequence
of elements of A is a string over A. We include the empty string 𝛬
among the strings over A, for every alphabet A. For instance,

B∗ = {𝛬, 0, 1, 00, 01, 10, 11,


000, 001, 010, 011, 100, 101, 110, 111, 0000, . . .}.

If x = x 1 . . . x n ∈ A∗ is a string consisting of n “letters” from A,


then we say length of the string is n and write len(x) = n.

Example A.14 (Infinite sequences). For any set A we may


also consider the set A𝜔 of infinite sequences of elements of A.
An infinite sequence a 1 a2 a3 a 4 . . . consists of a one-way infinite
list of objects, each one of which is an element of A.
APPENDIX A. SETS 256

Figure A.1: The union A ∪ B of two sets is set of elements of A together with
those of B.

A.4 Unions and Intersections


In appendix A.1, we introduced definitions of sets by abstraction,
i.e., definitions of the form {x : 𝜑(x)}. Here, we invoke some
property 𝜑, and this property can mention sets we’ve already
defined. So for instance, if A and B are sets, the set {x : x ∈
A∨x ∈ B } consists of all those objects which are elements of either
A or B, i.e., it’s the set that combines the elements of A and B.
We can visualize this as in Figure A.1, where the highlighted area
indicates the elements of the two sets A and B together.
This operation on sets—combining them—is very useful and
common, and so we give it a formal name and a symbol.

Definition A.15 (Union). The union of two sets A and B, writ-


ten A ∪ B, is the set of all things which are elements of A, B, or
both.
A ∪ B = {x : x ∈ A ∨ x ∈ B }

Example A.16. Since the multiplicity of elements doesn’t mat-


ter, the union of two sets which have an element in common con-
tains that element only once, e.g., {a,b,c }∪{a, 0, 1} = {a,b,c , 0, 1}.
The union of a set and one of its subsets is just the bigger set:
{a,b,c } ∪ {a} = {a,b,c }.
APPENDIX A. SETS 257

Figure A.2: The intersection A ∩ B of two sets is the set of elements they have
in common.

The union of a set with the empty set is identical to the set:
{a,b,c } ∪ ∅ = {a,b,c }.

We can also consider a “dual” operation to union. This is the


operation that forms the set of all elements that are elements of A
and are also elements of B. This operation is called intersection,
and can be depicted as in Figure A.2.

Definition A.17 (Intersection). The intersection of two sets A


and B, written A ∩ B, is the set of all things which are elements
of both A and B.

A ∩ B = {x : x ∈ A ∧ x ∈ B }

Two sets are called disjoint if their intersection is empty. This


means they have no elements in common.

Example A.18. If two sets have no elements in common, their


intersection is empty: {a,b,c } ∩ {0, 1} = ∅.
If two sets do have elements in common, their intersection is
the set of all those: {a,b,c } ∩ {a,b,d } = {a,b }.
The intersection of a set with one of its subsets is just the
smaller set: {a,b,c } ∩ {a,b } = {a,b }.
APPENDIX A. SETS 258

The intersection of any set with the empty set is empty:


{a,b,c } ∩ ∅ = ∅.

We can also form the union or intersection of more than two


sets. An elegant way of dealing with this in general is the follow-
ing: suppose you collect all the sets you want to form the union
(or intersection) of into a single set. Then we can define the union
of all our original sets as the set of all objects which belong to at
least one element of the set, and the intersection as the set of all
objects which belong to every element of the set.

Definition A.19. If A is a set of sets, then A is the set of


⋃︁
elements of elements of A:
⋃︂
A = {x : x belongs to an element of A}, i.e.,
= {x : there is a B ∈ A so that x ∈ B }

Definition A.20. If A is a set of sets, then A is the set of


⋂︁
objects which all elements of A have in common:
⋂︂
A = {x : x belongs to every element of A}, i.e.,
= {x : for all B ∈ A,x ∈ B }

Example A.21. Suppose A = {{a,b }, {a,d ,e }, {a,d }}. Then


A = {a,b,d ,e } and A = {a}.
⋃︁ ⋂︁

We could also do the same for a sequence of sets A1 , A2 , . . .


⋃︂
Ai = {x : x belongs to one of the Ai }
i
⋂︂
Ai = {x : x belongs to every Ai }.
i

When we have an index of sets, i.e., some set I such that


we are considering Ai for each i ∈ I , we may also use these
APPENDIX A. SETS 259

Figure A.3: The difference A \ B of two sets is the set of those elements of A
which are not also elements of B.

abbreviations:
⋃︂ ⋃︂
Ai = {Ai : i ∈ I }
i ∈I
⋂︂ ⋂︂
Ai = {Ai : i ∈ I }
i ∈I

Finally, we may want to think about the set of all elements


in A which are not in B. We can depict this as in Figure A.3.

Definition A.22 (Difference). The set difference A \ B is the set


of all elements of A which are not also elements of B, i.e.,

A \ B = {x : x ∈ A and x ∉ B }.

A.5 Pairs, Tuples, Cartesian Products


It follows from extensionality that sets have no order to their
elements. So if we want to represent order, we use ordered pairs
⟨x, y⟩. In an unordered pair {x, y }, the order does not matter:
{x, y } = {y,x }. In an ordered pair, it does: if x ≠ y, then ⟨x, y⟩ ≠
⟨y,x⟩.
How should we think about ordered pairs in set theory? Cru-
cially, we want to preserve the idea that ordered pairs are iden-
tical iff they share the same first element and share the same
APPENDIX A. SETS 260

second element, i.e.:

⟨a,b⟩ = ⟨c ,d ⟩ iff both a = c and b = d .

We can define ordered pairs in set theory using the Wiener-


Kuratowski definition.

Definition A.23 (Ordered pair). ⟨a,b⟩ = {{a}, {a,b }}.

Having fixed a definition of an ordered pair, we can use it


to define further sets. For example, sometimes we also want or-
dered sequences of more than two objects, e.g., triples ⟨x, y, z ⟩,
quadruples ⟨x, y, z ,u⟩, and so on. We can think of triples as spe-
cial ordered pairs, where the first element is itself an ordered pair:
⟨x, y, z ⟩ is ⟨⟨x, y⟩, z ⟩. The same is true for quadruples: ⟨x, y, z ,u⟩
is ⟨⟨⟨x, y⟩, z ⟩,u⟩, and so on. In general, we talk of ordered n-tuples
⟨x 1 , . . . ,x n ⟩.
Certain sets of ordered pairs, or other ordered n-tuples, will
be useful.

Definition A.24 (Cartesian product). Given sets A and B,


their Cartesian product A × B is defined by

A × B = {⟨x, y⟩ : x ∈ A and y ∈ B }.

Example A.25. If A = {0, 1}, and B = {1,a,b }, then their prod-


uct is

A × B = {⟨0, 1⟩, ⟨0,a⟩, ⟨0,b⟩, ⟨1, 1⟩, ⟨1,a⟩, ⟨1,b⟩}.

Example A.26. If A is a set, the product of A with itself, A × A,


is also written A2 . It is the set of all pairs ⟨x, y⟩ with x, y ∈ A. The
set of all triples ⟨x, y, z ⟩ is A3 , and so on. We can give a recursive
definition:

A1 = A
Ak +1 = Ak × A
APPENDIX A. SETS 261

Proposition A.27. If A has n elements and B has m elements, then


A × B has n · m elements.

Proof. For every element x in A, there are m elements of the form


⟨x, y⟩ ∈ A × B. Let Bx = {⟨x, y⟩ : y ∈ B }. Since whenever x 1 ≠ x 2 ,
⟨x 1 , y⟩ ≠ ⟨x 2 , y⟩, Bx 1 ∩ Bx 2 = ∅. But if A = {x 1 , . . . ,x n }, then
A × B = Bx 1 ∪ · · · ∪ Bxn , and so has n · m elements.
To visualize this, arrange the elements of A × B in a grid:

Bx 1 = {⟨x 1 , y 1 ⟩ ⟨x 1 , y 2 ⟩ . . . ⟨x 1 , y m ⟩}
Bx 2 = {⟨x 2 , y 1 ⟩ ⟨x 2 , y 2 ⟩ . . . ⟨x 2 , y m ⟩}
.. ..
. .
Bxn = {⟨x n , y 1 ⟩ ⟨x n , y 2 ⟩ . . . ⟨x n , y m ⟩}

Since the x i are all different, and the y j are all different, no two of
the pairs in this grid are the same, and there are n · m of them.□

Example A.28. If A is a set, a word over A is any sequence of


elements of A. A sequence can be thought of as an n-tuple of ele-
ments of A. For instance, if A = {a,b,c }, then the sequence “bac ”
can be thought of as the triple ⟨b,a,c ⟩. Words, i.e., sequences of
symbols, are of crucial importance in computer science. By con-
vention, we count elements of A as sequences of length 1, and ∅
as the sequence of length 0. The set of all words over A then is

A∗ = {∅} ∪ A ∪ A2 ∪ A3 ∪ . . .

A.6 Russell’s Paradox


Extensionality licenses the notation {x : 𝜑(x)}, for the set of x’s
such that 𝜑(x). However, all that extensionality really licenses is
the following thought. If there is a set whose members are all
and only the 𝜑’s, then there is only one such set. Otherwise put:
having fixed some 𝜑, the set {x : 𝜑(x)} is unique, if it exists.
But this conditional is important! Crucially, not every prop-
erty lends itself to comprehension. That is, some properties do not
APPENDIX A. SETS 262

define sets. If they all did, then we would run into outright contra-
dictions. The most famous example of this is Russell’s Paradox.
Sets may be elements of other sets—for instance, the power
set of a set A is made up of sets. And so it makes sense to ask or
investigate whether a set is an element of another set. Can a set
be a member of itself? Nothing about the idea of a set seems to
rule this out. For instance, if all sets form a collection of objects,
one might think that they can be collected into a single set—the
set of all sets. And it, being a set, would be an element of the set
of all sets.
Russell’s Paradox arises when we consider the property of not
having itself as an element, of being non-self-membered. What if we
suppose that there is a set of all sets that do not have themselves
as an element? Does

R = {x : x ∉ x }

exist? It turns out that we can prove that it does not.

Theorem A.29 (Russell’s Paradox). There is no set R = {x : x ∉


x }.

Proof. For reductio, suppose that R = {x : x ∉ x } exists. Then R ∈


R iff R ∉ R, since sets are extensional. But this is a contradicion.□

Let’s run through the proof that no set R of non-self-


membered sets can exist more slowly. If R exists, it makes sense
to ask if R ∈ R or not—it must be either ∈ R or ∉ R. Suppose
the former is true, i.e., R ∈ R. R was defined as the set of all
sets that are not elements of themselves, and so if R ∈ R, then R
does not have this defining property of R. But only sets that have
this property are in R, hence, R cannot be an element of R, i.e.,
R ∉ R. But R can’t both be and not be an element of R, so we
have a contradiction.
Since the assumption that R ∈ R leads to a contradiction, we
have R ∉ R. But this also leads to a contradiction! For if R ∉ R, it
does have the defining property of R, and so would be an element
APPENDIX A. SETS 263

of R just like all the other non-self-membered sets. And again, it


can’t both not be and be an element of R.
How do we set up a set theory which avoids falling into Rus-
sell’s Paradox, i.e., which avoids making the inconsistent claim that
R = {x : x ∉ x } exists? Well, we would need to lay down axioms
which give us very precise conditions for stating when sets exist
(and when they don’t).
The set theory sketched in this chapter doesn’t do this. It’s
genuinely naïve. It tells you only that sets obey extensionality and
that, if you have some sets, you can form their union, intersection,
etc. It is possible to develop set theory more rigorously than
this.

Problems
Problem A.1. Prove that there is at most one empty set, i.e.,
show that if A and B are sets without elements, then A = B.

Problem A.2. List all subsets of {a,b,c ,d }.

Problem A.3. Show that if A has n elements, then ℘(A) has 2n


elements.

Problem A.4. Prove that if A ⊆ B, then A ∪ B = B.

Problem A.5. Prove rigorously that if A ⊆ B, then A ∩ B = A.

Problem A.6. Show that if A is a set and A ∈ B, then A ⊆ B.


⋃︁

Problem A.7. Prove that if A ⊊ B, then B \ A ≠ ∅.

Problem A.8. Using Definition A.23, prove that ⟨a,b⟩ = ⟨c ,d ⟩


iff both a = c and b = d .

Problem A.9. List all elements of {1, 2, 3}3 .

Problem A.10. Show, by induction on k , that for all k ≥ 1, if A


has n elements, then Ak has n k elements.
APPENDIX B

Relations
B.1 Relations as Sets
In appendix A.3, we mentioned some important sets: N, Z, Q, R.
You will no doubt remember some interesting relations between
the elements of some of these sets. For instance, each of these sets
has a completely standard order relation on it. There is also the
relation is identical with that every object bears to itself and to no
other thing. There are many more interesting relations that we’ll
encounter, and even more possible relations. Before we review
them, though, we will start by pointing out that we can look at
relations as a special sort of set.
For this, recall two things from appendix A.5. First, recall
the notion of a ordered pair: given a and b, we can form ⟨a,b⟩.
Importantly, the order of elements does matter here. So if a ≠ b
then ⟨a,b⟩ ≠ ⟨b,a⟩. (Contrast this with unordered pairs, i.e., 2-
element sets, where {a,b } = {b,a}.) Second, recall the notion of
a Cartesian product: if A and B are sets, then we can form A × B,
the set of all pairs ⟨x, y⟩ with x ∈ A and y ∈ B. In particular,
A2 = A × A is the set of all ordered pairs from A.
Now we will consider a particular relation on a set: the <-
relation on the set N of natural numbers. Consider the set of all
pairs of numbers ⟨n,m⟩ where n < m, i.e.,

R = {⟨n,m⟩ : n,m ∈ N and n < m}.

264
APPENDIX B. RELATIONS 265

There is a close connection between n being less than m, and the


pair ⟨n,m⟩ being a member of R, namely:

n < m iff ⟨n,m⟩ ∈ R.

Indeed, without any loss of information, we can consider the set


R to be the <-relation on N.
In the same way we can construct a subset of N2 for any rela-
tion between numbers. Conversely, given any set of pairs of num-
bers S ⊆ N2 , there is a corresponding relation between numbers,
namely, the relationship n bears to m if and only if ⟨n,m⟩ ∈ S .
This justifies the following definition:

Definition B.1 (Binary relation). A binary relation on a set A


is a subset of A2 . If R ⊆ A2 is a binary relation on A and x, y ∈ A,
we sometimes write Rxy (or xRy) for ⟨x, y⟩ ∈ R.

Example B.2. The set N2 of pairs of natural numbers can be


listed in a 2-dimensional matrix like this:
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. ..
. . . . .

We have put the diagonal, here, in bold, since the subset of N2


consisting of the pairs lying on the diagonal, i.e.,

{⟨0, 0⟩, ⟨1, 1⟩, ⟨2, 2⟩, . . . },

is the identity relation on N. (Since the identity relation is popular,


let’s define IdA = {⟨x,x⟩ : x ∈ X } for any set A.) The subset of
all pairs lying above the diagonal, i.e.,

L = {⟨0, 1⟩, ⟨0, 2⟩, . . . , ⟨1, 2⟩, ⟨1, 3⟩, . . . , ⟨2, 3⟩, ⟨2, 4⟩, . . .},
APPENDIX B. RELATIONS 266

is the less than relation, i.e., Lnm iff n < m. The subset of pairs
below the diagonal, i.e.,

G = {⟨1, 0⟩, ⟨2, 0⟩, ⟨2, 1⟩, ⟨3, 0⟩, ⟨3, 1⟩, ⟨3, 2⟩, . . . },

is the greater than relation, i.e., G nm iff n > m. The union of L


with I , which we might call K = L ∪ I , is the less than or equal to
relation: K nm iff n ≤ m. Similarly, H = G ∪ I is the greater than
or equal to relation. These relations L, G , K , and H are special
kinds of relations called orders. L and G have the property that
no number bears L or G to itself (i.e., for all n, neither Lnn nor
G nn). Relations with this property are called irreflexive, and, if
they also happen to be orders, they are called strict orders.

Although orders and identity are important and natural re-


lations, it should be emphasized that according to our defini-
tion any subset of A2 is a relation on A, regardless of how un-
natural or contrived it seems. In particular, ∅ is a relation on
any set (the empty relation, which no pair of elements bears),
and A2 itself is a relation on A as well (one which every pair
bears), called the universal relation. But also something like
E = {⟨n,m⟩ : n > 5 or m × n ≥ 34} counts as a relation.

B.2 Special Properties of Relations


Some kinds of relations turn out to be so common that they have
been given special names. For instance, ≤ and ⊆ both relate their
respective domains (say, N in the case of ≤ and ℘(A) in the case
of ⊆) in similar ways. To get at exactly how these relations are
similar, and how they differ, we categorize them according to
some special properties that relations can have. It turns out that
(combinations of) some of these special properties are especially
important: orders and equivalence relations.
APPENDIX B. RELATIONS 267

Definition B.3 (Reflexivity). A relation R ⊆ A2 is reflexive iff,


for every x ∈ A, Rxx.

Definition B.4 (Transitivity). A relation R ⊆ A2 is transitive


iff, whenever Rxy and Ryz , then also Rxz .

Definition B.5 (Symmetry). A relation R ⊆ A2 is symmetric iff,


whenever Rxy, then also Ryx.

Definition B.6 (Anti-symmetry). A relation R ⊆ A2 is anti-


symmetric iff, whenever both Rxy and Ryx, then x = y (or, in
other words: if x ≠ y then either ¬Rxy or ¬Ryx).

In a symmetric relation, Rxy and Ryx always hold together,


or neither holds. In an anti-symmetric relation, the only way for
Rxy and Ryx to hold together is if x = y. Note that this does not
require that Rxy and Ryx holds when x = y, only that it isn’t ruled
out. So an anti-symmetric relation can be reflexive, but it is not
the case that every anti-symmetric relation is reflexive. Also note
that being anti-symmetric and merely not being symmetric are
different conditions. In fact, a relation can be both symmetric
and anti-symmetric at the same time (e.g., the identity relation
is).

Definition B.7 (Connectivity). A relation R ⊆ A2 is connected


if for all x, y ∈ A, if x ≠ y, then either Rxy or Ryx.

Definition B.8 (Irreflexivity). A relation R ⊆ A2 is called ir-


reflexive if, for all x ∈ A, not Rxx.
APPENDIX B. RELATIONS 268

Definition B.9 (Asymmetry). A relation R ⊆ A2 is called asym-


metric if for no pair x, y ∈ A we have both Rxy and Ryx.

Note that if A ≠ ∅, then no irreflexive relation on A is reflex-


ive and every asymmetric relation on A is also anti-symmetric.
However, there are R ⊆ A2 that are not reflexive and also not
irreflexive, and there are anti-symmetric relations that are not
asymmetric.

B.3 Equivalence Relations


The identity relation on a set is reflexive, symmetric, and transi-
tive. Relations R that have all three of these properties are very
common.

Definition B.10 (Equivalence relation). A relation R ⊆ A2


that is reflexive, symmetric, and transitive is called an equivalence
relation. Elements x and y of A are said to be R-equivalent if Rxy.

Equivalence relations give rise to the notion of an equivalence


class. An equivalence relation “chunks up” the domain into differ-
ent partitions. Within each partition, all the objects are related
to one another; and no objects from different partitions relate
to one another. Sometimes, it’s helpful just to talk about these
partitions directly. To that end, we introduce a definition:

Definition B.11. Let R ⊆ A2 be an equivalence relation. For


each x ∈ A, the equivalence class of x in A is the set [x] R = {y ∈
A : Rxy }. The quotient of A under R is A/R = {[x] R : x ∈ A}, i.e.,
the set of these equivalence classes.

The next result vindicates the definition of an equivalence


class, in proving that the equivalence classes are indeed the par-
titions of A:
APPENDIX B. RELATIONS 269

Proposition B.12. If R ⊆ A2 is an equivalence relation, then Rxy


iff [x] R = [y] R .

Proof. For the left-to-right direction, suppose Rxy, and let z ∈


[x] R . By definition, then, Rxz . Since R is an equivalence relation,
Ryz . (Spelling this out: as Rxy and R is symmetric we have
Ryx, and as Rxz and R is transitive we have Ryz .) So z ∈ [y] R .
Generalising, [x] R ⊆ [y] R . But exactly similarly, [y] R ⊆ [x] R . So
[x] R = [y] R , by extensionality.
For the right-to-left direction, suppose [x] R = [y] R . Since R is
reflexive, Ryy, so y ∈ [y] R . Thus also y ∈ [x] R by the assumption
that [x] R = [y] R . So Rxy. □

Example B.13. A nice example of equivalence relations comes


from modular arithmetic. For any a, b, and n ∈ N, say that
a ≡n b iff dividing a by n gives remainder b. (Somewhat more
symbolically: a ≡n b iff (∃k ∈ N)a − b = kn.) Now, ≡n is an
equivalence relation, for any n. And there are exactly n distinct
equivalence classes generated by ≡n ; that is, N/≡n has n elements.
These are: the set of numbers divisible by n without remainder,
i.e., [0] ≡n ; the set of numbers divisible by n with remainder 1, i.e.,
[1] ≡n ; . . . ; and the set of numbers divisible by n with remainder
n − 1, i.e., [n − 1] ≡n .

B.4 Orders
Many of our comparisons involve describing some objects as be-
ing “less than”, “equal to”, or “greater than” other objects, in a
certain respect. These involve order relations. But there are differ-
ent kinds of order relations. For instance, some require that any
two objects be comparable, others don’t. Some include identity
(like ≤) and some exclude it (like <). It will help us to have a
taxonomy here.
APPENDIX B. RELATIONS 270

Definition B.14 (Preorder). A relation which is both reflexive


and transitive is called a preorder.

Definition B.15 (Partial order). A preorder which is also anti-


symmetric is called a partial order.

Definition B.16 (Linear order). A partial order which is also


connected is called a total order or linear order.

Example B.17. Every linear order is also a partial order, and


every partial order is also a preorder, but the converses don’t
hold. The universal relation on A is a preorder, since it is reflexive
and transitive. But, if A has more than one element, the universal
relation is not anti-symmetric, and so not a partial order.

Example B.18. Consider the no longer than relation ≼ on B∗ : x ≼


y iff len(x) ≤ len(y). This is a preorder (reflexive and transitive),
and even connected, but not a partial order, since it is not anti-
symmetric. For instance, 01 ≼ 10 and 10 ≼ 01, but 01 ≠ 10.

Example B.19. An important partial order is the relation ⊆ on a


set of sets. This is not in general a linear order, since if a ≠ b and
we consider ℘({a,b }) = {∅, {a}, {b }, {a,b }}, we see that {a} ⊈ {b }
and {a} ≠ {b } and {b } ⊈ {a}.

Example B.20. The relation of divisibility without remainder


gives us a partial order which isn’t a linear order. For integers n,
m, we write n | m to mean n (evenly) divides m, i.e., iff there is
some integer k so that m = kn. On N, this is a partial order, but
not a linear order: for instance, 2 ∤ 3 and also 3 ∤ 2. Considered
as a relation on Z, divisibility is only a preorder since it is not
anti-symmetric: 1 | −1 and −1 | 1 but 1 ≠ −1.
APPENDIX B. RELATIONS 271

Definition B.21 (Strict order). A strict order is a relation which


is irreflexive, asymmetric, and transitive.

Definition B.22 (Strict linear order). A strict order which is


also connected is called a strict linear order.

Example B.23. ≤ is the linear order corresponding to the strict


linear order <. ⊆ is the partial order corresponding to the strict
order ⊊.

Definition B.24 (Total order). A strict order which is also con-


nected is called a total order. This is also sometimes called a strict
linear order.

Any strict order R on A can be turned into a partial order by


adding the diagonal IdA , i.e., adding all the pairs ⟨x,x⟩. (This
is called the reflexive closure of R.) Conversely, starting from a
partial order, one can get a strict order by removing IdA . These
next two results make this precise.

Proposition B.25. If R is a strict order on A, then R + = R ∪ IdA is


a partial order. Moreover, if R is total, then R + is a linear order.

Proof. Suppose R is a strict order, i.e., R ⊆ A2 and R is irreflexive,


asymmetric, and transitive. Let R + = R ∪ IdA . We have to show
that R + is reflexive, antisymmetric, and transitive.
R + is clearly reflexive, since ⟨x,x⟩ ∈ IdA ⊆ R + for all x ∈ A.
To show R + is antisymmetric, suppose for reductio that R + xy
and R + yx but x ≠ y. Since ⟨x, y⟩ ∈ R ∪ IdX , but ⟨x, y⟩ ∉ IdX , we
must have ⟨x, y⟩ ∈ R, i.e., Rxy. Similarly, Ryx. But this contra-
dicts the assumption that R is asymmetric.
To establish transitivity, suppose that R + xy and R + yz . If both
⟨x, y⟩ ∈ R and ⟨y, z ⟩ ∈ R, then ⟨x, z ⟩ ∈ R since R is transitive.
Otherwise, either ⟨x, y⟩ ∈ IdX , i.e., x = y, or ⟨y, z ⟩ ∈ IdX , i.e.,
APPENDIX B. RELATIONS 272

y = z . In the first case, we have that R + yz by assumption, x = y,


hence R + xz . Similarly in the second case. In either case, R + xz ,
thus, R + is also transitive.
Concerning the “moreover” clause, suppose R is a total order,
i.e., that R is connected. So for all x ≠ y, either Rxy or Ryx, i.e.,
either ⟨x, y⟩ ∈ R or ⟨y,x⟩ ∈ R. Since R ⊆ R + , this remains true of
R + , so R + is connected as well. □

Proposition B.26. If R is a partial order on X , then R − = R \ IdX


is a strict order. Moreover, if R is linear, then R − is total.

Proof. This is left as an exercise. □

Example B.27. ≤ is the linear order corresponding to the total


order <. ⊆ is the partial order corresponding to the strict order ⊊.

The following simple result which establishes that total orders


satisfy an extensionality-like property:

Proposition B.28. If < totally orders A, then:

(∀a,b ∈ A)((∀x ∈ A) (x < a ↔ x < b) → a = b)

Proof. Suppose (∀x ∈ A) (x < a ↔ x < b). If a < b, then a < a,


contradicting the fact that < is irreflexive; so a ≮ b. Exactly
similarly, b ≮ a. So a = b, as < is connected. □

B.5 Graphs
A graph is a diagram in which points—called “nodes” or “ver-
tices” (plural of “vertex”)—are connected by edges. Graphs are
a ubiquitous tool in discrete mathematics and in computer sci-
ence. They are incredibly useful for representing, and visualizing,
relationships and structures, from concrete things like networks
of various kinds to abstract structures such as the possible out-
comes of decisions. There are many different kinds of graphs in
APPENDIX B. RELATIONS 273

the literature which differ, e.g., according to whether the edges


are directed or not, have labels or not, whether there can be edges
from a node to the same node, multiple edges between the same
nodes, etc. Directed graphs have a special connection to relations.

Definition B.29 (Directed graph). A directed graph G = ⟨V,E⟩


is a set of vertices V and a set of edges E ⊆ V 2 .

According to our definition, a graph just is a set together with


a relation on that set. Of course, when talking about graphs, it’s
only natural to expect that they are graphically represented: we
can draw a graph by connecting two vertices v 1 and v 2 by an
arrow iff ⟨v 1 ,v 2 ⟩ ∈ E. The only difference between a relation by
itself and a graph is that a graph specifies the set of vertices, i.e., a
graph may have isolated vertices. The important point, however,
is that every relation R on a set X can be seen as a directed graph
⟨X ,R⟩, and conversely, a directed graph ⟨V,E⟩ can be seen as a
relation E ⊆ V 2 with the set V explicitly specified.

Example B.30. The graph ⟨V,E⟩ with V = {1, 2, 3, 4} and E =


{⟨1, 1⟩, ⟨1, 2⟩, ⟨1, 3⟩, ⟨2, 3⟩} looks like this:

1 2 4

3
APPENDIX B. RELATIONS 274

This is a different graph than ⟨V ′,E⟩ with V ′ = {1, 2, 3}, which


looks like this:

1 2

B.6 Operations on Relations


It is often useful to modify or combine relations. In Proposi-
tion B.25, we considered the union of relations, which is just the
union of two relations considered as sets of pairs. Similarly, in
Proposition B.26, we considered the relative difference of rela-
tions. Here are some other operations we can perform on rela-
tions.

Definition B.31. Let R, S be relations, and A be any set.


The inverse of R is R −1 = {⟨y,x⟩ : ⟨x, y⟩ ∈ R}.
The relative product of R and S is (R | S ) = {⟨x, z ⟩ : ∃y (Rxy ∧
S yz )}.
The restriction of R to A is R↾A = R ∩ A2 .
The application of R to A is R [A] = {y : (∃x ∈ A)Rxy }

Example B.32. Let S ⊆ Z2 be the successor relation on Z, i.e.,


S = {⟨x, y⟩ ∈ Z2 : x + 1 = y }, so that S xy iff x + 1 = y.
S −1 is the predecessor relation on Z, i.e., {⟨x, y⟩ ∈ Z2 : x − 1 =
y }.
S | S is {⟨x, y⟩ ∈ Z2 : x + 2 = y }
S ↾N is the successor relation on N.
S [{1, 2, 3}] is {2, 3, 4}.
APPENDIX B. RELATIONS 275

Definition B.33 (Transitive closure). Let R ⊆ A2 be a binary


relation.
The transitive closure of R is R + = 0<n ∈N R n , where we recur-
⋃︁
sively define R 1 = R and R n+1 = R n | R.
The reflexive transitive closure of R is R ∗ = R + ∪ IdX .

Example B.34. Take the successor relation S ⊆ Z2 . S 2 xy iff


x + 2 = y, S 3 xy iff x + 3 = y, etc. So S + xy iff x + n = y for some
n > 1. In other words, S + xy iff x < y, and S ∗ xy iff x ≤ y.

Problems
Problem B.1. List the elements of the relation ⊆ on the set
℘({a,b,c }).

Problem B.2. Give examples of relations that are (a) reflex-


ive and symmetric but not transitive, (b) reflexive and anti-
symmetric, (c) anti-symmetric, transitive, but not reflexive, and
(d) reflexive, symmetric, and transitive. Do not use relations on
numbers or sets.

Problem B.3. Show that ≡n is an equivalence relation, for any


n ∈ N, and that N/≡n has exactly n members.

Problem B.4. Give a proof of Proposition B.26.

Problem B.5. Consider the less-than-or-equal-to relation ≤ on


the set {1, 2, 3, 4} as a graph and draw the corresponding dia-
gram.

Problem B.6. Show that the transitive closure of R is in fact


transitive.
APPENDIX C

Proofs
C.1 Introduction
Based on your experiences in introductory logic, you might be
comfortable with a derivation system—probably a natural de-
duction or Fitch style derivation system, or perhaps a proof-tree
system. You probably remember doing proofs in these systems,
either proving a formula or show that a given argument is valid.
In order to do this, you applied the rules of the system until you
got the desired end result. In reasoning about logic, we also prove
things, but in most cases we are not using a derivation system. In
fact, most of the proofs we consider are done in English (perhaps,
with some symbolic language thrown in) rather than entirely in
the language of first-order logic. When constructing such proofs,
you might at first be at a loss—how do I prove something without
a derivation system? How do I start? How do I know if my proof
is correct?
Before attempting a proof, it’s important to know what a proof
is and how to construct one. As implied by the name, a proof is
meant to show that something is true. You might think of this in
terms of a dialogue—someone asks you if something is true, say,
if every prime other than two is an odd number. To answer “yes”
is not enough; they might want to know why. In this case, you’d
give them a proof.
In everyday discourse, it might be enough to gesture at an

276
APPENDIX C. PROOFS 277

answer, or give an incomplete answer. In logic and mathematics,


however, we want rigorous proof—we want to show that some-
thing is true beyond any doubt. This means that every step in our
proof must be justified, and the justification must be cogent (i.e.,
the assumption you’re using is actually assumed in the statement
of the theorem you’re proving, the definitions you apply must be
correctly applied, the justifications appealed to must be correct
inferences, etc.).
Usually, we’re proving some statement. We call the statements
we’re proving by various names: propositions, theorems, lemmas,
or corollaries. A proposition is a basic proof-worthy statement:
important enough to record, but perhaps not particularly deep
nor applied often. A theorem is a significant, important proposi-
tion. Its proof often is broken into several steps, and sometimes
it is named after the person who first proved it (e.g., Cantor’s
Theorem, the Löwenheim-Skolem theorem) or after the fact it
concerns (e.g., the completeness theorem). A lemma is a propo-
sition or theorem that is used to in the proof of a more impor-
tant result. Confusingly, sometimes lemmas are important results
in themselves, and also named after the person who introduced
them (e.g., Zorn’s Lemma). A corollary is a result that easily
follows from another one.
A statement to be proved often contains some assumption
that clarifies about which kinds of things we’re proving some-
thing. It might begin with “Let A be a formula of the form B →C ”
or “Suppose 𝛤 ⊢ A” or something of the sort. These are hypothe-
ses of the proposition, theorem, or lemma, and you may assume
these to be true in your proof. They restrict what we’re proving
about, and also introduce some names for the objects we’re talk-
ing about. For instance, if your proposition begins with “Let A be
a formula of the form B → C ,” you’re proving something about
all formulas of a certain sort only (namely, conditionals), and it’s
understood that B →C is an arbitrary conditional that your proof
will talk about.
APPENDIX C. PROOFS 278

C.2 Starting a Proof


But where do you even start?
You’ve been given something to prove, so this should be the
last thing that is mentioned in the proof (you can, obviously, an-
nounce that you’re going to prove it at the beginning, but you don’t
want to use it as an assumption). Write what you are trying to
prove at the bottom of a fresh sheet of paper—this way you don’t
lose sight of your goal.
Next, you may have some assumptions that you are able to use
(this will be made clearer when we talk about the type of proof you
are doing in the next section). Write these at the top of the page
and make sure to flag that they are assumptions (i.e., if you are
assuming p, write “assume that p,” or “suppose that p”). Finally,
there might be some definitions in the question that you need
to know. You might be told to use a specific definition, or there
might be various definitions in the assumptions or conclusion
that you are working towards. Write these down and ensure that you
understand what they mean.
How you set up your proof will also be dependent upon the
form of the question. The next section provides details on how
to set up your proof based on the type of sentence.

C.3 Using Definitions


We mentioned that you must be familiar with all definitions that
may be used in the proof, and that you can properly apply them.
This is a really important point, and it is worth looking at in
a bit more detail. Definitions are used to abbreviate properties
and relations so we can talk about them more succinctly. The
introduced abbreviation is called the definiendum, and what it ab-
breviates is the definiens. In proofs, we often have to go back to
how the definiendum was introduced, because we have to exploit
the logical structure of the definiens (the long version of which
the defined term is the abbreviation) to get through our proof. By
APPENDIX C. PROOFS 279

unpacking definitions, you’re ensuring that you’re getting to the


heart of where the logical action is.
We’ll start with an example. Suppose you want to prove the
following:

Proposition C.1. For any sets A and B, A ∪ B = B ∪ A.

In order to even start the proof, we need to know what it


means for two sets to be identical; i.e., we need to know what
the “=” in that equation means for sets. Sets are defined to be
identical whenever they have the same elements. So the definition
we have to unpack is:

Definition C.2. Sets A and B are identical, A = B, iff every ele-


ment of A is an element of B, and vice versa.

This definition uses A and B as placeholders for arbitrary sets.


What it defines—the definiendum—is the expression “A = B” by
giving the condition under which A = B is true. This condition—
“every element of A is an element of B, and vice versa”—is the
definiens.1 The definition specifies that A = B is true if, and only
if (we abbreviate this to “iff”) the condition holds.
When you apply the definition, you have to match the A and
B in the definition to the case you’re dealing with. In our case, it
means that in order for A ∪ B = B ∪ A to be true, each z ∈ A ∪ B
must also be in B ∪A, and vice versa. The expression A ∪B in the
proposition plays the role of A in the definition, and B ∪ A that
of B. Since A and B are used both in the definition and in the
statement of the proposition we’re proving, but in different uses,
you have to be careful to make sure you don’t mix up the two.
For instance, it would be a mistake to think that you could prove
the proposition by showing that every element of A is an element
1 In this particular case—and very confusingly!—when A = B, the sets A
and B are just one and the same set, even though we use different letters for it
on the left and the right side. But the ways in which that set is picked out may
be different, and that makes the definition non-trivial.
APPENDIX C. PROOFS 280

of B, and vice versa—that would show that A = B, not that A∪B =


B ∪ A. (Also, since A and B may be any two sets, you won’t get
very far, because if nothing is assumed about A and B they may
well be different sets.)
Within the proof we are dealing with set-theoretic notions
such as union, and so we must also know the meanings of the
symbol ∪ in order to understand how the proof should pro-
ceed. And sometimes, unpacking the definition gives rise to
further definitions to unpack. For instance, A ∪ B is defined as
{z : z ∈ A or z ∈ B }. So if you want to prove that x ∈ A ∪ B,
unpacking the definition of ∪ tells you that you have to prove
x ∈ {z : z ∈ A or z ∈ B }. Now you also have to remember that
x ∈ {z : . . . z . . .} iff . . . x . . . . So, further unpacking the definition
of the {z : . . . z . . .} notation, what you have to show is: x ∈ A or
x ∈ B. So, “every element of A ∪ B is also an element of B ∪ A”
really means: “for every x, if x ∈ A or x ∈ B, then x ∈ B or
x ∈ A.” If we fully unpack the definitions in the proposition, we
see that what we have to show is this:

Proposition C.3. For any sets A and B: (a) for every x, if x ∈ A or


x ∈ B, then x ∈ B or x ∈ A, and (b) for every x, if x ∈ B or x ∈ A,
then x ∈ A or x ∈ B.

What’s important is that unpacking definitions is a necessary


part of constructing a proof. Properly doing it is sometimes diffi-
cult: you must be careful to distinguish and match the variables
in the definition and the terms in the claim you’re proving. In
order to be successful, you must know what the question is ask-
ing and what all the terms used in the question mean—you will
often need to unpack more than one definition. In simple proofs
such as the ones below, the solution follows almost immediately
from the definitions themselves. Of course, it won’t always be this
simple.
APPENDIX C. PROOFS 281

C.4 Inference Patterns


Proofs are composed of individual inferences. When we make an
inference, we typically indicate that by using a word like “so,”
“thus,” or “therefore.” The inference often relies on one or two
facts we already have available in our proof—it may be something
we have assumed, or something that we’ve concluded by an in-
ference already. To be clear, we may label these things, and in
the inference we indicate what other statements we’re using in the
inference. An inference will often also contain an explanation of
why our new conclusion follows from the things that come before
it. There are some common patterns of inference that are used
very often in proofs; we’ll go through some below. Some patterns
of inference, like proofs by induction, are more involved (and will
be discussed later).
We’ve already discussed one pattern of inference: unpack-
ing, or applying, a definition. When we unpack a definition, we
just restate something that involves the definiendum by using the
definiens. For instance, suppose that we have already established
in the course of a proof that D = E (a). Then we may apply the
definition of = for sets and infer: “Thus, by definition from (a),
every element of D is an element of E and vice versa.”
Somewhat confusingly, we often do not write the justification
of an inference when we actually make it, but before. Suppose
we haven’t already proved that D = E, but we want to. If D = E
is the conclusion we aim for, then we can restate this aim also
by applying the definition: to prove D = E we have to prove
that every element of D is an element of E and vice versa. So
our proof will have the form: (a) prove that every element of D
is an element of E; (b) every element of E is an element of D;
(c) therefore, from (a) and (b) by definition of =, D = E. But
we would usually not write it this way. Instead we might write
something like,

We want to show D = E. By definition of =, this


amounts to showing that every element of D is an el-
APPENDIX C. PROOFS 282

ement of E and vice versa.


(a) . . . (a proof that every element of D is an element
of E) . . .
(b) . . . (a proof that every element of E is an element
of D) . . .

Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as con-
clusion one of the conjuncts of a conjunction. In other words:
if we have assumed or already proved that p and q , then we’re
entitled to infer that p (and also that q ). This is such a basic
inference that it is often not mentioned. For instance, once we’ve
unpacked the definition of D = E we’ve established that every
element of D is an element of E and vice versa. From this we can
conclude that every element of E is an element of D (that’s the
“vice versa” part).

Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a
conjunction; you will be asked to “prove p and q .” In this case,
you simply have to do two things: prove p, and then prove q . You
could divide your proof into two sections, and for clarity, label
them. When you’re making your first notes, you might write “(1)
Prove p” at the top of the page, and “(2) Prove q ” in the middle of
the page. (Of course, you might not be explicitly asked to prove
a conjunction but find that your proof requires that you prove a
conjunction. For instance, if you’re asked to prove that D = E
you will find that, after unpacking the definition of =, you have to
prove: every element of D is an element of E and every element
of E is an element of D).
APPENDIX C. PROOFS 283

Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it
is an statement of the form “p or q ”), it is enough to show that one
of the disjuncts is true. However, it basically never happens that
either disjunct just follows from the assumptions of your theorem.
More often, the assumptions of your theorem are themselves dis-
junctive, or you’re showing that all things of a certain kind have
one of two properties, but some of the things have the one and
others have the other property. This is where proof by cases is
useful (see below).

Conditional Proof
Many theorems you will encounter are in conditional form (i.e.,
show that if p holds, then q is also true). These cases are nice and
easy to set up—simply assume the antecedent of the conditional
(in this case, p) and prove the conclusion q from it. So if your
theorem reads, “If p then q ,” you start your proof with “assume
p” and at the end you should have proved q .
Conditionals may be stated in different ways. So instead of “If
p then q ,” a theorem may state that “p only if q ,” “q if p,” or “q ,
provided p.” These all mean the same and require assuming p
and proving q from that assumption. Recall that a biconditional
(“p if and only if (iff) q ”) is really two conditionals put together:
if p then q , and if q then p. All you have to do, then, is two
instances of conditional proof: one for the first conditional and
another one for the second. Sometimes, however, it is possible
to prove an “iff” statement by chaining together a bunch of other
“iff” statements so that you start with “p” an end with “q ”—but
in that case you have to make sure that each step really is an “iff.”

Universal Claims
Using a universal claim is simple: if something is true for any-
thing, it’s true for each particular thing. So if, say, the hypothesis
of your proof is A ⊆ B, that means (unpacking the definition
APPENDIX C. PROOFS 284

of ⊆), that, for every x ∈ A, x ∈ B. Thus, if you already know


that z ∈ A, you can conclude z ∈ B.
Proving a universal claim may seem a little bit tricky. Usually
these statements take the following form: “If x has P , then it
has Q ” or “All P s are Q s.” Of course, it might not fit this form
perfectly, and it takes a bit of practice to figure out what you’re
asked to prove exactly. But: we often have to prove that all objects
with some property have a certain other property.
The way to prove a universal claim is to introduce names
or variables, for the things that have the one property and then
show that they also have the other property. We might put this
by saying that to prove something for all P s you have to prove
it for an arbitrary P . And the name introduced is a name for an
arbitrary P . We typically use single letters as these names for
arbitrary things, and the letters usually follow conventions: e.g.,
we use n for natural numbers, A for formulas, A for sets, f for
functions, etc.
The trick is to maintain generality throughout the proof. You
start by assuming that an arbitrary object (“x”) has the prop-
erty P , and show (based only on definitions or what you are al-
lowed to assume) that x has the property Q . Because you have
not stipulated what x is specifically, other that it has the property
P , then you can assert that all every P has the property Q . In
short, x is a stand-in for all things with property P .

Proposition C.4. For all sets A and B, A ⊆ A ∪ B.

Proof. Let A and B be arbitrary sets. We want to show that A ⊆


A ∪ B. By definition of ⊆, this amounts to: for every x, if x ∈ A
then x ∈ A ∪ B. So let x ∈ A be an arbitrary element of A. We
have to show that x ∈ A ∪ B. Since x ∈ A, x ∈ A or x ∈ B. Thus,
x ∈ {x : x ∈ A ∨ x ∈ B }. But that, by definition of ∪, means
x ∈ A ∪ B. □
APPENDIX C. PROOFS 285

Proof by Cases
Suppose you have a disjunction as an assumption or as an already
established conclusion—you have assumed or proved that p or q
is true. You want to prove r . You do this in two steps: first you
assume that p is true, and prove r , then you assume that q is true
and prove r again. This works because we assume or know that
one of the two alternatives holds. The two steps establish that
either one is sufficient for the truth of r . (If both are true, we
have not one but two reasons for why r is true. It is not neces-
sary to separately prove that r is true assuming both p and q .)
To indicate what we’re doing, we announce that we “distinguish
cases.” For instance, suppose we know that x ∈ B ∪ C . B ∪ C is
defined as {x : x ∈ B or x ∈ C }. In other words, by definition,
x ∈ B or x ∈ C . We would prove that x ∈ A from this by first
assuming that x ∈ B, and proving x ∈ A from this assumption,
and then assume x ∈ C , and again prove x ∈ A from this. You
would write “We distinguish cases” under the assumption, then
“Case (1): x ∈ B” underneath, and “Case (2): x ∈ C halfway
down the page. Then you’d proceed to fill in the top half and the
bottom half of the page.
Proof by cases is especially useful if what you’re proving is
itself disjunctive. Here’s a simple example:

Proposition C.5. Suppose B ⊆ D and C ⊆ E. Then B ∪C ⊆ D ∪E.

Proof. Assume (a) that B ⊆ D and (b) C ⊆ E. By definition, any


x ∈ B is also ∈ D (c) and any x ∈ C is also ∈ E (d). To show that
B ∪ C ⊆ D ∪ E, we have to show that if x ∈ B ∪ C then x ∈ D ∪ E
(by definition of ⊆). x ∈ B ∪ C iff x ∈ B or x ∈ C (by definition
of ∪). Similarly, x ∈ D ∪ E iff x ∈ D or x ∈ E. So, we have to
show: for any x, if x ∈ B or x ∈ C , then x ∈ D or x ∈ E.

So far we’ve only unpacked definitions! We’ve refor-


mulated our proposition without ⊆ and ∪ and are left
with trying to prove a universal conditional claim. By
what we’ve discussed above, this is done by assuming
APPENDIX C. PROOFS 286

that x is something about which we assume the “if”


part is true, and we’ll go on to show that the “then”
part is true as well. In other words, we’ll assume that
x ∈ B or x ∈ C and show that x ∈ D or x ∈ E.2

Suppose that x ∈ B or x ∈ C . We have to show that x ∈ D or


x ∈ E. We distinguish cases.
Case 1: x ∈ B. By (c), x ∈ D. Thus, x ∈ D or x ∈ E. (Here
we’ve made the inference discussed in the preceding subsection!)
Case 2: x ∈ C . By (d), x ∈ E. Thus, x ∈ D or x ∈ E. □

Proving an Existence Claim


When asked to prove an existence claim, the question will usually
be of the form “prove that there is an x such that . . . x . . . ”, i.e.,
that some object that has the property described by “. . . x . . . ”. In
this case you’ll have to identify a suitable object show that is has
the required property. This sounds straightforward, but a proof
of this kind can be tricky. Typically it involves constructing or
defining an object and proving that the object so defined has the
required property. Finding the right object may be hard, proving
that it has the required property may be hard, and sometimes it’s
even tricky to show that you’ve succeeded in defining an object
at all!
Generally, you’d write this out by specifying the object, e.g.,
“let x be . . . ” (where . . . specifies which object you have in mind),
possibly proving that . . . in fact describes an object that exists,
and then go on to show that x has the property Q . Here’s a simple
example.

Proposition C.6. Suppose that x ∈ B. Then there is an A such that


A ⊆ B and A ≠ ∅.

Proof. Assume x ∈ B. Let A = {x }.


2 Thisparagraph just explains what we’re doing—it’s not part of the proof,
and you don’t have to go into all this detail when you write down your own
proofs.
APPENDIX C. PROOFS 287

Here we’ve defined the set A by enumerating its ele-


ments. Since we assume that x is an object, and we
can always form a set by enumerating its elements,
we don’t have to show that we’ve succeeded in defin-
ing a set A here. However, we still have to show that
A has the properties required by the proposition. The
proof isn’t complete without that!

Since x ∈ A, A ≠ ∅.

This relies on the definition of A as {x } and the ob-


vious facts that x ∈ {x } and x ∉ ∅.

Since x is the only element of {x }, and x ∈ B, every element of A


is also an element of B. By definition of ⊆, A ⊆ B. □

Using Existence Claims


Suppose you know that some existence claim is true (you’ve
proved it, or it’s a hypothesis you can use), say, “for some x,
x ∈ A” or “there is an x ∈ A.” If you want to use it in your proof,
you can just pretend that you have a name for one of the things
which your hypothesis says exist. Since A contains at least one
thing, there are things to which that name might refer. You might
of course not be able to pick one out or describe it further (other
than that it is ∈ A). But for the purpose of the proof, you can
pretend that you have picked it out and give a name to it. It’s
important to pick a name that you haven’t already used (or that
appears in your hypotheses), otherwise things can go wrong. In
your proof, you indicate this by going from “for some x, x ∈ A”
to “Let a ∈ A.” Now you can reason about a, use some other hy-
potheses, etc., until you come to a conclusion, p. If p no longer
mentions a, p is independent of the asusmption that a ∈ A, and
you’ve shown that it follows just from the assumption “for some
x, x ∈ A.”
APPENDIX C. PROOFS 288

Proposition C.7. If A ≠ ∅, then A ∪ B ≠ ∅.

Proof. Suppose A ≠ ∅. So for some x, x ∈ A.

Here we first just restated the hypothesis of the propo-


sition. This hypothesis, i.e., A ≠ ∅, hides an existen-
tial claim, which you get to only by unpacking a few
definitions. The definition of = tells us that A = ∅ iff
every x ∈ A is also ∈ ∅ and every x ∈ ∅ is also ∈ A.
Negating both sides, we get: A ≠ ∅ iff either some
x ∈ A is ∉ ∅ or some x ∈ ∅ is ∉ A. Since nothing is
∈ ∅, the second disjunct can never be true, and “x ∈ A
and x ∉ ∅” reduces to just x ∈ A. So x ≠ ∅ iff for some
x, x ∈ A. That’s an existence claim. Now we use that
existence claim by introducing a name for one of the
elements of A:

Let a ∈ A.

Now we’ve introduced a name for one of the things ∈


A. We’ll continue to argue about a, but we’ll be care-
ful to only assume that a ∈ A and nothing else:

Since a ∈ A, a ∈ A∪B, by definition of ∪. So for some x, x ∈ A∪B,


i.e., A ∪ B ≠ ∅.

In that last step, we went from “a ∈ A ∪ B” to “for


some x, x ∈ A ∪B.” That doesn’t mention a anymore,
so we know that “for some x, x ∈ A ∪ B” follows
from “for some x, x ∈ A alone.” But that means that
A ∪ B ≠ ∅. □

It’s maybe good practice to keep bound variables like “x” sep-
arate from hypothetical names like a, like we did. In practice,
however, we often don’t and just use x, like so:

Suppose A ≠ ∅, i.e., there is an x ∈ A. By definition


of ∪, x ∈ A ∪ B. So A ∪ B ≠ ∅.
APPENDIX C. PROOFS 289

However, when you do this, you have to be extra careful that


you use different x’s and y’s for different existential claims. For
instance, the following is not a correct proof of “If A ≠ ∅ and
B ≠ ∅ then A ∩ B ≠ ∅” (which is not true).

Suppose A ≠ ∅ and B ≠ ∅. So for some x, x ∈ A


and also for some x, x ∈ B. Since x ∈ A and x ∈ B,
x ∈ A ∩ B, by definition of ∩. So A ∩ B ≠ ∅.

Can you spot where the incorrect step occurs and explain why
the result does not hold?

C.5 An Example
Our first example is the following simple fact about unions and in-
tersections of sets. It will illustrate unpacking definitions, proofs
of conjunctions, of universal claims, and proof by cases.

Proposition C.8. For any sets A, B, and C , A ∪ (B ∩ C ) = (A ∪


B) ∩ (A ∪ C )

Let’s prove it!

Proof. We want to show that for any sets A, B, and C , A∪(B ∩C ) =


(A ∪ B) ∩ (A ∪ C )

First we unpack the definition of “=” in the statement


of the proposition. Recall that proving sets identical
means showing that the sets have the same elements.
That is, all elements of A ∪ (B ∩ C ) are also elements
of (A ∪ B) ∩ (A ∪C ), and vice versa. The “vice versa”
means that also every element of (A ∪ B) ∩ (A ∪ C )
must be an element of A∪(B ∩C ). So in unpacking the
definition, we see that we have to prove a conjunction.
Let’s record this:
APPENDIX C. PROOFS 290

By definition, A ∪ (B ∩ C ) = (A ∪ B) ∩ (A ∪ C ) iff every element


of A ∪ (B ∩ C ) is also an element of (A ∪ B) ∩ (A ∪ C ), and every
element of (A ∪ B) ∩ (A ∪ C ) is an element of A ∪ (B ∩ C ).

Since this is a conjunction, we must prove each con-


junct separately. Lets start with the first: let’s prove
that every element of A ∪ (B ∩ C ) is also an element
of (A ∪ B) ∩ (A ∪ C ).
This is a universal claim, and so we consider an ar-
bitrary element of A ∪ (B ∩ C ) and show that it must
also be an element of (A ∪ B) ∩ (A ∪ C ). We’ll pick a
variable to call this arbitrary element by, say, z . Our
proof continues:

First, we prove that every element of A∪(B ∩C ) is also an element


of (A ∪ B) ∩ (A ∪ C ). Let z ∈ A ∪ (B ∩ C ). We have to show that
z ∈ (A ∪ B) ∩ (A ∪ C ).

Now it is time to unpack the definition of ∪ and ∩.


For instance, the definition of ∪ is: A ∪ B = {z :
z ∈ A or z ∈ B }. When we apply the definition to
“A ∪ (B ∩ C ),” the role of the “B” in the definition
is now played by “B ∩ C ,” so A ∪ (B ∩ C ) = {z :
z ∈ A or z ∈ B ∩ C }. So our assumption that z ∈
A ∪ (B ∩ C ) amounts to: z ∈ {z : z ∈ A or z ∈ B ∩ C }.
And z ∈ {z : . . . z . . .} iff . . . z . . . , i.e., in this case,
z ∈ A or z ∈ B ∩ C .

By the definition of ∪, either z ∈ A or z ∈ B ∩ C .

Since this is a disjunction, it will be useful to apply


proof by cases. We take the two cases, and show that
in each one, the conclusion we’re aiming for (namely,
“z ∈ (A ∪ B) ∩ (A ∪ C )”) obtains.

Case 1: Suppose that z ∈ A.


APPENDIX C. PROOFS 291

There’s not much more to work from based on our


assumptions. So let’s look at what we have to work
with in the conclusion. We want to show that z ∈
(A ∪ B) ∩ (A ∪ C ). Based on the definition of ∩, if
we want to show that z ∈ (A ∪ B) ∩ (A ∪ C ), we have
to show that it’s in both (A ∪ B) and (A ∪ C ). But
z ∈ A ∪ B iff z ∈ A or z ∈ B, and we already have
(as the assumption of case 1) that z ∈ A. By the
same reasoning—switching C for B—z ∈ A ∪C . This
argument went in the reverse direction, so let’s record
our reasoning in the direction needed in our proof.

Since z ∈ A, z ∈ A or z ∈ B, and hence, by definition of ∪, z ∈ A∪


B. Similarly, z ∈ A ∪C . But this means that z ∈ (A ∪ B) ∩ (A ∪C ),
by definition of ∩.

This completes the first case of the proof by cases.


Now we want to derive the conclusion in the second
case, where z ∈ B ∩ C .

Case 2: Suppose that z ∈ B ∩ C .

Again, we are working with the intersection of two


sets. Let’s apply the definition of ∩:

Since z ∈ B ∩ C , z must be an element of both B and C , by


definition of ∩.

It’s time to look at our conclusion again. We have to


show that z is in both (A∪B) and (A∪C ). And again,
the solution is immediate.

Since z ∈ B, z ∈ (A ∪ B). Since z ∈ C , also z ∈ (A ∪ C ). So,


z ∈ (A ∪ B) ∩ (A ∪ C ).

Here we applied the definitions of ∪ and ∩ again,


but since we’ve already recalled those definitions, and
already showed that if z is in one of two sets it is in
APPENDIX C. PROOFS 292

their union, we don’t have to be as explicit in what


we’ve done.
We’ve completed the second case of the proof by
cases, so now we can assert our first conclusion.

So, if z ∈ A ∪ (B ∩ C ) then z ∈ (A ∪ B) ∩ (A ∪ C ).

Now we just want to show the other direction, that


every element of (A ∪ B) ∩ (A ∪ C ) is an element of
A ∪ (B ∩ C ). As before, we prove this universal claim
by assuming we have an arbitrary element of the first
set and show it must be in the second set. Let’s state
what we’re about to do.

Now, assume that z ∈ (A ∪ B) ∩ (A ∪ C ). We want to show that


z ∈ A ∪ (B ∩ C ).

We are now working from the hypothesis that z ∈


(A ∪ B) ∩ (A ∪ C ). It hopefully isn’t too confusing
that we’re using the same z here as in the first part
of the proof. When we finished that part, all the as-
sumptions we’ve made there are no longer in effect,
so now we can make new assumptions about what z
is. If that is confusing to you, just replace z with a
different variable in what follows.
We know that z is in both A ∪ B and A ∪ C , by defini-
tion of ∩. And by the definition of ∪, we can further
unpack this to: either z ∈ A or z ∈ B, and also either
z ∈ A or z ∈ C . This looks like a proof by cases
again—except the “and” makes it confusing. You
might think that this amounts to there being three
possibilities: z is either in A, B or C . But that would
be a mistake. We have to be careful, so let’s consider
each disjunction in turn.

By definition of ∩, z ∈ A ∪ B and z ∈ A ∪ C . By definition of ∪,


z ∈ A or z ∈ B. We distinguish cases.
APPENDIX C. PROOFS 293

Since we’re focusing on the first disjunction, we


haven’t gotten our second disjunction (from unpack-
ing A ∪ C ) yet. In fact, we don’t need it yet. The
first case is z ∈ A, and an element of a set is also
an element of the union of that set with any other. So
case 1 is easy:

Case 1: Suppose that z ∈ A. It follows that z ∈ A ∪ (B ∩ C ).

Now for the second case, z ∈ B. Here we’ll unpack


the second ∪ and do another proof-by-cases:

Case 2: Suppose that z ∈ B. Since z ∈ A ∪ C , either z ∈ A or


z ∈ C . We distinguish cases further:
Case 2a: z ∈ A. Then, again, z ∈ A ∪ (B ∩ C ).

Ok, this was a bit weird. We didn’t actually need the


assumption that z ∈ B for this case, but that’s ok.

Case 2b: z ∈ C . Then z ∈ B and z ∈ C , so z ∈ B ∩ C , and


consequently, z ∈ A ∪ (B ∩ C ).

This concludes both proofs-by-cases and so we’re


done with the second half.

So, if z ∈ (A ∪ B) ∩ (A ∪ C ) then z ∈ A ∪ (B ∩ C ). □

C.6 Another Example


Proposition C.9. If A ⊆ C , then A ∪ (C \ A) = C .

Proof. Suppose that A ⊆ C . We want to show that A ∪ (C \A) = C .

We begin by observing that this is a conditional state-


ment. It is tacitly universally quantified: the proposi-
tion holds for all sets A and C . So A and C are vari-
ables for arbitrary sets. To prove such a statement,
we assume the antecedent and prove the consequent.
APPENDIX C. PROOFS 294

We continue by using the assumption that A ⊆ C .


Let’s unpack the definition of ⊆: the assumption
means that all elements of A are also elements of C .
Let’s write this down—it’s an important fact that we’ll
use throughout the proof.

By the definition of ⊆, since A ⊆ C , for all z , if z ∈ A, then z ∈ C .

We’ve unpacked all the definitions that are given to


us in the assumption. Now we can move onto the
conclusion. We want to show that A ∪ (C \ A) = C ,
and so we set up a proof similarly to the last example:
we show that every element of A ∪ (C \ A) is also
an element of C and, conversely, every element of C
is an element of A ∪ (C \ A). We can shorten this to:
A ∪ (C \ A) ⊆ C and C ⊆ A ∪ (C \ A). (Here we’re
doing the opposite of unpacking a definition, but it
makes the proof a bit easier to read.) Since this is a
conjunction, we have to prove both parts. To show the
first part, i.e., that every element of A ∪ (C \A) is also
an element of C , we assume that z ∈ A ∪ (C \ A) for
an arbitrary z and show that z ∈ C . By the definition
of ∪, we can conclude that z ∈ A or z ∈ C \ A from
z ∈ A ∪ (C \ A). You should now be getting the hang
of this.

A ∪ (C \ A) = C iff A ∪ (C \ A) ⊆ C and C ⊆ (A ∪ (C \ A). First


we prove that A ∪ (C \ A) ⊆ C . Let z ∈ A ∪ (C \ A). So, either
z ∈ A or z ∈ (C \ A).

We’ve arrived at a disjunction, and from it we want


to prove that z ∈ C . We do this using proof by cases.

Case 1: z ∈ A. Since for all z , if z ∈ A, z ∈ C , we have that z ∈ C .

Here we’ve used the fact recorded earlier which fol-


lowed from the hypothesis of the proposition that
A ⊆ C . The first case is complete, and we turn to
APPENDIX C. PROOFS 295

the second case, z ∈ (C \ A). Recall that C \ A de-


notes the difference of the two sets, i.e., the set of all
elements of C which are not elements of A. But any
element of C not in A is in particular an element of C .

Case 2: z ∈ (C \ A). This means that z ∈ C and z ∉ A. So, in


particular, z ∈ C .

Great, we’ve proved the first direction. Now for the


second direction. Here we prove that C ⊆ A ∪ (C \A).
So we assume that z ∈ C and prove that z ∈ A ∪ (C \
A).

Now let z ∈ C . We want to show that z ∈ A or z ∈ C \ A.

Since all elements of A are also elements of C , and


C \A is the set of all things that are elements of C but
not A, it follows that z is either in A or in C \ A. This
may be a bit unclear if you don’t already know why
the result is true. It would be better to prove it step-
by-step. It will help to use a simple fact which we can
state without proof: z ∈ A or z ∉ A. This is called the
“principle of excluded middle:” for any statement p,
either p is true or its negation is true. (Here, p is the
statement that z ∈ A.) Since this is a disjunction, we
can again use proof-by-cases.

Either z ∈ A or z ∉ A. In the former case, z ∈ A ∪ (C \ A). In the


latter case, z ∈ C and z ∉ A, so z ∈ C \A. But then z ∈ A ∪ (C \A).

Our proof is complete: we have shown that A ∪ (C \


A) = C . □

C.7 Proof by Contradiction


In the first instance, proof by contradiction is an inference pat-
tern that is used to prove negative claims. Suppose you want to
APPENDIX C. PROOFS 296

show that some claim p is false, i.e., you want to show ¬p. The
most promising strategy is to (a) suppose that p is true, and (b)
show that this assumption leads to something you know to be
false. “Something known to be false” may be a result that con-
flicts with—contradicts—p itself, or some other hypothesis of the
overall claim you are considering. For instance, a proof of “if q
then ¬p” involves assuming that q is true and proving ¬p from
it. If you prove ¬p by contradiction, that means assuming p in
addition to q . If you can prove ¬q from p, you have shown that
the assumption p leads to something that contradicts your other
assumption q , since q and ¬q cannot both be true. Of course,
you have to use other inference patterns in your proof of the con-
tradiction, as well as unpacking definitions. Let’s consider an
example.

Proposition C.10. If A ⊆ B and B = ∅, then A has no elements.

Proof. Suppose A ⊆ B and B = ∅. We want to show that A has


no elements.

Since this is a conditional claim, we assume the an-


tecedent and want to prove the consequent. The con-
sequent is: A has no elements. We can make that a bit
more explicit: it’s not the case that there is an x ∈ A.

A has no elements iff it’s not the case that there is an x such that
x ∈ A.

So we’ve determined that what we want to prove is


really a negative claim ¬p, namely: it’s not the case
that there is an x ∈ A. To use proof by contradic-
tion, we have to assume the corresponding positive
claim p, i.e., there is an x ∈ A, and prove a contra-
diction from it. We indicate that we’re doing a proof
by contradiction by writing “by way of contradiction,
assume” or even just “suppose not,” and then state
the assumption p.
APPENDIX C. PROOFS 297

Suppose not: there is an x ∈ A.

This is now the new assumption we’ll use to obtain a


contradiction. We have two more assumptions: that
A ⊆ B and that B = ∅. The first gives us that x ∈ B:

Since A ⊆ B, x ∈ B.

But since B = ∅, every element of B (e.g., x) must


also be an element of ∅.

Since B = ∅, x ∈ ∅. This is a contradiction, since by definition ∅


has no elements.

This already completes the proof: we’ve arrived at


what we need (a contradiction) from the assumptions
we’ve set up, and this means that the assumptions
can’t all be true. Since the first two assumptions (A ⊆
B and B = ∅) are not contested, it must be the last
assumption introduced (there is an x ∈ A) that must
be false. But if we want to be thorough, we can spell
this out.

Thus, our assumption that there is an x ∈ A must be false, hence,


A has no elements by proof by contradiction. □

Every positive claim is trivially equivalent to a negative claim:


p iff ¬¬p. So proofs by contradiction can also be used to establish
positive claims “indirectly,” as follows: To prove p, read it as the
negative claim ¬¬p. If we can prove a contradiction from ¬p,
we’ve established ¬¬p by proof by contradiction, and hence p.
In the last example, we aimed to prove a negative claim,
namely that A has no elements, and so the assumption we made
for the purpose of proof by contradiction (i.e., that there is an
x ∈ A) was a positive claim. It gave us something to work with,
namely the hypothetical x ∈ A about which we continued to rea-
son until we got to x ∈ ∅.
APPENDIX C. PROOFS 298

When proving a positive claim indirectly, the assumption


you’d make for the purpose of proof by contradiction would be
negative. But very often you can easily reformulate a positive
claim as a negative claim, and a negative claim as a positive
claim. Our previous proof would have been essentially the same
had we proved “A = ∅” instead of the negative consequent “A
has no elements.” (By definition of =, “A = ∅” is a general claim,
since it unpacks to “every element of A is an element of ∅ and
vice versa”.) But it is easily seen to be equivalent to the negative
claim “not: there is an x ∈ A.”
So it is sometimes easier to work with ¬p as an assumption
than it is to prove p directly. Even when a direct proof is just
as simple or even simpler (as in the next example), some people
prefer to proceed indirectly. If the double negation confuses you,
think of a proof by contradiction of some claim as a proof of a
contradiction from the opposite claim. So, a proof by contradic-
tion of ¬p is a proof of a contradiction from the assumption p; and
proof by contradiction of p is a proof of a contradiction from ¬p.

Proposition C.11. A ⊆ A ∪ B.

Proof. We want to show that A ⊆ A ∪ B.

On the face of it, this is a positive claim: every x ∈ A


is also in A ∪ B. The negation of that is: some x ∈
A is ∉ A ∪ B. So we can prove the claim indirectly
by assuming this negated claim, and showing that it
leads to a contradiction.

Suppose not, i.e., A ⊈ A ∪ B.

We have a definition of A ⊆ A ∪ B: every x ∈ A is


also ∈ A ∪ B. To understand what A ⊈ A ∪ B means,
we have to use some elementary logical manipulation
on the unpacked definition: it’s false that every x ∈ A
is also ∈ A ∪ B iff there is some x ∈ A that is ∉ C .
(This is a place where you want to be very careful:
APPENDIX C. PROOFS 299

many students’ attempted proofs by contradiction fail


because they analyze the negation of a claim like “all
As are Bs” incorrectly.) In other words, A ⊈ A ∪ B iff
there is an x such that x ∈ A and x ∉ A ∪ B. From
then on, it’s easy.

So, there is an x ∈ A such that x ∉ A ∪ B. By definition of ∪,


x ∈ A ∪ B iff x ∈ A or x ∈ B. Since x ∈ A, we have x ∈ A ∪ B.
This contradicts the assumption that x ∉ A ∪ B. □

Proposition C.12. If A ⊆ B and B ⊆ C then A ⊆ C .

Proof. Suppose A ⊆ B and B ⊆ C . We want to show A ⊆ C .

Let’s proceed indirectly: we assume the negation of


what we want to etablish.

Suppose not, i.e., A ⊈ C .

As before, we reason that A ⊈ C iff not every x ∈ A


is also ∈ C , i.e., some x ∈ A is ∉ C . Don’t worry,
with practice you won’t have to think hard anymore
to unpack negations like this.

In other words, there is an x such that x ∈ A and x ∉ C .

Now we can use this to get to our contradiction. Of


course, we’ll have to use the other two assumptions
to do it.

Since A ⊆ B, x ∈ B. Since B ⊆ C , x ∈ C . But this contradicts


x ∉ C. □
APPENDIX C. PROOFS 300

Proposition C.13. If A ∪ B = A ∩ B then A = B.

Proof. Suppose A ∪ B = A ∩ B. We want to show that A = B.

The beginning is now routine:

Assume, by way of contradiction, that A ≠ B.

Our assumption for the proof by contradiction is that


A ≠ B. Since A = B iff A ⊆ B an B ⊆ A, we get that
A ≠ B iff A ⊈ B or B ⊈ A. (Note how important it is
to be careful when manipulating negations!) To prove
a contradiction from this disjunction, we use a proof
by cases and show that in each case, a contradiction
follows.

A ≠ B iff A ⊈ B or B ⊈ A. We distinguish cases.

In the first case, we assume A ⊈ B, i.e., for some x,


x ∈ A but ∉ B. A ∩ B is defined as those elements
that A and B have in common, so if something isn’t
in one of them, it’s not in the intersection. A ∪ B is
A together with B, so anything in either is also in the
union. This tells us that x ∈ A ∪ B but x ∉ A ∩ B, and
hence that A ∩ B ≠ B ∩ A.

Case 1: A ⊈ B. Then for some x, x ∈ A but x ∉ B. Since


x ∉ B, then x ∉ A ∩ B. Since x ∈ A, x ∈ A ∪ B. So, A ∩ B ≠ B ∩ A,
contradicting the assumption that A ∩ B = A ∪ B.
Case 2: B ⊈ A. Then for some y, y ∈ B but y ∉ A. As before,
we have y ∈ A ∪ B but y ∉ A ∩ B, and so A ∩ B ≠ A ∪ B, again
contradicting A ∩ B = A ∪ B. □

C.8 Reading Proofs


Proofs you find in textbooks and articles very seldom give all the
details we have so far included in our examples. Authors often
APPENDIX C. PROOFS 301

do not draw attention to when they distinguish cases, when they


give an indirect proof, or don’t mention that they use a definition.
So when you read a proof in a textbook, you will often have to
fill in those details for yourself in order to understand the proof.
Doing this is also good practice to get the hang of the various
moves you have to make in a proof. Let’s look at an example.

Proposition C.14 (Absorption). For all sets A, B,

A ∩ (A ∪ B) = A

Proof. If z ∈ A ∩ (A ∪ B), then z ∈ A, so A ∩ (A ∪ B) ⊆ A.


Now suppose z ∈ A. Then also z ∈ A ∪ B, and therefore also
z ∈ A ∩ (A ∪ B). □

The preceding proof of the absorption law is very condensed.


There is no mention of any definitions used, no “we have to prove
that” before we prove it, etc. Let’s unpack it. The proposition
proved is a general claim about any sets A and B, and when the
proof mentions A or B, these are variables for arbitrary sets. The
general claims the proof establishes is what’s required to prove
identity of sets, i.e., that every element of the left side of the
identity is an element of the right and vice versa.

“If z ∈ A ∩ (A ∪ B), then z ∈ A, so A ∩ (A ∪ B) ⊆ A.”

This is the first half of the proof of the identity: it estabishes


that if an arbitrary z is an element of the left side, it is also
an element of the right, i.e., A ∩ (A ∪ B) ⊆ A. Assume that
z ∈ A ∩ (A ∪ B). Since z is an element of the intersection of two
sets iff it is an element of both sets, we can conclude that z ∈ A
and also z ∈ A ∪ B. In particular, z ∈ A, which is what we wanted
to show. Since that’s all that has to be done for the first half, we
know that the rest of the proof must be a proof of the second half,
i.e., a proof that A ⊆ A ∩ (A ∪ B).

“Now suppose z ∈ A. Then also z ∈ A ∪ B, and


therefore also z ∈ A ∩ (A ∪ B).”
APPENDIX C. PROOFS 302

We start by assuming that z ∈ A, since we are showing that,


for any z , if z ∈ A then z ∈ A∩(A∪B). To show that z ∈ A∩(A∪B),
we have to show (by definition of “∩”) that (i) z ∈ A and also (ii)
z ∈ A ∪ B. Here (i) is just our assumption, so there is nothing
further to prove, and that’s why the proof does not mention it
again. For (ii), recall that z is an element of a union of sets
iff it is an element of at least one of those sets. Since z ∈ A,
and A ∪ B is the union of A and B, this is the case here. So
z ∈ A ∪ B. We’ve shown both (i) z ∈ A and (ii) z ∈ A ∪ B, hence,
by definition of “∩,” z ∈ A ∩ (A ∪ B). The proof doesn’t mention
those definitions; it’s assumed the reader has already internalized
them. If you haven’t, you’ll have to go back and remind yourself
what they are. Then you’ll also have to recognize why it follows
from z ∈ A that z ∈ A ∪ B, and from z ∈ A and z ∈ A ∪ B that
z ∈ A ∩ (A ∪ B).
Here’s another version of the proof above, with everything
made explicit:

Proof. [By definition of = for sets, A ∩ (A ∪B) = A we have to show


(a) A ∩ (A ∪ B) ⊆ A and (b) A ∩ (A ∪ B) ⊆ A. (a): By definition
of ⊆, we have to show that if z ∈ A ∩ (A ∪ B), then z ∈ A.] If
z ∈ A∩ (A∪B), then z ∈ A [since by definition of ∩, z ∈ A∩ (A∪B)
iff z ∈ A and z ∈ A ∪ B], so A ∩ (A ∪ B) ⊆ A. [(b): By definition
of ⊆, we have to show that if z ∈ A, then z ∈ A ∩ (A ∪ B).] Now
suppose [(1)] z ∈ A. Then also [(2)] z ∈ A ∪ B [since by (1) z ∈ A
or z ∈ B, which by definition of ∪ means z ∈ A ∪B], and therefore
also z ∈ A ∩ (A ∪ B) [since the definition of ∩ requires that z ∈ A,
i.e., (1), and z ∈ A ∪ B), i.e., (2)]. □

C.9 I Can’t Do It!


We all get to a point where we feel like giving up. But you can do
it. Your instructor and teaching assistant, as well as your fellow
students, can help. Ask them for help! Here are a few tips to help
you avoid a crisis, and what to do if you feel like giving up.
APPENDIX C. PROOFS 303

To make sure you can solve problems successfully, do the fol-


lowing:

1. Start as far in advance as possible. We get busy throughout


the semester and many of us struggle with procrastination,
one of the best things you can do is to start your homework
assignments early. That way, if you’re stuck, you have time
to look for a solution (that isn’t crying).

2. Talk to your classmates. You are not alone. Others in the


class may also struggle—but the may struggle with differ-
ent things. Talking it out with your peers can give you
a different perspective on the problem that might lead to
a breakthrough. Of course, don’t just copy their solution:
ask them for a hint, or explain where you get stuck and ask
them for the next step. And when you do get it, recipro-
cate. Helping someone else along, and explaining things
will help you understand better, too.

3. Ask for help. You have many resources available to you—


your instructor and teaching assistant are there for you
and want you to succeed. They should be able to help
you work out a problem and identify where in the process
you’re struggling.

4. Take a break. If you’re stuck, it might be because you’ve been


staring at the problem for too long. Take a short break,
have a cup of tea, or work on a different problem for a
while, then return to the problem with a fresh mind. Sleep
on it.

Notice how these strategies require that you’ve started to work


on the proof well in advance? If you’ve started the proof at 2am
the day before it’s due, these might not be so helpful.
This might sound like doom and gloom, but solving a proof
is a challenge that pays off in the end. Some people do this as
a career—so there must be something to enjoy about it. Like
APPENDIX C. PROOFS 304

basically everything, solving problems and doing proofs is some-


thing that requires practice. You might see classmates who find
this easy: they’ve probably just had lots of practice already. Try
not to give in too easily.
If you do run out of time (or patience) on a particular prob-
lem: that’s ok. It doesn’t mean you’re stupid or that you will never
get it. Find out (from your instructor or another student) how it
is done, and identify where you went wrong or got stuck, so you
can avoid doing that the next time you encounter a similar issue.
Then try to do it without looking at the solution. And next time,
start (and ask for help) earlier.

C.10 Other Resources


There are many books on how to do proofs in mathematics which
may be useful. Check out How to Read and do Proofs: An Intro-
duction to Mathematical Thought Processes (Solow, 2013) and How
to Prove It: A Structured Approach (Velleman, 2019) in particular.
The Book of Proof (Hammack, 2013) and Mathematical Reasoning
(Sandstrum, 2019) are books on proof that are freely available
online. Philosophers might find More Precisely: The Math you need
to do Philosophy (Steinhart, 2018) to be a good primer on mathe-
matical reasoning.
There are also various shorter guides to proofs available on
the internet; e.g., “Introduction to Mathematical Arguments”
(Hutchings, 2003) and “How to write proofs” (Cheng, 2004).

Motivational Videos
Feel like you have no motivation to do your homework? Feeling
down? These videos might help!

• https://www.youtube.com/watch?v=ZXsQAXx_ao0

• https://www.youtube.com/watch?v=BQ4yd2W50No

• https://www.youtube.com/watch?v=StTqXEQ2l-Y
APPENDIX C. PROOFS 305

Problems
Problem C.1. Suppose you are asked to prove that A ∩ B ≠ ∅.
Unpack all the definitions occuring here, i.e., restate this in a way
that does not mention “∩”, “=”, or “∅.

Problem C.2. Prove indirectly that A ∩ B ⊆ A.

Problem C.3. Expand the following proof of A ∪ (A ∩ B) = A,


where you mention all the inference patterns used, why each step
follows from assumptions or claims established before it, and
where we have to appeal to which definitions.

Proof. If z ∈ A ∪ (A ∩ B) then z ∈ A or z ∈ A ∩ B. If z ∈ A ∩ B,
z ∈ A. Any z ∈ A is also ∈ A ∪ (A ∩ B). □
APPENDIX D

Induction
D.1 Introduction
Induction is an important proof technique which is used, in dif-
ferent forms, in almost all areas of logic, theoretical computer
science, and mathematics. It is needed to prove many of the re-
sults in logic.
Induction is often contrasted with deduction, and character-
ized as the inference from the particular to the general. For in-
stance, if we observe many green emeralds, and nothing that we
would call an emerald that’s not green, we might conclude that
all emeralds are green. This is an inductive inference, in that it
proceeds from many particlar cases (this emerald is green, that
emerald is green, etc.) to a general claim (all emeralds are green).
Mathematical induction is also an inference that concludes a gen-
eral claim, but it is of a very different kind that this “simple in-
duction.”
Very roughly, an inductive proof in mathematics concludes
that all mathematical objects of a certain sort have a certain prop-
erty. In the simplest case, the mathematical objects an inductive
proof is concerned with are natural numbers. In that case an in-
ductive proof is used to establish that all natural numbers have
some property, and it does this by showing that

1. 0 has the property, and (2)

306
APPENDIX D. INDUCTION 307

2. whenever a number k has the property, so does k + 1.

Induction on natural numbers can then also often be used to


prove general about mathematical objects that can be assigned
numbers. For instance, finite sets each have a finite number n of
elements, and if we can use induction to show that every num-
ber n has the property “all finite sets of size n are . . . ” then we
will have shown something about all finite sets.
Induction can also be generalized to mathematical objects
that are inductively defined. For instance, expressions of a formal
language such as those of first-order logic are defined inductively.
Structural induction is a way to prove results about all such expres-
sions. Structural induction, in particular, is very useful—and
widely used—in logic.

D.2 Induction on N
In its simplest form, induction is a technique used to prove results
for all natural numbers. It uses the fact that by starting from 0
and repeatedly adding 1 we eventually reach every natural num-
ber. So to prove that something is true for every number, we can
(1) establish that it is true for 0 and (2) show that whenever it is
true for a number n, it is also true for the next number n +1. If we
abbreviate “number n has property P ” by P (n) (and “number k
has property P ” by P (k ), etc.), then a proof by induction that
P (n) for all n ∈ N consists of:

1. a proof of P (0), and

2. a proof that, for any k , if P (k ) then P (k + 1).

To make this crystal clear, suppose we have both (1) and (2).
Then (1) tells us that P (0) is true. If we also have (2), we know
in particular that if P (0) then P (0 + 1), i.e., P (1). This follows
from the general statement “for any k , if P (k ) then P (k + 1)” by
putting 0 for k . So by modus ponens, we have that P (1). From (2)
again, now taking 1 for n, we have: if P (1) then P (2). Since we’ve
APPENDIX D. INDUCTION 308

just established P (1), by modus ponens, we have P (2). And so


on. For any number n, after doing this n times, we eventually
arrive at P (n). So (1) and (2) together establish P (n) for any
n ∈ N.
Let’s look at an example. Suppose we want to find out how
many different sums we can throw with n dice. Although it might
seem silly, let’s start with 0 dice. If you have no dice there’s only
one possible sum you can “throw”: no dots at all, which sums
to 0. So the number of different possible throws is 1. If you have
only one die, i.e., n = 1, there are six possible values, 1 through 6.
With two dice, we can throw any sum from 2 through 12, that’s 11
possibilities. With three dice, we can throw any number from 3 to
18, i.e., 16 different possibilities. 1, 6, 11, 16: looks like a pattern:
maybe the answer is 5n + 1? Of course, 5n + 1 is the maximum
possible, because there are only 5n + 1 numbers between n, the
lowest value you can throw with n dice (all 1’s) and 6n, the highest
you can throw (all 6’s).

Theorem D.1. With n dice one can throw all 5n + 1 possible values
between n and 6n.

Proof. Let P (n) be the claim: “It is possible to throw any number
between n and 6n using n dice.” To use induction, we prove:

1. The induction basis P (1), i.e., with just one die, you can
throw any number between 1 and 6.

2. The induction step, for all k , if P (k ) then P (k + 1).

(1) Is proved by inspecting a 6-sided die. It has all 6 sides,


and every number between 1 and 6 shows up one on of the sides.
So it is possible to throw any number between 1 and 6 using a
single die.
To prove (2), we assume the antecedent of the conditional,
i.e., P (k ). This assumption is called the inductive hypothesis. We
use it to prove P (k + 1). The hard part is to find a way of thinking
about the possible values of a throw of k + 1 dice in terms of the
APPENDIX D. INDUCTION 309

possible values of throws of k dice plus of throws of the extra


k + 1-st die—this is what we have to do, though, if we want to use
the inductive hypothesis.
The inductive hypothesis says we can get any number between
k and 6k using k dice. If we throw a 1 with our (k + 1)-st die, this
adds 1 to the total. So we can throw any value between k + 1 and
6k + 1 by throwing 5 dice and then rolling a 1 with the (k + 1)-st
die. What’s left? The values 6k + 2 through 6k + 6. We can get
these by rolling k 6s and then a number between 2 and 6 with
our (k + 1)-st die. Together, this means that with k + 1 dice we
can throw any of the numbers between k + 1 and 6(k + 1), i.e.,
we’ve proved P (k + 1) using the assumption P (k ), the inductive
hypothesis. □

Very often we use induction when we want to prove something


about a series of objects (numbers, sets, etc.) that is itself defined
“inductively,” i.e., by defining the (n+1)-st object in terms of the n-
th. For instance, we can define the sum sn of the natural numbers
up to n by

s0 = 0
sn+1 = sn + (n + 1)

This definition gives:

s 0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 =1+2=3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.

Now we can prove, by induction, that sn = n (n + 1)/2.

Proposition D.2. sn = n (n + 1)/2.

Proof. We have to prove (1) that s 0 = 0 · (0 + 1)/2 and (2) if


sk = k (k + 1)/2 then sk +1 = (k + 1) (k + 2)/2. (1) is obvious. To
APPENDIX D. INDUCTION 310

prove (2), we assume the inductive hypothesis: sk = k (k + 1)/2.


Using it, we have to show that sk +1 = (k + 1) (k + 2)/2.
What is sk +1 ? By the definition, sk +1 = sk + (k + 1). By in-
ductive hypothesis, sk = k (k + 1)/2. We can substitute this into
the previous equation, and then just need a bit of arithmetic of
fractions:
k (k + 1)
sk +1 = + (k + 1) =
2
k (k + 1) 2(k + 1)
= + =
2 2
k (k + 1) + 2(k + 1)
= =
2
(k + 2) (k + 1)
= . □
2
The important lesson here is that if you’re proving something
about some inductively defined sequence an , induction is the ob-
vious way to go. And even if it isn’t (as in the case of the possibil-
ities of dice throws), you can use induction if you can somehow
relate the case for k + 1 to the case for k .

D.3 Strong Induction


In the principle of induction discussed above, we prove P (0) and
also if P (k ), then P (k + 1). In the second part, we assume that
P (k ) is true and use this assumption to prove P (k + 1). Equiva-
lently, of course, we could assume P (k − 1) and use it to prove
P (k )—the important part is that we be able to carry out the in-
ference from any number to its successor; that we can prove the
claim in question for any number under the assumption it holds
for its predecessor.
There is a variant of the principle of induction in which we
don’t just assume that the claim holds for the predecessor k − 1
of k , but for all numbers smaller than k , and use this assump-
tion to establish the claim for k . This also gives us the claim
P (n) for all n ∈ N. For once we have established P (0), we have
APPENDIX D. INDUCTION 311

thereby established that P holds for all numbers less than 1. And
if we know that if P (l ) for all l < k , then P (k ), we know this
in particular for k = 1. So we can conclude P (1). With this we
have proved P (0) and P (1), i.e., P (l ) for all l < 2, and since we
have also the conditional, if P (l ) for all l < 2, then P (2), we can
conclude P (2), and so on.
In fact, if we can establish the general conditional “for all k ,
if P (l ) for all l < k , then P (k ),” we do not have to establish P (0)
anymore, since it follows from it. For remember that a general
claim like “for all l < k , P (l )” is true if there are no l < k . This
is a case of vacuous quantification: “all As are Bs” is true if there
are no As, ∀x (A(x) → B (x)) is true if no x satisfies A(x). In this
case, the formalized version would be “∀l (l < k → P (l ))”—and
that is true if there are no l < k . And if k = 0 that’s exactly the
case: no l < 0, hence “for all l < 0, P (0)” is true, whatever P is.
A proof of “if P (l ) for all l < k , then P (k )” thus automatically
establishes P (0).
This variant is useful if establishing the claim for k can’t be
made to just rely on the claim for k − 1 but may require the
assumption that it is true for one or more l < k .

D.4 Inductive Definitions


In logic we very often define kinds of objects inductively, i.e., by
specifying rules for what counts as an object of the kind to be
defined which explain how to get new objects of that kind from
old objects of that kind. For instance, we often define special
kinds of sequences of symbols, such as the terms and formulas of
a language, by induction. For a simple example, consider strings
of consisting of letters a, b, c, d, the symbol ◦, and brackets [ and
], such as “[[c◦d] [”, “[a[]◦]”, “a” or “[[a◦b] ◦d]”. You probably
feel that there’s something “wrong” with the first two strings: the
brackets don’t “balance” at all in the first, and you might feel that
the “◦” should “connect” expressions that themselves make sense.
The third and fourth string look better: for every “[” there’s a
APPENDIX D. INDUCTION 312

closing “]” (if there are any at all), and for any ◦ we can find “nice”
expressions on either side, surrounded by a pair of parentheses.
We would like to precisely specify what counts as a “nice
term.” First of all, every letter by itself is nice. Anything that’s
not just a letter by itself should be of the form “[t ◦ s ]” where s
and t are themselves nice. Conversely, if t and s are nice, then we
can form a new nice term by putting a ◦ between them and sur-
round them by a pair of brackets. We might use these operations
to define the set of nice terms. This is an inductive definition.

Definition D.3 (Nice terms). The set of nice terms is induc-


tively defined as follows:

1. Any letter a, b, c, d is a nice term.

2. If s 1 and s 2 are nice terms, then so is [s 1 ◦ s 2 ].

3. Nothing else is a nice term.

This definition tells us that something counts as a nice term iff


it can be constructed according to the two conditions (1) and (2)
in some finite number of steps. In the first step, we construct all
nice terms just consisting of letters by themselves, i.e.,

a, b, c, d

In the second step, we apply (2) to the terms we’ve constructed.


We’ll get
[a ◦ a], [a ◦ b], [b ◦ a], . . . , [d ◦ d]
for all combinations of two letters. In the third step, we apply
(2) again, to any two nice terms we’ve constructed so far. We get
new nice term such as [a ◦ [a ◦ a]]—where t is a from step 1 and s
is [a ◦ a] from step 2—and [[b ◦ c] ◦ [d ◦ b]] constructed out of the
two terms [b ◦ c] and [d ◦ b] from step 2. And so on. Clause (3)
rules out that anything not constructed in this way sneaks into
the set of nice terms.
APPENDIX D. INDUCTION 313

Note that we have not yet proved that every sequence of sym-
bols that “feels” nice is nice according to this definition. However,
it should be clear that everything we can construct does in fact
“feel nice”: brackets are balanced, and ◦ connects parts that are
themselves nice.
The key feature of inductive definitions is that if you want
to prove something about all nice terms, the definition tells you
which cases you must consider. For instance, if you are told that
t is a nice term, the inductive definition tells you what t can look
like: t can be a letter, or it can be [s 1 ◦ s 2 ] for some pair of
nice terms s 1 and s2 . Because of clause (3), those are the only
possibilities.
When proving claims about all of an inductively defined set,
the strong form of induction becomes particularly important. For
instance, suppose we want to prove that for every nice term of
length n, the number of [ in it is < n/2. This can be seen as a
claim about all n: for every n, the number of [ in any nice term
of length n is < n/2.

Proposition D.4. For any n, the number of [ in a nice term of


length n is < n/2.

Proof. To prove this result by (strong) induction, we have to show


that the following conditional claim is true:
If for every l < k , any nice term of length l has l /2
[’s, then any nice term of length k has k /2 [’s.
To show this conditional, assume that its antecedent is true, i.e.,
assume that for any l < k , nice terms of length l contain < l /2
[’s. We call this assumption the inductive hypothesis. We want
to show the same is true for nice terms of length k .
So suppose t is a nice term of length k . Because nice terms
are inductively defined, we have two cases: (1) t is a letter by
itself, or (2) t is [s 1 ◦ s 2 ] for some nice terms s 1 and s 2 .
1. t is a letter. Then k = 1, and the number of [ in t is 0.
Since 0 < 1/2, the claim holds.
APPENDIX D. INDUCTION 314

2. t is [s 1 ◦ s 2 ] for some nice terms s1 and s2 . Let’s let l1 be the


length of s 1 and l2 be the length of s 2 . Then the length k of
t is l1 + l2 + 3 (the lengths of s 1 and s 2 plus three symbols
[, ◦, ]). Since l1 + l2 + 3 is always greater than l1 , l1 < k .
Similarly, l2 < n. That means that the induction hypothesis
applies to the terms s 1 and s 2 : the number m 1 of [ in s 1 is
< l1 /2, and the number m2 of [ in s 2 is < l2 /2.
The number of [ in t is the number of [ in s 1 , plus the
number of [ in s 2 , plus 1, i.e., it is m 1 + m 2 + 1. Since
m1 < l1 /2 and m 2 < l2 /2 we have:

l1 l2 l1 + l2 + 2 l1 + l − 2 + 3
m1 + m2 + 1 < + +1 = < = k /2.
2 2 2 2

In each case, we’ve shown that the number of [ in t is < k /2 (on


the basis of the inductive hypothesis). By strong induction, the
proposition follows. □

D.5 Structural Induction


So far we have used induction to establish results about all natural
numbers. But a corresponding principle can be used directly to
prove results about all elements of an inductively defined set.
This often called structural induction, because it depends on the
structure of the inductively defined objects.
Generally, an inductive definition is given by (a) a list of “ini-
tial” elements of the set and (b) a list of operations which produce
new elements of the set from old ones. In the case of nice terms,
for instance, the initial objects are the letters. We only have one
operation: the operations are

o (s 1 ,s 2 ) =[s 1 ◦ s 2 ]

You can even think of the natural numbers N themselves as being


given be an inductive definition: the initial object is 0, and the
operation is the successor function x + 1.
APPENDIX D. INDUCTION 315

In order to prove something about all elements of an induc-


tively defined set, i.e., that every element of the set has a prop-
erty P , we must:

1. Prove that the initial objects have P

2. Prove that for each operation o, if the arguments have P ,


so does the result.

For instance, in order to prove something about all nice terms,


we would prove that it is true about all letters, and that it is true
about [s 1 ◦ s 2 ] provided it is true of s 1 and s 2 individually.

Proposition D.5. The number of [ equals the number of ] in any nice


term t .

Proof. We use structural induction. Nice terms are inductively


defined, with letters as initial objects and the operations o for
constructing new nice terms out of old ones.

1. The claim is true for every letter, since the number of [ in


a letter by itself is 0 and the number of ] in it is also 0.

2. Suppose the number of [ in s1 equals the number of ], and


the same is true for s 2 . The number of [ in o (s1 ,s 2 ), i.e., in
[s 1 ◦ s 2 ], is the sum of the number of [ in s 1 and s 2 . The
number of ] in o (s 1 ,s 2 ) is the sum of the number of ] in s 1
and s 2 . Thus, the number of [ in o (s 1 ,s 2 ) equals the number
of ] in o (s 1 ,s2 ). □

Let’s give another proof by structural induction: a proper


initial segment of a string t of symbols is any string s that agrees
with t symbol by symbol, read from the left, but t is longer. So,
e.g., [a ◦ is a proper initial segment of [a ◦ b], but neither are
[b ◦ (they disagree at the second symbol) nor [a ◦ b] (they are
the same length).
APPENDIX D. INDUCTION 316

Proposition D.6. Every proper initial segment of a nice term t has


more [’s than ]’s.

Proof. By induction on t :
1. t is a letter by itself: Then t has no proper initial segments.

2. t = [s 1 ◦ s 2 ] for some nice terms s1 and s 2 . If r is a proper


initial segment of t , there are a number of possibilities:

a) r is just [: Then r has one more [ than it does ].


b) r is [r 1 where r 1 is a proper initial segment of s1 : Since
s 1 is a nice term, by induction hypothesis, r 1 has more
[ than ] and the same is true for [r 1 .
c) r is [s 1 or [s 1 ◦ : By the previous result, the number
of [ and ] in s1 are equal; so the number of [ in [s 1 or
[s 1 ◦ is one more than the number of ].
d) r is [s 1 ◦ r 2 where r 2 is a proper initial segment of s 2 :
By induction hypothesis, r 2 contains more [ than ]. By
the previous result, the number of [ and of ] in s 1 are
equal. So the number of [ in [s 1 ◦ r 2 is greater than
the number of ].
e) r is [s 1 ◦ s 2 : By the previous result, the number of [
and ] in s1 are equal, and the same for s 2 . So there is
one more [ in [s 1 ◦ s 2 than there are ]. □

D.6 Relations and Functions


When we have defined a set of objects (such as the natural num-
bers or the nice terms) inductively, we can also define relations on
these objects by induction. For instance, consider the following
idea: a nice term t1 is a subterm of a nice term t2 if it occurs as
a part of it. Let’s use a symbol for it: t1 ⊑ t2 . Every nice term
is a subterm of itself, of course: t ⊑ t . We can give an inductive
definition of this relation as follows:
APPENDIX D. INDUCTION 317

Definition D.7. The relation of a nice term t1 being a subterm


of t2 , t1 ⊑ t2 , is defined by induction on t2 as follows:

1. If t2 is a letter, then t1 ⊑ t2 iff t1 = t2 .

2. If t2 is [s 1 ◦ s 2 ], then t1 ⊑ t2 iff t = t2 , t1 ⊑ s 1 , or t1 ⊑ s 2 .

This definition, for instance, will tell us that a ⊑ [b ◦ a]. For


(2) says that a ⊑ [b ◦ a] iff a = [b ◦ a], or a ⊑ b, or a ⊑ a. The
first two are false: a clearly isn’t identical to [b ◦ a], and by (1),
a ⊑ b iff a = b, which is also false. However, also by (1), a ⊑ a iff
a = a, which is true.
It’s important to note that the success of this definition de-
pends on a fact that we haven’t proved yet: every nice term t is
either a letter by itself, or there are uniquely determined nice terms
s 1 and s 2 such that t = [s 1 ◦ s 2 ]. “Uniquely determined” here
means that if t = [s 1 ◦ s 2 ] it isn’t also = [r 1 ◦ r 2 ] with s 1 ≠ r 1 or
s 2 ≠ r 2 . If this were the case, then clause (2) may come in conflict
with itself: reading t2 as [s 1 ◦ s2 ] we might get t1 ⊑ t2 , but if we
read t2 as [r 1 ◦ r 2 ] we might get not t1 ⊑ t2 . Before we prove that
this can’t happen, let’s look at an example where it can happen.

Definition D.8. Define bracketless terms inductively by

1. Every letter is a bracketless term.

2. If s1 and s 2 are bracketless terms, then s 1 ◦s 2 is a bracketless


term.

3. Nothing else is a bracketless term.

Bracketless terms are, e.g., a, b◦d, b◦a◦b. Now if we defined


“subterm” for bracketless terms the way we did above, the second
clause would read

If t2 = s1 ◦ s2 , then t1 ⊑ t2 iff t1 = t2 , t1 ⊑ s 1 , or t1 ⊑ s2 .
APPENDIX D. INDUCTION 318

Now b ◦ a ◦ b is of the form s 1 ◦ s2 with

s 1 = b and s 2 = a ◦ b.

It is also of the form r 1 ◦ r 2 with

r 1 = b ◦ a and r 2 = b.

Now is a ◦ b a subterm of b ◦ a ◦ b? The answer is yes if we go by


the first reading, and no if we go by the second.
The property that the way a nice term is built up from other
nice terms is unique is called unique readability. Since inductive
definitions of relations for such inductively defined objects are
important, we have to prove that it holds.

Proposition D.9. Suppose t is a nice term. Then either t is a letter


by itself, or there are uniquely determined nice terms s 1 , s 2 such that t =
[s1 ◦ s 2 ].

Proof. If t is a letter by itself, the condition is satisfied. So assume


t isn’t a letter by itself. We can tell from the inductive definition
that then t must be of the form [s 1 ◦ s 2 ] for some nice terms s 1
and s 2 . It remains to show that these are uniquely determined,
i.e., if t = [r 1 ◦ r 2 ], then s 1 = r 1 and s 2 = r 2 .
So suppose t = [s 1 ◦ s 2 ] and also t = [r 1 ◦ r 2 ] for nice terms s 1 ,
s 2 , r 1 , r 2 . We have to show that s 1 = r 1 and s 2 = r 2 . First, s 1 and r 1
must be identical, for otherwise one is a proper initial segment of
the other. But by Proposition D.6, that is impossible if s 1 and r 1
are both nice terms. But if s1 = r 1 , then clearly also s 2 = r 2 . □

We can also define functions inductively: e.g., we can define


the function f that maps any nice term to the maximum depth
of nested [. . . ] in it as follows:

Definition D.10. The depth of a nice term, f (t ), is defined in-


APPENDIX D. INDUCTION 319

ductively as follows:
{︄
0 if t is a letter
f (t ) =
max( f (s 1 ), f (s 2 )) + 1 if t = [s 1 ◦ s2 ].

For instance
f ([a ◦ b]) = max( f (a), f (b)) + 1 =
= max(0, 0) + 1 = 1, and
f ([[a ◦ b] ◦ c]) = max( f ( [a ◦ b]), f (c)) + 1 =
= max(1, 0) + 1 = 2.
Here, of course, we assume that s 1 an s 2 are nice terms, and
make use of the fact that every nice term is either a letter or of
the form [s 1 ◦ s 2 ]. It is again important that it can be of this form
in only one way. To see why, consider again the bracketless terms
we defined earlier. The corresponding “definition” would be:
{︄
0 if t is a letter
g (t ) =
max(g (s ), g (s )) + 1 if t = [s 1 ◦ s 2 ].

Now consider the bracketless term a ◦ b ◦ c ◦ d. It can be read in


more than one way, e.g., as s 1 ◦ s 2 with
s1 = a and s 2 = b ◦ c ◦ d,

or as r 1 ◦ r 2 with

r 1 = a ◦ b and r 2 = c ◦ d.
Calculating g according to the first way of reading it would give
g (s 1 ◦ s 2 ) = max(g (a), g (b ◦ c ◦ d)) + 1 =
= max(0, 2) + 1 = 3

while according to the other reading we get

g (r 1 ◦ r 2 ) = max(g (a ◦ b), g (c ◦ d)) + 1 =


APPENDIX D. INDUCTION 320

= max(1, 1) + 1 = 2

But a function must always yield a unique value; so our “defini-


tion” of g doesn’t define a function at all.

Problems
Problem D.1. Define the set of supernice terms by

1. Any letter a, b, c, d is a supernice term.

2. If s is a supernice term, then so is [s ].

3. If s 1 and s 2 are supernice terms, then so is [s 1 ◦ s 2 ].

4. Nothing else is a supernice term.

Show that the number of [ in a supernice term t of length n is


≤ n/2 + 1.

Problem D.2. Prove by structural induction that no nice term


starts with ].

Problem D.3. Give an inductive definition of the function l ,


where l (t ) is the number of symbols in the nice term t .

Problem D.4. Prove by structural induction on nice terms t that


f (t ) < l (t ) (where l (t ) is the number of symbols in t and f (t ) is
the depth of t as defined in Definition D.10).
APPENDIX E

The Greek
Alphabet
Alpha 𝛼 A Nu 𝜈 N
Beta 𝛽 B Xi 𝜉 𝛯
Gamma 𝛾 𝛤 Omicron o O
Delta 𝛿 𝛥 Pi 𝜋 𝛱
Epsilon 𝜀 E Rho 𝜌 P
Zeta 𝜁 Z Sigma 𝜎 𝛴
Eta 𝜂 H Tau 𝜏 T
Theta 𝜃 𝛩 Upsilon 𝜐 𝛶
Iota 𝜄 I Phi 𝜑 𝛷
Kappa 𝜅 K Chi 𝜒 X
Lambda 𝜆 𝛬 Psi 𝜓 𝛹
Mu 𝜇 M Omega 𝜔 𝛺

321
Bibliography
Cheng, Eugenia. 2004. How to write proofs: A quick quide. URL
http://http://eugeniacheng.com/wp-content/uploads/
2017/02/cheng-proofguide.pdf.

Hammack, Richard. 2013. Book of Proof. Richmond, VA: Vir-


ginia Commonwealth University. URL http://www.people.
vcu.edu/~rhammack/BookOfProof/BookOfProof.pdf.

Hutchings, Michael. 2003. Introduction to mathematical ar-


guments. URL https://math.berkeley.edu/~hutching/
teach/proofs.pdf.

Sandstrum, Ted. 2019. Mathematical Reasoning: Writing and Proof.


Allendale, MI: Grand Valley State University. URL https:
//scholarworks.gvsu.edu/books/7/.

Solow, Daniel. 2013. How to Read and Do Proofs. Hoboken, NJ:


Wiley.

Steinhart, Eric. 2018. More Precisely: The Math You Need to Do


Philosophy. Peterborough, ON: Broadview, 2nd ed.

Velleman, Daniel J. 2019. How to Prove It: A Structured Approach.


Cambridge: Cambridge University Press, 3rd ed.

322
About the Open
Logic Project
The Open Logic Text is an open-source, collaborative textbook of
formal meta-logic and formal methods, starting at an intermedi-
ate level (i.e., after an introductory formal logic course). Though
aimed at a non-mathematical audience (in particular, students of
philosophy and computer science), it is rigorous.
Coverage of some topics currently included may not yet be
complete, and many sections still require substantial revision.
We plan to expand the text to cover more topics in the future.
We also plan to add features to the text, such as a glossary, a
list of further reading, historical notes, pictures, better explana-
tions, sections explaining the relevance of results to philosophy,
computer science, and mathematics, and more problems and ex-
amples. If you find an error, or have a suggestion, please let the
project team know.
The project operates in the spirit of open source. Not only
is the text freely available, we provide the LaTeX source un-
der the Creative Commons Attribution license, which gives any-
one the right to download, use, modify, re-arrange, convert, and
re-distribute our work, as long as they give appropriate credit.
Please see the Open Logic Project website at openlogicproject.org
for additional information.

323

You might also like