Instant ebooks textbook An Introduction to Bayesian Inference, Methods and Computation Nick Heard download all chapters
Instant ebooks textbook An Introduction to Bayesian Inference, Methods and Computation Nick Heard download all chapters
com
https://ebookmeta.com/product/an-introduction-to-bayesian-
inference-methods-and-computation-nick-heard/
OR CLICK HERE
DOWLOAD NOW
An Introduction
to Bayesian
Inference,
Methods and
Computation
An Introduction to Bayesian Inference, Methods
and Computation
Nick Heard
An Introduction to Bayesian
Inference, Methods
and Computation
Nick Heard
Imperial College London
London, UK
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The aim of writing this text was to provide a fast, accessible introduction to Bayesian
statistical inference. The content is directed at postgraduate students with a back-
ground in a numerate discipline, including some experience in basic probability
theory and statistical estimation. The text accompanies a module of the same name,
Bayesian Methods and Computation, which forms part of the online Master of
Machine Learning and Data Science degree programme at Imperial College London.
Starting from an introduction to the fundamentals of subjective probability, the
course quickly advances to modelling principles, computational approaches and then
advanced modelling techniques. Whilst this rapid development necessitates a light
treatment of some advanced theoretical concepts, the benefit is to fast track the reader
to an exciting wealth of modelling possibilities whilst still providing a key grounding
in the fundamental principles.
To make possible this rapid transition from basic principles to advanced modelling,
the text makes extensive use of the probabilistic programming language Stan, which
is the product of a worldwide initiative to make Bayesian inference on user-defined
statistical models more accessible. Stan is written in C++, meaning it is computa-
tionally fast and can be run in parallel, but the interface is modular and simple. The
future of applied Bayesian inference arguably relies on the broadening development
of such software platforms.
Chapter 1 introduces the core ideas of Bayesian reasoning: Decision-making under
uncertainty, specifying subjective probabilities and utility functions and identifying
optimal decisions as those which maximise expected utility. Prediction and estima-
tion, the two core tasks in statistical inference, are shown to be special cases of this
broader decision-making framework. The application-driven reader may choose to
skip this chapter, although philosophically it sets the foundation for everything that
follows.
Chapter 2 presents representation theorems which justify the prior × likelihood
formulation synonymous with Bayesian probability models. Simply believing that
unknown variables are exchangeable, meaning probability beliefs are invariant to
relabelling of the variables, is sufficient to guarantee that construction must hold.
The prior distribution distinguishes Bayesian inference from frequentist statistical
methods, and several approaches to specifying prior distributions are discussed. The
v
vi Preface
ix
x Contents
1.1.1 Subjectivism
In the seminal work of de Finetti (see the English translation of de Finetti 2017),
the central idea for the Bayesian paradigm is to address decision-making in the
face of uncertainty from a subjective viewpoint. Given the same set of uncertain
circumstances, two decision-makers could differ in the following ways:
• How desirable different potential outcomes might seem to them.
• How likely they consider the various outcomes to be.
• How they feel their actions might affect the eventual outcome.
The Bayesian decision-making paradigm is most easily viewed through the lens of
an individual making choices (“decisions”) in the face of (personal) uncertainty. For
this reason, certain illustrative elements of this section will be purposefully written
in the first person.
This decision-theoretic view of the Bayesian paradigm represents a mathematical
ideal of how a coherent non-self-contradictory individual should aspire to behave.
This is a non-trivial requirement, made easier with various mathematical formalisms
which will be introduced in the modelling sections of this text. Whilst these for-
malisms might not exactly match my beliefs for specific decision problems, the aim
is to present sufficiently many classes of models that one of them might adequately
reflect my opinions up to some acceptable level of approximation.
Coherence is also the most that will be expected from a decision-maker; there
will be no requirement for me to choose in any sense the right decisions from any
perspective other than my own at that time. Everything within the paradigm is sub-
jective, even apparently absolute concepts such as truth. Statements of certainty such
as “The true value of the parameter is x” should be considered shorthand for “It is my
understanding that the true value of the parameter is x”. This might seem pedantic,
There are numerous sources of individual uncertainty which can complicate decision-
making. These could include:
• Events which have not yet happened, but might happen some time in the future
• Events which have happened which I have not yet learnt about
• Facts which may yet be undiscovered, such as the truth of some mathematical
conjecture
• Facts which may have been discovered elsewhere, but remain unknown to me
• Facts which I have partially or completely forgotten
In the Bayesian paradigm, these and other sources of uncertainty are treated equally.
If there are matters on which I am unsure, then these uncertainties must be acknowl-
edged and incorporated into a rational decision process. Whether or not I perhaps
should know them is immaterial.
Example 1.1 If rolling a die, I might understandably assume that the outcome
will be in Ω = { , , , , , }. Alternatively, I could take a more conserva-
tive viewpoint and extend the space of outcomes to include some unintended or
potentially unforeseen outcomes; for example, Ω = {Dice roll does not take place,
No valid outcome, , , , , , }.
Neither viewpoint in Example 1.1 could irrefutably be said to be right or wrong.
But if I am making a decision which I consider to be affected by the future outcome of
the intended dice roll, I would possibly adopt different positions according to which
set of possible outcomes I chose to focus on. The only requirement for Ω is that it
should contain every outcome I currently conceive to be possible and meaningful to
the decision problem under consideration.
Definition 1.2 (Decision problem) Following Bernardo and Smith (1994), a decision
problem will be composed of three elements:
1. An action a, to be chosen from a set A of possible actions.
2. An uncertain outcome ω, thought to lie within a set Ω of envisaged possible
outcomes.
3. An identifiable consequence, assumed to lie within a set C of possible conse-
quences, resulting from the combination of both the action taken and the ensuing
outcome which occurs.
Axioms 1 C will be totally ordered, meaning there exists an ordering relation ≤C
on C such that for any pair of consequences c1 , c2 ∈ C , necessarily c1 ≤C c2 or
c2 ≤C c1 .
If both c1 ≤C c2 and c2 ≤C c1 , then we write c1 =C c2 . This provides definitions
of (subjective) preference and indifference between consequences.
Remark 1.1 Crucially, the ordering ≤C is assumed to be subjective; my perceived
ordering of the different consequences must be allowed to differ from that of other
decision-makers.
Definition 1.3 (Preferences on consequences) Suppose c1 , c2 ∈ C . If c1 ≤C c2 and
c1 =C c2 , then c2 is said to be a preferable consequence to c1 , written c1 <C c2 . If
c1 =C c2 , then I am indifferent between the two consequences.
Definition 1.4 (Action) An action defines a function which maps outcomes to con-
sequences. For simplicity of presentation, until Section 1.5.1 the actions in A will
be assumed to be discrete, meaning that each can be represented by a generic form
a = {(E 1 , c1 ), (E 2 , c2 ), . . .}, where c1 , c2 , . . . ∈ C , and E 1 , E 2 , . . . are referred to as
fundamental events which form a partition of Ω, meaning Ω = ∪i E i , E i ∩ E j = ∅
for i = j. Then, for example, if I take action a, then I anticipate that any outcome
ω ∈ E 1 would lead to consequence c1 , and so on.
Remark 1.2 When actions are identified, in this way, by the perceived consequences
they will lead to under different outcomes, they are subjective.
a = {(E 1 , c1 ), (E 2 , c2 ), . . .},
a = {(E 1 , c1 ), (E 2 , c2 ), . . .}.
The overall desirability of each action will depend entirely on the uncertainty sur-
rounding the fundamental events E 1 , E 2 , . . . and E 1 , E 2 , . . . and the desirability of
the corresponding consequences c1 , c2 , . . . and c1 , c2 , . . .. This can be exploited in
two ways, which will be developed in later sections:
1. If I innately prefer action a to a , then this preference can be used to quantify my
beliefs about the uncertainty surrounding the fundamental events characterising
each action. This will form the basis for eliciting subjective probabilities (see
Sect. 1.3).
2. Reversing the same argument, once I have elicited my probabilities for certain
events then these can be used to obtain preferences between corresponding actions
through the principle of maximising expected utility (see Sect. 1.4.1).
Remark 1.3 The two actions {(F, c1 ), (F, c2 )} and {(E, c1 ), (E, c2 )} only differ in
the consequences anticipated from any ω ∈ E ∩ F; that is, the event E ∩ F would
lead to a consequence of c1 under the first action and c2 under the second.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.