WEBVTT

00:00:01.400 --> 00:00:04.860
We end this lecture sequence
by stepping back to discuss

00:00:04.860 --> 00:00:09.020
what probability theory really
is and what exactly is the

00:00:09.020 --> 00:00:11.990
meaning of the word
probability.

00:00:11.990 --> 00:00:14.990
In the most narrow view,
probability theory is just a

00:00:14.990 --> 00:00:16.590
branch of mathematics.

00:00:16.590 --> 00:00:18.590
We start with some axioms.

00:00:18.590 --> 00:00:22.080
We consider models that satisfy
these axioms, and we

00:00:22.080 --> 00:00:24.410
establish some consequences,
which are the

00:00:24.410 --> 00:00:27.110
theorems of this theory.

00:00:27.110 --> 00:00:30.730
You could do all that without
ever asking the question of

00:00:30.730 --> 00:00:34.170
what the word "probability"
really means.

00:00:34.170 --> 00:00:37.730
Yet, one of the theorems of
probability theory, that we

00:00:37.730 --> 00:00:41.900
will see later in this class,
is that probabilities can be

00:00:41.900 --> 00:00:46.250
interpreted as frequencies,
very loosely speaking.

00:00:46.250 --> 00:00:50.120
If I have a fair coin, and I
toss it infinitely many times,

00:00:50.120 --> 00:00:52.580
then the fraction of
heads that I will

00:00:52.580 --> 00:00:55.170
observe will be one half.

00:00:55.170 --> 00:00:58.880
In this sense, the probability
of an event, A, can be

00:00:58.880 --> 00:01:03.160
interpreted as the frequency
with which event A will occur

00:01:03.160 --> 00:01:07.880
in an infinite number of
repetitions of the experiment.

00:01:07.880 --> 00:01:10.780
But is this all there is?

00:01:10.780 --> 00:01:13.090
If we're dealing with coin
tosses, it makes sense to

00:01:13.090 --> 00:01:15.410
think of probabilities
as frequencies.

00:01:15.410 --> 00:01:20.230
But consider a statement such
as the "current president of

00:01:20.230 --> 00:01:23.620
my country will be reelected
in the next election with

00:01:23.620 --> 00:01:26.390
probability 0.7".

00:01:26.390 --> 00:01:30.180
It's hard to think of this
number, 0.7, as a frequency.

00:01:30.180 --> 00:01:33.020
It does not make sense to
think of infinitely many

00:01:33.020 --> 00:01:35.789
repetitions of the
next election.

00:01:35.789 --> 00:01:39.750
In cases like this, and in many
others, it is better to

00:01:39.750 --> 00:01:43.210
think of probabilities
as just some way of

00:01:43.210 --> 00:01:45.300
describing our beliefs.

00:01:45.300 --> 00:01:48.920
And if you're a betting person,
probabilities can be

00:01:48.920 --> 00:01:52.460
thought of as some numerical
guidance into what kinds of

00:01:52.460 --> 00:01:56.780
bets you might be
willing to make.

00:01:56.780 --> 00:02:01.440
But now if we think of
probabilities as beliefs, you

00:02:01.440 --> 00:02:04.590
can run into the argument
that, well, beliefs are

00:02:04.590 --> 00:02:05.820
subjective.

00:02:05.820 --> 00:02:09.860
Isn't probability theory
supposed to be an objective

00:02:09.860 --> 00:02:12.270
part of math and science?

00:02:12.270 --> 00:02:16.260
Is probability theory just an
exercise in subjectivity?

00:02:16.260 --> 00:02:18.540
Well, not quite.

00:02:18.540 --> 00:02:20.250
There's more to it.

00:02:20.250 --> 00:02:24.210
Probability, at the minimum,
gives us some rules for

00:02:24.210 --> 00:02:29.310
thinking systematically about
uncertain situations.

00:02:29.310 --> 00:02:32.450
And if it happens that our
probability model, our

00:02:32.450 --> 00:02:36.380
subjective beliefs, have some
relation with the real world,

00:02:36.380 --> 00:02:39.910
then probability theory can be
a very useful tool for making

00:02:39.910 --> 00:02:45.000
predictions and decisions that
apply to the real world.

00:02:45.000 --> 00:02:48.750
Now, whether your predictions
and decisions will be any good

00:02:48.750 --> 00:02:53.120
will depend on whether you
have chosen a good model.

00:02:53.120 --> 00:02:55.640
Have you chosen a model that's
provides a good enough

00:02:55.640 --> 00:02:59.079
representation of
the real world?

00:02:59.079 --> 00:03:01.660
How do you make sure that
this is the case?

00:03:01.660 --> 00:03:05.020
There's a whole field, the field
of statistics, whose

00:03:05.020 --> 00:03:09.270
purpose is to complement
probability theory by using

00:03:09.270 --> 00:03:12.750
data to come up with
good models.

00:03:12.750 --> 00:03:17.340
And so we have the following
diagram that summarizes the

00:03:17.340 --> 00:03:20.200
relation between the real
world, statistics, and

00:03:20.200 --> 00:03:21.250
probability.

00:03:21.250 --> 00:03:23.920
The real world generates data.

00:03:23.920 --> 00:03:27.230
The field of statistics and
inference uses these data to

00:03:27.230 --> 00:03:29.680
come up with probabilistic
models.

00:03:29.680 --> 00:03:32.720
Once we have a probabilistic
model, we use probability

00:03:32.720 --> 00:03:36.390
theory and the analysis tools
that it provides to us.

00:03:36.390 --> 00:03:40.360
And the results that we get
from this analysis lead to

00:03:40.360 --> 00:03:42.650
predictions and decisions
about the real world.