WEBVTT
00:00:03.300 --> 00:00:05.900
Putting together a probabilistic
model--
00:00:05.900 --> 00:00:09.060
that is, a model of a random
phenomenon or a random
00:00:09.060 --> 00:00:10.300
experiment--
00:00:10.300 --> 00:00:12.360
involves two steps.
00:00:12.360 --> 00:00:16.020
First step, we describe the
possible outcomes of the
00:00:16.020 --> 00:00:18.930
phenomenon or experiment
of interest.
00:00:18.930 --> 00:00:22.940
Second step, we describe our
beliefs about the likelihood
00:00:22.940 --> 00:00:25.660
of the different possible
outcomes by specifying a
00:00:25.660 --> 00:00:27.570
probability law.
00:00:27.570 --> 00:00:31.650
Here, we start by just talking
about the first step, namely,
00:00:31.650 --> 00:00:33.830
the description of the possible
outcomes of the
00:00:33.830 --> 00:00:34.950
experiment.
00:00:34.950 --> 00:00:36.760
So we carry out an experiment.
00:00:36.760 --> 00:00:38.540
For example, we flip a coin.
00:00:38.540 --> 00:00:41.400
Or maybe we flip five coins
simultaneously.
00:00:41.400 --> 00:00:43.740
Or maybe we roll a die.
00:00:43.740 --> 00:00:48.390
Whatever that experiment is,
it has a number of possible
00:00:48.390 --> 00:00:51.970
outcomes, and we start
by making a list of
00:00:51.970 --> 00:00:53.830
the possible outcomes--
00:00:53.830 --> 00:00:57.490
or, a better word, instead of
the word "list", is to use the
00:00:57.490 --> 00:01:01.520
word "set", which has a more
formal mathematical meaning.
00:01:01.520 --> 00:01:05.400
So we create a set
that we usually
00:01:05.400 --> 00:01:09.850
denote by capital omega.
00:01:09.850 --> 00:01:14.570
That set is called the sample
space and is the set of all
00:01:14.570 --> 00:01:17.565
possible outcomes of
our experiment.
00:01:21.289 --> 00:01:24.750
The elements of that set
should have certain
00:01:24.750 --> 00:01:25.980
properties.
00:01:25.980 --> 00:01:29.590
Namely, the elements should
be mutually exclusive and
00:01:29.590 --> 00:01:31.620
collectively exhaustive.
00:01:31.620 --> 00:01:32.860
What does that mean?
00:01:32.860 --> 00:01:36.289
Mutually exclusive means that,
if at the end of the
00:01:36.289 --> 00:01:41.160
experiment, I tell you that this
outcome happened, then it
00:01:41.160 --> 00:01:44.970
should not be possible that this
outcome also happened.
00:01:44.970 --> 00:01:47.759
At the end of the experiment,
there can only be one of the
00:01:47.759 --> 00:01:50.200
outcomes that has happened.
00:01:50.200 --> 00:01:53.420
Being collectively exhaustive
means something else-- that,
00:01:53.420 --> 00:01:57.330
together, all of these elements
of the set exhaust
00:01:57.330 --> 00:01:59.400
all the possibilities.
00:01:59.400 --> 00:02:03.400
So no matter what, at the end,
you will be able to point to
00:02:03.400 --> 00:02:07.860
one of the outcomes and say,
that's the one that occurred.
00:02:07.860 --> 00:02:09.009
To summarize--
00:02:09.009 --> 00:02:12.340
this set should be such that, at
the end of the experiment,
00:02:12.340 --> 00:02:17.270
you should be always able to
point to one, and exactly one,
00:02:17.270 --> 00:02:20.660
of the possible outcomes and say
that this is the outcome
00:02:20.660 --> 00:02:21.910
that occurred.
00:02:23.700 --> 00:02:28.870
Physically different outcomes
should be distinguished in the
00:02:28.870 --> 00:02:33.530
sample space and correspond
to distinct points.
00:02:33.530 --> 00:02:35.760
But when we say physically
different
00:02:35.760 --> 00:02:37.840
outcomes, what do we mean?
00:02:37.840 --> 00:02:41.910
We really mean different in
all relevant aspects but
00:02:41.910 --> 00:02:45.880
perhaps not different in
irrelevant aspects.
00:02:45.880 --> 00:02:50.180
Let's make more precise what I
mean by that by looking at a
00:02:50.180 --> 00:02:53.600
very simple, and maybe
silly, example,
00:02:53.600 --> 00:02:54.625
which is the following.
00:02:54.625 --> 00:02:58.360
Suppose that you flip a coin
and you see whether it
00:02:58.360 --> 00:03:01.980
resulted in heads or tails.
00:03:01.980 --> 00:03:05.890
So you have a perfectly
legitimate sample space for
00:03:05.890 --> 00:03:09.470
this experiment which consists
of just two points--
00:03:09.470 --> 00:03:11.200
heads and tails.
00:03:11.200 --> 00:03:16.750
Together these two outcomes
exhaust all possibilities.
00:03:16.750 --> 00:03:19.380
And the two outcomes are
mutually exclusive.
00:03:19.380 --> 00:03:22.200
So this is a very legitimate
sample space for this
00:03:22.200 --> 00:03:23.760
experiment.
00:03:23.760 --> 00:03:26.620
Now suppose that while you were
flipping the coin, you
00:03:26.620 --> 00:03:30.110
also looked outside the window
to check the weather.
00:03:30.110 --> 00:03:37.140
And then you could say that my
sample space is really, heads,
00:03:37.140 --> 00:03:38.410
and it's raining.
00:03:40.970 --> 00:03:44.965
Another possible outcome
is heads and no rain.
00:03:48.780 --> 00:03:55.640
Another possible outcome is
tails, and it's raining, and,
00:03:55.640 --> 00:03:59.975
finally, another possible
outcome is tails and no rain.
00:04:05.490 --> 00:04:11.560
This set, consisting of four
elements, is also a perfectly
00:04:11.560 --> 00:04:14.315
legitimate sample space
for the experiment
00:04:14.315 --> 00:04:16.140
of flipping a coin.
00:04:16.140 --> 00:04:19.060
The elements of this sample
space are mutually exclusive
00:04:19.060 --> 00:04:20.390
and collectively exhaustive.
00:04:20.390 --> 00:04:24.830
Exactly one of these outcomes
is going to be true, or will
00:04:24.830 --> 00:04:27.640
have materialized, at the
end of the experiment.
00:04:27.640 --> 00:04:30.440
So which sample space
is the correct one?
00:04:30.440 --> 00:04:32.950
This sample space, the
second one, involves
00:04:32.950 --> 00:04:34.970
some irrelevant details.
00:04:34.970 --> 00:04:40.090
So the preferred sample space
for describing the flipping of
00:04:40.090 --> 00:04:44.260
a coin, the preferred sample
space is the simpler one, the
00:04:44.260 --> 00:04:48.340
first one, which is sort of at
the right granularity, given
00:04:48.340 --> 00:04:50.640
what we're interested in.
00:04:50.640 --> 00:04:54.010
But ultimately, the question
of which one is the right
00:04:54.010 --> 00:04:56.760
sample space depends
on what kind of
00:04:56.760 --> 00:04:58.840
questions you want to answer.
00:04:58.840 --> 00:05:02.280
For example, if you have a
theory that the weather
00:05:02.280 --> 00:05:07.020
affects the behavior of coins,
then, in order to play with
00:05:07.020 --> 00:05:11.960
that theory, or maybe check it
out, and so on, then, in such
00:05:11.960 --> 00:05:15.810
a case, you might want to work
with the second sample space.
00:05:15.810 --> 00:05:19.070
This is a common feature
in all of science.
00:05:19.070 --> 00:05:22.670
Whenever you put together a
model, you need to decide how
00:05:22.670 --> 00:05:25.080
detailed you want your
model to be.
00:05:25.080 --> 00:05:28.870
And the right level of detail is
the one that captures those
00:05:28.870 --> 00:05:32.500
aspects that are relevant
and of interest to you.