WEBVTT
00:00:02.660 --> 00:00:05.830
The last discrete random
variable that we will discuss
00:00:05.830 --> 00:00:08.730
is the so-called geometric
random variable.
00:00:08.730 --> 00:00:12.060
It shows up in the context of
the following experiment.
00:00:12.060 --> 00:00:16.970
We have a coin and we toss it
infinitely many times and
00:00:16.970 --> 00:00:18.260
independently.
00:00:18.260 --> 00:00:21.420
And at each coin toss we have a
fixed probability of heads,
00:00:21.420 --> 00:00:23.120
which is some given number, p.
00:00:23.120 --> 00:00:28.310
This is a parameter that
specifies the experiment.
00:00:28.310 --> 00:00:31.270
When we say that the infinitely
many tosses are
00:00:31.270 --> 00:00:36.690
independent, what we mean in a
mathematical and formal sense
00:00:36.690 --> 00:00:41.030
is that any finite subset of
those tosses are independent
00:00:41.030 --> 00:00:42.270
of each other.
00:00:42.270 --> 00:00:44.850
I'm only making this comment
because we introduced a
00:00:44.850 --> 00:00:48.340
definition of independence of
finitely many events, but had
00:00:48.340 --> 00:00:51.570
never defined the notion of
independence or infinitely
00:00:51.570 --> 00:00:53.910
many events.
00:00:53.910 --> 00:00:56.530
The sample space for this
experiment is the set of
00:00:56.530 --> 00:00:58.980
infinite sequences of
heads and tails.
00:00:58.980 --> 00:01:01.640
So a typical outcome
of this experiment
00:01:01.640 --> 00:01:03.340
might look like this.
00:01:03.340 --> 00:01:07.990
It's a sequence of heads and
tails in some arbitrary order.
00:01:07.990 --> 00:01:10.570
And of course, it's an infinite
sequence, so it
00:01:10.570 --> 00:01:11.940
continues forever.
00:01:11.940 --> 00:01:13.539
But I'm only showing
you here the
00:01:13.539 --> 00:01:16.100
beginning of that sequence.
00:01:16.100 --> 00:01:19.550
We're interested in the
following random variable, X,
00:01:19.550 --> 00:01:23.160
which is the number of tosses
until the first heads.
00:01:23.160 --> 00:01:27.170
So if our sequence looked like
this, our random variable
00:01:27.170 --> 00:01:30.950
would be taking a value of 5.
00:01:30.950 --> 00:01:34.100
A random variable of this kind
appears in many applications
00:01:34.100 --> 00:01:36.120
and many real world contexts.
00:01:36.120 --> 00:01:39.390
In general, it models situations
where we're waiting
00:01:39.390 --> 00:01:41.770
for something to happen.
00:01:41.770 --> 00:01:46.710
Suppose that we keep doing
trials at each time and the
00:01:46.710 --> 00:01:50.110
trial can result either
in success or failure.
00:01:50.110 --> 00:01:53.550
And we're counting the number
of trials it takes until a
00:01:53.550 --> 00:01:57.670
success is observed for
the first time.
00:01:57.670 --> 00:02:01.370
Now, these trials could be
experiments of some kind,
00:02:01.370 --> 00:02:05.470
could be processes of some kind,
or they could be whether
00:02:05.470 --> 00:02:10.020
a customer shows up in a store
in a particular second or not.
00:02:10.020 --> 00:02:12.940
So there are many diverse
interpretations of the words
00:02:12.940 --> 00:02:17.820
trial and of the word success
that would allow us to apply
00:02:17.820 --> 00:02:21.720
this particular model to
a given situation.
00:02:21.720 --> 00:02:25.010
Now, let us move to the
calculation of the PMF of this
00:02:25.010 --> 00:02:26.300
random variable.
00:02:26.300 --> 00:02:30.280
By definition, what we need to
calculate is the probability
00:02:30.280 --> 00:02:32.530
that the random variable
takes on a
00:02:32.530 --> 00:02:35.160
particular numerical value.
00:02:35.160 --> 00:02:38.690
What does it mean for
X to be equal to k?
00:02:38.690 --> 00:02:43.250
What it means is that the first
heads was observed in
00:02:43.250 --> 00:02:47.690
the k-th trial, which means
that the first k minus 1
00:02:47.690 --> 00:02:53.440
trials were tails, and then were
followed by heads in the
00:02:53.440 --> 00:02:56.530
k-th trial.
00:02:56.530 --> 00:03:00.400
This is an event that only
concerns the first k trials,
00:03:00.400 --> 00:03:04.030
and the probability of this
event can be calculated using
00:03:04.030 --> 00:03:07.400
the fact that different coin
tosses or different trials are
00:03:07.400 --> 00:03:08.100
independent.
00:03:08.100 --> 00:03:12.620
It is the probability of tails
in the first coin toss times
00:03:12.620 --> 00:03:15.840
the probability of tails in the
second coin toss, and so
00:03:15.840 --> 00:03:18.000
on, k minus 1 times.
00:03:18.000 --> 00:03:21.600
So we get an exponent here
of k minus 1 times the
00:03:21.600 --> 00:03:25.120
probability of heads in
the k-th coin toss.
00:03:25.120 --> 00:03:28.380
So this is the form of the PMF
of this particular random
00:03:28.380 --> 00:03:31.630
variable, and that formula
applies for the possible
00:03:31.630 --> 00:03:36.310
values of k, which are the
positive integers.
00:03:36.310 --> 00:03:39.790
Because the time of the
first head can only
00:03:39.790 --> 00:03:41.829
be a positive integer.
00:03:41.829 --> 00:03:45.430
And any positive integer is
possible, so our random
00:03:45.430 --> 00:03:50.770
variable takes values in a
discrete but infinite set.
00:03:50.770 --> 00:03:54.670
The geometric PMF has a
shape of this type.
00:03:54.670 --> 00:04:00.070
Here we see the plot for the
case where p equals to 1/3.
00:04:00.070 --> 00:04:03.670
The probability that the first
head shows up in the first
00:04:03.670 --> 00:04:07.490
trial is equal to p, that's
the probability of heads.
00:04:07.490 --> 00:04:11.130
The probability that it shows up
in the next trial, that the
00:04:11.130 --> 00:04:14.760
first head appears in the second
trial, this is the
00:04:14.760 --> 00:04:20.529
probability that we had heads
following a tail.
00:04:20.529 --> 00:04:23.070
So we have the probability of
a tail and then times the
00:04:23.070 --> 00:04:24.580
probability of a head.
00:04:24.580 --> 00:04:28.220
And then each time that we move
to a further entry, we
00:04:28.220 --> 00:04:34.110
multiply by a further
factor of 1 minus p.
00:04:34.110 --> 00:04:38.420
Finally, one little
technical remark.
00:04:38.420 --> 00:04:42.080
There's a possible and rather
annoying outcome of this
00:04:42.080 --> 00:04:46.640
experiment, which would be that
we observe a sequence of
00:04:46.640 --> 00:04:50.210
tails forever and no heads.
00:04:50.210 --> 00:04:53.450
In that case, our random
variable is not well-defined,
00:04:53.450 --> 00:04:56.480
because there is no first
heads to consider.
00:04:56.480 --> 00:05:00.190
You might say that in this case
our random variable takes
00:05:00.190 --> 00:05:04.150
a value of infinity, but we
would rather not have to deal
00:05:04.150 --> 00:05:07.410
with random variables that
could be infinite.
00:05:07.410 --> 00:05:11.760
Fortunately, it turns out that
this particular event has 0
00:05:11.760 --> 00:05:16.890
probability of occurring, which
I will now try to show.
00:05:16.890 --> 00:05:20.980
So this is the event that
we always see tails.
00:05:20.980 --> 00:05:25.630
Let us compare it with the event
where we see tails in
00:05:25.630 --> 00:05:27.450
the first k trials.
00:05:30.860 --> 00:05:35.344
How do these two
events relate?
00:05:35.344 --> 00:05:40.990
If we have always tails, then
we will have tails in the
00:05:40.990 --> 00:05:42.720
first k trials.
00:05:42.720 --> 00:05:46.730
So this event implies
that event.
00:05:46.730 --> 00:05:50.540
This event is smaller
than that event.
00:05:50.540 --> 00:05:54.140
So the probability of this event
is less than or equal to
00:05:54.140 --> 00:05:57.620
the probability of that
second event.
00:05:57.620 --> 00:06:00.240
And the probability of that
second event is 1
00:06:00.240 --> 00:06:01.750
minus p to the k.
00:06:04.800 --> 00:06:09.700
Now, this is true no matter
what k we choose.
00:06:09.700 --> 00:06:14.720
And by taking k arbitrarily
large, this number here
00:06:14.720 --> 00:06:16.530
becomes arbitrarily small.
00:06:19.310 --> 00:06:21.790
Why does it become arbitrarily
small?
00:06:21.790 --> 00:06:25.880
Well, we're assuming that p is
positive, so 1 minus p is a
00:06:25.880 --> 00:06:27.590
number less than 1.
00:06:27.590 --> 00:06:31.040
And when we multiply a number
strictly less than 1 by itself
00:06:31.040 --> 00:06:34.659
over and over, we get
arbitrarily small numbers.
00:06:34.659 --> 00:06:39.409
So the probability of never
seeing a head is less than or
00:06:39.409 --> 00:06:43.130
equal to an arbitrarily
small positive number.
00:06:43.130 --> 00:06:49.040
So the only possibility for this
is that it is equal to 0.
00:06:49.040 --> 00:06:53.010
So the probability of not ever
seeing any heads is equal to
00:06:53.010 --> 00:06:56.350
0, and this means that
we can ignore
00:06:56.350 --> 00:07:00.840
this particular outcome.
00:07:00.840 --> 00:07:05.340
And as a side consequence
of this, the sum of the
00:07:05.340 --> 00:07:09.650
probabilities of the different
possible values of k is going
00:07:09.650 --> 00:07:13.830
to be equal to 1, because we're
certain that the random
00:07:13.830 --> 00:07:17.260
variable is going to take
a finite value.
00:07:17.260 --> 00:07:19.530
And so when we sum probabilities
of all the
00:07:19.530 --> 00:07:22.460
possible finite values,
that sum will have
00:07:22.460 --> 00:07:23.800
to be equal to 1.
00:07:23.800 --> 00:07:26.910
And indeed, you can use the
formula for the geometric
00:07:26.910 --> 00:07:31.300
series to verify that, indeed,
the sum of these numbers here,
00:07:31.300 --> 00:07:35.080
when you add over all values of
k, is, indeed, equal to 1.