1
00:00:00,650 --> 00:00:03,990
In this segment, we introduce
the concept of continuous
2
00:00:03,990 --> 00:00:07,230
random variables and their
characterization in terms of
3
00:00:07,230 --> 00:00:11,100
probability density functions,
or PDFs for short.
4
00:00:11,100 --> 00:00:13,810
Let us first go back to discrete
random variables.
5
00:00:13,810 --> 00:00:15,800
A discrete random variable
takes values
6
00:00:15,800 --> 00:00:17,130
in a discrete set.
7
00:00:17,130 --> 00:00:20,430
There is a total of one unit of
probability assigned to the
8
00:00:20,430 --> 00:00:21,780
possible values.
9
00:00:21,780 --> 00:00:26,080
And the PMF tells us exactly how
much of this probability
10
00:00:26,080 --> 00:00:28,100
is assigned to each value.
11
00:00:28,100 --> 00:00:33,590
So we can think of the bars in
the PMF as point masses with
12
00:00:33,590 --> 00:00:37,000
positive weight that
sit on top of each
13
00:00:37,000 --> 00:00:38,870
possible numerical value.
14
00:00:38,870 --> 00:00:41,440
And we can calculate the
probability that the random
15
00:00:41,440 --> 00:00:45,720
variable falls inside an
interval by adding all the
16
00:00:45,720 --> 00:00:49,270
masses that sit on top
of that interval.
17
00:00:49,270 --> 00:00:52,890
So for example, if we're looking
at the interval from a
18
00:00:52,890 --> 00:00:57,730
to b, the probability of this
interval is equal to the sum
19
00:00:57,730 --> 00:01:01,530
of the probabilities of these
three masses that fall inside
20
00:01:01,530 --> 00:01:02,870
this interval.
21
00:01:02,870 --> 00:01:05,690
On the other hand, a continuous
random variable
22
00:01:05,690 --> 00:01:08,860
will be taking values over
a continuous range--
23
00:01:08,860 --> 00:01:12,800
for example, the real line or an
interval on the real line.
24
00:01:12,800 --> 00:01:18,000
In this case, we still have one
total unit of probability
25
00:01:18,000 --> 00:01:21,350
mass that is assigned to the
possible values of the random
26
00:01:21,350 --> 00:01:25,390
variable, except that this unit
of mass is spread all
27
00:01:25,390 --> 00:01:26,860
over the real line.
28
00:01:26,860 --> 00:01:29,990
But it is not spread in
a uniform manner.
29
00:01:29,990 --> 00:01:32,210
Some parts of the real
line have more
30
00:01:32,210 --> 00:01:33,880
mass per unit length.
31
00:01:33,880 --> 00:01:35,670
Some have less.
32
00:01:35,670 --> 00:01:39,900
How much mass exactly is sitting
on top of each part of
33
00:01:39,900 --> 00:01:43,090
the real line is described by
the probability density
34
00:01:43,090 --> 00:01:47,380
function, this function plotted
here, which we denote
35
00:01:47,380 --> 00:01:48,840
with this notation.
36
00:01:48,840 --> 00:01:52,090
The letter f will always
indicate that we are dealing
37
00:01:52,090 --> 00:01:53,440
with a PDF.
38
00:01:53,440 --> 00:01:57,310
And the subscript will indicate
which random variable
39
00:01:57,310 --> 00:02:00,190
we're talking about.
40
00:02:00,190 --> 00:02:03,210
We use the probability density
function to calculate the
41
00:02:03,210 --> 00:02:07,020
probability that X lies in
a certain interval--
42
00:02:07,020 --> 00:02:10,560
let's say the interval
from a to b.
43
00:02:10,560 --> 00:02:16,079
And we calculate it by finding
the area under the PDF that
44
00:02:16,079 --> 00:02:19,900
sits on top of that interval.
45
00:02:19,900 --> 00:02:23,300
So this area here, the shaded
area, is the probability that
46
00:02:23,300 --> 00:02:26,100
X stakes values in
this interval.
47
00:02:26,100 --> 00:02:28,990
Think of probability
as snow fall.
48
00:02:28,990 --> 00:02:31,770
There is one pound of snow
that has fallen on
49
00:02:31,770 --> 00:02:34,500
top of the real line.
50
00:02:34,500 --> 00:02:39,160
The PDF tells us the height of
the snow accumulated over a
51
00:02:39,160 --> 00:02:41,329
particular point.
52
00:02:41,329 --> 00:02:47,190
We then find the weight of the
overall amount of snow sitting
53
00:02:47,190 --> 00:02:51,079
on top of an interval by
calculating the area under
54
00:02:51,079 --> 00:02:52,400
this curve.
55
00:02:52,400 --> 00:02:56,750
Of course, mathematically, area
under the curve is just
56
00:02:56,750 --> 00:02:57,720
an integral.
57
00:02:57,720 --> 00:03:01,320
So the probability that X takes
values in this interval
58
00:03:01,320 --> 00:03:07,790
is the integral of the PDF over
this particular interval.
59
00:03:07,790 --> 00:03:10,320
What properties should
the PDF have?
60
00:03:10,320 --> 00:03:14,780
By analogy with the discrete
case, a PDF must be
61
00:03:14,780 --> 00:03:18,060
non-negative, because we do
not want to get negative
62
00:03:18,060 --> 00:03:20,290
probabilities.
63
00:03:20,290 --> 00:03:24,870
In the discrete case, the sum
of the PMF entries has to be
64
00:03:24,870 --> 00:03:26,680
equal to 1.
65
00:03:26,680 --> 00:03:31,240
In the continuous case, X is
certain to lie in the interval
66
00:03:31,240 --> 00:03:34,200
between minus infinity
and plus infinity.
67
00:03:34,200 --> 00:03:39,260
So letting a be minus infinity
and b plus infinity, we should
68
00:03:39,260 --> 00:03:41,680
get a probability of 1.
69
00:03:41,680 --> 00:03:46,180
So the total area under the PDF,
when we integrate over
70
00:03:46,180 --> 00:03:49,780
the entire real line, should
be equal to 1.
71
00:03:49,780 --> 00:03:55,140
These two conditions are all
that we need in order to have
72
00:03:55,140 --> 00:03:58,300
a legitimate PDF.
73
00:03:58,300 --> 00:04:01,280
We can now give a formal
definition of what a
74
00:04:01,280 --> 00:04:03,770
continuous random variable is.
75
00:04:03,770 --> 00:04:08,270
A continuous random variable
is a random variable whose
76
00:04:08,270 --> 00:04:14,380
probabilities can be described
by a PDF according to a
77
00:04:14,380 --> 00:04:17,209
formula of this type.
78
00:04:17,209 --> 00:04:19,899
An important point--
79
00:04:19,899 --> 00:04:22,550
the fact that a random variable
takes values on a
80
00:04:22,550 --> 00:04:26,980
continuous set is not enough
to make it what we call a
81
00:04:26,980 --> 00:04:29,180
continuous random variable.
82
00:04:29,180 --> 00:04:30,910
For a continuous random
variable, we're
83
00:04:30,910 --> 00:04:32,730
asking for a bit more--
84
00:04:32,730 --> 00:04:36,630
that it can be described by a
PDF, that a formula of this
85
00:04:36,630 --> 00:04:37,880
type is valid.
86
00:04:40,760 --> 00:04:44,240
Now, once we have the
probabilities of intervals as
87
00:04:44,240 --> 00:04:48,380
given by a PDF, we can use of
additivity to calculate the
88
00:04:48,380 --> 00:04:51,340
probabilities of more
complicated sets.
89
00:04:51,340 --> 00:04:54,520
For example, if you're
interested in the probability
90
00:04:54,520 --> 00:05:06,050
that X lies between 1 and 3 or
that X lies between 4 and 5--
91
00:05:06,050 --> 00:05:11,250
so this is the probability that
X falls in a region that
92
00:05:11,250 --> 00:05:14,940
consists of two disjoint
intervals.
93
00:05:14,940 --> 00:05:19,790
We find the probability of the
union of these two intervals,
94
00:05:19,790 --> 00:05:24,020
by additivity, by adding the
probabilities of the two
95
00:05:24,020 --> 00:05:29,480
intervals, since these intervals
are disjoint.
96
00:05:29,480 --> 00:05:35,840
And then we can use the PDF to
calculate the probabilities of
97
00:05:35,840 --> 00:05:39,380
each one of these intervals
according to this formula.
98
00:05:42,570 --> 00:05:45,780
At this point, you may be
wondering what happened to the
99
00:05:45,780 --> 00:05:49,120
sample space in all
this discussion.
100
00:05:49,120 --> 00:05:53,070
Well, there is still an
underlying sample space
101
00:05:53,070 --> 00:05:54,320
lurking in the background.
102
00:05:57,800 --> 00:06:01,860
And different outcomes in the
sample space result in
103
00:06:01,860 --> 00:06:05,670
different numerical
values for the
104
00:06:05,670 --> 00:06:07,135
random variable of interest.
105
00:06:09,790 --> 00:06:13,650
And when we talk about the
probability that X takes
106
00:06:13,650 --> 00:06:18,050
values between some numbers a
and b, what we really mean is
107
00:06:18,050 --> 00:06:23,780
the probability of those
outcomes for which the
108
00:06:23,780 --> 00:06:27,430
resulting value of
X lies inside
109
00:06:27,430 --> 00:06:29,800
this particular interval.
110
00:06:29,800 --> 00:06:31,720
So that's what probability
means.
111
00:06:31,720 --> 00:06:35,659
On the other hand, once we have
a PDF in our hands, we
112
00:06:35,659 --> 00:06:39,290
can completely forget about the
underlying sample space.
113
00:06:39,290 --> 00:06:42,890
And we can carry out any
calculations we may be
114
00:06:42,890 --> 00:06:46,680
interested in by just working
with the PDF.
115
00:06:46,680 --> 00:06:50,600
So as we move on in this course,
the sample space will
116
00:06:50,600 --> 00:06:52,400
be moved offstage.
117
00:06:52,400 --> 00:06:54,470
There will be less and
less mention of it.
118
00:06:54,470 --> 00:06:58,159
And we will be working just
directly with PDFs or with
119
00:06:58,159 --> 00:07:03,270
PMFs if we are dealing with
discrete random variables.
120
00:07:03,270 --> 00:07:06,470
Let us now build a little bit
on our understanding of what
121
00:07:06,470 --> 00:07:08,580
PDFs really are by looking at
122
00:07:08,580 --> 00:07:11,110
probabilities of small intervals.
123
00:07:11,110 --> 00:07:16,660
Let us look at an interval that
starts at some a and goes
124
00:07:16,660 --> 00:07:20,340
up to some number
a plus delta.
125
00:07:20,340 --> 00:07:23,410
So here, delta is a
positive number.
126
00:07:23,410 --> 00:07:25,270
But we're interested
in the case where
127
00:07:25,270 --> 00:07:28,150
delta is very small.
128
00:07:28,150 --> 00:07:33,750
Let us look at the probability
that X falls in this interval.
129
00:07:33,750 --> 00:07:40,640
The probability that X lies
inside this interval is the
130
00:07:40,640 --> 00:07:42,390
area of this region.
131
00:07:45,650 --> 00:07:50,630
On the other hand, as long as
f does not change too much
132
00:07:50,630 --> 00:07:53,590
over this little interval, which
will be the case if we
133
00:07:53,590 --> 00:07:59,520
have a continuous density f,
then we can approximate the
134
00:07:59,520 --> 00:08:04,670
area we have of this region by
the area of a rectangle where
135
00:08:04,670 --> 00:08:08,130
we keep the height constant.
136
00:08:08,130 --> 00:08:14,800
The area of this rectangle is
equal to the height, which is
137
00:08:14,800 --> 00:08:21,360
the value of the PDF at the
point a, times the base of the
138
00:08:21,360 --> 00:08:24,360
rectangle, which is
equal to delta.
139
00:08:24,360 --> 00:08:27,560
So this gives us an
interpretation of PDFs in
140
00:08:27,560 --> 00:08:30,940
terms of probabilities
of small intervals.
141
00:08:30,940 --> 00:08:34,370
If we take this factor of delta
and send it to the other
142
00:08:34,370 --> 00:08:37,960
side in this approximate
equality, we see that the
143
00:08:37,960 --> 00:08:43,308
value of the PDF can be
interpreted as probability per
144
00:08:43,308 --> 00:08:44,760
unit length.
145
00:08:44,760 --> 00:08:47,350
So PDFs are not probabilities.
146
00:08:47,350 --> 00:08:48,750
They are densities.
147
00:08:48,750 --> 00:08:52,770
Their units are probability
per unit length.
148
00:08:52,770 --> 00:08:57,710
Now, if the probability per unit
length is finite and the
149
00:08:57,710 --> 00:09:04,410
length delta is sent to 0, we
will get 0 probability.
150
00:09:04,410 --> 00:09:09,690
More formally, if we look at
this integral and we let b to
151
00:09:09,690 --> 00:09:16,200
be the same as a, then we obtain
the probability that X
152
00:09:16,200 --> 00:09:19,150
is equal to a.
153
00:09:19,150 --> 00:09:21,620
And on that side, we
get an integral
154
00:09:21,620 --> 00:09:24,176
over a 0 length interval.
155
00:09:24,176 --> 00:09:26,800
And that integral is
going to be 0.
156
00:09:26,800 --> 00:09:30,960
So we obtain that the
probability that X takes a
157
00:09:30,960 --> 00:09:35,900
value equal to a specific,
particular point--
158
00:09:35,900 --> 00:09:39,220
that probability is going
to be equal to 0.
159
00:09:39,220 --> 00:09:42,520
So for a continuous random
variable, any particular point
160
00:09:42,520 --> 00:09:44,710
has 0 probability.
161
00:09:44,710 --> 00:09:50,280
Yet somehow, collectively, the
infinitely many points in an
162
00:09:50,280 --> 00:09:55,070
interval together will have
positive probablility.
163
00:09:55,070 --> 00:09:56,650
Is this a puzzle?
164
00:09:56,650 --> 00:09:57,260
Not really.
165
00:09:57,260 --> 00:09:59,890
That's exactly what happens,
also, with the
166
00:09:59,890 --> 00:10:02,050
ordinary notion of length.
167
00:10:02,050 --> 00:10:06,590
Single points have 0 length, yet
by putting together lots
168
00:10:06,590 --> 00:10:14,030
of points, we can create a set
that has a positive length.
169
00:10:14,030 --> 00:10:17,680
And a final consequence of the
fact that individual points
170
00:10:17,680 --> 00:10:20,650
have 0 length.
171
00:10:20,650 --> 00:10:24,380
Using the additivity axiom,
the probability that our
172
00:10:24,380 --> 00:10:29,680
random variable takes values
inside an interval is equal to
173
00:10:29,680 --> 00:10:33,150
the probability that our random
variable takes a value
174
00:10:33,150 --> 00:10:37,780
of a plus the probability that
our random variable takes a
175
00:10:37,780 --> 00:10:42,470
value of b plus the probability
that our random
176
00:10:42,470 --> 00:10:47,390
variable is strictly
between a and b.
177
00:10:47,390 --> 00:10:54,030
According to our discussion,
this term is equal to 0.
178
00:10:54,030 --> 00:10:56,230
And this term is equal to 0.
179
00:10:56,230 --> 00:10:59,070
And so we conclude that the
probability of a closed
180
00:10:59,070 --> 00:11:00,650
interval is the same as the
181
00:11:00,650 --> 00:11:03,100
probability of an open interval.
182
00:11:03,100 --> 00:11:05,340
When calculating probabilities,
it does not
183
00:11:05,340 --> 00:11:08,300
matter whether we include
the endpoints or not.