WEBVTT
00:00:00.040 --> 00:00:02.460
The following content is
provided under a Creative
00:00:02.460 --> 00:00:03.870
Commons license.
00:00:03.870 --> 00:00:06.910
Your support will help MIT
OpenCourseWare continue to
00:00:06.910 --> 00:00:10.560
offer high-quality educational
resources for free.
00:00:10.560 --> 00:00:13.460
To make a donation or view
additional materials from
00:00:13.460 --> 00:00:19.290
hundreds of MIT courses, visit
MIT OpenCourseWare at
00:00:19.290 --> 00:00:20.540
ocw.mit.edu.
00:00:22.640 --> 00:00:22.990
JOHN TSITSIKLIS: OK.
00:00:22.990 --> 00:00:24.020
We can start.
00:00:24.020 --> 00:00:26.540
Good morning.
00:00:26.540 --> 00:00:29.600
So we're going to start
now a new unit.
00:00:29.600 --> 00:00:32.200
For the next couple of lectures,
we will be talking
00:00:32.200 --> 00:00:34.560
about continuous random
variables.
00:00:34.560 --> 00:00:36.520
So this is new material
which is not going
00:00:36.520 --> 00:00:37.400
to be in the quiz.
00:00:37.400 --> 00:00:41.170
You are going to have a long
break next week without any
00:00:41.170 --> 00:00:45.230
lecture, just a quiz and
recitation and tutorial.
00:00:45.230 --> 00:00:48.500
So what's going to happen
in this new unit?
00:00:48.500 --> 00:00:52.760
Basically, we want to do
everything that we did for
00:00:52.760 --> 00:00:56.610
discrete random variables,
reintroduce the same sort of
00:00:56.610 --> 00:00:59.510
concepts but see how they apply
and how they need to be
00:00:59.510 --> 00:01:02.840
modified in order to talk about
random variables that
00:01:02.840 --> 00:01:04.700
take continuous values.
00:01:04.700 --> 00:01:06.610
At some level, it's
all the same.
00:01:06.610 --> 00:01:10.340
At some level, it's quite a bit
harder because when things
00:01:10.340 --> 00:01:12.490
are continuous, calculus
comes in.
00:01:12.490 --> 00:01:14.770
So the calculations that you
have to do on the side
00:01:14.770 --> 00:01:17.760
sometimes need a little
bit more thinking.
00:01:17.760 --> 00:01:20.300
In terms of new concepts,
there's not going to be a
00:01:20.300 --> 00:01:24.200
whole lot today, some analogs
of things we have done.
00:01:24.200 --> 00:01:27.110
We're going to introduce the
concept of cumulative
00:01:27.110 --> 00:01:29.950
distribution functions, which
allows us to deal with
00:01:29.950 --> 00:01:32.750
discrete and continuous
random variables, all
00:01:32.750 --> 00:01:34.560
of them in one shot.
00:01:34.560 --> 00:01:37.890
And finally, introduce a famous
kind of continuous
00:01:37.890 --> 00:01:41.900
random variable, the normal
random variable.
00:01:41.900 --> 00:01:43.970
OK, so what's the story?
00:01:43.970 --> 00:01:46.970
Continuous random variables are
random variables that take
00:01:46.970 --> 00:01:50.350
values over the continuum.
00:01:50.350 --> 00:01:53.470
So the numerical value of the
random variable can be any
00:01:53.470 --> 00:01:55.240
real number.
00:01:55.240 --> 00:01:58.600
They don't take values just
in a discrete set.
00:01:58.600 --> 00:02:00.660
So we have our sample space.
00:02:00.660 --> 00:02:02.020
The experiment happens.
00:02:02.020 --> 00:02:05.730
We get some omega, a sample
point in the sample space.
00:02:05.730 --> 00:02:10.070
And once that point is
determined, it determines the
00:02:10.070 --> 00:02:12.500
numerical value of the
random variable.
00:02:12.500 --> 00:02:15.370
Remember, random variables are
functions on the sample space.
00:02:15.370 --> 00:02:17.020
You pick a sample point.
00:02:17.020 --> 00:02:19.690
This determines the numerical
value of the random variable.
00:02:19.690 --> 00:02:23.500
So that numerical value is going
to be some real number
00:02:23.500 --> 00:02:26.010
on that line.
00:02:26.010 --> 00:02:28.550
Now we want to say something
about the distribution of the
00:02:28.550 --> 00:02:29.290
random variable.
00:02:29.290 --> 00:02:31.970
We want to say which values are
more likely than others to
00:02:31.970 --> 00:02:34.060
occur in a certain sense.
00:02:34.060 --> 00:02:36.910
For example, you may be
interested in a particular
00:02:36.910 --> 00:02:40.090
event, the event that the random
variable takes values
00:02:40.090 --> 00:02:42.360
in the interval from a to b.
00:02:42.360 --> 00:02:43.950
And we want to say something
about the
00:02:43.950 --> 00:02:45.820
probability of that event.
00:02:45.820 --> 00:02:48.510
In principle, how
is this done?
00:02:48.510 --> 00:02:52.010
You go back to the sample space,
and you find all those
00:02:52.010 --> 00:02:56.790
outcomes for which the value of
the random variable happens
00:02:56.790 --> 00:02:58.500
to be in that interval.
00:02:58.500 --> 00:03:01.870
The probability that the random
variable falls here is
00:03:01.870 --> 00:03:06.070
the same as the probability of
all outcomes that make the
00:03:06.070 --> 00:03:08.290
random variable to
fall in there.
00:03:08.290 --> 00:03:11.190
So in principle, you can work on
the original sample space,
00:03:11.190 --> 00:03:14.750
find the probability of this
event, and you would be done.
00:03:14.750 --> 00:03:18.810
But similar to what happened in
chapter 2, we want to kind
00:03:18.810 --> 00:03:22.910
of push the sample space in the
background and just work
00:03:22.910 --> 00:03:26.890
directly on the real
axis and talk about
00:03:26.890 --> 00:03:28.640
probabilities up here.
00:03:28.640 --> 00:03:32.430
So we want now a way to specify
probabilities, how
00:03:32.430 --> 00:03:38.340
they are bunched together, or
arranged, along the real line.
00:03:38.340 --> 00:03:40.980
So what did we do for discrete
random variables?
00:03:40.980 --> 00:03:44.100
We introduced PMFs, probability
mass functions.
00:03:44.100 --> 00:03:47.100
And the way that we described
the random variable was by
00:03:47.100 --> 00:03:50.300
saying this point has so much
mass on top of it, that point
00:03:50.300 --> 00:03:52.790
has so much mass on top
of it, and so on.
00:03:52.790 --> 00:03:57.610
And so we assigned a total
amount of 1 unit of
00:03:57.610 --> 00:03:58.670
probability.
00:03:58.670 --> 00:04:01.810
We assigned it to different
masses, which we put at
00:04:01.810 --> 00:04:04.870
different points on
the real axis.
00:04:04.870 --> 00:04:08.070
So that's what you do if
somebody gives you a pound of
00:04:08.070 --> 00:04:11.910
discrete stuff, a pound of
mass in little chunks.
00:04:11.910 --> 00:04:15.300
And you place those chunks
at a few points.
00:04:15.300 --> 00:04:20.890
Now, in the continuous case,
this total unit of probability
00:04:20.890 --> 00:04:25.440
mass does not sit just on
discrete points but is spread
00:04:25.440 --> 00:04:28.140
all over the real axis.
00:04:28.140 --> 00:04:31.280
So now we're going to have a
unit of mass that spreads on
00:04:31.280 --> 00:04:32.510
top of the real axis.
00:04:32.510 --> 00:04:36.020
How do we describe masses that
are continuously spread?
00:04:36.020 --> 00:04:39.680
The way we describe them is
by specifying densities.
00:04:39.680 --> 00:04:43.800
That is, how thick is the mass
that's sitting here?
00:04:43.800 --> 00:04:46.210
How dense is the mass that's
sitting there?
00:04:46.210 --> 00:04:48.260
So that's exactly what
we're going to do.
00:04:48.260 --> 00:04:50.930
We're going to introduce the
concept of a probability
00:04:50.930 --> 00:04:55.340
density function that tells us
how probabilities accumulate
00:04:55.340 --> 00:04:59.270
at different parts
of the real axis.
00:05:03.780 --> 00:05:07.870
So here's an example or a
picture of a possible
00:05:07.870 --> 00:05:10.210
probability density function.
00:05:10.210 --> 00:05:13.210
What does that density function
kind of convey
00:05:13.210 --> 00:05:14.290
intuitively?
00:05:14.290 --> 00:05:17.510
Well, that these x's
are relatively
00:05:17.510 --> 00:05:19.160
less likely to occur.
00:05:19.160 --> 00:05:22.120
Those x's are somewhat more
likely to occur because the
00:05:22.120 --> 00:05:24.930
density is higher.
00:05:24.930 --> 00:05:27.950
Now, for a more formal
definition, we're going to say
00:05:27.950 --> 00:05:35.620
that a random variable X is said
to be continuous if it
00:05:35.620 --> 00:05:38.560
can be described by a
density function in
00:05:38.560 --> 00:05:40.780
the following sense.
00:05:40.780 --> 00:05:42.910
We have a density function.
00:05:42.910 --> 00:05:47.830
And we calculate probabilities
of falling inside an interval
00:05:47.830 --> 00:05:52.580
by finding the area under
the curve that sits
00:05:52.580 --> 00:05:54.940
on top of that interval.
00:05:54.940 --> 00:05:57.800
So that's sort of the defining
relation for
00:05:57.800 --> 00:05:59.190
continuous random variables.
00:05:59.190 --> 00:06:00.860
It's an implicit definition.
00:06:00.860 --> 00:06:03.870
And it tells us a random
variable is continuous if we
00:06:03.870 --> 00:06:06.560
can calculate probabilities
this way.
00:06:06.560 --> 00:06:09.520
So the probability of falling
in this interval is the area
00:06:09.520 --> 00:06:10.500
under this curve.
00:06:10.500 --> 00:06:14.950
Mathematically, it's the
integral of the density over
00:06:14.950 --> 00:06:17.020
this particular interval.
00:06:17.020 --> 00:06:20.410
If the density happens to be
constant over that interval,
00:06:20.410 --> 00:06:23.610
the area under the curve would
be the length of the interval
00:06:23.610 --> 00:06:26.440
times the height of
the density, which
00:06:26.440 --> 00:06:28.170
sort of makes sense.
00:06:28.170 --> 00:06:32.020
Now, because the density is not
constant but it kind of
00:06:32.020 --> 00:06:35.720
moves around, what you need is
to write down an integral.
00:06:35.720 --> 00:06:39.100
Now, this formula is very much
analogous to what you would do
00:06:39.100 --> 00:06:41.030
for discrete random variables.
00:06:41.030 --> 00:06:44.140
For a discrete random variable,
how do you calculate
00:06:44.140 --> 00:06:45.610
this probability?
00:06:45.610 --> 00:06:48.800
You look at all x's
in this interval.
00:06:48.800 --> 00:06:54.060
And you add the probability mass
function over that range.
00:06:54.060 --> 00:06:59.660
So just for comparison, this
would be the formula for the
00:06:59.660 --> 00:07:01.590
discrete case--
00:07:01.590 --> 00:07:05.620
the sum over all x's in the
interval from a to b over the
00:07:05.620 --> 00:07:09.420
probability mass function.
00:07:09.420 --> 00:07:12.650
And there is a syntactic analogy
that's happening here
00:07:12.650 --> 00:07:16.160
and which will be a persistent
theme when we deal with
00:07:16.160 --> 00:07:18.920
continuous random variables.
00:07:18.920 --> 00:07:22.620
Sums get replaced
by integrals.
00:07:22.620 --> 00:07:24.110
In the discrete case, you add.
00:07:24.110 --> 00:07:26.920
In the continuous case,
you integrate.
00:07:26.920 --> 00:07:31.600
Mass functions get replaced
by density functions.
00:07:31.600 --> 00:07:35.500
So you can take pretty much any
formula from the discrete
00:07:35.500 --> 00:07:40.020
case and translate it to a
continuous analog of that
00:07:40.020 --> 00:07:41.480
formula, as we're
going to see.
00:07:43.990 --> 00:07:45.240
OK.
00:07:47.250 --> 00:07:50.040
So let's take this
now as our model.
00:07:50.040 --> 00:07:53.220
What is the probability that
the random variable takes a
00:07:53.220 --> 00:07:58.440
specific value if we have a
continuous random variable?
00:07:58.440 --> 00:08:00.200
Well, this would be the case.
00:08:00.200 --> 00:08:02.880
It's a case of a trivial
interval, where the two end
00:08:02.880 --> 00:08:04.660
points coincide.
00:08:04.660 --> 00:08:07.670
So it would be the integral
from a to itself.
00:08:07.670 --> 00:08:10.520
So you're integrating just
over a single point.
00:08:10.520 --> 00:08:12.790
Now, when you integrate over
a single point, the
00:08:12.790 --> 00:08:14.600
integral is just 0.
00:08:14.600 --> 00:08:17.980
The area under the curve, if
you're only looking at a
00:08:17.980 --> 00:08:19.560
single point, it's 0.
00:08:19.560 --> 00:08:22.670
So big property of continuous
random variables is that any
00:08:22.670 --> 00:08:26.940
individual point has
0 probability.
00:08:26.940 --> 00:08:30.740
In particular, when you look at
the value of the density,
00:08:30.740 --> 00:08:35.299
the density does not tell you
the probability of that point.
00:08:35.299 --> 00:08:37.860
The point itself has
0 probability.
00:08:37.860 --> 00:08:42.409
So the density tells you
something a little different.
00:08:42.409 --> 00:08:44.645
We are going to see shortly
what that is.
00:08:47.390 --> 00:08:52.070
Before we get there,
can the density be
00:08:52.070 --> 00:08:54.410
an arbitrary function?
00:08:54.410 --> 00:08:56.160
Almost, but not quite.
00:08:56.160 --> 00:08:57.650
There are two things
that we want.
00:08:57.650 --> 00:09:00.310
First, since densities
are used to calculate
00:09:00.310 --> 00:09:02.690
probabilities, and since
probabilities must be
00:09:02.690 --> 00:09:06.840
non-negative, the density should
also be non-negative.
00:09:06.840 --> 00:09:10.960
Otherwise you would be getting
negative probabilities, which
00:09:10.960 --> 00:09:13.360
is not a good thing.
00:09:13.360 --> 00:09:16.930
So that's a basic property
that any density function
00:09:16.930 --> 00:09:18.640
should obey.
00:09:18.640 --> 00:09:21.970
The second property that we
need is that the overall
00:09:21.970 --> 00:09:25.210
probability of the entire real
line should be equal to 1.
00:09:25.210 --> 00:09:27.980
So if you ask me, what is the
probability that x falls
00:09:27.980 --> 00:09:30.760
between minus infinity and plus
infinity, well, we are
00:09:30.760 --> 00:09:33.590
sure that x is going to
fall in that range.
00:09:33.590 --> 00:09:37.400
So the probability of that
event should be 1.
00:09:37.400 --> 00:09:40.480
So the probability of being
between minus infinity and
00:09:40.480 --> 00:09:43.600
plus infinity should be 1, which
means that the integral
00:09:43.600 --> 00:09:46.410
from minus infinity to plus
infinity should be 1.
00:09:46.410 --> 00:09:50.460
So that just tells us that
there's 1 unit of total
00:09:50.460 --> 00:09:54.690
probability that's being
spread over our space.
00:09:54.690 --> 00:09:59.000
Now, what's the best way to
think intuitively about what
00:09:59.000 --> 00:10:01.480
the density function does?
00:10:01.480 --> 00:10:06.470
The interpretation that I find
most natural and easy to
00:10:06.470 --> 00:10:10.300
convey the meaning of a
density is to look at
00:10:10.300 --> 00:10:13.220
probabilities of small
intervals.
00:10:13.220 --> 00:10:18.850
So let us take an x somewhere
here and then x plus delta
00:10:18.850 --> 00:10:20.230
just next to it.
00:10:20.230 --> 00:10:23.050
So delta is a small number.
00:10:23.050 --> 00:10:26.460
And let's look at the
probability of the event that
00:10:26.460 --> 00:10:29.750
we get a value in that range.
00:10:29.750 --> 00:10:32.220
For continuous random variables,
the way we find the
00:10:32.220 --> 00:10:35.270
probability of falling in that
range is by integrating the
00:10:35.270 --> 00:10:37.550
density over that range.
00:10:37.550 --> 00:10:41.610
So we're drawing this picture.
00:10:41.610 --> 00:10:46.060
And we want to take the
area under this curve.
00:10:46.060 --> 00:10:50.760
Now, what happens if delta
is a fairly small number?
00:10:50.760 --> 00:10:55.030
If delta is pretty small, our
density is not going to change
00:10:55.030 --> 00:10:57.040
much over that range.
00:10:57.040 --> 00:10:59.330
So you can pretend that
the density is
00:10:59.330 --> 00:11:01.230
approximately constant.
00:11:01.230 --> 00:11:04.550
And so to find the area under
the curve, you just take the
00:11:04.550 --> 00:11:07.760
base times the height.
00:11:07.760 --> 00:11:10.630
And it doesn't matter where
exactly you take the height in
00:11:10.630 --> 00:11:13.140
that interval, because the
density doesn't change very
00:11:13.140 --> 00:11:15.370
much over that interval.
00:11:15.370 --> 00:11:19.760
And so the integral becomes just
base times the height.
00:11:19.760 --> 00:11:24.020
So for small intervals, the
probability of a small
00:11:24.020 --> 00:11:30.170
interval is approximately
the density times delta.
00:11:30.170 --> 00:11:32.340
So densities essentially
give us
00:11:32.340 --> 00:11:34.670
probabilities of small intervals.
00:11:34.670 --> 00:11:38.100
And if you want to think about
it a little differently, you
00:11:38.100 --> 00:11:41.020
can take that delta from
here and send it to
00:11:41.020 --> 00:11:43.960
the denominator there.
00:11:43.960 --> 00:11:48.880
And what this tells you
is that the density is
00:11:48.880 --> 00:11:55.270
probability per unit length for
intervals of small length.
00:11:55.270 --> 00:11:59.860
So the units of density are
probability per unit length.
00:11:59.860 --> 00:12:01.420
Densities are not
probabilities.
00:12:01.420 --> 00:12:04.430
They are rates at which
probabilities accumulate,
00:12:04.430 --> 00:12:06.780
probabilities per unit length.
00:12:06.780 --> 00:12:09.780
And since densities are not
probabilities, they don't have
00:12:09.780 --> 00:12:11.960
to be less than 1.
00:12:11.960 --> 00:12:14.730
Ordinary probabilities always
must be less than 1.
00:12:14.730 --> 00:12:18.000
But density is a different
kind of thing.
00:12:18.000 --> 00:12:20.530
It can get pretty big
in some places.
00:12:20.530 --> 00:12:23.680
It can even sort of blow
up in some places.
00:12:23.680 --> 00:12:27.620
As long as the total area under
the curve is 1, other
00:12:27.620 --> 00:12:32.830
than that, the curve can do
anything that it wants.
00:12:32.830 --> 00:12:35.930
Now, the density prescribes
for us the
00:12:35.930 --> 00:12:41.620
probability of intervals.
00:12:41.620 --> 00:12:44.710
Sometimes we may want to find
the probability of more
00:12:44.710 --> 00:12:46.540
general sets.
00:12:46.540 --> 00:12:47.780
How would we do that?
00:12:47.780 --> 00:12:51.580
Well, for nice sets, you will
just integrate the density
00:12:51.580 --> 00:12:54.260
over that nice set.
00:12:54.260 --> 00:12:56.640
I'm not quite defining
what "nice" means.
00:12:56.640 --> 00:12:59.140
That's a pretty technical
topic in the theory of
00:12:59.140 --> 00:13:00.160
probability.
00:13:00.160 --> 00:13:04.530
But for our purposes, usually we
will take b to be something
00:13:04.530 --> 00:13:06.500
like a union of intervals.
00:13:06.500 --> 00:13:10.200
So how do you find the
probability of falling in the
00:13:10.200 --> 00:13:11.690
union of two intervals?
00:13:11.690 --> 00:13:14.180
Well, you find the probability
of falling in that interval
00:13:14.180 --> 00:13:16.240
plus the probability of falling
in that interval.
00:13:16.240 --> 00:13:19.150
So it's the integral over this
interval plus the integral
00:13:19.150 --> 00:13:20.500
over that interval.
00:13:20.500 --> 00:13:24.370
And you think of this as just
integrating over the union of
00:13:24.370 --> 00:13:25.730
the two intervals.
00:13:25.730 --> 00:13:28.580
So once you can calculate
probabilities of intervals,
00:13:28.580 --> 00:13:30.590
then usually you are in
business, and you can
00:13:30.590 --> 00:13:34.000
calculate anything else
you might want.
00:13:34.000 --> 00:13:36.330
So the probability density
function is a complete
00:13:36.330 --> 00:13:39.530
description of any statistical
information we might be
00:13:39.530 --> 00:13:44.425
interested in for a continuous
random variable.
00:13:44.425 --> 00:13:44.880
OK.
00:13:44.880 --> 00:13:47.330
So now we can start walking
through the concepts and the
00:13:47.330 --> 00:13:51.730
definitions that we have for
discrete random variables and
00:13:51.730 --> 00:13:54.230
translate them to the
continuous case.
00:13:54.230 --> 00:13:58.960
The first big concept is the
concept of the expectation.
00:13:58.960 --> 00:14:01.680
One can start with a
mathematical definition.
00:14:01.680 --> 00:14:04.810
And here we put down
a definition by
00:14:04.810 --> 00:14:07.730
just translating notation.
00:14:07.730 --> 00:14:11.160
Wherever we have a sum in
the discrete case, we
00:14:11.160 --> 00:14:13.060
now write an integral.
00:14:13.060 --> 00:14:16.310
And wherever we had the
probability mass function, we
00:14:16.310 --> 00:14:20.570
now throw in the probability
density function.
00:14:20.570 --> 00:14:22.010
This formula--
00:14:22.010 --> 00:14:24.200
you may have seen it in
freshman physics--
00:14:24.200 --> 00:14:28.190
basically, it again gives you
the center of gravity of the
00:14:28.190 --> 00:14:31.150
picture that you have when
you have the density.
00:14:31.150 --> 00:14:36.460
It's the center of gravity of
the object sitting underneath
00:14:36.460 --> 00:14:38.220
the probability density
function.
00:14:38.220 --> 00:14:40.900
So that the interpretation
still applies.
00:14:40.900 --> 00:14:44.120
It's also true that our
conceptual interpretation of
00:14:44.120 --> 00:14:47.820
what an expectation means is
also valid in this case.
00:14:47.820 --> 00:14:51.770
That is, if you repeat an
experiment a zillion times,
00:14:51.770 --> 00:14:54.100
each time drawing an independent
sample of your
00:14:54.100 --> 00:14:58.500
random variable x, in the long
run, the average that you are
00:14:58.500 --> 00:15:01.860
going to get should be
the expectation.
00:15:01.860 --> 00:15:04.740
One can reason in a hand-waving
way, sort of
00:15:04.740 --> 00:15:07.440
intuitively, the way we did it
for the case of discrete
00:15:07.440 --> 00:15:08.770
random variables.
00:15:08.770 --> 00:15:11.940
But this is also a theorem
of some sort.
00:15:11.940 --> 00:15:15.300
It's a limit theorem that we're
going to visit later on
00:15:15.300 --> 00:15:17.530
in this class.
00:15:17.530 --> 00:15:20.700
Having defined the expectation
and having claimed that the
00:15:20.700 --> 00:15:23.100
interpretation of the
expectation is that same as
00:15:23.100 --> 00:15:26.810
before, then we can start taking
just any formula you've
00:15:26.810 --> 00:15:28.580
seen before and just
translate it.
00:15:28.580 --> 00:15:31.200
So for example, to find the
expected value of a function
00:15:31.200 --> 00:15:35.430
of a continuous random variable,
you do not have to
00:15:35.430 --> 00:15:39.130
find the PDF or PMF of g(X).
00:15:39.130 --> 00:15:43.040
You can just work directly with
the original distribution
00:15:43.040 --> 00:15:44.990
of the random variable
capital X.
00:15:44.990 --> 00:15:48.570
And this formula is the same
as for the discrete case.
00:15:48.570 --> 00:15:50.880
Sums get replaced
by integrals.
00:15:50.880 --> 00:15:54.340
And PMFs get replaced by PDFs.
00:15:54.340 --> 00:15:57.050
And in particular, the variance
of a random variable
00:15:57.050 --> 00:15:59.080
is defined again the same way.
00:15:59.080 --> 00:16:03.390
The variance is the expected
value, the average of the
00:16:03.390 --> 00:16:07.920
distance of X from the mean
and then squared.
00:16:07.920 --> 00:16:10.690
So it's the expected value for
a random variable that takes
00:16:10.690 --> 00:16:12.500
these numerical values.
00:16:12.500 --> 00:16:17.250
And same formula as before,
integral and F instead of
00:16:17.250 --> 00:16:19.420
summation, and the P.
00:16:19.420 --> 00:16:23.090
And the formulas that we have
derived or formulas that you
00:16:23.090 --> 00:16:26.260
have seen for the discrete case,
they all go through the
00:16:26.260 --> 00:16:27.090
continuous case.
00:16:27.090 --> 00:16:31.990
So for example, the useful
relation for variances, which
00:16:31.990 --> 00:16:37.410
is this one, remains true.
00:16:37.410 --> 00:16:37.850
All right.
00:16:37.850 --> 00:16:39.790
So time for an example.
00:16:39.790 --> 00:16:43.500
The most simple example of a
continuous random variable
00:16:43.500 --> 00:16:45.170
that there is, is the so-called
00:16:45.170 --> 00:16:48.670
uniform random variable.
00:16:48.670 --> 00:16:51.940
So the uniform random variable
is described by a density
00:16:51.940 --> 00:16:55.540
which is 0 except over
an interval.
00:16:55.540 --> 00:16:58.360
And over that interval,
it is constant.
00:16:58.360 --> 00:17:00.190
What is it meant to convey?
00:17:00.190 --> 00:17:04.829
It's trying to convey the idea
that all x's in this range are
00:17:04.829 --> 00:17:06.540
equally likely.
00:17:06.540 --> 00:17:08.390
Well, that doesn't
say very much.
00:17:08.390 --> 00:17:11.170
Any individual x has
0 probability.
00:17:11.170 --> 00:17:13.460
So it's conveying a little
more than that.
00:17:13.460 --> 00:17:18.000
What it is saying is that if I
take an interval of a given
00:17:18.000 --> 00:17:22.089
length delta, and I take another
interval of the same
00:17:22.089 --> 00:17:26.290
length, delta, under the uniform
distribution, these
00:17:26.290 --> 00:17:29.290
two intervals are going to have
the same probability.
00:17:29.290 --> 00:17:34.670
So being uniform means that
intervals of same length have
00:17:34.670 --> 00:17:35.720
the same probability.
00:17:35.720 --> 00:17:40.390
So no interval is more likely
than any other to occur.
00:17:40.390 --> 00:17:44.200
And in that sense, it conveys
the idea of sort of complete
00:17:44.200 --> 00:17:45.100
randomness.
00:17:45.100 --> 00:17:48.430
Any little interval in our range
is equally likely as any
00:17:48.430 --> 00:17:49.830
other little interval.
00:17:49.830 --> 00:17:50.260
All right.
00:17:50.260 --> 00:17:53.870
So what's the formula
for this density?
00:17:53.870 --> 00:17:55.280
I only told you the range.
00:17:55.280 --> 00:17:57.490
What's the height?
00:17:57.490 --> 00:18:00.340
Well, the area under the density
must be equal to 1.
00:18:00.340 --> 00:18:02.700
Total probability
is equal to 1.
00:18:02.700 --> 00:18:07.100
And so the height, inescapably,
is going to be 1
00:18:07.100 --> 00:18:09.480
over (b minus a).
00:18:09.480 --> 00:18:14.880
That's the height that makes
the density integrate to 1.
00:18:14.880 --> 00:18:16.610
So that's the formula.
00:18:16.610 --> 00:18:21.240
And if you don't want to lose
one point in your exam, you
00:18:21.240 --> 00:18:25.946
have to say that it's
also 0, otherwise.
00:18:25.946 --> 00:18:27.794
OK.
00:18:27.794 --> 00:18:28.260
All right?
00:18:28.260 --> 00:18:31.760
That's sort of the
complete answer.
00:18:31.760 --> 00:18:35.590
How about the expected value
of this random variable?
00:18:35.590 --> 00:18:36.060
OK.
00:18:36.060 --> 00:18:39.730
You can find the expected value
in two different ways.
00:18:39.730 --> 00:18:42.400
One is to start with
the definition.
00:18:42.400 --> 00:18:45.220
And so you integrate
over the range of
00:18:45.220 --> 00:18:47.185
interest times the density.
00:18:50.350 --> 00:18:55.460
And you figure out what that
integral is going to be.
00:18:55.460 --> 00:18:57.800
Or you can be a little
more clever.
00:18:57.800 --> 00:19:01.290
Since the center-of-gravity
interpretation is still true,
00:19:01.290 --> 00:19:03.890
it must be the center of gravity
of this picture.
00:19:03.890 --> 00:19:06.680
And the center of gravity is,
of course, the midpoint.
00:19:06.680 --> 00:19:11.740
Whenever you have symmetry,
the mean is always the
00:19:11.740 --> 00:19:20.630
midpoint of the diagram that
gives you the PDF.
00:19:20.630 --> 00:19:22.180
OK.
00:19:22.180 --> 00:19:24.870
So that's the expected
value of X.
00:19:24.870 --> 00:19:27.990
Finally, regarding the variance,
well, there you will
00:19:27.990 --> 00:19:30.240
have to do a little
bit of calculus.
00:19:30.240 --> 00:19:33.460
We can write down
the definition.
00:19:33.460 --> 00:19:35.930
So it's an integral
instead of a sum.
00:19:35.930 --> 00:19:40.590
A typical value of the random
variable minus the expected
00:19:40.590 --> 00:19:44.280
value, squared, times
the density.
00:19:44.280 --> 00:19:45.650
And we integrate.
00:19:45.650 --> 00:19:48.820
You do this integral, and you
find it's (b minus a) squared
00:19:48.820 --> 00:19:52.660
over that number, which
happens to be 12.
00:19:52.660 --> 00:19:56.140
Maybe more interesting is the
standard deviation itself.
00:19:59.140 --> 00:20:02.760
And you see that the standard
deviation is proportional to
00:20:02.760 --> 00:20:05.280
the width of that interval.
00:20:05.280 --> 00:20:07.850
This agrees with our intuition,
that the standard
00:20:07.850 --> 00:20:12.730
deviation is meant to capture a
sense of how spread out our
00:20:12.730 --> 00:20:14.000
distribution is.
00:20:14.000 --> 00:20:17.370
And the standard deviation has
the same units as the random
00:20:17.370 --> 00:20:19.040
variable itself.
00:20:19.040 --> 00:20:22.860
So it's sort of good to-- you
can interpret it in a
00:20:22.860 --> 00:20:27.180
reasonable way based
on that picture.
00:20:27.180 --> 00:20:30.890
OK, yes.
00:20:30.890 --> 00:20:38.280
Now, let's go up one level and
think about the following.
00:20:38.280 --> 00:20:41.740
So we have formulas for the
discrete case, formulas for
00:20:41.740 --> 00:20:42.690
the continuous case.
00:20:42.690 --> 00:20:44.420
So you can write them
side by side.
00:20:44.420 --> 00:20:47.100
One has sums, the other
has integrals.
00:20:47.100 --> 00:20:49.450
Suppose you want to make an
argument and say that
00:20:49.450 --> 00:20:52.160
something is true for every
random variable.
00:20:52.160 --> 00:20:55.770
You would essentially need to
do two separate proofs, for
00:20:55.770 --> 00:20:57.510
discrete and for continuous.
00:20:57.510 --> 00:21:00.400
Is there some way of dealing
with random variables just one
00:21:00.400 --> 00:21:05.130
at a time, in one shot, using
a sort of uniform notation?
00:21:05.130 --> 00:21:07.990
Is there a unifying concept?
00:21:07.990 --> 00:21:10.170
Luckily, there is one.
00:21:10.170 --> 00:21:12.400
It's the notion of the
cumulative distribution
00:21:12.400 --> 00:21:13.850
function of a random variable.
00:21:16.400 --> 00:21:20.730
And it's a concept that applies
equally well to
00:21:20.730 --> 00:21:22.890
discrete and continuous
random variables.
00:21:22.890 --> 00:21:26.210
So it's an object that we can
use to describe distributions
00:21:26.210 --> 00:21:29.340
in both cases, using just
one piece of notation.
00:21:32.070 --> 00:21:33.600
So what's the definition?
00:21:33.600 --> 00:21:36.290
It's the probability that the
random variable takes values
00:21:36.290 --> 00:21:39.030
less than a certain
number little x.
00:21:39.030 --> 00:21:41.440
So you go to the diagram, and
you see what's the probability
00:21:41.440 --> 00:21:44.060
that I'm falling to
the left of this.
00:21:44.060 --> 00:21:47.680
And you specify those
probabilities for all x's.
00:21:47.680 --> 00:21:51.400
In the continuous case, you
calculate those probabilities
00:21:51.400 --> 00:21:53.090
using the integral formula.
00:21:53.090 --> 00:21:55.730
So you integrate from
here up to x.
00:21:55.730 --> 00:21:58.850
In the discrete case, to find
the probability to the left of
00:21:58.850 --> 00:22:02.790
some point, you go here, and
you add probabilities again
00:22:02.790 --> 00:22:03.980
from the left.
00:22:03.980 --> 00:22:06.770
So the way that the cumulative
distribution function is
00:22:06.770 --> 00:22:10.010
calculated is a little different
in the continuous
00:22:10.010 --> 00:22:10.850
and discrete case.
00:22:10.850 --> 00:22:11.990
In one case you integrate.
00:22:11.990 --> 00:22:13.440
In the other, you sum.
00:22:13.440 --> 00:22:18.340
But leaving aside how it's being
calculated, what the
00:22:18.340 --> 00:22:22.530
concept is, it's the same
concept in both cases.
00:22:22.530 --> 00:22:25.810
So let's see what the shape of
the cumulative distribution
00:22:25.810 --> 00:22:28.360
function would be in
the two cases.
00:22:28.360 --> 00:22:34.100
So here what we want is to
record for every little x the
00:22:34.100 --> 00:22:36.760
probability of falling
to the left of x.
00:22:36.760 --> 00:22:38.240
So let's start here.
00:22:38.240 --> 00:22:41.580
Probability of falling to
the left of here is 0--
00:22:41.580 --> 00:22:43.550
0, 0, 0.
00:22:43.550 --> 00:22:47.280
Once we get here and we start
moving to the right, the
00:22:47.280 --> 00:22:51.750
probability of falling to the
left of here is the area of
00:22:51.750 --> 00:22:53.610
this little rectangle.
00:22:53.610 --> 00:22:57.590
And the area of that little
rectangle increases linearly
00:22:57.590 --> 00:22:59.290
as I keep moving.
00:22:59.290 --> 00:23:03.780
So accordingly, the CDF
increases linearly until I get
00:23:03.780 --> 00:23:04.870
to that point.
00:23:04.870 --> 00:23:08.670
At that point, what's
the value of my CDF?
00:23:08.670 --> 00:23:09.020
1.
00:23:09.020 --> 00:23:11.400
I have accumulated all the
probability there is.
00:23:11.400 --> 00:23:13.180
I have integrated it.
00:23:13.180 --> 00:23:15.890
This total area has
to be equal to 1.
00:23:15.890 --> 00:23:18.780
So it reaches 1, and then
there's no more probability to
00:23:18.780 --> 00:23:20.040
be accumulated.
00:23:20.040 --> 00:23:23.170
It just stays at 1.
00:23:23.170 --> 00:23:28.050
So the value here
is equal to 1.
00:23:28.050 --> 00:23:30.270
OK.
00:23:30.270 --> 00:23:36.716
How would you find the density
if somebody gave you the CDF?
00:23:36.716 --> 00:23:39.570
The CDF is the integral
of the density.
00:23:39.570 --> 00:23:43.820
Therefore, the density is the
derivative of the CDF.
00:23:43.820 --> 00:23:46.190
So you look at this picture
and take the derivative.
00:23:46.190 --> 00:23:48.580
Derivative is 0 here, 0 here.
00:23:48.580 --> 00:23:51.330
And it's a constant
up there, which
00:23:51.330 --> 00:23:53.120
corresponds to that constant.
00:23:53.120 --> 00:23:56.900
So more generally, and an
important thing to know, is
00:23:56.900 --> 00:24:04.250
that the derivative of the CDF
is equal to the density--
00:24:10.210 --> 00:24:14.170
almost, with a little
bit of an exception.
00:24:14.170 --> 00:24:15.800
What's the exception?
00:24:15.800 --> 00:24:19.200
At those places where the CDF
does not have a derivative--
00:24:19.200 --> 00:24:21.520
here where it has a corner--
00:24:21.520 --> 00:24:23.720
the derivative is undefined.
00:24:23.720 --> 00:24:26.030
And in some sense, the
density is also
00:24:26.030 --> 00:24:27.460
ambiguous at that point.
00:24:27.460 --> 00:24:31.860
Is my density at the endpoint,
is it 0 or is it 1?
00:24:31.860 --> 00:24:33.330
It doesn't really matter.
00:24:33.330 --> 00:24:36.670
If you change the density at
just a single point, it's not
00:24:36.670 --> 00:24:39.000
going to affect the
value of any
00:24:39.000 --> 00:24:41.530
integral you ever calculate.
00:24:41.530 --> 00:24:44.900
So the value of the density at
the endpoint, you can leave it
00:24:44.900 --> 00:24:47.390
as being ambiguous, or
you can specify it.
00:24:47.390 --> 00:24:49.130
It doesn't matter.
00:24:49.130 --> 00:24:53.590
So at all places where the
CDF has a derivative,
00:24:53.590 --> 00:24:54.970
this will be true.
00:24:54.970 --> 00:24:58.470
At those places where you have
corners, which do show up
00:24:58.470 --> 00:25:01.740
sometimes, well, you
don't really care.
00:25:01.740 --> 00:25:03.640
How about the discrete case?
00:25:03.640 --> 00:25:07.450
In the discrete case, the CDF
has a more peculiar shape.
00:25:07.450 --> 00:25:08.870
So let's do the calculation.
00:25:08.870 --> 00:25:10.440
We want to find the
probability of b
00:25:10.440 --> 00:25:11.920
to the left of here.
00:25:11.920 --> 00:25:13.970
That probability is 0, 0, 0.
00:25:13.970 --> 00:25:16.170
Once we cross that point, the
probability of being to the
00:25:16.170 --> 00:25:19.140
left of here is 1/6.
00:25:19.140 --> 00:25:22.030
So as soon as we cross the
point 1, we get the
00:25:22.030 --> 00:25:25.740
probability of 1/6, which means
that the size of the
00:25:25.740 --> 00:25:29.230
jump that we have here is 1/6.
00:25:29.230 --> 00:25:31.020
Now, question.
00:25:31.020 --> 00:25:35.175
At this point 1, which is the
correct value of the CDF?
00:25:35.175 --> 00:25:39.090
Is it 0, or is it 1/6?
00:25:39.090 --> 00:25:40.560
It's 1/6 because--
00:25:40.560 --> 00:25:42.540
you need to look carefully
at the definitions, the
00:25:42.540 --> 00:25:46.180
probability of x being less
than or equal to little x.
00:25:46.180 --> 00:25:49.230
If I take little x to be 1,
it's the probability that
00:25:49.230 --> 00:25:51.900
capital X is less than
or equal to 1.
00:25:51.900 --> 00:25:55.730
So it includes the event
that x is equal to 1.
00:25:55.730 --> 00:25:58.130
So it includes this
probability here.
00:25:58.130 --> 00:26:02.710
So at jump points, the correct
value of the CDF is going to
00:26:02.710 --> 00:26:04.650
be this one.
00:26:04.650 --> 00:26:08.130
And now as I trace, x is
going to the right.
00:26:08.130 --> 00:26:12.750
As soon as I cross this point,
I have added another 3/6
00:26:12.750 --> 00:26:14.180
probability.
00:26:14.180 --> 00:26:20.350
So that 3/6 causes a
jump to the CDF.
00:26:20.350 --> 00:26:23.280
And that determines
the new value.
00:26:23.280 --> 00:26:27.860
And finally, once I cross
the last point, I get
00:26:27.860 --> 00:26:31.631
another jump of 2/6.
00:26:31.631 --> 00:26:35.900
A general moral from these two
examples and these pictures.
00:26:35.900 --> 00:26:39.270
CDFs are well defined
in both cases.
00:26:39.270 --> 00:26:42.490
For the case of continuous
random variables, the CDF will
00:26:42.490 --> 00:26:45.000
be a continuous function.
00:26:45.000 --> 00:26:46.330
It starts from 0.
00:26:46.330 --> 00:26:49.760
It eventually goes to 1
and goes smoothly--
00:26:49.760 --> 00:26:54.100
well, continuously from smaller
to higher values.
00:26:54.100 --> 00:26:55.200
It can only go up.
00:26:55.200 --> 00:26:58.300
It cannot go down since we're
accumulating more and more
00:26:58.300 --> 00:27:00.230
probability as we are
going to the right.
00:27:00.230 --> 00:27:03.160
In the discrete case, again
it starts from 0,
00:27:03.160 --> 00:27:04.610
and it goes to 1.
00:27:04.610 --> 00:27:07.740
But it does it in a
staircase manner.
00:27:07.740 --> 00:27:13.050
And you get a jump at each place
where the PMF assigns a
00:27:13.050 --> 00:27:14.660
positive mass.
00:27:14.660 --> 00:27:19.560
So jumps in the CDF are
associated with point masses
00:27:19.560 --> 00:27:20.330
in our distribution.
00:27:20.330 --> 00:27:23.570
In the continuous case, we don't
have any point masses,
00:27:23.570 --> 00:27:25.470
so we do not have any
jumps either.
00:27:30.390 --> 00:27:33.300
Now, besides saving
us notation--
00:27:33.300 --> 00:27:36.020
we don't have to deal
with discrete
00:27:36.020 --> 00:27:39.000
and continuous twice--
00:27:39.000 --> 00:27:43.240
CDFs give us actually a little
more flexibility.
00:27:43.240 --> 00:27:46.840
Not all random variables are
continuous or discrete.
00:27:46.840 --> 00:27:49.790
You can cook up random variables
that are kind of
00:27:49.790 --> 00:27:53.410
neither or a mixture
of the two.
00:27:53.410 --> 00:27:59.540
An example would be, let's
say you play a game.
00:27:59.540 --> 00:28:03.620
And with a certain probability,
you get a certain
00:28:03.620 --> 00:28:05.690
number of dollars
in your hands.
00:28:05.690 --> 00:28:07.000
So you flip a coin.
00:28:07.000 --> 00:28:14.120
And with probability 1/2, you
get a reward of 1/2 dollars.
00:28:14.120 --> 00:28:18.430
And with probability 1/2, you
are led to a dark room where
00:28:18.430 --> 00:28:20.580
you spin a wheel of fortune.
00:28:20.580 --> 00:28:23.410
And that wheel of fortune gives
you a random reward
00:28:23.410 --> 00:28:25.610
between 0 and 1.
00:28:25.610 --> 00:28:28.600
So any of these outcomes
is possible.
00:28:28.600 --> 00:28:31.100
And the amount that you're
going to get,
00:28:31.100 --> 00:28:33.930
let's say, is uniform.
00:28:33.930 --> 00:28:35.640
So you flip a coin.
00:28:35.640 --> 00:28:38.360
And depending on the outcome of
the coin, either you get a
00:28:38.360 --> 00:28:43.530
certain value or you get a
value that ranges over a
00:28:43.530 --> 00:28:45.360
continuous interval.
00:28:45.360 --> 00:28:48.380
So what kind of random
variable is it?
00:28:48.380 --> 00:28:50.280
Is it continuous?
00:28:50.280 --> 00:28:54.100
Well, continuous random
variables assign 0 probability
00:28:54.100 --> 00:28:56.180
to individual points.
00:28:56.180 --> 00:28:58.020
Is it the case here?
00:28:58.020 --> 00:29:00.680
No, because you have positive
probability of
00:29:00.680 --> 00:29:04.740
obtaining 1/2 dollar.
00:29:04.740 --> 00:29:07.040
So our random variable
is not continuous.
00:29:07.040 --> 00:29:08.220
Is it discrete?
00:29:08.220 --> 00:29:11.600
It's not discrete, because our
random variable can take
00:29:11.600 --> 00:29:14.260
values also over a
continuous range.
00:29:14.260 --> 00:29:16.780
So we call such a random
variable a
00:29:16.780 --> 00:29:19.380
mixed random variable.
00:29:19.380 --> 00:29:27.200
If you were to draw its
distribution very loosely,
00:29:27.200 --> 00:29:33.740
probably you would want to draw
a picture like this one,
00:29:33.740 --> 00:29:36.710
which kind of conveys the
idea of what's going on.
00:29:36.710 --> 00:29:39.690
So just think of this as a
drawing of masses that are
00:29:39.690 --> 00:29:41.840
sitting over a table.
00:29:41.840 --> 00:29:47.940
We place an object that weighs
half a pound, but it's an
00:29:47.940 --> 00:29:50.230
object that takes zero space.
00:29:50.230 --> 00:29:53.720
So half a pound is just sitting
on top of that point.
00:29:53.720 --> 00:29:57.980
And we take another half-pound
of probability and spread it
00:29:57.980 --> 00:30:00.740
uniformly over that interval.
00:30:00.740 --> 00:30:04.820
So this is like a piece that
comes from mass functions.
00:30:04.820 --> 00:30:08.060
And that's a piece that looks
more like a density function.
00:30:08.060 --> 00:30:10.920
And we just throw them together
in the picture.
00:30:10.920 --> 00:30:13.150
I'm not trying to associate
any formal
00:30:13.150 --> 00:30:14.310
meaning with this picture.
00:30:14.310 --> 00:30:18.410
It's just a schematic of how
probabilities are distributed,
00:30:18.410 --> 00:30:20.860
help us visualize
what's going on.
00:30:20.860 --> 00:30:26.080
Now, if you have taken classes
on systems and all of that,
00:30:26.080 --> 00:30:29.890
you may have seen the concept
of an impulse function.
00:30:29.890 --> 00:30:33.630
And you my start saying that,
oh, I should treat this
00:30:33.630 --> 00:30:36.190
mathematically as a so-called
impulse function.
00:30:36.190 --> 00:30:39.400
But we do not need this for our
purposes in this class.
00:30:39.400 --> 00:30:43.860
Just think of this as a nice
picture that conveys what's
00:30:43.860 --> 00:30:46.200
going on in this particular
case.
00:30:46.200 --> 00:30:51.740
So now, what would the CDF
look like in this case?
00:30:51.740 --> 00:30:55.550
The CDF is always well defined,
no matter what kind
00:30:55.550 --> 00:30:57.220
of random variable you have.
00:30:57.220 --> 00:30:59.540
So the fact that it's not
continuous, it's not discrete
00:30:59.540 --> 00:31:01.870
shouldn't be a problem as
long as we can calculate
00:31:01.870 --> 00:31:04.120
probabilities of this kind.
00:31:04.120 --> 00:31:07.600
So the probability of falling
to the left here is 0.
00:31:07.600 --> 00:31:10.850
Once I start crossing there, the
probability of falling to
00:31:10.850 --> 00:31:13.890
the left of a point increases
linearly with
00:31:13.890 --> 00:31:15.610
how far I have gone.
00:31:15.610 --> 00:31:17.900
So we get this linear
increase.
00:31:17.900 --> 00:31:21.250
But as soon as I cross that
point, I accumulate another
00:31:21.250 --> 00:31:24.220
1/2 unit of probability
instantly.
00:31:24.220 --> 00:31:27.860
And once I accumulate that 1/2
unit, it means that my CDF is
00:31:27.860 --> 00:31:30.320
going to have a jump of 1/2.
00:31:30.320 --> 00:31:33.780
And then afterwards, I still
keep accumulating probability
00:31:33.780 --> 00:31:36.760
at a fixed rate, the rate
being the density.
00:31:36.760 --> 00:31:39.640
And I keep accumulating, again,
at a linear rate until
00:31:39.640 --> 00:31:42.160
I settle to 1.
00:31:42.160 --> 00:31:46.240
So this is a CDF that has
certain pieces where it
00:31:46.240 --> 00:31:48.060
increases continuously.
00:31:48.060 --> 00:31:50.280
And that corresponds to the
continuous part of our
00:31:50.280 --> 00:31:51.390
randomize variable.
00:31:51.390 --> 00:31:55.090
And it also has some places
where it has discrete jumps.
00:31:55.090 --> 00:31:57.500
And those district jumps
correspond to places in which
00:31:57.500 --> 00:32:00.990
we have placed a
positive mass.
00:32:00.990 --> 00:32:01.780
And by the--
00:32:01.780 --> 00:32:03.750
OK, yeah.
00:32:03.750 --> 00:32:06.580
So this little 0 shouldn't
be there.
00:32:06.580 --> 00:32:08.040
So let's cross it out.
00:32:10.980 --> 00:32:11.780
All right.
00:32:11.780 --> 00:32:15.830
So finally, we're going to take
the remaining time and
00:32:15.830 --> 00:32:17.610
introduce our new friend.
00:32:17.610 --> 00:32:23.080
It's going to be the Gaussian
or normal distribution.
00:32:23.080 --> 00:32:27.690
So it's the most important
distribution there is in all
00:32:27.690 --> 00:32:28.940
of probability theory.
00:32:28.940 --> 00:32:31.230
It's plays a very
central role.
00:32:31.230 --> 00:32:34.340
It shows up all over
the place.
00:32:34.340 --> 00:32:37.870
We'll see later in the
class in more detail
00:32:37.870 --> 00:32:39.450
why it shows up.
00:32:39.450 --> 00:32:42.115
But the quick preview
is the following.
00:32:42.115 --> 00:32:46.220
If you have a phenomenon in
which you measure a certain
00:32:46.220 --> 00:32:50.970
quantity, but that quantity is
made up of lots and lots of
00:32:50.970 --> 00:32:52.820
random contributions--
00:32:52.820 --> 00:32:55.870
so your random variable is
actually the sum of lots and
00:32:55.870 --> 00:32:59.570
lots of independent little
random variables--
00:32:59.570 --> 00:33:04.290
then invariability, no matter
what kind of distribution the
00:33:04.290 --> 00:33:08.260
little random variables have,
their sum will turn out to
00:33:08.260 --> 00:33:11.500
have approximately a normal
distribution.
00:33:11.500 --> 00:33:14.490
So this makes the normal
distribution to arise very
00:33:14.490 --> 00:33:16.680
naturally in lots and
lots of contexts.
00:33:16.680 --> 00:33:21.210
Whenever you have noise that's
comprised of lots of different
00:33:21.210 --> 00:33:26.310
independent pieces of noise,
then the end result will be a
00:33:26.310 --> 00:33:28.650
random variable that's normal.
00:33:28.650 --> 00:33:31.250
So we are going to come back
to that topic later.
00:33:31.250 --> 00:33:34.620
But that's the preview comment,
basically to argue
00:33:34.620 --> 00:33:37.430
that it's an important one.
00:33:37.430 --> 00:33:37.690
OK.
00:33:37.690 --> 00:33:38.810
And there's a special case.
00:33:38.810 --> 00:33:41.030
If you are dealing with a
binomial distribution, which
00:33:41.030 --> 00:33:44.610
is the sum of lots of Bernoulli
random variables,
00:33:44.610 --> 00:33:47.200
again you would expect that
the binomial would start
00:33:47.200 --> 00:33:51.170
looking like a normal if you
have many, many-- a large
00:33:51.170 --> 00:33:53.150
number of point fields.
00:33:53.150 --> 00:33:53.530
All right.
00:33:53.530 --> 00:33:56.560
So what's the math
involved here?
00:33:56.560 --> 00:34:02.370
Let's parse the formula for
the density of the normal.
00:34:02.370 --> 00:34:07.110
What we start with is the
function X squared over 2.
00:34:07.110 --> 00:34:09.750
And if you are to plot X
squared over 2, it's a
00:34:09.750 --> 00:34:12.840
parabola, and it has
this shape --
00:34:12.840 --> 00:34:14.860
X squared over 2.
00:34:14.860 --> 00:34:16.790
Then what do we do?
00:34:16.790 --> 00:34:20.210
We take the negative exponential
of this.
00:34:20.210 --> 00:34:24.600
So when X squared over
2 is 0, then negative
00:34:24.600 --> 00:34:28.980
exponential is 1.
00:34:28.980 --> 00:34:32.739
When X squared over 2 increases,
the negative
00:34:32.739 --> 00:34:37.130
exponential of that falls off,
and it falls off pretty fast.
00:34:37.130 --> 00:34:39.630
So as this goes up, the
formula for the
00:34:39.630 --> 00:34:41.150
density goes down.
00:34:41.150 --> 00:34:45.060
And because exponentials are
pretty strong in how quickly
00:34:45.060 --> 00:34:49.530
they fall off, this means that
the tails of this distribution
00:34:49.530 --> 00:34:53.370
actually do go down
pretty fast.
00:34:53.370 --> 00:34:53.659
OK.
00:34:53.659 --> 00:34:57.800
So that explains the shape
of the normal PDF.
00:34:57.800 --> 00:35:02.340
How about this factor 1
over square root 2 pi?
00:35:02.340 --> 00:35:05.540
Where does this come from?
00:35:05.540 --> 00:35:08.760
Well, the integral has
to be equal to 1.
00:35:08.760 --> 00:35:14.620
So you have to go and do your
calculus exercise and find the
00:35:14.620 --> 00:35:18.350
integral of this the minus X
squared over 2 function and
00:35:18.350 --> 00:35:22.240
then figure out, what constant
do I need to put in front so
00:35:22.240 --> 00:35:24.250
that the integral
is equal to 1?
00:35:24.250 --> 00:35:26.820
How do you evaluate
that integral?
00:35:26.820 --> 00:35:30.760
Either you go to Mathematica
or Wolfram's Alpha or
00:35:30.760 --> 00:35:33.340
whatever, and it tells
you what it is.
00:35:33.340 --> 00:35:37.260
Or it's a very beautiful
calculus exercise that you may
00:35:37.260 --> 00:35:39.050
have seen at some point.
00:35:39.050 --> 00:35:42.190
You throw in another exponential
of this kind, you
00:35:42.190 --> 00:35:46.520
bring in polar coordinates, and
somehow the answer comes
00:35:46.520 --> 00:35:48.010
beautifully out there.
00:35:48.010 --> 00:35:51.910
But in any case, this is the
constant that you need to make
00:35:51.910 --> 00:35:56.070
it integrate to 1 and to be
a legitimate density.
00:35:56.070 --> 00:35:58.550
We call this the standard
normal.
00:35:58.550 --> 00:36:02.280
And for the standard normal,
what is the expected value?
00:36:02.280 --> 00:36:05.780
Well, the symmetry, so
it's equal to 0.
00:36:05.780 --> 00:36:07.490
What is the variance?
00:36:07.490 --> 00:36:09.740
Well, here there's
no shortcut.
00:36:09.740 --> 00:36:12.490
You have to do another
calculus exercise.
00:36:12.490 --> 00:36:17.080
And you find that the variance
is equal to 1.
00:36:17.080 --> 00:36:17.750
OK.
00:36:17.750 --> 00:36:21.720
So this is a normal that's
centered around 0.
00:36:21.720 --> 00:36:24.990
How about other types of normals
that are centered at
00:36:24.990 --> 00:36:26.760
different places?
00:36:26.760 --> 00:36:29.730
So we can do the same
kind of thing.
00:36:29.730 --> 00:36:34.080
Instead of centering it at 0,
we can take some place where
00:36:34.080 --> 00:36:39.640
we want to center it, write down
a quadratic such as (X
00:36:39.640 --> 00:36:44.050
minus mu) squared, and then
take the negative
00:36:44.050 --> 00:36:45.940
exponential of that.
00:36:45.940 --> 00:36:53.790
And that gives us a normal
density that's centered at mu.
00:36:53.790 --> 00:37:01.190
Now, I may wish to control
the width of my density.
00:37:01.190 --> 00:37:04.820
To control the width of my
density, equivalently I can
00:37:04.820 --> 00:37:07.720
control the width
of my parabola.
00:37:07.720 --> 00:37:15.430
If my parabola is narrower, if
my parabola looks like this,
00:37:15.430 --> 00:37:17.990
what's going to happen
to the density?
00:37:17.990 --> 00:37:20.550
It's going to fall
off much faster.
00:37:26.620 --> 00:37:26.920
OK.
00:37:26.920 --> 00:37:31.150
How do I make my parabola
narrower or wider?
00:37:31.150 --> 00:37:35.300
I do it by putting in a
constant down here.
00:37:35.300 --> 00:37:39.890
So by putting a sigma here, this
stretches or widens my
00:37:39.890 --> 00:37:42.840
parabola by a factor of sigma.
00:37:42.840 --> 00:37:43.540
Let's see.
00:37:43.540 --> 00:37:44.780
Which way does it go?
00:37:44.780 --> 00:37:49.330
If sigma is very small,
this is a big number.
00:37:49.330 --> 00:37:55.080
My parabola goes up quickly,
which means my normal falls
00:37:55.080 --> 00:37:56.730
off very fast.
00:37:56.730 --> 00:38:02.630
So small sigma corresponds
to a narrower density.
00:38:02.630 --> 00:38:08.870
And so it, therefore, should be
intuitive that the standard
00:38:08.870 --> 00:38:11.520
deviation is proportional
to sigma.
00:38:11.520 --> 00:38:13.380
Because that's the amount
by which you
00:38:13.380 --> 00:38:15.080
are scaling the picture.
00:38:15.080 --> 00:38:17.320
And indeed, the standard
deviation is sigma.
00:38:17.320 --> 00:38:21.470
And so the variance
is sigma squared.
00:38:21.470 --> 00:38:26.590
So all that we have done here
to create a general normal
00:38:26.590 --> 00:38:31.180
with a given mean and variance
is to take this picture, shift
00:38:31.180 --> 00:38:35.600
it in space so that the mean
sits at mu instead of 0, and
00:38:35.600 --> 00:38:38.880
then scale it by a
factor of sigma.
00:38:38.880 --> 00:38:41.130
This gives us a normal
with a given
00:38:41.130 --> 00:38:42.560
mean and a given variance.
00:38:42.560 --> 00:38:47.670
And the formula for
it is this one.
00:38:47.670 --> 00:38:48.810
All right.
00:38:48.810 --> 00:38:52.230
Now, normal random variables
have some wonderful
00:38:52.230 --> 00:38:54.160
properties.
00:38:54.160 --> 00:39:00.190
And one of them is that they
behave nicely when you take
00:39:00.190 --> 00:39:02.740
linear functions of them.
00:39:02.740 --> 00:39:07.190
So let's fix some constants
a and b, suppose that X is
00:39:07.190 --> 00:39:13.840
normal, and look at this
linear function Y.
00:39:13.840 --> 00:39:17.340
What is the expected
value of Y?
00:39:17.340 --> 00:39:19.220
Here we don't need
anything special.
00:39:19.220 --> 00:39:22.920
We know that the expected value
of a linear function is
00:39:22.920 --> 00:39:26.690
the linear function of
the expectation.
00:39:26.690 --> 00:39:30.570
So the expected value is this.
00:39:30.570 --> 00:39:33.230
How about the variance?
00:39:33.230 --> 00:39:36.430
We know that the variance of a
linear function doesn't care
00:39:36.430 --> 00:39:37.910
about the constant term.
00:39:37.910 --> 00:39:40.880
But the variance gets multiplied
by a squared.
00:39:40.880 --> 00:39:46.880
So we get these variance, where
sigma squared is the
00:39:46.880 --> 00:39:49.070
variance of the original
normal.
00:39:49.070 --> 00:39:53.730
So have we used so far the
property that X is normal?
00:39:53.730 --> 00:39:55.170
No, we haven't.
00:39:55.170 --> 00:39:59.650
This calculation here is true
in general when you take a
00:39:59.650 --> 00:40:02.650
linear function of a
random variable.
00:40:02.650 --> 00:40:08.730
But if X is normal, we get the
other additional fact that Y
00:40:08.730 --> 00:40:10.930
is also going to be normal.
00:40:10.930 --> 00:40:14.300
So that's the nontrivial
part of the fact that
00:40:14.300 --> 00:40:16.070
I'm claiming here.
00:40:16.070 --> 00:40:19.700
So linear functions of normal
random variables are
00:40:19.700 --> 00:40:23.020
themselves normal.
00:40:23.020 --> 00:40:26.680
How do we convince ourselves
about it?
00:40:26.680 --> 00:40:27.080
OK.
00:40:27.080 --> 00:40:31.390
It's something that we will do
formerly in about two or three
00:40:31.390 --> 00:40:33.390
lectures from today.
00:40:33.390 --> 00:40:35.310
So we're going to prove it.
00:40:35.310 --> 00:40:39.770
But if you think about it
intuitively, normal means this
00:40:39.770 --> 00:40:42.070
particular bell-shaped curve.
00:40:42.070 --> 00:40:45.550
And that bell-shaped curve could
be sitting anywhere and
00:40:45.550 --> 00:40:47.910
could be scaled in any way.
00:40:47.910 --> 00:40:51.190
So you start with a
bell-shaped curve.
00:40:51.190 --> 00:40:55.370
If you take X, which is bell
shaped, and you multiply it by
00:40:55.370 --> 00:40:57.500
a constant, what does that do?
00:40:57.500 --> 00:41:01.260
Multiplying by a constant is
just like scaling the axis or
00:41:01.260 --> 00:41:03.750
changing the units with which
you're measuring it.
00:41:03.750 --> 00:41:08.880
So it will take a bell shape
and spread it or narrow it.
00:41:08.880 --> 00:41:10.850
But it will still
be a bell shape.
00:41:10.850 --> 00:41:13.440
And then when you add the
constant, you just take that
00:41:13.440 --> 00:41:16.260
bell and move it elsewhere.
00:41:16.260 --> 00:41:19.970
So under linear transformations,
bell shapes
00:41:19.970 --> 00:41:23.360
will remain bell shapes, just
sitting at a different place
00:41:23.360 --> 00:41:25.090
and with a different width.
00:41:25.090 --> 00:41:30.490
And that sort of the intuition
of why normals remain normals
00:41:30.490 --> 00:41:32.096
under this kind of
transformation.
00:41:35.100 --> 00:41:36.770
So why is this useful?
00:41:36.770 --> 00:41:37.960
Well, OK.
00:41:37.960 --> 00:41:39.890
We have a formula
for the density.
00:41:39.890 --> 00:41:43.750
But usually we want to calculate
probabilities.
00:41:43.750 --> 00:41:45.760
How will you calculate
probabilities?
00:41:45.760 --> 00:41:48.670
If I ask you, what's the
probability that the normal is
00:41:48.670 --> 00:41:51.380
less than 3, how
do you find it?
00:41:51.380 --> 00:41:54.830
You need to integrate the
density from minus
00:41:54.830 --> 00:41:57.300
infinity up to 3.
00:41:57.300 --> 00:42:03.230
Unfortunately, the integral of
the expression that shows up
00:42:03.230 --> 00:42:06.720
that you would have to
calculate, an integral of this
00:42:06.720 --> 00:42:12.690
kind from, let's say, minus
infinity to some number, is
00:42:12.690 --> 00:42:16.270
something that's not known
in closed form.
00:42:16.270 --> 00:42:23.490
So if you're looking for a
closed-form formula for this--
00:42:23.490 --> 00:42:25.040
X bar--
00:42:25.040 --> 00:42:27.890
if you're looking for a
closed-form formula that gives
00:42:27.890 --> 00:42:32.010
you the value of this integral
as a function of X bar, you're
00:42:32.010 --> 00:42:34.460
not going to find it.
00:42:34.460 --> 00:42:36.150
So what can we do?
00:42:36.150 --> 00:42:38.790
Well, since it's a useful
integral, we can
00:42:38.790 --> 00:42:40.880
just tabulate it.
00:42:40.880 --> 00:42:46.070
Calculate it once and for all,
for all values of X bar up to
00:42:46.070 --> 00:42:50.440
some precision, and have
that table, and use it.
00:42:50.440 --> 00:42:53.010
That's what one does.
00:42:53.010 --> 00:42:54.885
OK, but now there is a catch.
00:42:54.885 --> 00:42:59.600
Are we going to write down a
table for every conceivable
00:42:59.600 --> 00:43:01.870
type of normal distribution--
00:43:01.870 --> 00:43:05.115
that is, for every possible
mean and every variance?
00:43:05.115 --> 00:43:07.400
I guess that would be
a pretty long table.
00:43:07.400 --> 00:43:09.540
You don't want to do that.
00:43:09.540 --> 00:43:12.820
Fortunately, it's enough to
have a table with the
00:43:12.820 --> 00:43:17.590
numerical values only for
the standard normal.
00:43:17.590 --> 00:43:20.880
And once you have those, you can
use them in a clever way
00:43:20.880 --> 00:43:24.000
to calculate probabilities for
the more general case.
00:43:24.000 --> 00:43:26.090
So let's see how this is done.
00:43:26.090 --> 00:43:30.610
So our starting point is that
someone has graciously
00:43:30.610 --> 00:43:36.520
calculated for us the values
of the CDF, the cumulative
00:43:36.520 --> 00:43:40.350
distribution function, that is
the probability of falling
00:43:40.350 --> 00:43:44.120
below a certain point for
the standard normal
00:43:44.120 --> 00:43:46.610
and at various places.
00:43:46.610 --> 00:43:48.770
How do we read this table?
00:43:48.770 --> 00:43:55.840
The probability that X is
less than, let's say,
00:43:55.840 --> 00:43:59.170
0.63 is this number.
00:43:59.170 --> 00:44:04.610
This number, 0.7357, is the
probability that the standard
00:44:04.610 --> 00:44:08.070
normal is below 0.63.
00:44:08.070 --> 00:44:11.377
So the table refers to
the standard normal.
00:44:15.990 --> 00:44:19.600
But someone, let's say, gives
us some other numbers and
00:44:19.600 --> 00:44:22.140
tells us we're dealing with a
normal with a certain mean and
00:44:22.140 --> 00:44:23.530
a certain variance.
00:44:23.530 --> 00:44:26.555
And we want to calculate the
probability that the value of
00:44:26.555 --> 00:44:28.740
that random variable is less
than or equal to 3.
00:44:28.740 --> 00:44:30.470
How are we going to do it?
00:44:30.470 --> 00:44:36.210
Well, there's a standard trick,
which is so-called
00:44:36.210 --> 00:44:39.480
standardizing a random
variable.
00:44:39.480 --> 00:44:41.350
Standardizing a random variable
00:44:41.350 --> 00:44:43.080
stands for the following.
00:44:43.080 --> 00:44:44.490
You look at the random
variable, and you
00:44:44.490 --> 00:44:46.280
subtract the mean.
00:44:46.280 --> 00:44:50.690
This makes it a random
variable with 0 mean.
00:44:50.690 --> 00:44:54.270
And then if I divide by the
standard deviation, what
00:44:54.270 --> 00:44:58.220
happens to the variance of
this random variable?
00:44:58.220 --> 00:45:03.860
Dividing by a number divides the
variance by sigma squared.
00:45:03.860 --> 00:45:07.300
The original variance of
X was sigma squared.
00:45:07.300 --> 00:45:11.740
So when I divide by sigma, I
end up with unit variance.
00:45:11.740 --> 00:45:14.920
So after I do this
transformation, I get a random
00:45:14.920 --> 00:45:19.190
variable that has 0 mean
and unit variance.
00:45:19.190 --> 00:45:20.650
It is also normal.
00:45:20.650 --> 00:45:23.580
Why is its normal?
00:45:23.580 --> 00:45:28.890
Because this expression is a
linear function of the X that
00:45:28.890 --> 00:45:30.120
I started with.
00:45:30.120 --> 00:45:32.700
It's a linear function of a
normal random variable.
00:45:32.700 --> 00:45:34.620
Therefore, it is normal.
00:45:34.620 --> 00:45:37.090
And it is a standard normal.
00:45:37.090 --> 00:45:41.460
So by taking a general normal
random variable and doing this
00:45:41.460 --> 00:45:47.200
standardization, you end up
with a standard normal to
00:45:47.200 --> 00:45:49.580
which you can then
apply the table.
00:45:52.100 --> 00:45:56.180
Sometimes one calls this
the normalized score.
00:45:56.180 --> 00:45:59.100
If you're thinking about test
results, how would you
00:45:59.100 --> 00:46:00.780
interpret this number?
00:46:00.780 --> 00:46:05.440
It tells you how many standard
deviations are you
00:46:05.440 --> 00:46:07.900
away from the mean.
00:46:07.900 --> 00:46:10.470
This is how much you are
away from the mean.
00:46:10.470 --> 00:46:13.080
And you count it in terms
of how many standard
00:46:13.080 --> 00:46:14.390
deviations it is.
00:46:14.390 --> 00:46:19.680
So this number being equal to 3
tells you that X happens to
00:46:19.680 --> 00:46:23.160
be 3 standard deviations
above the mean.
00:46:23.160 --> 00:46:26.030
And I guess if you're looking
at your quiz scores, very
00:46:26.030 --> 00:46:30.690
often that's the kind of number
that you think about.
00:46:30.690 --> 00:46:32.130
So it's a useful quantity.
00:46:32.130 --> 00:46:35.120
But it's also useful for doing
the calculation we're now
00:46:35.120 --> 00:46:36.050
going to do.
00:46:36.050 --> 00:46:40.910
So suppose that X has a mean of
2 and a variance of 16, so
00:46:40.910 --> 00:46:43.600
a standard deviation of 4.
00:46:43.600 --> 00:46:46.030
And we're going to calculate the
probability of this event.
00:46:46.030 --> 00:46:49.900
This event is described in terms
of this X that has ugly
00:46:49.900 --> 00:46:51.530
means and variances.
00:46:51.530 --> 00:46:55.390
But we can take this event
and rewrite it as
00:46:55.390 --> 00:46:57.070
an equivalent event.
00:46:57.070 --> 00:47:01.470
X less than 3 is this same as
X minus 2 being less than 3
00:47:01.470 --> 00:47:06.410
minus 2, which is the same as
this ratio being less than
00:47:06.410 --> 00:47:08.440
that ratio.
00:47:08.440 --> 00:47:11.460
So I'm subtracting from both
sides of the inequality the
00:47:11.460 --> 00:47:14.170
mean and then dividing by
the standard deviation.
00:47:14.170 --> 00:47:16.190
This event is the same
as that event.
00:47:16.190 --> 00:47:19.430
Why do we like this
better than that?
00:47:19.430 --> 00:47:23.670
We like it because this is the
standardized, or normalized,
00:47:23.670 --> 00:47:28.660
version of X. We know that
this is standard normal.
00:47:28.660 --> 00:47:30.650
And so we're asking the
question, what's the
00:47:30.650 --> 00:47:34.130
probability that the standard
normal is less than this
00:47:34.130 --> 00:47:37.300
number, which is 1/4?
00:47:37.300 --> 00:47:45.380
So that's the key property, that
this is normal (0, 1).
00:47:45.380 --> 00:47:48.470
And so we can look up now with
the table and ask for the
00:47:48.470 --> 00:47:51.010
probability that the standard
normal random variable
00:47:51.010 --> 00:47:53.170
is less than 0.25.
00:47:53.170 --> 00:47:55.130
Where is that going to be?
00:47:55.130 --> 00:48:01.390
0.2, 0.25, it's here.
00:48:01.390 --> 00:48:09.600
So the answer is 0.987.
00:48:09.600 --> 00:48:15.570
So I guess this is just a drill
that you could learn in
00:48:15.570 --> 00:48:16.190
high school.
00:48:16.190 --> 00:48:18.990
You didn't have to come here
to learn about it.
00:48:18.990 --> 00:48:22.030
But it's a drill that's very
useful when we will be
00:48:22.030 --> 00:48:24.060
calculating normal probabilities
all the time.
00:48:24.060 --> 00:48:27.300
So make sure you know how to
use the table and how to
00:48:27.300 --> 00:48:30.350
massage a general normal
random variable into a
00:48:30.350 --> 00:48:33.380
standard normal random
variable.
00:48:33.380 --> 00:48:33.790
OK.
00:48:33.790 --> 00:48:37.450
So just one more minute to look
at the big picture and
00:48:37.450 --> 00:48:40.940
take stock of what we
have done so far
00:48:40.940 --> 00:48:42.970
and where we're going.
00:48:42.970 --> 00:48:47.840
Chapter 2 was this part of the
picture, where we dealt with
00:48:47.840 --> 00:48:50.460
discrete random variables.
00:48:50.460 --> 00:48:54.590
And this time, today, we
started talking about
00:48:54.590 --> 00:48:56.410
continuous random variables.
00:48:56.410 --> 00:49:00.305
And we introduced the density
function, which is the analog
00:49:00.305 --> 00:49:03.160
of the probability
mass function.
00:49:03.160 --> 00:49:05.790
We have the concepts
of expectation and
00:49:05.790 --> 00:49:07.090
variance and CDF.
00:49:07.090 --> 00:49:10.290
And this kind of notation
applies to both discrete and
00:49:10.290 --> 00:49:11.720
continuous cases.
00:49:11.720 --> 00:49:17.310
They are calculated the same way
in both cases except that
00:49:17.310 --> 00:49:19.770
in the continuous case,
you use sums.
00:49:19.770 --> 00:49:22.740
In the discrete case,
you use integrals.
00:49:22.740 --> 00:49:25.320
So on that side, you
have integrals.
00:49:25.320 --> 00:49:27.500
In this case, you have sums.
00:49:27.500 --> 00:49:30.200
In this case, you always have
Fs in your formulas.
00:49:30.200 --> 00:49:33.500
In this case, you always have
Ps in your formulas.
00:49:33.500 --> 00:49:37.890
So what's there that's left
for us to do is to look at
00:49:37.890 --> 00:49:42.460
these two concepts, joint
probability mass functions and
00:49:42.460 --> 00:49:47.410
conditional mass functions, and
figure out what would be
00:49:47.410 --> 00:49:51.080
the equivalent concepts on
the continuous side.
00:49:51.080 --> 00:49:55.240
So we will need some notion of
a joint density when we're
00:49:55.240 --> 00:49:57.510
dealing with multiple
random variables.
00:49:57.510 --> 00:50:00.310
And we will also need the
concept of conditional
00:50:00.310 --> 00:50:03.430
density, again for the case of
continuous random variables.
00:50:03.430 --> 00:50:07.840
The intuition and the meaning
of these objects is going to
00:50:07.840 --> 00:50:14.120
be exactly the same as here,
only a little subtler because
00:50:14.120 --> 00:50:16.000
densities are not
probabilities.
00:50:16.000 --> 00:50:18.630
They're rates at which
probabilities accumulate.
00:50:18.630 --> 00:50:22.030
So that adds a little bit of
potential confusion here,
00:50:22.030 --> 00:50:24.680
which, hopefully, we will fully
resolve in the next
00:50:24.680 --> 00:50:26.490
couple of sections.
00:50:26.490 --> 00:50:27.310
All right.
00:50:27.310 --> 00:50:28.560
Thank you.