1
00:00:00,000 --> 00:00:00,040
2
00:00:00,040 --> 00:00:02,460
The following content is
provided under a Creative
3
00:00:02,460 --> 00:00:03,870
Commons license.
4
00:00:03,870 --> 00:00:06,910
Your support will help MIT
OpenCourseWare continue to
5
00:00:06,910 --> 00:00:08,700
offer high-quality, educational
6
00:00:08,700 --> 00:00:10,560
resources for free.
7
00:00:10,560 --> 00:00:13,460
To make a donation or view
additional materials from
8
00:00:13,460 --> 00:00:19,290
hundreds of MIT courses, visit
MIT OpenCourseWare at
9
00:00:19,290 --> 00:00:20,540
ocw.mit.edu.
10
00:00:20,540 --> 00:00:22,245
11
00:00:22,245 --> 00:00:23,550
PROFESSOR: Good morning.
12
00:00:23,550 --> 00:00:25,900
So today we're going
to continue the
13
00:00:25,900 --> 00:00:27,930
subject from last time.
14
00:00:27,930 --> 00:00:31,010
So we're going to talk about
derived distributions a little
15
00:00:31,010 --> 00:00:34,685
more, how to derive the
distribution of a function of
16
00:00:34,685 --> 00:00:36,510
a random variable.
17
00:00:36,510 --> 00:00:40,440
So last time we discussed a
couple of examples in which we
18
00:00:40,440 --> 00:00:43,570
had a function of a
single variable.
19
00:00:43,570 --> 00:00:46,330
And we found the distribution
of Y, if we're told the
20
00:00:46,330 --> 00:00:47,970
distribution of X.
21
00:00:47,970 --> 00:00:51,030
So today we're going to do an
example where we deal with the
22
00:00:51,030 --> 00:00:53,600
function of two random
variables.
23
00:00:53,600 --> 00:00:56,460
And then we're going to consider
the most interesting
24
00:00:56,460 --> 00:01:00,470
example of this kind, in which
we have a random variable of
25
00:01:00,470 --> 00:01:03,800
the form W, which is
the sum of two
26
00:01:03,800 --> 00:01:05,830
independent, random variables.
27
00:01:05,830 --> 00:01:08,210
That's a case that shows
up quite often.
28
00:01:08,210 --> 00:01:10,850
And so we want to see what
exactly happens in this
29
00:01:10,850 --> 00:01:12,780
particular case.
30
00:01:12,780 --> 00:01:14,620
Just one comment that
I should make.
31
00:01:14,620 --> 00:01:18,010
The material that we're covering
now, chapter four, is
32
00:01:18,010 --> 00:01:21,940
sort of conceptually a little
more difficult than one we
33
00:01:21,940 --> 00:01:23,600
have been doing before.
34
00:01:23,600 --> 00:01:26,230
So I would definitely encourage
you to read the text
35
00:01:26,230 --> 00:01:29,690
before you jump and try
to do the problems in
36
00:01:29,690 --> 00:01:32,300
your problem sets.
37
00:01:32,300 --> 00:01:40,270
OK, so let's start with our
example, in which we're given
38
00:01:40,270 --> 00:01:41,940
two random variables.
39
00:01:41,940 --> 00:01:43,450
They're jointly continuous.
40
00:01:43,450 --> 00:01:45,870
And their distribution
is pretty simple.
41
00:01:45,870 --> 00:01:48,920
They're uniform on
the unit square.
42
00:01:48,920 --> 00:01:52,440
In particular, each one of the
random variables is uniform on
43
00:01:52,440 --> 00:01:54,050
the unit interval.
44
00:01:54,050 --> 00:01:57,160
And the two random variables
are independent.
45
00:01:57,160 --> 00:02:00,820
What we're going to find is the
distribution of the ratio
46
00:02:00,820 --> 00:02:03,120
of the two random variables.
47
00:02:03,120 --> 00:02:07,170
How do we go about it? , Well,
the same cookbook procedure
48
00:02:07,170 --> 00:02:10,020
that we used last time
for the case of a
49
00:02:10,020 --> 00:02:13,750
single random variable.
50
00:02:13,750 --> 00:02:17,100
The cookbook procedure that
we used for this case also
51
00:02:17,100 --> 00:02:20,240
applies to the case where you
have a function of multiple
52
00:02:20,240 --> 00:02:21,420
random variables.
53
00:02:21,420 --> 00:02:23,740
So what was the cookbook
procedure?
54
00:02:23,740 --> 00:02:27,030
The first step is to find the
cumulative distribution
55
00:02:27,030 --> 00:02:30,830
function of the random variable
of interest and then
56
00:02:30,830 --> 00:02:36,260
take the derivative in order
to find the density.
57
00:02:36,260 --> 00:02:39,770
So let's find the cumulative.
58
00:02:39,770 --> 00:02:43,800
So, by definition, the
cumulative is the probability
59
00:02:43,800 --> 00:02:47,940
that the random variable is
less than or equal to the
60
00:02:47,940 --> 00:02:49,880
argument of the cumulative.
61
00:02:49,880 --> 00:02:53,010
So if we write this event in
terms of the random variable
62
00:02:53,010 --> 00:02:58,470
of interest, this is the
probability that our random
63
00:02:58,470 --> 00:03:01,650
variable is less than
or equal to z.
64
00:03:01,650 --> 00:03:04,920
So what is that?
65
00:03:04,920 --> 00:03:09,453
OK, so the ratio is going to be
less than or equal to z, if
66
00:03:09,453 --> 00:03:14,920
and only if the pair, (x,y),
happens to fall below the line
67
00:03:14,920 --> 00:03:17,280
that has a slope z.
68
00:03:17,280 --> 00:03:20,800
OK, so we draw a line
that has a slope z.
69
00:03:20,800 --> 00:03:23,880
The ratio is less than this
number, if and only if we get
70
00:03:23,880 --> 00:03:27,700
the pair of x and y that falls
inside this triangle.
71
00:03:27,700 --> 00:03:29,430
So we're talking about
the probability of
72
00:03:29,430 --> 00:03:30,880
this particular event.
73
00:03:30,880 --> 00:03:37,170
Since this line has a slope of
z, the height at this point is
74
00:03:37,170 --> 00:03:38,650
equal to z.
75
00:03:38,650 --> 00:03:40,980
And so we can find the
probability of this event.
76
00:03:40,980 --> 00:03:43,260
It's just the area
of this triangle.
77
00:03:43,260 --> 00:03:47,100
And so the area is 1
times z times 1/2.
78
00:03:47,100 --> 00:03:48,760
And we get the answer, z/2.
79
00:03:48,760 --> 00:03:52,190
80
00:03:52,190 --> 00:03:56,220
Now, is this answer
always correct?
81
00:03:56,220 --> 00:04:00,300
Now, this answer is going to be
correct only if the slope
82
00:04:00,300 --> 00:04:05,080
happens to be such that we get
a picture of this kind.
83
00:04:05,080 --> 00:04:07,380
So when do we get a picture
of this kind?
84
00:04:07,380 --> 00:04:09,460
When the slope is less than 1.
85
00:04:09,460 --> 00:04:13,110
If I consider a different slope,
a number, little z --
86
00:04:13,110 --> 00:04:15,730
that happens to be a slope
of that kind --
87
00:04:15,730 --> 00:04:17,670
then the picture changes.
88
00:04:17,670 --> 00:04:20,579
And in that case, we
get a picture of
89
00:04:20,579 --> 00:04:24,330
this kind, let's say.
90
00:04:24,330 --> 00:04:28,495
So this is a line here
of slope z, again.
91
00:04:28,495 --> 00:04:31,030
92
00:04:31,030 --> 00:04:35,790
And this is the second case in
which our number, little z, is
93
00:04:35,790 --> 00:04:37,960
bigger than 1.
94
00:04:37,960 --> 00:04:39,740
So how do we proceed?
95
00:04:39,740 --> 00:04:43,690
Once more, the cumulative is the
probability that the ratio
96
00:04:43,690 --> 00:04:46,060
is less than or equal
to that number.
97
00:04:46,060 --> 00:04:50,650
So it's the probability that
we fall below the red line.
98
00:04:50,650 --> 00:04:56,590
So we're talking about the
event, about this event.
99
00:04:56,590 --> 00:04:59,450
So to find the probability of
this event, we need to find
100
00:04:59,450 --> 00:05:02,300
the area of this red shape.
101
00:05:02,300 --> 00:05:06,310
And one way of finding this area
is to consider the whole
102
00:05:06,310 --> 00:05:09,560
area and subtract the area
of this triangle.
103
00:05:09,560 --> 00:05:11,360
So let's do it this way.
104
00:05:11,360 --> 00:05:15,000
It's going to be 1 minus the
area of the triangle.
105
00:05:15,000 --> 00:05:16,750
Now, what's the area
of the triangle?
106
00:05:16,750 --> 00:05:24,420
It's 1/2 times this side, which
is 1 times this side.
107
00:05:24,420 --> 00:05:28,090
How big is that side?
108
00:05:28,090 --> 00:05:37,130
Well, if y and the slope is z,
now z is the ratio y over x.
109
00:05:37,130 --> 00:05:39,050
So if y over x--
110
00:05:39,050 --> 00:05:46,560
at this point we have
y/x = z and y =1.
111
00:05:46,560 --> 00:05:49,370
This means that z is 1/x.
112
00:05:49,370 --> 00:05:55,080
So the coordinate of
this point is 1/x.
113
00:05:55,080 --> 00:05:56,970
And this means that
we're going to--
114
00:05:56,970 --> 00:06:04,390
1/z So here we get the
factor of 1/z.
115
00:06:04,390 --> 00:06:07,300
116
00:06:07,300 --> 00:06:09,440
And we're basically done.
117
00:06:09,440 --> 00:06:12,630
I guess if you want to have a
complete answer, you should
118
00:06:12,630 --> 00:06:16,770
also give the formula
for z less than 0.
119
00:06:16,770 --> 00:06:19,510
What is the cumulative when
z is less than 0, the
120
00:06:19,510 --> 00:06:22,870
probability that you get the
ratio that's negative?
121
00:06:22,870 --> 00:06:25,140
Well, since our random variables
are positive,
122
00:06:25,140 --> 00:06:27,890
there's no way that you can
get a negative ratio.
123
00:06:27,890 --> 00:06:31,900
So the cumulative down
there is equal to 0.
124
00:06:31,900 --> 00:06:34,870
So we can plot the cumulative.
125
00:06:34,870 --> 00:06:37,965
And we can take its derivative
in order to find the density.
126
00:06:37,965 --> 00:06:45,980
127
00:06:45,980 --> 00:06:49,720
So the cumulative that
we got starts at 0,
128
00:06:49,720 --> 00:06:52,000
when z's are negative.
129
00:06:52,000 --> 00:06:59,750
Then it starts going up
in proportion to z, at
130
00:06:59,750 --> 00:07:01,000
the slope of 1/2.
131
00:07:01,000 --> 00:07:03,520
132
00:07:03,520 --> 00:07:05,770
So this takes us up to 1.
133
00:07:05,770 --> 00:07:08,980
134
00:07:08,980 --> 00:07:14,480
And then it starts increasing
towards 1,
135
00:07:14,480 --> 00:07:15,790
according to this function.
136
00:07:15,790 --> 00:07:18,780
When you let z go to infinity,
the cumulative is
137
00:07:18,780 --> 00:07:20,330
going to go to 1.
138
00:07:20,330 --> 00:07:25,210
And it has a shape of, more
or less, this kind.
139
00:07:25,210 --> 00:07:29,035
So now to get the density, we
just take the derivative.
140
00:07:29,035 --> 00:07:36,790
141
00:07:36,790 --> 00:07:40,560
And the density is, of
course, 0 down here.
142
00:07:40,560 --> 00:07:43,950
Up here the derivative
is just 1/2.
143
00:07:43,950 --> 00:07:48,750
144
00:07:48,750 --> 00:07:52,470
And beyond that point we need to
take the derivative of this
145
00:07:52,470 --> 00:07:53,480
expression.
146
00:07:53,480 --> 00:07:58,700
And the derivative is going to
be 1/2 times 1 over z-squared.
147
00:07:58,700 --> 00:08:00,990
So it's going to be a
shape of this kind.
148
00:08:00,990 --> 00:08:09,440
149
00:08:09,440 --> 00:08:11,730
And we're done.
150
00:08:11,730 --> 00:08:14,820
So you see that problems
involving functions of
151
00:08:14,820 --> 00:08:19,300
multiple random variables are
no harder than problems that
152
00:08:19,300 --> 00:08:22,320
deal with the functional of
a single random variable.
153
00:08:22,320 --> 00:08:25,070
The general procedure is,
again, exactly the same.
154
00:08:25,070 --> 00:08:28,470
You first find the cumulative,
and then you differentiate.
155
00:08:28,470 --> 00:08:31,540
The only extra difficulty will
be that when you calculate the
156
00:08:31,540 --> 00:08:34,570
cumulative, you need to find
the probability of an event
157
00:08:34,570 --> 00:08:37,020
that involves multiple
random variables.
158
00:08:37,020 --> 00:08:40,730
And sometimes this could be
a little harder to do.
159
00:08:40,730 --> 00:08:44,910
By the way, since we dealt
with this example, just a
160
00:08:44,910 --> 00:08:45,920
couple of questions.
161
00:08:45,920 --> 00:08:49,280
What do you think is going to
be the expected value of the
162
00:08:49,280 --> 00:08:52,720
random variable Z?
163
00:08:52,720 --> 00:08:55,960
Let's see, the expected value
of the random variable Z is
164
00:08:55,960 --> 00:09:01,120
going to be the integral
of z times the density.
165
00:09:01,120 --> 00:09:10,950
And the density is equal to 1/2
for z going from 0 to 1.
166
00:09:10,950 --> 00:09:12,050
And then there's another
167
00:09:12,050 --> 00:09:14,770
contribution from 1 to infinity.
168
00:09:14,770 --> 00:09:17,260
There the density is
1/(2z-squared).
169
00:09:17,260 --> 00:09:19,880
170
00:09:19,880 --> 00:09:24,630
And we get the z, since we're
dealing with expectations, dz.
171
00:09:24,630 --> 00:09:25,880
So what is this integral?
172
00:09:25,880 --> 00:09:29,602
173
00:09:29,602 --> 00:09:35,070
Well, if you look here, you're
integrating 1/z, all the way
174
00:09:35,070 --> 00:09:36,420
to infinity.
175
00:09:36,420 --> 00:09:41,550
1/z has an integral, which
is the logarithm of z.
176
00:09:41,550 --> 00:09:44,660
And since the logarithm goes to
infinity, this means that
177
00:09:44,660 --> 00:09:47,770
this integral is
also infinite.
178
00:09:47,770 --> 00:09:53,640
So the expectation of the random
variable Z is actually
179
00:09:53,640 --> 00:09:55,310
infinite in this example.
180
00:09:55,310 --> 00:09:57,130
There's nothing wrong
with this.
181
00:09:57,130 --> 00:10:00,500
Lots of random variables have
infinite expectations.
182
00:10:00,500 --> 00:10:06,980
If the tail of the density falls
kind of slowly, as the
183
00:10:06,980 --> 00:10:10,650
argument goes to infinity, then
it may well turn out that
184
00:10:10,650 --> 00:10:12,430
you get an infinite integral.
185
00:10:12,430 --> 00:10:15,770
So that's just how
things often are.
186
00:10:15,770 --> 00:10:19,060
Nothing strange about it.
187
00:10:19,060 --> 00:10:22,700
And now, since we are still in
this example, let me ask
188
00:10:22,700 --> 00:10:25,680
another question.
189
00:10:25,680 --> 00:10:30,110
Would we reason, on the average,
would it be true that
190
00:10:30,110 --> 00:10:31,960
the expected value of Z --
191
00:10:31,960 --> 00:10:36,710
remember that Z is the ratio
Y/X -- could it be that the
192
00:10:36,710 --> 00:10:39,850
expected value of Z
is this number?
193
00:10:39,850 --> 00:10:43,380
194
00:10:43,380 --> 00:10:48,460
Or could it be that it's
equal to this number?
195
00:10:48,460 --> 00:10:53,670
196
00:10:53,670 --> 00:10:57,345
Or could it be that it's
none of the above?
197
00:10:57,345 --> 00:11:01,140
198
00:11:01,140 --> 00:11:06,295
OK, so how many people think
this is correct?
199
00:11:06,295 --> 00:11:12,500
200
00:11:12,500 --> 00:11:14,130
Small number.
201
00:11:14,130 --> 00:11:15,625
How many people think
this is correct?
202
00:11:15,625 --> 00:11:18,290
203
00:11:18,290 --> 00:11:21,480
Slightly bigger, but still
a small number.
204
00:11:21,480 --> 00:11:24,660
And how many people think
this is correct?
205
00:11:24,660 --> 00:11:26,090
OK, that's--
206
00:11:26,090 --> 00:11:28,890
this one wins the vote.
207
00:11:28,890 --> 00:11:32,570
OK, let's see.
208
00:11:32,570 --> 00:11:37,360
This one is not correct, just
because there's no reason it
209
00:11:37,360 --> 00:11:39,100
should be correct.
210
00:11:39,100 --> 00:11:44,420
So, in general, you cannot
reason on the average.
211
00:11:44,420 --> 00:11:48,460
The expected value of a function
is not the same as
212
00:11:48,460 --> 00:11:50,950
the same function of the
expected values.
213
00:11:50,950 --> 00:11:53,740
This is only true if you're
dealing with linear functions
214
00:11:53,740 --> 00:11:54,950
of random variables.
215
00:11:54,950 --> 00:11:56,340
So this is not--
216
00:11:56,340 --> 00:11:59,400
this turns out to
not be correct.
217
00:11:59,400 --> 00:12:00,790
How about this one?
218
00:12:00,790 --> 00:12:05,820
Well, X and Y are independent,
by assumption.
219
00:12:05,820 --> 00:12:10,910
So 1/X and Y are also
independent.
220
00:12:10,910 --> 00:12:14,000
221
00:12:14,000 --> 00:12:14,710
Why is this?
222
00:12:14,710 --> 00:12:17,150
Independence means that one
random variable does not
223
00:12:17,150 --> 00:12:19,670
convey any information
about the other.
224
00:12:19,670 --> 00:12:24,100
So Y doesn't give you any
information about X. So Y
225
00:12:24,100 --> 00:12:27,970
doesn't give you any information
about 1/X. Or to
226
00:12:27,970 --> 00:12:30,140
put it differently, if two
random variables are
227
00:12:30,140 --> 00:12:36,170
independent, functions of each
one of those random variables
228
00:12:36,170 --> 00:12:37,990
are also independent.
229
00:12:37,990 --> 00:12:41,360
If X is independent from
Y, then g(X) is
230
00:12:41,360 --> 00:12:43,700
independent of h(Y).
231
00:12:43,700 --> 00:12:45,280
So this applies to this case.
232
00:12:45,280 --> 00:12:47,780
These two random variables
are independent.
233
00:12:47,780 --> 00:12:50,350
And since they are independent,
this means that
234
00:12:50,350 --> 00:12:55,070
the expected value of their
product is equal to the
235
00:12:55,070 --> 00:12:57,670
product of the expected
values.
236
00:12:57,670 --> 00:13:02,950
So this relation actually
is true.
237
00:13:02,950 --> 00:13:05,632
And therefore, this
is not true.
238
00:13:05,632 --> 00:13:06,882
OK.
239
00:13:06,882 --> 00:13:14,690
240
00:13:14,690 --> 00:13:17,630
Now, let's move on.
241
00:13:17,630 --> 00:13:22,420
We have this general procedure
of finding the derived
242
00:13:22,420 --> 00:13:26,770
distribution by going through
the cumulative.
243
00:13:26,770 --> 00:13:30,000
Are there some cases where
we can have a shortcut?
244
00:13:30,000 --> 00:13:34,040
Turns out that there is a
special case or a special
245
00:13:34,040 --> 00:13:38,340
structure in which we can get
directly from densities to
246
00:13:38,340 --> 00:13:42,240
densities using directly
just a formula.
247
00:13:42,240 --> 00:13:44,590
And in that case, we don't
have to go through the
248
00:13:44,590 --> 00:13:45,750
cumulative.
249
00:13:45,750 --> 00:13:48,580
And this case is also
interesting, because it gives
250
00:13:48,580 --> 00:13:52,810
us some insight about how one
density changes to a different
251
00:13:52,810 --> 00:13:56,430
density and what affects the
shape of those densities.
252
00:13:56,430 --> 00:14:00,430
So the case where things easy
is when the transformation
253
00:14:00,430 --> 00:14:03,300
from one random variable to
the other is a strictly
254
00:14:03,300 --> 00:14:04,660
monotonic one.
255
00:14:04,660 --> 00:14:10,630
So there's a one-to-one relation
between x's and y's.
256
00:14:10,630 --> 00:14:14,790
Here we can reason directly in
terms of densities by thinking
257
00:14:14,790 --> 00:14:17,980
in terms of probabilities
of small intervals.
258
00:14:17,980 --> 00:14:23,370
So let's look at the small
interval on the x-axis, like
259
00:14:23,370 --> 00:14:26,540
this one, when X ranges from--
260
00:14:26,540 --> 00:14:30,390
where capital X ranges
from a small x to a
261
00:14:30,390 --> 00:14:31,760
small x plus delta.
262
00:14:31,760 --> 00:14:36,480
So this is a small interval
of length delta.
263
00:14:36,480 --> 00:14:40,060
Whenever X happens to fall in
this interval, the random
264
00:14:40,060 --> 00:14:42,720
variable Y is going
to fall in a
265
00:14:42,720 --> 00:14:45,800
corresponding interval up there.
266
00:14:45,800 --> 00:14:50,840
So up there we have a
corresponding interval.
267
00:14:50,840 --> 00:14:55,890
And these two intervals, the
red and the blue interval--
268
00:14:55,890 --> 00:14:57,670
this is the blue interval.
269
00:14:57,670 --> 00:15:01,120
And that's the red interval.
270
00:15:01,120 --> 00:15:05,180
These two intervals should have
the same probability.
271
00:15:05,180 --> 00:15:08,120
They're exactly the
same event.
272
00:15:08,120 --> 00:15:13,530
When X falls here, g(X) happens
to fall in there.
273
00:15:13,530 --> 00:15:16,610
So we can sort of say that the
probability of this little
274
00:15:16,610 --> 00:15:18,270
interval is the same
as the probability
275
00:15:18,270 --> 00:15:20,050
of that little interval.
276
00:15:20,050 --> 00:15:22,870
And we know that probabilities
of little intervals have
277
00:15:22,870 --> 00:15:25,560
something to do with
densities.
278
00:15:25,560 --> 00:15:28,260
So what is the probability
of this little interval?
279
00:15:28,260 --> 00:15:32,490
It's the density of the random
variable X, at this point,
280
00:15:32,490 --> 00:15:35,750
times the length of
the interval.
281
00:15:35,750 --> 00:15:38,990
How about the probability
of that interval?
282
00:15:38,990 --> 00:15:45,070
It's going to be the density of
the random variable Y times
283
00:15:45,070 --> 00:15:48,180
the length of that
little interval.
284
00:15:48,180 --> 00:15:50,310
Now, this interval
has length delta.
285
00:15:50,310 --> 00:15:51,760
Does that mean that
this interval
286
00:15:51,760 --> 00:15:53,710
also has length delta?
287
00:15:53,710 --> 00:15:55,440
Well, not necessarily.
288
00:15:55,440 --> 00:15:58,040
The length of this interval has
something to do with the
289
00:15:58,040 --> 00:16:01,830
slope of your function g.
290
00:16:01,830 --> 00:16:05,650
So slope is dy by dx.
291
00:16:05,650 --> 00:16:09,700
Is how much-- the slope tells
you how big is the y interval
292
00:16:09,700 --> 00:16:13,820
when you take an interval
x of a certain length.
293
00:16:13,820 --> 00:16:17,180
So the slope is what multiplies
the length of this
294
00:16:17,180 --> 00:16:20,430
interval to give you the length
of that interval.
295
00:16:20,430 --> 00:16:25,150
So the length of this interval
is delta times the slope of
296
00:16:25,150 --> 00:16:26,400
your function.
297
00:16:26,400 --> 00:16:28,870
298
00:16:28,870 --> 00:16:35,400
So the length of the interval
is delta times the slope of
299
00:16:35,400 --> 00:16:37,450
the function, approximately.
300
00:16:37,450 --> 00:16:41,320
So the probability of this
interval is going to be the
301
00:16:41,320 --> 00:16:46,790
density of Y times the length
of the interval that we are
302
00:16:46,790 --> 00:16:47,940
considering.
303
00:16:47,940 --> 00:16:52,280
So this gives us a relation
between the density of X,
304
00:16:52,280 --> 00:16:57,080
evaluated at this point, to the
density of Y, evaluated at
305
00:16:57,080 --> 00:16:58,140
that point.
306
00:16:58,140 --> 00:17:00,350
The two densities are
closely related.
307
00:17:00,350 --> 00:17:05,130
If these x's are very likely
to occur, then this is big,
308
00:17:05,130 --> 00:17:08,359
which means that that density
will also be big.
309
00:17:08,359 --> 00:17:11,550
If these x's are very likely to
occur, then those y's are
310
00:17:11,550 --> 00:17:13,530
also very likely to occur.
311
00:17:13,530 --> 00:17:16,109
But there's also another
factor that comes in.
312
00:17:16,109 --> 00:17:18,660
And that's the slope
of the function at
313
00:17:18,660 --> 00:17:21,109
this particular point.
314
00:17:21,109 --> 00:17:24,500
So we have this relation between
the two densities.
315
00:17:24,500 --> 00:17:28,130
Now, in interpreting this
equation, you need to make
316
00:17:28,130 --> 00:17:30,900
sure what's the relation between
the two variables.
317
00:17:30,900 --> 00:17:34,670
I have both little x's
and little y's.
318
00:17:34,670 --> 00:17:39,330
Well, this formula is true for
an (x,y) pair, that they're
319
00:17:39,330 --> 00:17:42,300
related according to this
particular function.
320
00:17:42,300 --> 00:17:48,000
So if I fix an x and consider
the corresponding y, then the
321
00:17:48,000 --> 00:17:52,480
densities at those x's and
corresponding y's will be
322
00:17:52,480 --> 00:17:54,420
related by that formula.
323
00:17:54,420 --> 00:17:57,650
Now, in the end, you want to
come up with a formula that
324
00:17:57,650 --> 00:18:01,520
just gives you the density
of Y as a function of y.
325
00:18:01,520 --> 00:18:03,110
And that means that you need to
326
00:18:03,110 --> 00:18:06,040
eliminate x from the picture.
327
00:18:06,040 --> 00:18:11,140
So let's see how that would
go in an example.
328
00:18:11,140 --> 00:18:17,640
So suppose that we're dealing
with the function y equal to x
329
00:18:17,640 --> 00:18:21,930
cubed, in which case our
function, g(x), is the
330
00:18:21,930 --> 00:18:23,180
function x cubed.
331
00:18:23,180 --> 00:18:26,090
332
00:18:26,090 --> 00:18:31,980
And if x cubed is equal to a
little y, If we have a pair of
333
00:18:31,980 --> 00:18:38,350
x's and y's that are related
this way, then this means that
334
00:18:38,350 --> 00:18:41,600
x is going to be the
cubic root of y.
335
00:18:41,600 --> 00:18:46,550
So this is the formula that
takes us back from y's to x's.
336
00:18:46,550 --> 00:18:52,940
This is the direct function from
x, how to construct y.
337
00:18:52,940 --> 00:18:55,470
This is essentially the inverse
function that tells
338
00:18:55,470 --> 00:18:59,460
us, from a given y what is
the corresponding x.
339
00:18:59,460 --> 00:19:04,650
Now, if we write this formula,
it tells us that the density
340
00:19:04,650 --> 00:19:08,270
at the particular x is going
to be the density at the
341
00:19:08,270 --> 00:19:12,390
corresponding y times the slope
of the function at the
342
00:19:12,390 --> 00:19:14,770
particular x that we
are considering.
343
00:19:14,770 --> 00:19:17,150
The slope of the function
is 3x squared.
344
00:19:17,150 --> 00:19:20,870
345
00:19:20,870 --> 00:19:26,590
Now, we want to end up with a
formula for the density of Y.
346
00:19:26,590 --> 00:19:29,510
So I'm going to take this
factor, send it
347
00:19:29,510 --> 00:19:31,410
to the other side.
348
00:19:31,410 --> 00:19:35,300
But since I want it to be a
function of y, I want to
349
00:19:35,300 --> 00:19:36,920
eliminate the x's.
350
00:19:36,920 --> 00:19:39,590
And I'm going to eliminate
the x's using this
351
00:19:39,590 --> 00:19:41,290
correspondence here.
352
00:19:41,290 --> 00:19:44,440
So I'm going to get
the density of X
353
00:19:44,440 --> 00:19:47,830
evaluated at y to the 1/3.
354
00:19:47,830 --> 00:19:50,404
And then this factor in the
denominator, it's 1/(3y to the
355
00:19:50,404 --> 00:19:51,654
power 2/3).
356
00:19:51,654 --> 00:19:55,710
357
00:19:55,710 --> 00:19:59,540
So we end up finally with the
formula for the density of the
358
00:19:59,540 --> 00:20:02,900
random variable Y.
359
00:20:02,900 --> 00:20:06,900
And this is the same answer that
you would get if you go
360
00:20:06,900 --> 00:20:10,030
through this exercise using the
cumulative distribution
361
00:20:10,030 --> 00:20:11,540
function method.
362
00:20:11,540 --> 00:20:13,160
You end up getting
the same answer.
363
00:20:13,160 --> 00:20:15,205
But here we sort of
get it directly.
364
00:20:15,205 --> 00:20:19,700
365
00:20:19,700 --> 00:20:24,570
Just to get a little more
insight as to why
366
00:20:24,570 --> 00:20:25,830
the slope comes in--
367
00:20:25,830 --> 00:20:29,960
368
00:20:29,960 --> 00:20:35,070
suppose that we have a function
like this one.
369
00:20:35,070 --> 00:20:38,020
370
00:20:38,020 --> 00:20:45,110
So the function is sort of flat,
then moves quickly, and
371
00:20:45,110 --> 00:20:49,160
then becomes flat again.
372
00:20:49,160 --> 00:20:50,720
What should be --
373
00:20:50,720 --> 00:20:55,140
and suppose that X has some kind
of reasonable density,
374
00:20:55,140 --> 00:20:57,180
some kind of flat density.
375
00:20:57,180 --> 00:21:01,640
Suppose that X is a pretty
uniform random variable.
376
00:21:01,640 --> 00:21:04,770
What's going to happen to
the random variable Y?
377
00:21:04,770 --> 00:21:06,920
What kind of distribution
should it have?
378
00:21:06,920 --> 00:21:14,670
379
00:21:14,670 --> 00:21:19,220
What are the typical values
of the random variable Y?
380
00:21:19,220 --> 00:21:26,960
Either x falls here, and y is
a very small number, or--
381
00:21:26,960 --> 00:21:30,100
let's take that number here
to be -- let's say 2 --
382
00:21:30,100 --> 00:21:37,290
or x falls in this range, and
y takes a value close to 2.
383
00:21:37,290 --> 00:21:40,210
And there's a small chance that
x's will be somewhere in
384
00:21:40,210 --> 00:21:44,350
the middle, in which case y
takes intermediate values.
385
00:21:44,350 --> 00:21:46,390
So what kind of shape do
you expect for the
386
00:21:46,390 --> 00:21:48,060
distribution of Y?
387
00:21:48,060 --> 00:21:51,900
There's going to be a fair
amount of probability that Y
388
00:21:51,900 --> 00:21:55,510
takes values close to 0.
389
00:21:55,510 --> 00:21:58,480
There's a small probability
that Y takes
390
00:21:58,480 --> 00:22:00,130
intermediate values.
391
00:22:00,130 --> 00:22:03,870
That corresponds to the case
where x falls in here.
392
00:22:03,870 --> 00:22:05,480
That's not a lot
of probability.
393
00:22:05,480 --> 00:22:11,280
So the probability that Y takes
values between 0 and 2,
394
00:22:11,280 --> 00:22:12,760
that's kind of small.
395
00:22:12,760 --> 00:22:16,860
But then there's a lot of x's
that produces y's that are
396
00:22:16,860 --> 00:22:18,410
close to 2.
397
00:22:18,410 --> 00:22:22,110
So there's a significant
probability that Y would take
398
00:22:22,110 --> 00:22:25,470
values that are close to 2.
399
00:22:25,470 --> 00:22:26,370
So you--
400
00:22:26,370 --> 00:22:31,300
the density of Y would have
a shape of this kind.
401
00:22:31,300 --> 00:22:35,280
By looking at this picture, you
can tell that it's most
402
00:22:35,280 --> 00:22:39,630
likely that either x will fall
here or x will fall there.
403
00:22:39,630 --> 00:22:44,110
So the g(x) is most likely
to be close to 0 or
404
00:22:44,110 --> 00:22:46,290
to be close to 2.
405
00:22:46,290 --> 00:22:51,420
So since y is most likely to be
close to 0 or close to most
406
00:22:51,420 --> 00:22:53,850
of the probability
of y is here.
407
00:22:53,850 --> 00:22:54,570
And there's a small
408
00:22:54,570 --> 00:22:56,810
probability of being in between.
409
00:22:56,810 --> 00:23:02,330
Notice that the y's that get a
lot of probability are those
410
00:23:02,330 --> 00:23:07,490
y's associated with flats
regions off your g function.
411
00:23:07,490 --> 00:23:11,510
When the g function is flat,
that gives you big densities
412
00:23:11,510 --> 00:23:12,500
for Y.
413
00:23:12,500 --> 00:23:16,480
So the density of Y is inversely
proportional to the
414
00:23:16,480 --> 00:23:18,350
slope of the function.
415
00:23:18,350 --> 00:23:20,140
And that's what you
get from here.
416
00:23:20,140 --> 00:23:22,780
The density of Y is--
417
00:23:22,780 --> 00:23:25,430
send that term to the other
side-- is inversely
418
00:23:25,430 --> 00:23:28,550
proportional to the slope of
the function that you're
419
00:23:28,550 --> 00:23:29,800
dealing with.
420
00:23:29,800 --> 00:23:32,755
421
00:23:32,755 --> 00:23:36,730
OK, so this formula works nicely
for the case where the
422
00:23:36,730 --> 00:23:38,470
function is one-to-one.
423
00:23:38,470 --> 00:23:42,610
So we can have a unique
association between x's and
424
00:23:42,610 --> 00:23:47,500
y's and through an inverse
function, from y's to x's.
425
00:23:47,500 --> 00:23:50,030
It works for the monotonically
increasing case.
426
00:23:50,030 --> 00:23:53,660
It also works for the
monotonically decreasing case.
427
00:23:53,660 --> 00:23:56,120
In the monotonically decreasing
case, the only
428
00:23:56,120 --> 00:23:59,050
change that you need to do is to
take the absolute value of
429
00:23:59,050 --> 00:24:01,275
the slope, instead of
the slope itself.
430
00:24:01,275 --> 00:24:16,340
431
00:24:16,340 --> 00:24:22,480
OK, now, here's another example
or a special case.
432
00:24:22,480 --> 00:24:27,520
Let's talk about the most
interesting case that involves
433
00:24:27,520 --> 00:24:29,740
a function of two random
variables.
434
00:24:29,740 --> 00:24:34,460
And this is the case where we
have two independent, random
435
00:24:34,460 --> 00:24:38,190
variables, and we want to
find the distribution of
436
00:24:38,190 --> 00:24:40,150
the sum of the two.
437
00:24:40,150 --> 00:24:42,300
We're really interested in
the continuous case.
438
00:24:42,300 --> 00:24:45,540
But as a warm-up, it's useful
to look at the discrete case
439
00:24:45,540 --> 00:24:48,510
first of discrete random
variables.
440
00:24:48,510 --> 00:24:52,740
Let's say we want to find the
probability that the sum of X
441
00:24:52,740 --> 00:24:55,890
and Y is equal to a
particular number.
442
00:24:55,890 --> 00:24:58,570
And to illustrate this,
let's take that number
443
00:24:58,570 --> 00:25:00,010
to be equal to 3.
444
00:25:00,010 --> 00:25:02,380
What's the probability that
the sum of the two random
445
00:25:02,380 --> 00:25:04,700
variables is equal to 3?
446
00:25:04,700 --> 00:25:07,640
To find the probability that
the sum is equal to 3, you
447
00:25:07,640 --> 00:25:11,570
consider all possible ways that
you can get the sum of 3.
448
00:25:11,570 --> 00:25:14,760
And the different ways are the
points in this picture.
449
00:25:14,760 --> 00:25:18,100
And they correspond to a line
that goes this way.
450
00:25:18,100 --> 00:25:21,620
So the probability that the
sum is equal to a certain
451
00:25:21,620 --> 00:25:24,550
number is the probability
that --
452
00:25:24,550 --> 00:25:26,340
is the sum of the
probabilities of
453
00:25:26,340 --> 00:25:27,950
all of those points.
454
00:25:27,950 --> 00:25:31,190
What is a typical point
in this picture?
455
00:25:31,190 --> 00:25:34,470
In a typical point, the
random variable X
456
00:25:34,470 --> 00:25:36,490
takes a certain value.
457
00:25:36,490 --> 00:25:41,480
And Y takes the value that's
needed so that the sum is
458
00:25:41,480 --> 00:25:47,650
equal to W. Any combination of
an x with a w minus x, any
459
00:25:47,650 --> 00:25:51,110
such combination gives
you a sum of w.
460
00:25:51,110 --> 00:25:54,950
So the probability that the sum
is w is the sum over all
461
00:25:54,950 --> 00:25:56,120
possible x's.
462
00:25:56,120 --> 00:25:59,420
That's over all these points of
the probability that we get
463
00:25:59,420 --> 00:26:01,050
a certain x.
464
00:26:01,050 --> 00:26:05,630
Let's say x equals 2 times the
corresponding probability that
465
00:26:05,630 --> 00:26:08,570
random variable Y takes
the value 1.
466
00:26:08,570 --> 00:26:11,710
And why am I multiplying
probabilities here?
467
00:26:11,710 --> 00:26:14,070
That's where we use the
assumption that the two random
468
00:26:14,070 --> 00:26:16,170
variables are independent.
469
00:26:16,170 --> 00:26:19,610
So the probability that X takes
a certain value and Y
470
00:26:19,610 --> 00:26:22,870
takes the complementary value,
that probability is the
471
00:26:22,870 --> 00:26:26,120
product of two probabilities
because of independence.
472
00:26:26,120 --> 00:26:29,890
And when we write that into our
usual PMF notation, it's a
473
00:26:29,890 --> 00:26:31,510
formula of this kind.
474
00:26:31,510 --> 00:26:35,500
So this formula is called
the convolution formula.
475
00:26:35,500 --> 00:26:42,030
It's an operation that takes
one PMF and another PMF-- p
476
00:26:42,030 --> 00:26:44,580
we're given the PMF's
of X and Y --
477
00:26:44,580 --> 00:26:47,640
and produces a new PMF.
478
00:26:47,640 --> 00:26:50,350
So think of this formula as
giving you a transformation.
479
00:26:50,350 --> 00:26:53,570
You take two PMF's, you do
something with them, and you
480
00:26:53,570 --> 00:26:56,190
obtain a new PMF.
481
00:26:56,190 --> 00:26:59,710
This procedure, what this
formula does is --
482
00:26:59,710 --> 00:27:04,490
nicely illustrated sort
of by mechanically.
483
00:27:04,490 --> 00:27:08,640
So let me show you a picture
here and illustrate how the
484
00:27:08,640 --> 00:27:13,310
mechanics go, in general.
485
00:27:13,310 --> 00:27:16,790
So you don't have these slides,
but let's just reason
486
00:27:16,790 --> 00:27:18,040
through it.
487
00:27:18,040 --> 00:27:22,220
So suppose that you are
given the PMF of X,
488
00:27:22,220 --> 00:27:23,110
and it has this shape.
489
00:27:23,110 --> 00:27:26,000
You're given the PMF of
Y. It has this shape.
490
00:27:26,000 --> 00:27:28,790
And somehow we are going
to do this calculation.
491
00:27:28,790 --> 00:27:31,940
Now, we need to do this
calculation for every value of
492
00:27:31,940 --> 00:27:37,190
W, in order to get the PMF of
W. Let's start by doing the
493
00:27:37,190 --> 00:27:40,200
calculation just for one case.
494
00:27:40,200 --> 00:27:43,870
Suppose the W is equal to 0, in
which case we need to find
495
00:27:43,870 --> 00:27:46,835
the sum of Px(x) and Py(-x).
496
00:27:46,835 --> 00:27:50,790
497
00:27:50,790 --> 00:27:53,780
How do you do this calculation
graphically?
498
00:27:53,780 --> 00:27:59,550
It involves the PMF of X. But it
involves the PMF of Y, with
499
00:27:59,550 --> 00:28:02,120
the argument reversed.
500
00:28:02,120 --> 00:28:04,770
So how do we plot this?
501
00:28:04,770 --> 00:28:07,940
Well, in order to reverse the
argument, what you need is to
502
00:28:07,940 --> 00:28:11,230
take this PMF and flip it.
503
00:28:11,230 --> 00:28:13,850
So that's where it's handy
to have a pair of
504
00:28:13,850 --> 00:28:16,110
scissors with you.
505
00:28:16,110 --> 00:28:20,800
So you cut this down.
506
00:28:20,800 --> 00:28:26,360
And so now you take the PMF
of the random variable Y
507
00:28:26,360 --> 00:28:28,620
and just flip it.
508
00:28:28,620 --> 00:28:33,830
So what you see here is this
function where the argument is
509
00:28:33,830 --> 00:28:35,020
being reversed.
510
00:28:35,020 --> 00:28:36,260
And then what do we do?
511
00:28:36,260 --> 00:28:39,080
We cross-multiply
the two plots.
512
00:28:39,080 --> 00:28:41,070
Any entry here gets multiplied
with the
513
00:28:41,070 --> 00:28:43,110
corresponding entry there.
514
00:28:43,110 --> 00:28:46,550
And we consider all those
products and add them up.
515
00:28:46,550 --> 00:28:50,000
In this particular case, the
flipped PMF doesn't have any
516
00:28:50,000 --> 00:28:53,850
overlap with the PMF of X. So
we're going to get an answer
517
00:28:53,850 --> 00:28:56,190
that's equal to 0.
518
00:28:56,190 --> 00:29:03,320
So for w's equal to 0, the Pw is
going to be equal to 0, in
519
00:29:03,320 --> 00:29:05,210
this particular plot.
520
00:29:05,210 --> 00:29:08,760
Now if we have a different
value of w --
521
00:29:08,760 --> 00:29:09,930
oops.
522
00:29:09,930 --> 00:29:14,670
If we have a different value
of the argument w, then we
523
00:29:14,670 --> 00:29:20,530
have here the PMF of Y that's
flipped and shifted by an
524
00:29:20,530 --> 00:29:22,350
amount of w.
525
00:29:22,350 --> 00:29:25,930
So the correct picture of what
you do is to take this and
526
00:29:25,930 --> 00:29:30,250
displace it by a certain
amount of w.
527
00:29:30,250 --> 00:29:33,430
So here, how much
did I shift it?
528
00:29:33,430 --> 00:29:40,640
I shifted it until one
falls just below 4.
529
00:29:40,640 --> 00:29:44,832
So I have shifted by a
total amount of 5.
530
00:29:44,832 --> 00:29:50,680
So 0 falls under 5, whereas
0 initially was under 0.
531
00:29:50,680 --> 00:29:53,170
So I'm shifting it by 5 units.
532
00:29:53,170 --> 00:29:56,320
And I'm now going to
cross-multiply and add.
533
00:29:56,320 --> 00:29:58,220
Does this give us
the correct--
534
00:29:58,220 --> 00:30:00,180
does it do the correct thing?
535
00:30:00,180 --> 00:30:03,700
Yes, because a typical term will
be the probability that
536
00:30:03,700 --> 00:30:06,920
this random variable is 3 times
the probability that
537
00:30:06,920 --> 00:30:09,090
this random variable is 2.
538
00:30:09,090 --> 00:30:12,500
That's a particular way that
you can get a sum of 5.
539
00:30:12,500 --> 00:30:16,100
If you see here, the way that
things are aligned, it gives
540
00:30:16,100 --> 00:30:19,500
you all the different ways that
you can get the sum of 5.
541
00:30:19,500 --> 00:30:23,756
You can get the sum of 5 by
having 1 + 4, or 2 + 3, or 3 +
542
00:30:23,756 --> 00:30:26,140
2, or 4 + 1.
543
00:30:26,140 --> 00:30:28,280
You need to add the
probabilities of all those
544
00:30:28,280 --> 00:30:29,340
combinations.
545
00:30:29,340 --> 00:30:32,180
So you take this times that.
546
00:30:32,180 --> 00:30:34,230
That's one product term.
547
00:30:34,230 --> 00:30:38,220
Then this times 0,
this times that.
548
00:30:38,220 --> 00:30:39,480
And so 1--
549
00:30:39,480 --> 00:30:40,560
you cross--
550
00:30:40,560 --> 00:30:44,710
you find all the products of the
corresponding terms, and
551
00:30:44,710 --> 00:30:46,190
you add them together.
552
00:30:46,190 --> 00:30:50,140
So it's a kind of handy
mechanical procedure for doing
553
00:30:50,140 --> 00:30:53,520
this calculation, especially
when the PMF's are given to
554
00:30:53,520 --> 00:30:55,850
you in terms of a picture.
555
00:30:55,850 --> 00:31:00,000
So the summary of these
mechanics are just what we
556
00:31:00,000 --> 00:31:03,530
did, is that you put the PMF's
on top of each other.
557
00:31:03,530 --> 00:31:06,260
You take the PMF of
Y. You flip it.
558
00:31:06,260 --> 00:31:10,160
And for any particular w that
you're interested in, you take
559
00:31:10,160 --> 00:31:14,070
this flipped PMF and shift
it by an amount of w.
560
00:31:14,070 --> 00:31:17,120
Given this particular shift for
a particular value of w,
561
00:31:17,120 --> 00:31:21,020
you cross-multiply terms and
then accumulate them or add
562
00:31:21,020 --> 00:31:23,280
them together.
563
00:31:23,280 --> 00:31:26,620
What would you expect to happen
in the continuous case?
564
00:31:26,620 --> 00:31:28,600
Well, the story is familiar.
565
00:31:28,600 --> 00:31:32,520
In the continuous case, pretty
much, almost always things
566
00:31:32,520 --> 00:31:34,730
work out the same way,
except that we
567
00:31:34,730 --> 00:31:37,260
replace PMF's by PDF's.
568
00:31:37,260 --> 00:31:42,930
And we replace sums
by integrals.
569
00:31:42,930 --> 00:31:47,430
So there shouldn't be any
surprise here that you get a
570
00:31:47,430 --> 00:31:49,680
formula of this kind.
571
00:31:49,680 --> 00:31:54,030
The density of W can be obtained
from the density of X
572
00:31:54,030 --> 00:31:58,740
and the density of Y by
calculating this integral.
573
00:31:58,740 --> 00:32:03,440
Essentially, what this integral
does is it fits a
574
00:32:03,440 --> 00:32:05,130
particular w of interest.
575
00:32:05,130 --> 00:32:07,870
We're interested in the
probability that the random
576
00:32:07,870 --> 00:32:13,160
variable, capital W, takes a
value equal to little w or
577
00:32:13,160 --> 00:32:14,820
values close to it.
578
00:32:14,820 --> 00:32:17,240
So this corresponds to the
event, which is this
579
00:32:17,240 --> 00:32:21,120
particular line on the
two-dimensional space.
580
00:32:21,120 --> 00:32:24,140
So we need to find
the sort of odd
581
00:32:24,140 --> 00:32:25,990
probabilities along that line.
582
00:32:25,990 --> 00:32:28,620
But since the setting is
continuous, we will not add
583
00:32:28,620 --> 00:32:29,220
probabilities.
584
00:32:29,220 --> 00:32:31,120
We're going to integrate.
585
00:32:31,120 --> 00:32:35,430
And for any typical point in
this picture, the probability
586
00:32:35,430 --> 00:32:39,330
of obtaining an outcome in this
neighborhood is the--
587
00:32:39,330 --> 00:32:43,460
has something to do with the
density of that particular x
588
00:32:43,460 --> 00:32:47,190
and the density of the
particular y that would
589
00:32:47,190 --> 00:32:50,750
compliment x, in order
to form a sum of w.
590
00:32:50,750 --> 00:32:55,640
So this integral that we have
here is really an integral
591
00:32:55,640 --> 00:32:59,382
over this particular line.
592
00:32:59,382 --> 00:33:02,440
OK, so I'm going to
skip the formal
593
00:33:02,440 --> 00:33:04,010
derivation of this result.
594
00:33:04,010 --> 00:33:06,830
There's a couple of derivations
in the text.
595
00:33:06,830 --> 00:33:10,330
And the one which is outlined
here is yet a third
596
00:33:10,330 --> 00:33:11,500
derivation.
597
00:33:11,500 --> 00:33:14,300
But the easiest way to make
sense of this formula is to
598
00:33:14,300 --> 00:33:18,270
consider what happens in
the discrete case.
599
00:33:18,270 --> 00:33:22,010
So for the rest of the lecture
we're going to consider a few
600
00:33:22,010 --> 00:33:27,280
extra, more miscellaneous
topics, a few remarks, and a
601
00:33:27,280 --> 00:33:29,100
few more definitions.
602
00:33:29,100 --> 00:33:31,740
So let's change--
603
00:33:31,740 --> 00:33:35,325
flip a page and consider
the next mini topic.
604
00:33:35,325 --> 00:33:38,670
605
00:33:38,670 --> 00:33:41,370
There's not going to be anything
deep here, but just
606
00:33:41,370 --> 00:33:44,550
something that's worth
being familiar with.
607
00:33:44,550 --> 00:33:47,570
If you have two independent,
normal random variables with
608
00:33:47,570 --> 00:33:50,920
certain parameters, the question
is, what does the
609
00:33:50,920 --> 00:33:55,160
joined PDF look like?
610
00:33:55,160 --> 00:33:58,970
So if they're independent, by
definition the joint PDF is
611
00:33:58,970 --> 00:34:01,760
the product of the
individual PDF's.
612
00:34:01,760 --> 00:34:04,840
And the PDF's each one
of them involves an
613
00:34:04,840 --> 00:34:07,030
exponential of something.
614
00:34:07,030 --> 00:34:11,290
The product of two exponentials
is the
615
00:34:11,290 --> 00:34:13,389
exponential of the sum.
616
00:34:13,389 --> 00:34:15,400
So you just add the exponents.
617
00:34:15,400 --> 00:34:18,320
So this is the formula
for the joint PDF.
618
00:34:18,320 --> 00:34:20,790
Now, you look at that formula
and you ask, what
619
00:34:20,790 --> 00:34:23,969
does it look like?
620
00:34:23,969 --> 00:34:27,780
OK, you can understand it, a
function of two variables by
621
00:34:27,780 --> 00:34:30,530
thinking about the contours
of this function.
622
00:34:30,530 --> 00:34:32,850
Look at the points at
which the function
623
00:34:32,850 --> 00:34:34,389
takes a constant value.
624
00:34:34,389 --> 00:34:34,920
Where is it?
625
00:34:34,920 --> 00:34:37,139
When is it constant?
626
00:34:37,139 --> 00:34:40,150
What's the shape of
the set of points
627
00:34:40,150 --> 00:34:42,239
where this is a constant?
628
00:34:42,239 --> 00:34:46,610
So consider all x's and y's for
which this expression here
629
00:34:46,610 --> 00:34:51,179
is a constant, that this
expression here is a constant.
630
00:34:51,179 --> 00:34:53,250
What kind of shape is this?
631
00:34:53,250 --> 00:34:56,170
This is an ellipse.
632
00:34:56,170 --> 00:35:01,880
And it's an ellipse that's
centered at--
633
00:35:01,880 --> 00:35:06,530
it's centered at mu x, mu y.
634
00:35:06,530 --> 00:35:09,760
These are the means of the
two random variables.
635
00:35:09,760 --> 00:35:13,760
If those sigmas were equal,
that ellipse would
636
00:35:13,760 --> 00:35:16,970
be actually a circle.
637
00:35:16,970 --> 00:35:20,210
And you would get contours
of this kind.
638
00:35:20,210 --> 00:35:23,870
But if, on the other hand, the
sigmas are different, you're
639
00:35:23,870 --> 00:35:29,900
going to get an ellipse that
has contours of this kind.
640
00:35:29,900 --> 00:35:32,930
So if my contours are
of this kind, that
641
00:35:32,930 --> 00:35:35,820
corresponds to what?
642
00:35:35,820 --> 00:35:39,395
Sigma x being bigger than
sigma y or vice versa.
643
00:35:39,395 --> 00:35:42,760
644
00:35:42,760 --> 00:35:47,970
OK, contours of this kind
basically tell you that X is
645
00:35:47,970 --> 00:35:53,610
more likely to be spread out
than Y. So the range of
646
00:35:53,610 --> 00:35:55,600
possible x's is bigger.
647
00:35:55,600 --> 00:36:04,610
And X out here is as likely
as a Y up there.
648
00:36:04,610 --> 00:36:08,920
So big X's have roughly the same
probability as certain
649
00:36:08,920 --> 00:36:10,260
smaller y's.
650
00:36:10,260 --> 00:36:14,520
So in a picture of this kind,
the variance of X is going to
651
00:36:14,520 --> 00:36:17,710
be bigger than the
variance of Y.
652
00:36:17,710 --> 00:36:20,890
So depending on how these
variances compare with each
653
00:36:20,890 --> 00:36:22,520
other, that's going
to determine the
654
00:36:22,520 --> 00:36:24,180
shape of the ellipse.
655
00:36:24,180 --> 00:36:27,470
If the variance of Y we're
bigger, then your ellipse
656
00:36:27,470 --> 00:36:28,720
would be the other way.
657
00:36:28,720 --> 00:36:32,400
It would be elongated in
the other dimension.
658
00:36:32,400 --> 00:36:34,150
Just visualize it
a little more.
659
00:36:34,150 --> 00:36:37,120
Let me throw at you a
particular picture.
660
00:36:37,120 --> 00:36:39,820
This is one--
661
00:36:39,820 --> 00:36:43,830
this is a picture of
one special case.
662
00:36:43,830 --> 00:36:46,600
Here, I think, the variances
are equal.
663
00:36:46,600 --> 00:36:48,340
That's the kind of shape
that you get.
664
00:36:48,340 --> 00:36:51,330
It looks like a two-dimensional
bell.
665
00:36:51,330 --> 00:36:54,700
So remember, for a normal random
variables, for a single
666
00:36:54,700 --> 00:36:57,960
random variable you get a
PDF that's bell shaped.
667
00:36:57,960 --> 00:37:00,360
That's just a bell-shaped
curve.
668
00:37:00,360 --> 00:37:04,740
In the two-dimensional case, we
get the joint PDF, which is
669
00:37:04,740 --> 00:37:05,960
bell shaped again.
670
00:37:05,960 --> 00:37:09,750
And now it looks more like a
real bell, the way it would be
671
00:37:09,750 --> 00:37:12,550
laid out in ordinary space.
672
00:37:12,550 --> 00:37:15,060
And if you look at the contours
of this function, the
673
00:37:15,060 --> 00:37:18,950
places where the function is
equal, the typcial contour
674
00:37:18,950 --> 00:37:21,270
would have this shape here.
675
00:37:21,270 --> 00:37:23,090
And it would be an ellipse.
676
00:37:23,090 --> 00:37:28,320
And in this case, actually, it
will be more like a circle.
677
00:37:28,320 --> 00:37:32,650
So these would be the different
contours for
678
00:37:32,650 --> 00:37:33,900
different--
679
00:37:33,900 --> 00:37:36,820
680
00:37:36,820 --> 00:37:38,520
so the contours are
places where the
681
00:37:38,520 --> 00:37:40,550
joint PDF is a constant.
682
00:37:40,550 --> 00:37:43,200
When you change the value of
that constant, you get the
683
00:37:43,200 --> 00:37:44,620
different contours.
684
00:37:44,620 --> 00:37:50,790
And the PDF is, of course,
centered around the mean of
685
00:37:50,790 --> 00:37:52,350
the two random variables.
686
00:37:52,350 --> 00:37:55,970
So in this particular case,
since the bell is centered
687
00:37:55,970 --> 00:38:00,270
around the (0, 0) vector, this
is a plot of a bivariate
688
00:38:00,270 --> 00:38:02,245
normal with 0 means.
689
00:38:02,245 --> 00:38:05,370
690
00:38:05,370 --> 00:38:08,680
OK, there's--
691
00:38:08,680 --> 00:38:14,990
bivariate normals are also
interesting when your bell is
692
00:38:14,990 --> 00:38:17,280
oriented differently in space.
693
00:38:17,280 --> 00:38:21,090
We talked about ellipses that
are this way, ellipses that
694
00:38:21,090 --> 00:38:22,170
are this way.
695
00:38:22,170 --> 00:38:26,800
You could imagine also bells
that you take them, you squash
696
00:38:26,800 --> 00:38:30,200
them somehow, so that they
become narrow in one dimension
697
00:38:30,200 --> 00:38:32,700
and then maybe rotate them.
698
00:38:32,700 --> 00:38:33,720
So if you had--
699
00:38:33,720 --> 00:38:37,120
we're not going to go into this
subject, but if you had a
700
00:38:37,120 --> 00:38:46,580
joint pdf whose contours were
like this, what would that
701
00:38:46,580 --> 00:38:47,450
correspond to?
702
00:38:47,450 --> 00:38:51,220
Would your x's and y's
be independent?
703
00:38:51,220 --> 00:38:51,720
No.
704
00:38:51,720 --> 00:38:54,840
This would indicate that there's
a relation between the
705
00:38:54,840 --> 00:38:55,870
x's and the y's.
706
00:38:55,870 --> 00:38:59,280
That is, when you have bigger
x's, you would expect to also
707
00:38:59,280 --> 00:39:01,370
get bigger y's.
708
00:39:01,370 --> 00:39:04,530
So it would be a case of
dependent normals.
709
00:39:04,530 --> 00:39:09,100
And we're coming back to
this point in a second.
710
00:39:09,100 --> 00:39:13,710
Before we get to that point in
a second that has to do with
711
00:39:13,710 --> 00:39:16,840
the dependencies between the
random variables, let's just
712
00:39:16,840 --> 00:39:18,480
do another digression.
713
00:39:18,480 --> 00:39:23,700
If we have our two normals that
are independent, as we
714
00:39:23,700 --> 00:39:28,570
discussed here, we can go and
apply the formula, the
715
00:39:28,570 --> 00:39:31,770
convolution formula that we
were just discussing.
716
00:39:31,770 --> 00:39:35,250
Suppose you want to find the
distribution of the sum of
717
00:39:35,250 --> 00:39:37,160
these two independent normals.
718
00:39:37,160 --> 00:39:39,100
How do you do this?
719
00:39:39,100 --> 00:39:42,730
There is a closed-form formula
for the density of the sum,
720
00:39:42,730 --> 00:39:44,120
which is this one.
721
00:39:44,120 --> 00:39:47,530
We do have formulas for the
density of X and the density
722
00:39:47,530 --> 00:39:50,840
of Y, because both of them are
normal, random variables.
723
00:39:50,840 --> 00:39:54,820
So you need to calculate this
particular integral here.
724
00:39:54,820 --> 00:39:57,300
It's an integral with
respect to x.
725
00:39:57,300 --> 00:39:59,770
And you have to calculate
this integral for any
726
00:39:59,770 --> 00:40:03,190
given value of w.
727
00:40:03,190 --> 00:40:05,660
So this is an exercise
in integration,
728
00:40:05,660 --> 00:40:07,230
which is not very difficult.
729
00:40:07,230 --> 00:40:10,460
And it turns out that after you
do everything, you end up
730
00:40:10,460 --> 00:40:12,680
with an answer that
has this form.
731
00:40:12,680 --> 00:40:14,340
And you look at that,
and you suddenly
732
00:40:14,340 --> 00:40:16,930
recognize, oh, this is normal.
733
00:40:16,930 --> 00:40:20,760
And conclusion from this
exercise, once it's done, is
734
00:40:20,760 --> 00:40:23,150
that the sum of two independent
normal random
735
00:40:23,150 --> 00:40:26,370
variables is also normal.
736
00:40:26,370 --> 00:40:31,900
Now, the mean of W is, of
course, going to be equal to
737
00:40:31,900 --> 00:40:35,710
the sum of the means of X and
Y. In this case, in this
738
00:40:35,710 --> 00:40:37,660
formula I took the
means to be 0.
739
00:40:37,660 --> 00:40:40,850
So the mean of W is also
going to be 0.
740
00:40:40,850 --> 00:40:43,650
In the more general case, the
mean of W is going to be just
741
00:40:43,650 --> 00:40:45,560
the sum of the two means.
742
00:40:45,560 --> 00:40:49,680
The variance of W is always the
sum of the variances of X
743
00:40:49,680 --> 00:40:53,350
and Y, since we have independent
random variables.
744
00:40:53,350 --> 00:40:55,700
So there's no surprise here.
745
00:40:55,700 --> 00:40:59,990
The main surprise in this
calculation is this fact here,
746
00:40:59,990 --> 00:41:02,720
that the sum of independent
normal random
747
00:41:02,720 --> 00:41:04,170
variables is normal.
748
00:41:04,170 --> 00:41:07,210
I had mentioned this fact
in a previous lecture.
749
00:41:07,210 --> 00:41:12,070
Here what we're doing is to
basically outline the argument
750
00:41:12,070 --> 00:41:14,640
that justifies this
particular fact.
751
00:41:14,640 --> 00:41:17,540
It's an exercise in integration,
where you realize
752
00:41:17,540 --> 00:41:22,680
that when you convolve two
normal curves, you also get
753
00:41:22,680 --> 00:41:26,850
back a normal one once more.
754
00:41:26,850 --> 00:41:30,230
So now, let's return to the
comment I was making here,
755
00:41:30,230 --> 00:41:33,160
that if you have a contour plot
that has, let's say, a
756
00:41:33,160 --> 00:41:36,640
shape of this kind, this
indicates some kind of
757
00:41:36,640 --> 00:41:39,620
dependence between your
two random variables.
758
00:41:39,620 --> 00:41:43,470
So instead of a contour plot,
let me throw in here a
759
00:41:43,470 --> 00:41:44,990
scattered diagram.
760
00:41:44,990 --> 00:41:47,580
What does this scattered
diagram correspond to?
761
00:41:47,580 --> 00:41:50,650
Suppose you have a discrete
distribution, and each one of
762
00:41:50,650 --> 00:41:54,760
the points in this diagram
has positive probability.
763
00:41:54,760 --> 00:41:58,600
When you look at this diagram,
what would you say?
764
00:41:58,600 --> 00:42:06,890
I would say that when
y is big then x
765
00:42:06,890 --> 00:42:09,400
also tends to be larger.
766
00:42:09,400 --> 00:42:15,580
So bigger x's are sort of
associated with bigger y's in
767
00:42:15,580 --> 00:42:18,160
some average, statistical
sense.
768
00:42:18,160 --> 00:42:21,410
Whereas, if you have a picture
of this kind, it tells you in
769
00:42:21,410 --> 00:42:26,980
association that the positive
y's tend to be associated with
770
00:42:26,980 --> 00:42:30,090
negative x's most of the time.
771
00:42:30,090 --> 00:42:34,410
Negative y's tend to be
associated mostly with
772
00:42:34,410 --> 00:42:35,660
positive x's.
773
00:42:35,660 --> 00:42:38,510
774
00:42:38,510 --> 00:42:42,210
So here there's a relation
that when one variable is
775
00:42:42,210 --> 00:42:45,790
large, the other one is also
expected to be large.
776
00:42:45,790 --> 00:42:48,800
Here there's a relation
of the opposite kind.
777
00:42:48,800 --> 00:42:50,910
How can we capture
this relation
778
00:42:50,910 --> 00:42:52,310
between two random variables?
779
00:42:52,310 --> 00:42:56,090
The way we capture it is by
defining this concept called
780
00:42:56,090 --> 00:43:03,090
the covariance, that looks at
the relation of was X bigger
781
00:43:03,090 --> 00:43:04,160
than usual?
782
00:43:04,160 --> 00:43:06,520
That's the question, whether
this is positive.
783
00:43:06,520 --> 00:43:10,110
And how does this relate to the
answer-- to the question,
784
00:43:10,110 --> 00:43:13,160
was Y bigger than usual?
785
00:43:13,160 --> 00:43:16,290
We're asking-- by calculating
this quantity, we're sort of
786
00:43:16,290 --> 00:43:19,820
asking the question, is there a
systematic relation between
787
00:43:19,820 --> 00:43:25,790
having a big X with
having a big Y?
788
00:43:25,790 --> 00:43:28,590
OK , to understand more
precisely what this does,
789
00:43:28,590 --> 00:43:32,290
let's suppose that the random
variable has 0 means, So that
790
00:43:32,290 --> 00:43:33,610
we get rid of this--
791
00:43:33,610 --> 00:43:35,290
get rid of some clutter.
792
00:43:35,290 --> 00:43:38,940
So the covariance is defined
just as this product.
793
00:43:38,940 --> 00:43:40,760
What does this do?
794
00:43:40,760 --> 00:43:45,120
If positive x's tends to go
together with positive y's,
795
00:43:45,120 --> 00:43:49,080
and negative x's tend to go
together with negative y's,
796
00:43:49,080 --> 00:43:51,860
this product will always
be positive.
797
00:43:51,860 --> 00:43:54,880
And the covariance will
end up being positive.
798
00:43:54,880 --> 00:43:59,090
In particular, if you sit down
with a scattered diagram and
799
00:43:59,090 --> 00:44:01,220
you do the calculations,
you'll find that the
800
00:44:01,220 --> 00:44:05,480
covariance of X and Y in this
diagram would be positive,
801
00:44:05,480 --> 00:44:09,680
because here, most of the time,
X times Y is positive.
802
00:44:09,680 --> 00:44:12,130
There's going to be a few
negative terms, but there are
803
00:44:12,130 --> 00:44:14,300
fewer than the positive ones.
804
00:44:14,300 --> 00:44:17,000
So this is a case of a
positive covariance.
805
00:44:17,000 --> 00:44:19,570
It indicates a positive relation
between the two
806
00:44:19,570 --> 00:44:20,450
random variables.
807
00:44:20,450 --> 00:44:24,700
When one is big, the other
also tends to be big.
808
00:44:24,700 --> 00:44:26,320
This is the opposite
situation.
809
00:44:26,320 --> 00:44:28,070
Here, when one variable--
810
00:44:28,070 --> 00:44:31,000
here, most of the action happens
in this quadrant and
811
00:44:31,000 --> 00:44:35,530
that quadrant, which means that
X times Y, most of the
812
00:44:35,530 --> 00:44:37,150
time, is negative.
813
00:44:37,150 --> 00:44:39,130
You get a few positive
contributions,
814
00:44:39,130 --> 00:44:40,430
but there are few.
815
00:44:40,430 --> 00:44:44,430
When you add things up, the
negative terms dominate.
816
00:44:44,430 --> 00:44:46,510
And in this case we
have covariance of
817
00:44:46,510 --> 00:44:49,560
X and Y being negative.
818
00:44:49,560 --> 00:44:53,080
So a positive covariance
indicates a sort of systematic
819
00:44:53,080 --> 00:44:56,280
relation, that there's a
positive association between
820
00:44:56,280 --> 00:44:57,370
the two random variables.
821
00:44:57,370 --> 00:45:00,280
When one is large, the other
also tends to be large.
822
00:45:00,280 --> 00:45:03,060
Negative covariance is
sort of the opposite.
823
00:45:03,060 --> 00:45:05,690
When one tends to be
large, the other
824
00:45:05,690 --> 00:45:09,920
variable tends to be small.
825
00:45:09,920 --> 00:45:15,050
OK, so what else is there to
say about the covariance?
826
00:45:15,050 --> 00:45:18,280
One observation to make
is the following.
827
00:45:18,280 --> 00:45:21,105
What's the covariance
of X with X itself?
828
00:45:21,105 --> 00:45:23,940
829
00:45:23,940 --> 00:45:28,220
If you plug in X here, you see
that what we have is expected
830
00:45:28,220 --> 00:45:32,130
value of X minus expected
of X squared.
831
00:45:32,130 --> 00:45:33,790
And that's just the
definition of the
832
00:45:33,790 --> 00:45:36,170
variance of a random variable.
833
00:45:36,170 --> 00:45:41,180
So that's one fact
to keep in mind.
834
00:45:41,180 --> 00:45:44,620
We had a shortcut formula for
calculating variances.
835
00:45:44,620 --> 00:45:46,900
There's a similar shortcut
formula for calculating
836
00:45:46,900 --> 00:45:48,380
covariances.
837
00:45:48,380 --> 00:45:51,440
In particular, we can calculate
covariances in this
838
00:45:51,440 --> 00:45:52,720
particular way.
839
00:45:52,720 --> 00:45:56,500
That's just the convenient way
of doing it whenever you need
840
00:45:56,500 --> 00:45:57,940
to calculate it.
841
00:45:57,940 --> 00:46:02,690
And finally, covariances are
very useful when you want to
842
00:46:02,690 --> 00:46:06,420
calculate the variance of a
sum of random variables.
843
00:46:06,420 --> 00:46:08,940
844
00:46:08,940 --> 00:46:12,610
We know that if two random
variables are independent, the
845
00:46:12,610 --> 00:46:16,270
variance of the sum is the
sum of the variances.
846
00:46:16,270 --> 00:46:20,310
When the random variables are
dependent, this is no longer
847
00:46:20,310 --> 00:46:23,100
true, and we need to supplement
the formula a
848
00:46:23,100 --> 00:46:24,200
little bit.
849
00:46:24,200 --> 00:46:26,240
And there's a typo on
the slides that you
850
00:46:26,240 --> 00:46:27,680
have in your hands.
851
00:46:27,680 --> 00:46:32,870
That term of 2 shouldn't
be there.
852
00:46:32,870 --> 00:46:36,870
And let's see where that
formula comes from.
853
00:46:36,870 --> 00:46:41,550
854
00:46:41,550 --> 00:46:44,290
Let's suppose that our
random variables are
855
00:46:44,290 --> 00:46:46,530
independent of --
856
00:46:46,530 --> 00:46:47,530
not independent --
857
00:46:47,530 --> 00:46:49,395
our random variables
have 0 means.
858
00:46:49,395 --> 00:46:55,680
859
00:46:55,680 --> 00:46:57,990
And we want to calculate
the variance.
860
00:46:57,990 --> 00:47:00,900
So the variance is going
to be expected value of
861
00:47:00,900 --> 00:47:04,150
(X1 plus Xn) squared.
862
00:47:04,150 --> 00:47:07,140
What you do is you expand
the square.
863
00:47:07,140 --> 00:47:12,670
And you get the expected value
of the sum of the Xi squared.
864
00:47:12,670 --> 00:47:14,780
And then you get all
the cross terms.
865
00:47:14,780 --> 00:47:23,070
866
00:47:23,070 --> 00:47:24,510
OK.
867
00:47:24,510 --> 00:47:29,420
And so now, here, let's
assume for simplicity
868
00:47:29,420 --> 00:47:30,880
that we have 0 means.
869
00:47:30,880 --> 00:47:34,200
The expected value of this is
the sum of the expected values
870
00:47:34,200 --> 00:47:36,300
of the X squared terms.
871
00:47:36,300 --> 00:47:38,430
And that gives us
the variance.
872
00:47:38,430 --> 00:47:41,560
And then we have all the
possible cross terms.
873
00:47:41,560 --> 00:47:44,220
And each one of the possible
cross terms is the expected
874
00:47:44,220 --> 00:47:46,620
value of Xi times Xj.
875
00:47:46,620 --> 00:47:49,250
This is just the covariance.
876
00:47:49,250 --> 00:47:52,730
So if you can calculate all
the variances and the
877
00:47:52,730 --> 00:47:56,210
covariances, then you're able to
calculate also the variance
878
00:47:56,210 --> 00:47:58,540
of a sum of random variables.
879
00:47:58,540 --> 00:48:03,260
Now, if two random variables are
independent, then you look
880
00:48:03,260 --> 00:48:04,800
at this expression.
881
00:48:04,800 --> 00:48:07,700
Because of independence,
expected value of the product
882
00:48:07,700 --> 00:48:10,990
is going to be the product
of the expected values.
883
00:48:10,990 --> 00:48:14,080
And the expected value
of just this term is
884
00:48:14,080 --> 00:48:15,910
always equal to 0.
885
00:48:15,910 --> 00:48:19,790
You're expected deviation
from the mean is just 0.
886
00:48:19,790 --> 00:48:22,650
So the covariance will
turn out to be 0.
887
00:48:22,650 --> 00:48:25,110
So independent random
variables lead to 0
888
00:48:25,110 --> 00:48:28,320
covariances, although the
opposite fact is not
889
00:48:28,320 --> 00:48:30,160
necessarily true.
890
00:48:30,160 --> 00:48:33,250
So covariances give you some
indication of the relation
891
00:48:33,250 --> 00:48:35,430
between two random variables.
892
00:48:35,430 --> 00:48:38,370
Something that's not so
convenient conceptually about
893
00:48:38,370 --> 00:48:41,440
covariances is that it
has the wrong units.
894
00:48:41,440 --> 00:48:43,290
That's the same comment
that we had
895
00:48:43,290 --> 00:48:45,520
made regarding variances.
896
00:48:45,520 --> 00:48:48,730
And with variances we got out
of that issue by considering
897
00:48:48,730 --> 00:48:52,540
the standard deviation, which
has the correct units.
898
00:48:52,540 --> 00:48:58,090
So with the same reasoning, we
want to have a concept that
899
00:48:58,090 --> 00:49:02,150
captures the relation between
two random variables and, in
900
00:49:02,150 --> 00:49:05,790
some sense, that doesn't have
to do with the units that
901
00:49:05,790 --> 00:49:07,050
we're dealing.
902
00:49:07,050 --> 00:49:10,630
We want to have a dimensionless
quantity.
903
00:49:10,630 --> 00:49:14,040
That tells us how strongly two
random variables are related
904
00:49:14,040 --> 00:49:16,010
to each other.
905
00:49:16,010 --> 00:49:21,180
So instead of considering the
covariance of just X with Y,
906
00:49:21,180 --> 00:49:24,860
we take our random variables
and standardize them by
907
00:49:24,860 --> 00:49:28,430
dividing them by their
individual standard deviations
908
00:49:28,430 --> 00:49:30,460
and take the expectation
of this.
909
00:49:30,460 --> 00:49:34,780
So what we end up doing is the
covariance of X and Y, which
910
00:49:34,780 --> 00:49:39,160
has units that are the units of
X times the units of Y. But
911
00:49:39,160 --> 00:49:41,710
divide with a standard
deviation, so that we get a
912
00:49:41,710 --> 00:49:44,090
quantity that doesn't
have units.
913
00:49:44,090 --> 00:49:47,890
This quantity, we call it the
correlation coefficient.
914
00:49:47,890 --> 00:49:51,060
And it's a very useful quantity,
a very useful
915
00:49:51,060 --> 00:49:53,610
measure of the strength
of association
916
00:49:53,610 --> 00:49:55,580
between two random variables.
917
00:49:55,580 --> 00:49:59,750
It's very informative, because
it falls always
918
00:49:59,750 --> 00:50:02,330
between -1 and +1.
919
00:50:02,330 --> 00:50:06,240
This is an algebraic exercise
that you're going to see in
920
00:50:06,240 --> 00:50:07,780
recitation.
921
00:50:07,780 --> 00:50:10,600
And the way that you interpret
it is as follows.
922
00:50:10,600 --> 00:50:13,360
If the two random variables
are independent, the
923
00:50:13,360 --> 00:50:15,390
covariance is going to be 0.
924
00:50:15,390 --> 00:50:18,170
The correlation coefficient
is going to be 0.
925
00:50:18,170 --> 00:50:23,340
So 0 correlation coefficient
basically indicates a lack of
926
00:50:23,340 --> 00:50:26,570
a systematic relation between
the two random variables.
927
00:50:26,570 --> 00:50:31,710
On the other hand, when rho is
large, either close to 1 or
928
00:50:31,710 --> 00:50:34,850
close to -1, this is an
indication of a strong
929
00:50:34,850 --> 00:50:37,660
association between the
two random variables.
930
00:50:37,660 --> 00:50:42,770
And the extreme case is when
rho takes an extreme value.
931
00:50:42,770 --> 00:50:46,300
When rho has a magnitude
equal to 1, it's as
932
00:50:46,300 --> 00:50:47,790
big as it can be.
933
00:50:47,790 --> 00:50:50,210
In that case, the two
random variables are
934
00:50:50,210 --> 00:50:53,630
very strongly related.
935
00:50:53,630 --> 00:50:54,650
How strongly?
936
00:50:54,650 --> 00:50:58,030
Well, if you know one random
variable, if you know the
937
00:50:58,030 --> 00:51:03,530
value of y, you can recover the
value of x and conversely.
938
00:51:03,530 --> 00:51:07,210
So the case of a complete
correlation is the case where
939
00:51:07,210 --> 00:51:11,300
one random variable is a linear
function of the other
940
00:51:11,300 --> 00:51:12,560
random variable.
941
00:51:12,560 --> 00:51:16,940
In terms of a scatter plot, this
would mean that there's a
942
00:51:16,940 --> 00:51:22,060
certain line and that the only
possible (x,y) pairs that can
943
00:51:22,060 --> 00:51:24,940
happen would lie on that line.
944
00:51:24,940 --> 00:51:28,920
So if all the possible (x,y)
pairs lie on this line, then
945
00:51:28,920 --> 00:51:32,340
you have this relation, and the
correlation coefficient is
946
00:51:32,340 --> 00:51:33,440
equal to 1.
947
00:51:33,440 --> 00:51:36,580
A case where the correlation
coefficient is close to 1
948
00:51:36,580 --> 00:51:40,480
would be a scatter plot like
this, where the x's and y's
949
00:51:40,480 --> 00:51:44,820
are quite strongly aligned with
each other, maybe not
950
00:51:44,820 --> 00:51:47,920
exactly, but fairly strongly.
951
00:51:47,920 --> 00:51:50,760
All right, so you're going to
hear a little more about
952
00:51:50,760 --> 00:51:52,710
correlation coefficients
and covariances
953
00:51:52,710 --> 00:51:53,960
in recitation tomorrow.
954
00:51:53,960 --> 00:51:54,670