1
00:00:02,880 --> 00:00:07,590
In the discrete case, we saw
that we could recover the PMF
2
00:00:07,590 --> 00:00:12,370
of X and the PMF of Y
from the joint PMF.
3
00:00:12,370 --> 00:00:15,580
Indeed, the joint PMF is
supposed to contain a complete
4
00:00:15,580 --> 00:00:18,740
probabilistic description of
the two random variables.
5
00:00:18,740 --> 00:00:22,070
It is their probability law, and
any quantity of interest
6
00:00:22,070 --> 00:00:25,280
can be computed if we
know the joint.
7
00:00:25,280 --> 00:00:28,120
Things are similar in the
continuous setting.
8
00:00:28,120 --> 00:00:30,730
You can easily guess
the formula through
9
00:00:30,730 --> 00:00:32,900
the standard recipe.
10
00:00:32,900 --> 00:00:41,550
Replace sums by integrals,
and replace PMFs by PDFs.
11
00:00:41,550 --> 00:00:46,230
But a proof of this formula
is actually instructive.
12
00:00:46,230 --> 00:00:51,550
So let us start by first
finding the CDF of X.
13
00:00:51,550 --> 00:00:55,640
The CDF of X is, by definition,
the probability
14
00:00:55,640 --> 00:00:59,470
that the random variable X takes
a value less than or
15
00:00:59,470 --> 00:01:03,120
equal to a certain
number little x.
16
00:01:03,120 --> 00:01:07,540
And this is the probability of
a particular set that we can
17
00:01:07,540 --> 00:01:10,450
visualize on the two
dimensional plane.
18
00:01:10,450 --> 00:01:15,510
If here is the value of little
x, then we're talking about
19
00:01:15,510 --> 00:01:22,020
the set of all pairs x, y, for
which the x component is less
20
00:01:22,020 --> 00:01:24,530
than or equal to a
certain number.
21
00:01:24,530 --> 00:01:29,220
So we need to integrate over
this two-dimensional set the
22
00:01:29,220 --> 00:01:31,700
joint density.
23
00:01:31,700 --> 00:01:37,640
So it will be a double integral
of the joint density
24
00:01:37,640 --> 00:01:40,940
over this particular
two-dimensional set.
25
00:01:40,940 --> 00:01:43,509
Now, since we've used the
symbol x here to mean
26
00:01:43,509 --> 00:01:46,700
something specific, let us use
different symbols for the
27
00:01:46,700 --> 00:01:51,280
dummy variables that we will
use in the integration.
28
00:01:51,280 --> 00:01:55,180
And we need to integrate with
respect to the two variables,
29
00:01:55,180 --> 00:01:59,380
let's say with respect to
t and with respect to s.
30
00:01:59,380 --> 00:02:03,320
The variable t can
be anything.
31
00:02:03,320 --> 00:02:07,190
So it ranges from minus
infinity to infinity.
32
00:02:07,190 --> 00:02:11,720
But the variable s, the first
argument, ranges from minus
33
00:02:11,720 --> 00:02:14,670
infinity up to this
point, which is x.
34
00:02:17,680 --> 00:02:22,530
Think of this double integral as
an integral with respect to
35
00:02:22,530 --> 00:02:27,380
the variable s of this
complicated function inside
36
00:02:27,380 --> 00:02:28,900
the brackets.
37
00:02:28,900 --> 00:02:35,130
Now, to find the density of
X, all we need to do is to
38
00:02:35,130 --> 00:02:45,450
differentiate the CDF of X. And
when we have an integral
39
00:02:45,450 --> 00:02:48,220
of this kind and we
differentiate with respect to
40
00:02:48,220 --> 00:02:51,560
the upper limit of the
integration, what we are left
41
00:02:51,560 --> 00:02:54,630
with is the integrand.
42
00:02:54,630 --> 00:02:59,020
That is this expression here.
43
00:02:59,020 --> 00:03:04,550
It is an integral with respect
to the second variable.
44
00:03:04,550 --> 00:03:07,910
And it's an integral over the
entire space, from minus
45
00:03:07,910 --> 00:03:09,765
infinity to plus infinity.
46
00:03:12,330 --> 00:03:14,360
Here is an example.
47
00:03:14,360 --> 00:03:18,060
The simplest kind of a joint
PDF is a PDF of that is
48
00:03:18,060 --> 00:03:26,670
constant on a certain set, S,
and is 0 outside that set.
49
00:03:26,670 --> 00:03:30,020
So the overall probability, one
unit of probability, is
50
00:03:30,020 --> 00:03:33,010
spread uniformly
over that set.
51
00:03:33,010 --> 00:03:36,680
Because the total volume under
the joint PDF must be equal to
52
00:03:36,680 --> 00:03:43,790
1, the height of the PDF must
be equal to 1 over the area.
53
00:03:43,790 --> 00:03:50,690
To calculate the probability of
a certain set A, we want to
54
00:03:50,690 --> 00:03:55,700
ask how much volume is sitting
on top of that set.
55
00:03:55,700 --> 00:04:00,310
And because in this case, the
PDF is constant, we need to
56
00:04:00,310 --> 00:04:04,060
take the height of the PDF
times the relevant area.
57
00:04:04,060 --> 00:04:05,760
What is the relevant area?
58
00:04:05,760 --> 00:04:10,570
Well, actually, the PDF is 0
outside the set S. So the
59
00:04:10,570 --> 00:04:14,510
relevant area is only this
part here, which is the
60
00:04:14,510 --> 00:04:19,050
intersection of the
two sets, S and A.
61
00:04:19,050 --> 00:04:24,810
So the total volume sitting on
top of this little set is
62
00:04:24,810 --> 00:04:29,860
going to be the base, the area
of the base, which is the area
63
00:04:29,860 --> 00:04:35,159
of A intersection S times
the height of the
64
00:04:35,159 --> 00:04:38,170
PDF at those places.
65
00:04:38,170 --> 00:04:44,020
Now, the height of the PDF is 1
over the area of S. So this
66
00:04:44,020 --> 00:04:49,510
is the formula for calculating
the probability of a certain
67
00:04:49,510 --> 00:04:57,590
set, A.
68
00:04:57,590 --> 00:05:00,530
Let's now look at a
specific example.
69
00:05:00,530 --> 00:05:06,050
Suppose that we have a uniform
PDF over this particular set,
70
00:05:06,050 --> 00:05:11,500
S. This set has an area
that is equal to 4.
71
00:05:11,500 --> 00:05:15,500
It consists of four units
rectangles arranged next to
72
00:05:15,500 --> 00:05:16,470
each other.
73
00:05:16,470 --> 00:05:20,490
So the height of the joint
PDF in this example
74
00:05:20,490 --> 00:05:22,140
is going to be 1/4.
75
00:05:24,700 --> 00:05:28,790
It is one 1/4 on that set, but
of course, it's going to be 0
76
00:05:28,790 --> 00:05:30,910
outside that set.
77
00:05:30,910 --> 00:05:36,460
We can now find the marginal
PDF at some particular x.
78
00:05:36,460 --> 00:05:39,490
So we can fix a particular
value of x,
79
00:05:39,490 --> 00:05:41,060
let's say this one.
80
00:05:41,060 --> 00:05:44,070
To find the value of the
marginal PDF, we need to
81
00:05:44,070 --> 00:05:49,550
integrate over y along
that particular line.
82
00:05:49,550 --> 00:05:53,380
And the integral is going to
have a contribution only on
83
00:05:53,380 --> 00:05:54,900
that segment.
84
00:05:54,900 --> 00:05:59,320
On that segment, the value
of the joint PDF is 1/4.
85
00:05:59,320 --> 00:06:02,600
And we're integrating over
an interval that
86
00:06:02,600 --> 00:06:04,650
has a length of one.
87
00:06:04,650 --> 00:06:08,230
So the integral is going
to be equal to 1/4.
88
00:06:08,230 --> 00:06:12,760
But if x is somewhere around
here, as we integrate over
89
00:06:12,760 --> 00:06:17,640
that line, we integrate the
value of 1/4, the value of the
90
00:06:17,640 --> 00:06:23,190
PDF, over an interval that
has a length equal to 3.
91
00:06:23,190 --> 00:06:27,780
And so the result turns
out to be 3/4.
92
00:06:27,780 --> 00:06:32,220
There's a similar calculation
for the marginal PDF of y.
93
00:06:32,220 --> 00:06:38,010
For any particular value of
little y, to find the marginal
94
00:06:38,010 --> 00:06:44,460
PDF, we integrate along this
line the joint PDF.
95
00:06:44,460 --> 00:06:47,659
The joint PDF is 0 out here.
96
00:06:47,659 --> 00:06:52,030
It's nonzero only on
that interval.
97
00:06:52,030 --> 00:06:55,690
And on that interval, it
has a value of 1/4.
98
00:06:55,690 --> 00:06:59,909
And the interval has a length of
1, so the integral is going
99
00:06:59,909 --> 00:07:02,190
to end up equal to 1/4.
100
00:07:02,190 --> 00:07:06,250
But if we were to take a line
somewhere here, we integrate
101
00:07:06,250 --> 00:07:10,460
the value of 1/4 over an
interval of length 2.
102
00:07:10,460 --> 00:07:13,060
And so the result
would be 1/2.
103
00:07:13,060 --> 00:07:18,880
So we have recovered from the
joint PDF the marginal PDF of
104
00:07:18,880 --> 00:07:22,730
X and also the marginal
PDF of Y.