1
00:00:00,500 --> 00:00:02,820
The following content is
provided under a Creative
2
00:00:02,820 --> 00:00:04,340
Commons license.
3
00:00:04,340 --> 00:00:06,670
Your support will help
MIT OpenCourseWare
4
00:00:06,670 --> 00:00:11,040
continue to offer high quality
educational resources for free.
5
00:00:11,040 --> 00:00:13,650
To make a donation or
view additional materials
6
00:00:13,650 --> 00:00:17,556
from hundreds of MIT courses,
visit MIT OpenCourseWare
7
00:00:17,556 --> 00:00:18,181
at ocw.mit.edu.
8
00:00:23,447 --> 00:00:25,280
PROFESSOR: Last time
we defined the expected
9
00:00:25,280 --> 00:00:27,030
value of a random variable.
10
00:00:27,030 --> 00:00:29,570
And we talked about a lot of
ways it could be computed.
11
00:00:29,570 --> 00:00:33,090
We proved all sorts of
equivalent definitions.
12
00:00:33,090 --> 00:00:35,730
Today, we're going to keep
talking about expectation.
13
00:00:35,730 --> 00:00:38,970
And we're going to start
with an example that
14
00:00:38,970 --> 00:00:42,030
talks about the expected
number of events
15
00:00:42,030 --> 00:00:43,770
that you expect to have occur.
16
00:00:43,770 --> 00:00:45,200
And it's a
generalization of what
17
00:00:45,200 --> 00:00:50,946
we did with Chinese appetizer
and hat check from last time.
18
00:00:50,946 --> 00:00:52,910
We're going to call
this theorem 1.
19
00:00:55,600 --> 00:01:06,360
If you have a
probability space, s,
20
00:01:06,360 --> 00:01:10,860
and you've got a
collection of n events,
21
00:01:10,860 --> 00:01:20,310
let's call them A1 through
A n, and these are just
22
00:01:20,310 --> 00:01:31,750
subsets of s, then the expected
number of events to occur,
23
00:01:31,750 --> 00:01:47,375
of these events, is simply
the sum of i equals 1 to n
24
00:01:47,375 --> 00:01:52,610
of the probability of
the i-th event occurring.
25
00:01:52,610 --> 00:01:54,380
So you just sum up
the probabilities
26
00:01:54,380 --> 00:01:56,250
that the events occur.
27
00:01:56,250 --> 00:01:58,400
And that tells you
the expected number
28
00:01:58,400 --> 00:02:00,185
of events that will occur.
29
00:02:00,185 --> 00:02:03,320
So a very simple formula.
30
00:02:03,320 --> 00:02:06,730
So for example, A i
might be the event
31
00:02:06,730 --> 00:02:10,829
if the i-th man gets the
right hat back from last time.
32
00:02:10,829 --> 00:02:13,570
Or it could be the event
if the i-th person gets
33
00:02:13,570 --> 00:02:16,140
the right appetizer back
at the Chinese restaurant
34
00:02:16,140 --> 00:02:20,320
after we spin the
wheel in the center.
35
00:02:20,320 --> 00:02:21,890
So we're going to prove this.
36
00:02:21,890 --> 00:02:24,677
And the proof is very similar
to what we did last time when
37
00:02:24,677 --> 00:02:26,510
we figured out the
expected number of people
38
00:02:26,510 --> 00:02:28,350
to get the right hat back.
39
00:02:28,350 --> 00:02:31,810
In particular, we're going
to start by setting up
40
00:02:31,810 --> 00:02:36,160
an indicator variable, T sub
i, that tells us whether or not
41
00:02:36,160 --> 00:02:38,610
the i-th event, A sub i occurs.
42
00:02:42,710 --> 00:02:50,400
So we define T sub i-- and it's
a function of a sample point--
43
00:02:50,400 --> 00:02:55,650
to be 1 if the sample
point is in the i-th event,
44
00:02:55,650 --> 00:02:59,310
meaning the i-th event
occurs, and 0 otherwise.
45
00:03:03,050 --> 00:03:05,220
And this is just
another way of saying
46
00:03:05,220 --> 00:03:13,910
that T sub i is 1 if and only
if A sub i happens or occurs.
47
00:03:18,810 --> 00:03:21,970
Now what we care about is the
number of events that occur.
48
00:03:21,970 --> 00:03:25,550
And we get that just by
summing up the T sub i.
49
00:03:25,550 --> 00:03:33,805
So we'll let T be
T1 plus T2 plus T n.
50
00:03:33,805 --> 00:03:36,220
And that'll count because
we'll get a 1 every time
51
00:03:36,220 --> 00:03:37,060
an event occurs.
52
00:03:37,060 --> 00:03:39,260
By adding those up,
we'll get the number
53
00:03:39,260 --> 00:03:42,570
of events that occur.
54
00:03:42,570 --> 00:03:43,070
All right.
55
00:03:43,070 --> 00:03:46,210
Now we care about
the expected value
56
00:03:46,210 --> 00:03:49,760
of T, the expected number
of events to occur.
57
00:03:49,760 --> 00:03:53,870
And I claim that's just
the sum i equals 1 to n
58
00:03:53,870 --> 00:03:57,200
of the expected value of T i.
59
00:03:57,200 --> 00:03:58,110
Why is that true?
60
00:04:01,540 --> 00:04:06,740
Why is the expected value of T
the sum of the expected values
61
00:04:06,740 --> 00:04:09,440
of the T sub i?
62
00:04:09,440 --> 00:04:11,610
Linearity of expectations.
63
00:04:11,610 --> 00:04:15,910
Now did we need the T sub i's to
be independent events for that?
64
00:04:15,910 --> 00:04:16,970
No.
65
00:04:16,970 --> 00:04:17,884
OK, very good.
66
00:04:20,480 --> 00:04:25,180
Now the expected value of T
i is really easy to evaluate.
67
00:04:25,180 --> 00:04:27,220
It's just a 0, 1 variable.
68
00:04:27,220 --> 00:04:33,130
So it's just the
probability that T i is 1
69
00:04:33,130 --> 00:04:36,800
because it's 1 times this plus
0 times the probability of 0,
70
00:04:36,800 --> 00:04:39,470
and that cancels out.
71
00:04:39,470 --> 00:04:41,440
And the event that
T i equals 1 is just
72
00:04:41,440 --> 00:04:53,530
the situation where the i-th
event occurs because T i equals
73
00:04:53,530 --> 00:04:56,090
1 is the case that A i occurs.
74
00:04:56,090 --> 00:04:58,770
That's what it is.
75
00:04:58,770 --> 00:05:01,770
So we've shown that the expected
number of events to occur
76
00:05:01,770 --> 00:05:04,070
is simply the sum
of the probabilities
77
00:05:04,070 --> 00:05:05,620
that the events occur.
78
00:05:05,620 --> 00:05:10,630
So a very simple
formula, very handy.
79
00:05:10,630 --> 00:05:13,630
And you don't need
independence for that.
80
00:05:13,630 --> 00:05:18,020
Any questions about that?
81
00:05:18,020 --> 00:05:20,210
We're going to use that
theorem a lot today.
82
00:05:20,210 --> 00:05:26,445
As a simple example, suppose
we flip n fair coins.
83
00:05:30,110 --> 00:05:35,640
And we let A i be the event
that the i-th coin is heads.
84
00:05:40,560 --> 00:05:43,590
And suppose we want to know
the expected number of heads
85
00:05:43,590 --> 00:05:46,640
in the n flips.
86
00:05:46,640 --> 00:05:49,530
Well, we can use this theorem.
87
00:05:49,530 --> 00:05:51,954
The expected number
of heads is just
88
00:05:51,954 --> 00:05:53,620
going to be the sum
of the probabilities
89
00:05:53,620 --> 00:05:55,930
that each coin is a heads.
90
00:05:55,930 --> 00:05:57,960
So let's do that.
91
00:05:57,960 --> 00:05:59,300
T is the number of heads.
92
00:06:02,880 --> 00:06:05,460
We want to know the
expected value of T.
93
00:06:05,460 --> 00:06:07,810
And from theorem
one, that's just
94
00:06:07,810 --> 00:06:11,070
the probability the first coin
is heads plus the probability
95
00:06:11,070 --> 00:06:13,700
the second coin is heads.
96
00:06:13,700 --> 00:06:16,070
And the same out to the
probability the n-th coin
97
00:06:16,070 --> 00:06:18,700
is heads.
98
00:06:18,700 --> 00:06:20,690
What's the probability
the first coin is heads?
99
00:06:23,750 --> 00:06:25,620
And 1/2.
100
00:06:25,620 --> 00:06:29,614
The probability the second
coin is heads is a 1/2.
101
00:06:29,614 --> 00:06:31,530
The probability the last
coin is heads is 1/2.
102
00:06:34,730 --> 00:06:38,740
And so the expected number of
heads-- we add up 1/2 n times--
103
00:06:38,740 --> 00:06:41,798
is just n/2.
104
00:06:41,798 --> 00:06:43,470
Of course, you all knew that.
105
00:06:43,470 --> 00:06:45,800
If you flip n fair coins,
the expected number of heads
106
00:06:45,800 --> 00:06:46,900
is half of them.
107
00:06:46,900 --> 00:06:50,770
But that's a very
simple way to prove it.
108
00:06:50,770 --> 00:06:53,290
Did we need the coin
tosses to be mutually
109
00:06:53,290 --> 00:06:55,720
independent to conclude that?
110
00:06:58,290 --> 00:06:59,050
No.
111
00:06:59,050 --> 00:07:03,140
I could've glued them
together in some weird way.
112
00:07:03,140 --> 00:07:06,740
In fact, I could have glued
some face up and some face down
113
00:07:06,740 --> 00:07:08,920
and done weird
things, and you still
114
00:07:08,920 --> 00:07:12,060
expect n/2 heads even if
they were glued together
115
00:07:12,060 --> 00:07:13,480
in strange ways.
116
00:07:13,480 --> 00:07:15,490
Because I don't
need independence
117
00:07:15,490 --> 00:07:17,610
for linearity of
expectation to prove this.
118
00:07:21,080 --> 00:07:24,242
Now that's the easy
way to evaluate
119
00:07:24,242 --> 00:07:25,450
the expected number of heads.
120
00:07:25,450 --> 00:07:27,920
There is a hard way to do it.
121
00:07:27,920 --> 00:07:29,790
Let me set that up.
122
00:07:29,790 --> 00:07:31,630
We could start from
the definition,
123
00:07:31,630 --> 00:07:38,040
a different definition, namely,
that the expected value of T
124
00:07:38,040 --> 00:07:43,140
is the sum from i equals 0 to n.
125
00:07:43,140 --> 00:07:48,300
i times the probability
that you have i heads.
126
00:07:48,300 --> 00:07:50,670
This would be a
natural way to compute
127
00:07:50,670 --> 00:07:52,260
the expected number of heads.
128
00:07:52,260 --> 00:07:54,050
You add up the
case where there's
129
00:07:54,050 --> 00:07:56,390
zero heads times the
probability of 0,
130
00:07:56,390 --> 00:07:58,600
1 times the probability
of one head, 2 times
131
00:07:58,600 --> 00:08:00,096
probably two heads,
and so forth.
132
00:08:00,096 --> 00:08:02,220
That's one of the first
definitions of expectation.
133
00:08:04,820 --> 00:08:06,555
So let's keep trying to do this.
134
00:08:09,410 --> 00:08:12,402
What is the probability
of getting i heads?
135
00:08:12,402 --> 00:08:13,860
And now I'm going
to have to assume
136
00:08:13,860 --> 00:08:15,530
mutual independence actually.
137
00:08:15,530 --> 00:08:18,360
Now I'm going to need
mutual independence.
138
00:08:18,360 --> 00:08:20,439
So already, this
method isn't as good
139
00:08:20,439 --> 00:08:21,980
because I had to
make that assumption
140
00:08:21,980 --> 00:08:24,524
to answer this question.
141
00:08:24,524 --> 00:08:25,940
If you don't make
that assumption,
142
00:08:25,940 --> 00:08:27,420
you can't answer that question.
143
00:08:27,420 --> 00:08:31,040
What's the probability of
getting i heads out of n.
144
00:08:31,040 --> 00:08:32,745
Yeah?
145
00:08:32,745 --> 00:08:36,700
AUDIENCE: s to the n-th
power times n [INAUDIBLE] i.
146
00:08:36,700 --> 00:08:40,789
PROFESSOR: Yes, because if
you look at the sample space,
147
00:08:40,789 --> 00:08:45,110
there are 2 to the n sample
points all equally likely.
148
00:08:45,110 --> 00:08:47,310
They're all probability
2 the minus n.
149
00:08:47,310 --> 00:08:52,590
And there's n choose i of
them that have i heads.
150
00:08:52,590 --> 00:08:57,570
And now you'd have to
evaluate that sum which
151
00:08:57,570 --> 00:08:59,520
is sort of a pain.
152
00:08:59,520 --> 00:09:03,320
That one won't come
to mind readily.
153
00:09:03,320 --> 00:09:06,940
So you might say I reached
sort of a dead end here.
154
00:09:06,940 --> 00:09:09,910
But in fact, the answer
is easy to use, easy
155
00:09:09,910 --> 00:09:13,330
to get using this method.
156
00:09:13,330 --> 00:09:16,580
In fact, we've actually
proved an identity here
157
00:09:16,580 --> 00:09:18,440
because we know
the answer is n/2.
158
00:09:18,440 --> 00:09:23,360
We've just proved that
this messy thing is n/2.
159
00:09:26,400 --> 00:09:28,930
In fact, you can multiply
by 2 to the n here.
160
00:09:28,930 --> 00:09:32,840
We have proved, using
probability theory and theorem
161
00:09:32,840 --> 00:09:39,660
1 over there, the sum of
i n choose i equals n 2
162
00:09:39,660 --> 00:09:43,770
to the n minus 1.
163
00:09:43,770 --> 00:09:46,370
Just multiply by 2 to
the n on each side.
164
00:09:46,370 --> 00:09:48,700
So we've given a
probability based
165
00:09:48,700 --> 00:09:55,040
proof of this identity which is
sort of hard to do otherwise,
166
00:09:55,040 --> 00:09:56,630
could be harder to do.
167
00:09:56,630 --> 00:09:59,640
Any questions about that?
168
00:09:59,640 --> 00:10:03,570
So again, if it comes time for
a homework problem or a test
169
00:10:03,570 --> 00:10:06,950
problem, if it
naturally divides up
170
00:10:06,950 --> 00:10:11,030
in this way where you can
take a random variable when
171
00:10:11,030 --> 00:10:13,750
you got a computer's expectation
to make it a sum of indicator
172
00:10:13,750 --> 00:10:17,760
variables, if that is a natural
thing to do, do it that way.
173
00:10:17,760 --> 00:10:20,370
Because it's so much
easier than trying
174
00:10:20,370 --> 00:10:22,770
to go from the
definition because you
175
00:10:22,770 --> 00:10:24,430
might encounter
nasty things like
176
00:10:24,430 --> 00:10:26,290
that that you got to evaluate.
177
00:10:32,150 --> 00:10:33,840
So in this case,
we flipped n coins.
178
00:10:33,840 --> 00:10:36,050
We expect n/2 heads.
179
00:10:36,050 --> 00:10:39,650
In the hat check, in
Chinese appetizer problems--
180
00:10:39,650 --> 00:10:42,620
we had n hats or n
appetizers-- we expected
181
00:10:42,620 --> 00:10:46,490
to get one back to the right
person-- so a smaller expected
182
00:10:46,490 --> 00:10:48,150
value.
183
00:10:48,150 --> 00:10:51,890
For some problems, the
expected value is even less.
184
00:10:51,890 --> 00:10:54,550
The expected number of events
to occur is less than 1.
185
00:10:54,550 --> 00:10:57,130
In fact, it might
be much less than 1.
186
00:10:57,130 --> 00:11:01,980
Now in those cases it turns
out that the expected value
187
00:11:01,980 --> 00:11:07,710
is an upper bound
on the probability
188
00:11:07,710 --> 00:11:10,280
that one or more events occur.
189
00:11:10,280 --> 00:11:11,910
We're going to state
this as theorem 2.
190
00:11:18,090 --> 00:11:22,190
The probability that at
least one event occurs
191
00:11:22,190 --> 00:11:25,570
is always upper bounded by
the expected number of events
192
00:11:25,570 --> 00:11:27,450
to occur.
193
00:11:27,450 --> 00:11:29,090
Now this theorem
is pretty useless
194
00:11:29,090 --> 00:11:32,220
if the expected number of
events to occur is bigger than 1
195
00:11:32,220 --> 00:11:35,230
because all probabilities
are at most 1.
196
00:11:35,230 --> 00:11:37,190
But if the expected
number of events to occur
197
00:11:37,190 --> 00:11:40,080
is small, something
much less than 1,
198
00:11:40,080 --> 00:11:43,100
this is a pretty useful bound.
199
00:11:43,100 --> 00:11:44,025
So let's prove that.
200
00:11:47,380 --> 00:11:50,710
The expected value of
T-- and in this case
201
00:11:50,710 --> 00:11:54,200
we'll use one of the definitions
we have of expected value,
202
00:11:54,200 --> 00:11:56,710
name of the one
where you sum from i
203
00:11:56,710 --> 00:12:01,056
equals 1 to infinity
of the probability T
204
00:12:01,056 --> 00:12:04,210
is greater than or equal to i.
205
00:12:04,210 --> 00:12:08,951
Now what did we have to know
about T to use that definition?
206
00:12:08,951 --> 00:12:11,240
That doesn't work for
all random variables
207
00:12:11,240 --> 00:12:19,650
T. What condition do I have on
T to be able to use this one?
208
00:12:19,650 --> 00:12:20,770
Anybody remember?
209
00:12:20,770 --> 00:12:21,289
Yeah.
210
00:12:21,289 --> 00:12:23,330
AUDIENCE: Must be defined
on the natural numbers?
211
00:12:23,330 --> 00:12:25,371
PROFESSOR: T must defined
on the natural numbers.
212
00:12:25,371 --> 00:12:28,890
If it is, I can use this
very simple definition.
213
00:12:28,890 --> 00:12:31,260
And is T defined on the
natural numbers here?
214
00:12:31,260 --> 00:12:34,560
Is the range of T
natural numbers?
215
00:12:34,560 --> 00:12:37,920
Well, I'm counting the number
of events that occurred.
216
00:12:37,920 --> 00:12:39,575
Could be 0, 1, 2, 3, 4.
217
00:12:39,575 --> 00:12:40,700
Has to be a natural number.
218
00:12:40,700 --> 00:12:42,240
So it's OK.
219
00:12:42,240 --> 00:12:45,300
So I can use this definition.
220
00:12:45,300 --> 00:12:48,860
Now this is summing up
probability T is at least 1
221
00:12:48,860 --> 00:12:51,680
plus probability T is
at least 2 and so forth.
222
00:12:51,680 --> 00:12:53,960
I'm just going to
use the first term.
223
00:12:53,960 --> 00:12:56,425
This is at least the
size of the first term
224
00:12:56,425 --> 00:12:58,050
because probabilities
are non-negative.
225
00:13:00,349 --> 00:13:02,890
So I'm just going to throw away
all the terms after the first
226
00:13:02,890 --> 00:13:05,400
and conclude that this is
at least the probability T
227
00:13:05,400 --> 00:13:07,000
is bigger than or equal to 1.
228
00:13:07,000 --> 00:13:08,882
And I'm done.
229
00:13:08,882 --> 00:13:10,090
I just look at it in reverse.
230
00:13:10,090 --> 00:13:12,680
The probability of at
least one event occurring
231
00:13:12,680 --> 00:13:17,480
is at most the expected value.
232
00:13:17,480 --> 00:13:18,850
Very simple.
233
00:13:18,850 --> 00:13:20,620
There's a very quick
corollary here.
234
00:13:24,330 --> 00:13:26,140
The probability
at least one event
235
00:13:26,140 --> 00:13:31,045
occurs is at most the sum of
the probabilities of the events.
236
00:13:34,870 --> 00:13:39,260
And the proof there is
just plugging in theorem 1.
237
00:13:39,260 --> 00:13:43,920
Because the expected value is
the sum of the probabilities.
238
00:13:43,920 --> 00:13:48,460
So we just plug-in theorem
1 for the expected value
239
00:13:48,460 --> 00:13:49,710
because it's just that.
240
00:13:53,930 --> 00:13:57,900
Any questions about the proof?
241
00:13:57,900 --> 00:13:59,670
Very simple.
242
00:13:59,670 --> 00:14:03,530
Now this theorem is very
useful in situations
243
00:14:03,530 --> 00:14:05,100
where you're trying
to upper bound
244
00:14:05,100 --> 00:14:08,400
the probability of some kind
of disaster or something
245
00:14:08,400 --> 00:14:10,160
bad happening.
246
00:14:10,160 --> 00:14:12,210
For example, suppose
you want to compute
247
00:14:12,210 --> 00:14:16,570
the probability that a
nuclear plant melts down.
248
00:14:16,570 --> 00:14:19,405
Now actually, the
government does this.
249
00:14:19,405 --> 00:14:20,780
They got to figure
this thing out
250
00:14:20,780 --> 00:14:22,488
because if it's a high
probability, well,
251
00:14:22,488 --> 00:14:24,810
we're not going to allow
anybody to build them.
252
00:14:24,810 --> 00:14:29,250
And the way they do it is they
convene a panel of experts.
253
00:14:29,250 --> 00:14:32,160
They get some people from
various good universities
254
00:14:32,160 --> 00:14:33,950
and they bring them
down to Washington.
255
00:14:33,950 --> 00:14:35,940
And they have them
figure out every way
256
00:14:35,940 --> 00:14:39,450
they can think of that
a meltdown could occur,
257
00:14:39,450 --> 00:14:42,800
every possible event that
would lead to a meltdown.
258
00:14:42,800 --> 00:14:46,470
And then they'd have them
figure out the probability
259
00:14:46,470 --> 00:14:48,600
for each one of those events.
260
00:14:48,600 --> 00:14:51,240
For example, A1
could be the event
261
00:14:51,240 --> 00:14:56,180
that the operator goes
crazy and makes it meltdown.
262
00:14:56,180 --> 00:15:00,150
A2 could be the event that an
earthquake hits and the cooling
263
00:15:00,150 --> 00:15:04,340
pipes are ruptured, and
then you got a meltdown.
264
00:15:04,340 --> 00:15:07,800
A3 is the event that terrorists
shoot their way in and cause
265
00:15:07,800 --> 00:15:09,577
a meltdown.
266
00:15:09,577 --> 00:15:11,410
So you've got a lot of
possibilities for how
267
00:15:11,410 --> 00:15:13,730
the thing can melt down.
268
00:15:13,730 --> 00:15:16,000
And then they compute
the probabilities.
269
00:15:16,000 --> 00:15:21,260
And then they add them up
just using this result.
270
00:15:21,260 --> 00:15:23,120
And they say, well,
the probability
271
00:15:23,120 --> 00:15:27,510
that a meltdown-causing
event or one or more occurs
272
00:15:27,510 --> 00:15:29,270
is at most this small number.
273
00:15:29,270 --> 00:15:31,510
And hopefully it's small.
274
00:15:31,510 --> 00:15:37,530
So for example, suppose there's
100 ways that a meltdown could
275
00:15:37,530 --> 00:15:40,730
occur, 100 things that
could cause a meltdown.
276
00:15:40,730 --> 00:15:45,350
And each one happens with
probability one in a million.
277
00:15:45,350 --> 00:15:48,710
What can you say
about the probability
278
00:15:48,710 --> 00:15:49,690
that a meltdown occurs?
279
00:15:53,510 --> 00:15:55,820
You got a hundred ways
it could happen only.
280
00:15:55,820 --> 00:15:58,902
Each is a one in
a million chance.
281
00:15:58,902 --> 00:16:00,610
What's the probability
a meltdown occurs?
282
00:16:03,370 --> 00:16:07,940
1 in 10,000 because you're
adding up one in a million
283
00:16:07,940 --> 00:16:08,940
100 times.
284
00:16:08,940 --> 00:16:10,000
n is 100.
285
00:16:10,000 --> 00:16:12,300
Each of these is
one in a million.
286
00:16:12,300 --> 00:16:13,630
So you get 100 over a million.
287
00:16:13,630 --> 00:16:14,880
There's 1 in 10,000.
288
00:16:14,880 --> 00:16:16,570
And so then they
publish a report that
289
00:16:16,570 --> 00:16:20,500
says the chance of this reactor
melting down is 1 in 10,000.
290
00:16:20,500 --> 00:16:24,969
Now what if I've
got 100 reactors?
291
00:16:24,969 --> 00:16:27,260
What's the chance that at
least one of them melts down?
292
00:16:30,990 --> 00:16:35,020
1 in 100 because I got 100
over 10,000-- same theorem.
293
00:16:35,020 --> 00:16:38,065
So there's a 1 in 100 chance
something melts down somewhere,
294
00:16:38,065 --> 00:16:40,657
at most.
295
00:16:40,657 --> 00:16:42,490
Hopefully, the numbers
are better than that.
296
00:16:45,030 --> 00:16:47,870
Same thing if you bought
100 lottery tickets, each
297
00:16:47,870 --> 00:16:50,270
a one in a million
chance, you got a 1
298
00:16:50,270 --> 00:16:54,780
in 10,000 chance of
winning, at most.
299
00:16:54,780 --> 00:16:59,980
So simple fact but powerful
and used a lot in practice.
300
00:16:59,980 --> 00:17:03,080
And this is sort of
the good case when
301
00:17:03,080 --> 00:17:07,010
the expected number of
events that are bad to happen
302
00:17:07,010 --> 00:17:11,569
is small, like a
lot less than 1.
303
00:17:11,569 --> 00:17:15,760
But what if the expected number
of events to happen is big.
304
00:17:15,760 --> 00:17:18,819
Say it's 10.
305
00:17:18,819 --> 00:17:21,020
Say this government
panel gets together
306
00:17:21,020 --> 00:17:23,030
and they add up all
the probabilities
307
00:17:23,030 --> 00:17:25,530
and it comes out to be 10.
308
00:17:25,530 --> 00:17:29,260
Well, it doesn't sound so
good if that's the case.
309
00:17:29,260 --> 00:17:30,880
But is it necessarily bad?
310
00:17:30,880 --> 00:17:34,000
Does it necessarily
mean that you're
311
00:17:34,000 --> 00:17:35,160
going to have a meltdown.
312
00:17:35,160 --> 00:17:36,810
So for example,
let's say there's
313
00:17:36,810 --> 00:17:39,590
1,000 ways you could melt down.
314
00:17:39,590 --> 00:17:43,350
And let's say that the
probability of each one
315
00:17:43,350 --> 00:17:46,860
is 1 in 100.
316
00:17:46,860 --> 00:17:49,710
So the expected number of
things that could happen
317
00:17:49,710 --> 00:17:53,400
to cause a meltdown is 10.
318
00:17:53,400 --> 00:17:55,120
Am I guaranteed we're
going to melt down?
319
00:17:58,590 --> 00:17:59,710
No.
320
00:17:59,710 --> 00:18:01,680
Can anybody think
of a way where it's
321
00:18:01,680 --> 00:18:04,720
unlikely we're
going to melt down
322
00:18:04,720 --> 00:18:08,280
but respect these values
here, hypothetically?
323
00:18:12,282 --> 00:18:13,740
Is there any chance
that it's still
324
00:18:13,740 --> 00:18:14,906
unlikely to have a meltdown?
325
00:18:17,900 --> 00:18:20,150
We're going to think of a way?
326
00:18:20,150 --> 00:18:20,800
Yeah.
327
00:18:20,800 --> 00:18:22,591
AUDIENCE: They all
happen at the same time.
328
00:18:22,591 --> 00:18:24,800
PROFESSOR: They all
happen at the same time.
329
00:18:24,800 --> 00:18:27,565
Now the examples I gave you--
the terrorists, the earthquake,
330
00:18:27,565 --> 00:18:31,470
and the crazy operator--
put that on the side.
331
00:18:31,470 --> 00:18:34,490
If they all happen together,
when any one happens,
332
00:18:34,490 --> 00:18:37,360
the others have to happen.
333
00:18:37,360 --> 00:18:42,380
So we can express
that as for all ij,
334
00:18:42,380 --> 00:18:44,750
probability of the
i-th event happening
335
00:18:44,750 --> 00:18:51,440
given the j-th event happening
is one, so total dependence.
336
00:18:51,440 --> 00:18:54,060
What's the probability of a
meltdown in that scenario?
337
00:18:58,780 --> 00:19:01,350
What's the probability one of
those meltdown-inducing events
338
00:19:01,350 --> 00:19:01,850
occurs?
339
00:19:04,960 --> 00:19:07,332
They all happen at once.
340
00:19:07,332 --> 00:19:08,320
AUDIENCE: 1 in 100.
341
00:19:08,320 --> 00:19:15,142
PROFESSOR: 1 in 100.
342
00:19:15,142 --> 00:19:17,600
Because it's the same as the
probability of the first event
343
00:19:17,600 --> 00:19:22,180
happening which, by
definition, was 1 in 100.
344
00:19:22,180 --> 00:19:25,400
So it could be that the
probability of a meltdown
345
00:19:25,400 --> 00:19:27,520
is small.
346
00:19:27,520 --> 00:19:29,270
But it might not be as well.
347
00:19:29,270 --> 00:19:31,405
There's no way,
given this, to know.
348
00:19:35,030 --> 00:19:38,940
What if I chain-- in fact,
this is like Chinese appetizer,
349
00:19:38,940 --> 00:19:40,140
right?
350
00:19:40,140 --> 00:19:45,040
If one person gets their
appetizer back, everybody does.
351
00:19:45,040 --> 00:19:47,009
So there are circumstances
where you have
352
00:19:47,009 --> 00:19:48,300
the total dependence like that.
353
00:19:51,160 --> 00:19:54,760
Let's say I change a little bit
and I don't do this scenario.
354
00:19:54,760 --> 00:19:56,760
In fact, say I
tell you the events
355
00:19:56,760 --> 00:20:02,850
are mutually independent,
but you expect 10 to occur.
356
00:20:02,850 --> 00:20:06,210
Do you sleep at night now?
357
00:20:06,210 --> 00:20:08,470
Of course, 1% is still
a pretty high number.
358
00:20:08,470 --> 00:20:12,690
But how many people think that
if they're mutually independent
359
00:20:12,690 --> 00:20:16,630
and you expect 10
that, no matter what,
360
00:20:16,630 --> 00:20:20,931
there's a least a 50%
chance of a meltdown?
361
00:20:20,931 --> 00:20:21,430
Anybody?
362
00:20:24,400 --> 00:20:25,290
OK.
363
00:20:25,290 --> 00:20:32,840
In fact, if you expect 10 and
they're mutually independent,
364
00:20:32,840 --> 00:20:35,260
a meltdown is a
virtual certainty.
365
00:20:35,260 --> 00:20:41,890
The chance you don't melt
down is less than 1 in 22,000.
366
00:20:41,890 --> 00:20:44,270
For sure something
will occur that's bad.
367
00:20:44,270 --> 00:20:50,472
And this is a theorem
that we call Murphy's law.
368
00:20:50,472 --> 00:20:52,555
And Murphy's law, probably
you've all heard of it,
369
00:20:52,555 --> 00:20:53,920
it says-- it's a famous saying.
370
00:20:53,920 --> 00:20:56,679
If something can go
wrong, it will go wrong.
371
00:20:56,679 --> 00:20:58,720
And we're going to see
why that's true, at least,
372
00:20:58,720 --> 00:21:00,900
in our circumstances here.
373
00:21:13,795 --> 00:21:15,170
That's a pretty
powerful theorem.
374
00:21:28,870 --> 00:21:49,380
If you have mutually independent
events A1 through A n, then
375
00:21:49,380 --> 00:21:55,300
the probability that none
of them occur, t equals 0,
376
00:21:55,300 --> 00:22:00,190
is upper bounded by e to the
minus expected number of events
377
00:22:00,190 --> 00:22:02,980
to occur.
378
00:22:02,980 --> 00:22:09,430
So if I expect 10 to occur,
the chance that none do
379
00:22:09,430 --> 00:22:14,660
is upper bounded by e to the
minus 10, which is very small,
380
00:22:14,660 --> 00:22:21,640
which means almost surely one of
the events or more will occur.
381
00:22:21,640 --> 00:22:26,110
And that's bad
news in this case.
382
00:22:26,110 --> 00:22:27,800
So let's prove that.
383
00:22:31,900 --> 00:22:35,690
Well, the probability
that t equals 0
384
00:22:35,690 --> 00:22:39,150
is the same as the probability
that A1 does not occur
385
00:22:39,150 --> 00:22:44,890
and A2 does not occur and all
the way to A n does not occur.
386
00:22:47,880 --> 00:22:50,700
And I claim this
[INAUDIBLE] is the product
387
00:22:50,700 --> 00:22:54,300
of the probabilities
they don't occur.
388
00:22:54,300 --> 00:22:57,110
So I'm taking the
product i equals 1 to n
389
00:22:57,110 --> 00:23:01,240
of the probability that
A i does not occur.
390
00:23:01,240 --> 00:23:02,670
Now why can I make that step?
391
00:23:05,850 --> 00:23:06,350
Yeah.
392
00:23:06,350 --> 00:23:08,270
Independence.
393
00:23:08,270 --> 00:23:11,670
This is the product rule
for independent events.
394
00:23:23,840 --> 00:23:28,560
Now the probability
that A i does not occur
395
00:23:28,560 --> 00:23:31,255
is simply 1 minus the
probability it does occur.
396
00:23:35,440 --> 00:23:39,620
And now I'm going to
use a simple fact, which
397
00:23:39,620 --> 00:23:43,840
is that for any
number x, 1 minus x
398
00:23:43,840 --> 00:23:47,367
is at most e to the minus x.
399
00:23:47,367 --> 00:23:48,700
Just a simple fact from algebra.
400
00:23:48,700 --> 00:23:53,980
So I've got 1 minus-- I'm
going to treat this as the x.
401
00:23:53,980 --> 00:24:03,180
So this is at most e to the
minus probability of A i
402
00:24:03,180 --> 00:24:05,710
using that fact.
403
00:24:05,710 --> 00:24:07,820
And now I'll take the
product and put it
404
00:24:07,820 --> 00:24:09,130
into a sum in the exponent.
405
00:24:19,440 --> 00:24:23,450
And then some of the
probabilities of the events
406
00:24:23,450 --> 00:24:25,190
is just the expected value.
407
00:24:25,190 --> 00:24:28,390
That was theorem 1
that I just erased.
408
00:24:28,390 --> 00:24:31,720
So this is e to the minus
expected number of events
409
00:24:31,720 --> 00:24:34,650
to occur.
410
00:24:34,650 --> 00:24:36,220
So not too hard a proof.
411
00:24:36,220 --> 00:24:38,190
We had to use that fact.
412
00:24:38,190 --> 00:24:41,000
But that gets the
expected number of events
413
00:24:41,000 --> 00:24:44,080
to occur in the exponent.
414
00:24:44,080 --> 00:24:51,010
So a simple corollary
is the case when
415
00:24:51,010 --> 00:24:52,890
we expect 10 events to occur.
416
00:25:03,840 --> 00:25:19,040
So if we expect 10 or more
mutually independent events
417
00:25:19,040 --> 00:25:35,870
to occur, the probability
that no event occurs
418
00:25:35,870 --> 00:25:41,020
is at most e to the minus
10, which is less than 1
419
00:25:41,020 --> 00:25:46,490
over 22,000.
420
00:25:46,490 --> 00:25:50,490
Now there's not even any
dependence on n here.
421
00:25:50,490 --> 00:25:52,740
It had nothing to do with a
number of possible events,
422
00:25:52,740 --> 00:25:55,460
just that if you expect
10 of them to occur,
423
00:25:55,460 --> 00:25:58,910
you're pretty sure
one of them will.
424
00:25:58,910 --> 00:26:04,270
And this explains why you
see weird coincidences.
425
00:26:04,270 --> 00:26:07,880
Or people sometimes see what
they think are miracles.
426
00:26:07,880 --> 00:26:10,720
Because out in the
real world, there's
427
00:26:10,720 --> 00:26:15,360
billions of possible weird
things that could happen.
428
00:26:15,360 --> 00:26:19,450
You just can create all
sorts of crazy possibilities.
429
00:26:19,450 --> 00:26:22,230
And each one might be
one in a billion chance
430
00:26:22,230 --> 00:26:24,800
of actually happening.
431
00:26:24,800 --> 00:26:28,970
But you got billions
that could've.
432
00:26:28,970 --> 00:26:30,710
And if they're all
mutually independent--
433
00:26:30,710 --> 00:26:33,870
because you made up all these
different things-- than you
434
00:26:33,870 --> 00:26:36,940
expect some of them to happen.
435
00:26:36,940 --> 00:26:39,110
And so you should--
in fact, you're
436
00:26:39,110 --> 00:26:42,470
going to know that for sure
some of those weird things
437
00:26:42,470 --> 00:26:43,545
are going to happen.
438
00:26:43,545 --> 00:26:45,720
At least the chance that
no weird thing happens
439
00:26:45,720 --> 00:26:48,500
is 1 in 22,000.
440
00:26:48,500 --> 00:26:50,846
And so this can be why
somebody will go along
441
00:26:50,846 --> 00:26:52,710
and say, oh my goodness.
442
00:26:52,710 --> 00:26:55,650
You won't believe what
happened, a coincidence.
443
00:26:55,650 --> 00:26:57,880
And it's like, wow, the
chance of that happening
444
00:26:57,880 --> 00:26:59,380
was one in a billion.
445
00:26:59,380 --> 00:27:04,046
It must've been a miracle or an
act of God that this happened.
446
00:27:04,046 --> 00:27:06,420
But you're not thinking about
the other 10 billion things
447
00:27:06,420 --> 00:27:07,250
that didn't happen.
448
00:27:09,930 --> 00:27:13,050
So for sure some of those
things are going to happen.
449
00:27:13,050 --> 00:27:17,080
It's not likely that I'm going
to win megabucks next week.
450
00:27:17,080 --> 00:27:19,310
But somebody's going to win.
451
00:27:19,310 --> 00:27:22,860
If enough people play
and it's more than 1
452
00:27:22,860 --> 00:27:24,852
over the probability
that you're going to win,
453
00:27:24,852 --> 00:27:26,310
than it's very
likely somebody will
454
00:27:26,310 --> 00:27:30,010
win if everybody is
guessing randomly.
455
00:27:30,010 --> 00:27:33,110
Any questions about
what we're doing?
456
00:27:36,780 --> 00:27:40,110
So this is amazingly powerful,
this result. In fact,
457
00:27:40,110 --> 00:27:43,040
it's so powerful that
it's going to let me read
458
00:27:43,040 --> 00:27:44,860
somebody's mind in the class.
459
00:27:44,860 --> 00:27:48,980
We're going to do a
little card trick here.
460
00:27:48,980 --> 00:27:51,647
Now the way this
card trick works--
461
00:27:51,647 --> 00:27:52,730
it's a little complicated.
462
00:27:52,730 --> 00:27:54,410
I'm going to need a
volunteer, probably one of you
463
00:27:54,410 --> 00:27:55,390
guys down front.
464
00:27:55,390 --> 00:27:56,480
We'll get you.
465
00:27:56,480 --> 00:28:00,410
And one of your buddies is going
to keep you honest for me here.
466
00:28:00,410 --> 00:28:01,962
I'm going to reveal--
first I'm going
467
00:28:01,962 --> 00:28:03,129
to let you shuffle the deck.
468
00:28:03,129 --> 00:28:05,170
So go ahead and shuffle
it, do whatever you want.
469
00:28:05,170 --> 00:28:06,040
It's a normal deck.
470
00:28:06,040 --> 00:28:09,760
It's got 52 cards
and two jokers.
471
00:28:09,760 --> 00:28:12,970
And I don't care what
order they're in.
472
00:28:12,970 --> 00:28:16,410
I'm going to turn over
the cards one at a time.
473
00:28:16,410 --> 00:28:21,110
Now I'm going to ask you to
pick a number from 1 to 9
474
00:28:21,110 --> 00:28:22,200
ahead of time.
475
00:28:22,200 --> 00:28:24,530
Don't tell me or anybody else.
476
00:28:24,530 --> 00:28:27,550
In fact, I'm going to want
you guys to play along too.
477
00:28:27,550 --> 00:28:31,120
And we're going to see
where we all end up here.
478
00:28:31,120 --> 00:28:33,110
And that's your starting number.
479
00:28:33,110 --> 00:28:36,360
And as I turn over the
cards one at a time--
480
00:28:36,360 --> 00:28:39,630
say you started with a 3 was
the number you had in mind--
481
00:28:39,630 --> 00:28:44,100
on the third card I show,
that becomes your card.
482
00:28:44,100 --> 00:28:46,865
You don't tell me or jump
up and down or anything.
483
00:28:46,865 --> 00:28:47,740
But that's your card.
484
00:28:47,740 --> 00:28:52,650
And say it's a 4 of diamonds.
485
00:28:52,650 --> 00:28:55,110
Now a 4 replaces the
three in your mind
486
00:28:55,110 --> 00:29:00,130
and you count 4 more cards,
then that becomes your card.
487
00:29:00,130 --> 00:29:04,350
Now let's say that's a jack or
a face card or a 10 or a joker.
488
00:29:04,350 --> 00:29:06,840
10, face card, and
jokers all count as 1
489
00:29:06,840 --> 00:29:08,730
just like an ace counts as 1.
490
00:29:08,730 --> 00:29:10,940
And so then the next
card would be your card
491
00:29:10,940 --> 00:29:12,540
because you count 1.
492
00:29:12,540 --> 00:29:17,510
And we keep on going until you
have a card, maybe it's a 7.
493
00:29:17,510 --> 00:29:20,230
But there's only four
cards left in the deck.
494
00:29:20,230 --> 00:29:21,760
And so you don't get a new one.
495
00:29:21,760 --> 00:29:24,390
And your last card is the 7.
496
00:29:24,390 --> 00:29:29,300
And then you're going to write
that down here, not showing me.
497
00:29:29,300 --> 00:29:31,510
And you're going to
do this, maybe do this
498
00:29:31,510 --> 00:29:33,610
with a friend over there.
499
00:29:33,610 --> 00:29:36,030
And you're going to make sure
you count right on the deck
500
00:29:36,030 --> 00:29:37,571
because if you screw
up the counting,
501
00:29:37,571 --> 00:29:40,800
it's going to be hard
for me to read your mind.
502
00:29:40,800 --> 00:29:42,965
So just to make sure we
all understand this, let
503
00:29:42,965 --> 00:29:45,423
me write the rules down here
because I want the whole class
504
00:29:45,423 --> 00:29:49,270
to pick a number from 1 to
9 and play the same game.
505
00:29:49,270 --> 00:29:53,565
And we're going to
see what happens.
506
00:29:53,565 --> 00:29:55,190
So let me show you
the rules again just
507
00:29:55,190 --> 00:29:56,648
to make sure
everybody understands.
508
00:29:59,520 --> 00:30:03,260
So say the deck
starts out like this.
509
00:30:03,260 --> 00:30:05,760
I got a 4, a 5.
510
00:30:05,760 --> 00:30:08,890
So my first few cards of
the deck go like this.
511
00:30:08,890 --> 00:30:10,410
10 equals a 1.
512
00:30:10,410 --> 00:30:12,630
Then I got a queen equals a 1.
513
00:30:12,630 --> 00:30:16,024
3, 7, 6, 4, 2.
514
00:30:16,024 --> 00:30:16,940
Say it's a small deck.
515
00:30:16,940 --> 00:30:19,740
I'm going to use 54 cards.
516
00:30:19,740 --> 00:30:26,420
And say you're chosen number to
start, you start with a three.
517
00:30:29,030 --> 00:30:34,640
As I show the cards, you're
going to count 1, 2, 3.
518
00:30:34,640 --> 00:30:36,650
That becomes your new card.
519
00:30:36,650 --> 00:30:38,340
Then you're going to count 1, 2.
520
00:30:38,340 --> 00:30:39,430
That becomes your card.
521
00:30:39,430 --> 00:30:42,220
It's a 10, so you
convert it to a 1
522
00:30:42,220 --> 00:30:44,390
because we're only doing
single digit numbers.
523
00:30:44,390 --> 00:30:46,760
Go to 1, that becomes your card.
524
00:30:46,760 --> 00:30:47,910
Queen converts to a 1.
525
00:30:47,910 --> 00:30:49,520
You go 1, that
becomes your card.
526
00:30:49,520 --> 00:30:52,320
3, 1, 2, 3.
527
00:30:52,320 --> 00:30:54,730
That becomes your card.
528
00:30:54,730 --> 00:31:00,240
And you can't get 4, so you
remember the final card.
529
00:31:00,240 --> 00:31:03,230
Does everybody understand
what you're supposed to do?
530
00:31:03,230 --> 00:31:06,410
Because we're going to
do 54 cards of this.
531
00:31:06,410 --> 00:31:08,859
Maybe we get the TAs
to play along here.
532
00:31:08,859 --> 00:31:11,150
And as you do it, maybe you
want to talk to your buddy,
533
00:31:11,150 --> 00:31:12,940
make sure you got
it worked out there.
534
00:31:16,742 --> 00:31:18,200
And if I could read
your mind maybe
535
00:31:18,200 --> 00:31:21,510
we'll have a gift
certificate or something.
536
00:31:21,510 --> 00:31:23,330
So you shuffle the deck?
537
00:31:23,330 --> 00:31:24,350
Got it good?
538
00:31:24,350 --> 00:31:24,880
All right.
539
00:31:24,880 --> 00:31:27,213
So I'm going to start revealing
the cards one at a time.
540
00:31:27,213 --> 00:31:30,190
So you guys play along
quietly in your mind.
541
00:31:30,190 --> 00:31:33,100
And we'll see if we can
concentrate long enough.
542
00:31:42,750 --> 00:31:43,700
Aces are 1.
543
00:31:59,950 --> 00:32:00,693
Jacks are 1.
544
00:32:18,790 --> 00:32:19,550
10's are 1.
545
00:33:27,530 --> 00:33:28,380
10's are 1.
546
00:34:11,410 --> 00:34:12,840
We're halfway done.
547
00:35:22,260 --> 00:35:23,650
Jokers are 1.
548
00:35:28,471 --> 00:35:28,970
OK.
549
00:35:28,970 --> 00:35:30,070
That's the last card.
550
00:35:30,070 --> 00:35:32,111
So remember the last
one that was yours.
551
00:35:32,111 --> 00:35:33,735
And you got to go
check with your buddy
552
00:35:33,735 --> 00:35:38,592
to make sure you guys agree
on the counting there.
553
00:35:38,592 --> 00:35:39,550
And then write it down.
554
00:35:43,840 --> 00:35:47,040
Don't tell me because I'm
going to read your mind.
555
00:35:47,040 --> 00:35:47,990
I'm going to tell you.
556
00:35:53,312 --> 00:35:54,020
This is not good.
557
00:35:54,020 --> 00:35:57,690
They're arguing
over the last card.
558
00:35:57,690 --> 00:36:00,980
I'll have to read
one of your minds.
559
00:36:00,980 --> 00:36:01,480
What's that?
560
00:36:01,480 --> 00:36:02,286
The 11 of clubs.
561
00:36:02,286 --> 00:36:03,410
That's hard one to predict.
562
00:36:12,976 --> 00:36:14,600
Make your best guess
and write it down.
563
00:36:14,600 --> 00:36:15,183
Don't tell me.
564
00:36:15,183 --> 00:36:16,152
Write it down.
565
00:36:16,152 --> 00:36:16,652
You got two?
566
00:36:16,652 --> 00:36:17,568
Well, write them both.
567
00:36:17,568 --> 00:36:18,780
I'll predict one of them.
568
00:36:30,000 --> 00:36:32,704
I've never had a
dispute on what the--
569
00:36:32,704 --> 00:36:34,620
because if you started
with the same position,
570
00:36:34,620 --> 00:36:36,870
you've got to wind up
in the same position.
571
00:36:36,870 --> 00:36:38,244
You wrote it down?
572
00:36:38,244 --> 00:36:39,910
Now think about your
number really hard.
573
00:36:39,910 --> 00:36:40,690
We'll take yours.
574
00:36:40,690 --> 00:36:41,370
I'll trust you there.
575
00:36:41,370 --> 00:36:42,870
Think about it
really hard because I
576
00:36:42,870 --> 00:36:46,230
need the brain waves to come
over and read the mind here.
577
00:36:49,900 --> 00:36:51,920
Yeah, yeah.
578
00:36:51,920 --> 00:36:55,559
I'm getting a really strong
signal on the last card.
579
00:36:55,559 --> 00:36:56,350
Maybe I don't know.
580
00:36:56,350 --> 00:36:58,183
Maybe it's something--
it's really powerful.
581
00:36:58,183 --> 00:37:00,941
I'm going to say it's
the queen of hearts.
582
00:37:00,941 --> 00:37:02,861
Is that right?
583
00:37:02,861 --> 00:37:05,360
Both were the queen of-- oh,
you were trying to screw me up,
584
00:37:05,360 --> 00:37:06,080
mess with me.
585
00:37:06,080 --> 00:37:09,066
So you both got the
queen of hearts.
586
00:37:09,066 --> 00:37:09,820
Oh, you did.
587
00:37:09,820 --> 00:37:14,030
So how many people got
the queen of hearts?
588
00:37:14,030 --> 00:37:14,530
Oh wow.
589
00:37:14,530 --> 00:37:18,200
How many people did not
get the queen of hearts.
590
00:37:18,200 --> 00:37:18,700
Somebody.
591
00:37:18,700 --> 00:37:19,680
OK.
592
00:37:19,680 --> 00:37:22,490
Now there's a chance
you did it legitimately.
593
00:37:22,490 --> 00:37:26,030
But usually, with
a deck, we're all
594
00:37:26,030 --> 00:37:28,329
going to get the same last card.
595
00:37:28,329 --> 00:37:30,620
Now in this case, it happened
to be the very last card.
596
00:37:30,620 --> 00:37:32,880
That is typically not the case.
597
00:37:32,880 --> 00:37:34,660
So very good.
598
00:37:34,660 --> 00:37:35,670
So I read your mind.
599
00:37:35,670 --> 00:37:38,670
So you guys get the
gift certificates here.
600
00:37:38,670 --> 00:37:39,170
Very good.
601
00:37:39,170 --> 00:37:42,220
One for you and
your sponsor there.
602
00:37:42,220 --> 00:37:47,300
So it's clear how I read
his mind because I got
603
00:37:47,300 --> 00:37:49,350
the same number everybody did.
604
00:37:49,350 --> 00:37:52,630
And somehow it doesn't
matter where we started.
605
00:37:52,630 --> 00:37:55,322
We all had the same
card at the end.
606
00:37:55,322 --> 00:37:58,150
How is that possible?
607
00:37:58,150 --> 00:38:00,874
There's nine different
starting points.
608
00:38:00,874 --> 00:38:02,790
Why don't we wind up in
nine different places?
609
00:38:02,790 --> 00:38:04,460
And why isn't there
a one in nine chance
610
00:38:04,460 --> 00:38:07,420
that I guess his card?
611
00:38:07,420 --> 00:38:10,829
Why do we all wind
up in the same place?
612
00:38:10,829 --> 00:38:11,370
Any thoughts?
613
00:38:11,370 --> 00:38:12,806
Yeah?
614
00:38:12,806 --> 00:38:13,972
AUDIENCE: Get the same card.
615
00:38:13,972 --> 00:38:16,525
After that you stay [INAUDIBLE].
616
00:38:16,525 --> 00:38:17,525
PROFESSOR: That's right.
617
00:38:17,525 --> 00:38:20,960
If ever we had the
same card, then we're
618
00:38:20,960 --> 00:38:23,910
going to track forever and
finish in the same card.
619
00:38:23,910 --> 00:38:25,710
But why should we ever
get the same card?
620
00:38:25,710 --> 00:38:29,605
What are the chances of that,
that we land on the same card?
621
00:38:32,060 --> 00:38:33,810
Why don't we just keep
missing each other?
622
00:38:33,810 --> 00:38:35,410
It's a 1 in 9
chance or something.
623
00:38:35,410 --> 00:38:36,247
I don't know.
624
00:38:36,247 --> 00:38:37,330
Why don't we keep missing?
625
00:38:37,330 --> 00:38:39,420
AUDIENCE: It seems like
there are enough low cards
626
00:38:39,420 --> 00:38:42,130
that you just move slowly
along and, eventually, you're
627
00:38:42,130 --> 00:38:43,524
going to intersect.
628
00:38:43,524 --> 00:38:44,190
PROFESSOR: Yeah.
629
00:38:44,190 --> 00:38:47,060
I did make a lot
of 1's in the deck.
630
00:38:47,060 --> 00:38:48,460
If I would've made
all these face
631
00:38:48,460 --> 00:38:53,360
cards be 10's, the chances of
my reading your mind go down.
632
00:38:53,360 --> 00:38:55,300
Why do they go down?
633
00:38:55,300 --> 00:38:56,300
What does it have to do?
634
00:38:56,300 --> 00:39:00,320
Why did I put a lot
of 1's in the deck?
635
00:39:00,320 --> 00:39:01,770
AUDIENCE: It goes on longer.
636
00:39:01,770 --> 00:39:02,728
PROFESSOR: What's that?
637
00:39:02,728 --> 00:39:04,520
AUDIENCE: The game
goes on longer?
638
00:39:04,520 --> 00:39:05,978
PROFESSOR: The game
goes on longer.
639
00:39:05,978 --> 00:39:09,180
So there's more chances
to hit together.
640
00:39:09,180 --> 00:39:12,630
Because at any given time--
you've got your card.
641
00:39:12,630 --> 00:39:13,250
I've got mine.
642
00:39:13,250 --> 00:39:17,590
If mine is behind you, I
got a chance to land on you.
643
00:39:17,590 --> 00:39:19,170
And if you're behind
me in the deck,
644
00:39:19,170 --> 00:39:22,146
you've got a chance to land
on me with your number.
645
00:39:22,146 --> 00:39:23,770
And if the numbers
are smaller, there's
646
00:39:23,770 --> 00:39:27,190
more chances to
land on each other.
647
00:39:27,190 --> 00:39:30,010
And it is true that
on any given chance,
648
00:39:30,010 --> 00:39:33,300
the chances are low that
we land on the same card.
649
00:39:33,300 --> 00:39:35,990
But there's a lot of chances.
650
00:39:35,990 --> 00:39:37,610
And if there's a
lot more chances
651
00:39:37,610 --> 00:39:40,720
than the probability of
landing on each other,
652
00:39:40,720 --> 00:39:44,580
we've got Murphy's law.
653
00:39:44,580 --> 00:39:49,600
If you've got a lot of chances
and they're not less likely
654
00:39:49,600 --> 00:39:51,690
than the number of chances
or the inverse of that,
655
00:39:51,690 --> 00:39:53,880
then we expect to have
a certain bunch of times
656
00:39:53,880 --> 00:39:56,050
that we're going to
land on each other.
657
00:39:56,050 --> 00:39:59,970
And therefore, a very
high probability we do.
658
00:39:59,970 --> 00:40:02,240
Now that was a little hand-wavy.
659
00:40:06,430 --> 00:40:09,210
And in fact, there's a
reason it was hand-wavy.
660
00:40:09,210 --> 00:40:13,510
Why doesn't Murphy's law
really apply in this case,
661
00:40:13,510 --> 00:40:14,711
really mathematically apply?
662
00:40:14,711 --> 00:40:15,210
Yeah.
663
00:40:18,399 --> 00:40:20,190
AUDIENCE: They're not
mutually independent.
664
00:40:20,190 --> 00:40:23,940
Once you draw one card,
it's not coming back.
665
00:40:23,940 --> 00:40:25,240
PROFESSOR: That's correct.
666
00:40:25,240 --> 00:40:29,640
And it means that the knowledge
that we haven't collided yet
667
00:40:29,640 --> 00:40:32,670
tells me something about the
cards we've seen-- not a lot,
668
00:40:32,670 --> 00:40:34,670
but something maybe.
669
00:40:34,670 --> 00:40:37,080
And it's a finite
deck which tells me
670
00:40:37,080 --> 00:40:39,180
something about the
cards that are coming.
671
00:40:39,180 --> 00:40:40,840
And it might influence
the probability
672
00:40:40,840 --> 00:40:42,710
that we land together,
the next person
673
00:40:42,710 --> 00:40:45,390
who's jumping on the deck.
674
00:40:45,390 --> 00:40:48,870
And so the events of-- like
for example, in this case,
675
00:40:48,870 --> 00:40:54,930
we let A i be the event of a
collision on the i-th jump.
676
00:40:54,930 --> 00:40:58,000
And there's about 20 jumps in
this game, 10 for each of us
677
00:40:58,000 --> 00:40:58,500
expected.
678
00:41:01,230 --> 00:41:06,935
So A i is the event that we
collide on the i-th jump.
679
00:41:11,420 --> 00:41:15,710
These events are not necessarily
mutually independent.
680
00:41:15,710 --> 00:41:19,900
Now if I had an
infinite deck or a deck
681
00:41:19,900 --> 00:41:24,360
with replacement so every card
is equally likely to come next
682
00:41:24,360 --> 00:41:26,230
no matter what's
come in the past,
683
00:41:26,230 --> 00:41:28,630
now you can start getting
some mutual independence here.
684
00:41:28,630 --> 00:41:31,880
And then you could start
really applying the theorem.
685
00:41:31,880 --> 00:41:34,730
Now in this case, you don't
expect 10 things to happen.
686
00:41:34,730 --> 00:41:36,190
You expect a few.
687
00:41:36,190 --> 00:41:38,945
But that's good enough
that, in fact-- so
688
00:41:38,945 --> 00:41:40,990
we did a computer
simulation once
689
00:41:40,990 --> 00:41:44,300
and I got about a 90%
chance that we'll all
690
00:41:44,300 --> 00:41:45,970
be on the same card.
691
00:41:45,970 --> 00:41:48,640
So I have a pretty good chance
that I'm going to guess right.
692
00:41:48,640 --> 00:41:51,410
And so far, I haven't
guessed wrong.
693
00:41:51,410 --> 00:41:53,670
But it will happen
some day that we'll
694
00:41:53,670 --> 00:41:55,180
start with a different
first number,
695
00:41:55,180 --> 00:41:57,450
and we will miss at
the end because they'll
696
00:41:57,450 --> 00:41:59,954
be two possible outcomes.
697
00:41:59,954 --> 00:42:01,620
Just the way it works
out with 52 cards.
698
00:42:01,620 --> 00:42:03,320
Now of course, if
we have more cards
699
00:42:03,320 --> 00:42:07,020
or I made more things be
1's instead of 9's, say,
700
00:42:07,020 --> 00:42:10,290
my odds go up because the
number of events I've got,
701
00:42:10,290 --> 00:42:13,970
the number of chances
to collide, increases.
702
00:42:13,970 --> 00:42:17,920
And the chance of hitting
when I jump also increases.
703
00:42:17,920 --> 00:42:19,860
Any questions on that game?
704
00:42:23,490 --> 00:42:28,090
So the point of all this is
that if the expected number
705
00:42:28,090 --> 00:42:31,730
of events to occur
is small, then
706
00:42:31,730 --> 00:42:33,400
it's an upper bound
on the probability
707
00:42:33,400 --> 00:42:38,550
that something happens, whether
they're independent or not.
708
00:42:38,550 --> 00:42:42,310
If the expected number of events
to occur is bigger than 1,
709
00:42:42,310 --> 00:42:46,390
large, and if the events
are mutually independent,
710
00:42:46,390 --> 00:42:49,140
then you can be sure
one of those events
711
00:42:49,140 --> 00:42:52,260
is going to occur-- very, very
likely one of them will occur.
712
00:42:52,260 --> 00:42:54,690
And that's Murphy's law.
713
00:42:54,690 --> 00:42:57,670
Any questions about
numbers of events to occur?
714
00:43:01,930 --> 00:43:05,010
We'll talk more about the
probability in the numbers
715
00:43:05,010 --> 00:43:09,055
of events that occur next time.
716
00:43:09,055 --> 00:43:10,430
Before we do that,
I want to talk
717
00:43:10,430 --> 00:43:12,725
about some more useful
facts about expectation.
718
00:43:16,100 --> 00:43:20,020
Now we know from
linearity of expectation
719
00:43:20,020 --> 00:43:24,330
that the expected value of
a sum of random variables
720
00:43:24,330 --> 00:43:30,550
is the sum of the expected
values of the random variables.
721
00:43:30,550 --> 00:43:34,600
Now we're going to look at the
expected value of a product
722
00:43:34,600 --> 00:43:36,660
of random variables.
723
00:43:36,660 --> 00:43:38,965
And it turns out there's
a very nice rule for that.
724
00:43:41,530 --> 00:43:46,845
Theorem 4, and it's the
product rule for expectation.
725
00:43:53,010 --> 00:43:57,000
And it says that for-- if
your random variables are
726
00:43:57,000 --> 00:44:09,140
independent, R1 and
R2 are independent,
727
00:44:09,140 --> 00:44:12,855
then the expected value
of their product, also
728
00:44:12,855 --> 00:44:17,510
a random variable, is simply the
product of the expected values.
729
00:44:20,150 --> 00:44:23,650
So it's sort of the
equivalent thing
730
00:44:23,650 --> 00:44:26,672
to linearity of expectation,
except we're doing products.
731
00:44:26,672 --> 00:44:27,755
And you need independence.
732
00:44:31,781 --> 00:44:34,280
Now the proof of this is not
too hard, and it's in the book.
733
00:44:34,280 --> 00:44:37,061
So we're not going
to do it in class.
734
00:44:37,061 --> 00:44:38,185
But we can give an example.
735
00:44:41,030 --> 00:44:52,820
Say we roll two six-sided
fair and independent dice.
736
00:44:58,230 --> 00:45:02,130
And I want to know what's the
expected product of the dice.
737
00:45:10,900 --> 00:45:18,720
So we're going to let R1 be
the value on the first die,
738
00:45:18,720 --> 00:45:22,355
and R2 would be the
value on the second one.
739
00:45:25,060 --> 00:45:31,670
And the expected
value of the product
740
00:45:31,670 --> 00:45:33,320
is the product of
the expectations.
741
00:45:37,910 --> 00:45:43,160
And we already know the expected
value of a single die is 7/2.
742
00:45:43,160 --> 00:45:49,260
So we get 7/2 times 7/2
is 49/4 or 12 and 1/4.
743
00:45:49,260 --> 00:45:55,400
So it's easy to compute the
expected product of two dice.
744
00:45:55,400 --> 00:45:57,200
Any questions about that?
745
00:46:00,910 --> 00:46:06,200
Much easier than looking at all
36 outcomes to use this rule.
746
00:46:06,200 --> 00:46:11,350
Now what if the dice we're
rigged, glued together somehow
747
00:46:11,350 --> 00:46:12,790
so they always came up the same?
748
00:46:15,780 --> 00:46:18,540
Would the expected
product B 12 and 1/4 then?
749
00:46:21,990 --> 00:46:22,610
No?
750
00:46:22,610 --> 00:46:24,460
Why not?
751
00:46:24,460 --> 00:46:26,060
Why wouldn't it be the case?
752
00:46:29,670 --> 00:46:31,890
Why isn't the
expected value of R1
753
00:46:31,890 --> 00:46:37,060
squared the square of
the expected value of R1?
754
00:46:37,060 --> 00:46:40,174
Isn't that what this says?
755
00:46:40,174 --> 00:46:41,150
AUDIENCE: Independent.
756
00:46:41,150 --> 00:46:42,730
PROFESSOR: They're
not independent.
757
00:46:42,730 --> 00:46:47,120
R1 is not independent of R1
In fact, it's the same thing.
758
00:46:47,120 --> 00:46:50,490
And you need independence
for that to be the case.
759
00:46:50,490 --> 00:47:02,140
So a non example, the
expected value of R1 times R1
760
00:47:02,140 --> 00:47:07,950
is the expected
value of R1 squared.
761
00:47:07,950 --> 00:47:10,210
And to do that, we got
to go back to the basics.
762
00:47:10,210 --> 00:47:12,805
We're taking the six
possible values of R1.
763
00:47:12,805 --> 00:47:15,400
i equals 1 to 6.
764
00:47:15,400 --> 00:47:18,550
i squared, because
we're squaring it,
765
00:47:18,550 --> 00:47:21,060
times the probability
R1 equals i.
766
00:47:23,660 --> 00:47:26,020
And each of those
probabilities is 1/6.
767
00:47:26,020 --> 00:47:36,470
So we get 1/6 times 1 plus 4
plus 9 plus 16 plus 25 plus 36.
768
00:47:36,470 --> 00:47:42,250
And if you add all that
up you get 15 and 1/6,
769
00:47:42,250 --> 00:47:50,180
which is not 3 and 1/2
squared, which is the expected
770
00:47:50,180 --> 00:47:54,340
value of R1 squared.
771
00:47:54,340 --> 00:47:56,660
So the expected
value of the square
772
00:47:56,660 --> 00:48:01,150
is not necessarily the
square of the expectation.
773
00:48:01,150 --> 00:48:03,774
Because a random variable
is not independent of itself
774
00:48:03,774 --> 00:48:04,273
generally.
775
00:48:07,450 --> 00:48:08,190
OK.
776
00:48:08,190 --> 00:48:10,270
Any questions there?
777
00:48:12,492 --> 00:48:14,075
There's a couple of
quick corollaries.
778
00:48:19,250 --> 00:48:21,590
The first is you take
this rule and apply it
779
00:48:21,590 --> 00:48:26,067
to many random variables as long
as they're mutually dependent.
780
00:48:26,067 --> 00:48:40,580
So if R1 R2 out to R n
are mutually independent,
781
00:48:40,580 --> 00:48:48,620
then the expected
value of their product
782
00:48:48,620 --> 00:48:50,510
is the product of
the expected values.
783
00:48:59,720 --> 00:49:03,620
And the proof is just by
induction on the number
784
00:49:03,620 --> 00:49:04,550
of random variables.
785
00:49:04,550 --> 00:49:06,439
So that's pretty easy.
786
00:49:06,439 --> 00:49:07,730
There's another easy corollary.
787
00:49:11,718 --> 00:49:19,670
And that says, for any
constants, constant values,
788
00:49:19,670 --> 00:49:28,150
a and b, and any
random variable R,
789
00:49:28,150 --> 00:49:35,460
the expected value of a times
R plus b is simply a times
790
00:49:35,460 --> 00:49:38,760
the expected value of R plus b.
791
00:49:41,360 --> 00:49:43,860
And the reason that's
true-- well, the sum
792
00:49:43,860 --> 00:49:47,300
works because of linearity
of expectation for the sum.
793
00:49:47,300 --> 00:49:50,030
You can think of b as a random
variable that just always
794
00:49:50,030 --> 00:49:52,950
has the value b.
795
00:49:52,950 --> 00:49:56,862
And the a comes out
in front because you
796
00:49:56,862 --> 00:49:58,570
can think of it as a
random variable that
797
00:49:58,570 --> 00:50:01,225
always has a value a.
798
00:50:01,225 --> 00:50:04,240
And that's independent of
any other random variable
799
00:50:04,240 --> 00:50:06,630
because it never changes.
800
00:50:06,630 --> 00:50:10,325
So by the product rule,
the a can come out.
801
00:50:10,325 --> 00:50:11,700
Now you've got to
prove all that.
802
00:50:11,700 --> 00:50:14,159
But it's not too hard and
not especially interesting.
803
00:50:14,159 --> 00:50:15,200
So we won't do that here.
804
00:50:18,372 --> 00:50:19,455
Any questions about those?
805
00:50:27,140 --> 00:50:28,530
So we've got a
rule for computing
806
00:50:28,530 --> 00:50:32,170
the sum of random variables,
a rule for the product
807
00:50:32,170 --> 00:50:34,420
of random variables.
808
00:50:34,420 --> 00:50:39,350
What about a rule for the
ratio of random variables?
809
00:50:39,350 --> 00:50:40,280
Let's look at that.
810
00:50:45,330 --> 00:50:47,600
So is this the corollary?
811
00:50:50,490 --> 00:50:52,890
In fact, let's take the
inverse of random variable.
812
00:50:52,890 --> 00:50:57,690
Is the expected value
of 1/R equal to 1
813
00:50:57,690 --> 00:51:04,510
over the expected value of
R for any random variable R?
814
00:51:04,510 --> 00:51:06,560
Some folks saying yes.
815
00:51:06,560 --> 00:51:09,240
Some saying no.
816
00:51:09,240 --> 00:51:10,350
What do you think?
817
00:51:10,350 --> 00:51:11,445
Is that true?
818
00:51:14,610 --> 00:51:15,790
Oh, got a mix.
819
00:51:15,790 --> 00:51:17,910
How many say yes?
820
00:51:17,910 --> 00:51:19,385
How many say no?
821
00:51:19,385 --> 00:51:20,500
Oh, more no's.
822
00:51:20,500 --> 00:51:22,940
Somebody tell me
why that's not true.
823
00:51:22,940 --> 00:51:26,058
Who would like to
give me an example?
824
00:51:26,058 --> 00:51:31,345
Give us an example there
that'll be very convincing.
825
00:51:31,345 --> 00:51:33,954
Yeah?
826
00:51:33,954 --> 00:51:36,370
AUDIENCE: I don't think it's
one that would be immediately
827
00:51:36,370 --> 00:51:41,558
obvious, but I think if R is
the result of the roll of a die,
828
00:51:41,558 --> 00:51:43,900
I don't think it works out.
829
00:51:43,900 --> 00:51:47,060
PROFESSOR: So it's 50
chance of-- oh, I see.
830
00:51:47,060 --> 00:51:53,159
So I take the average of 1/i--
that's sort of hard to compute.
831
00:51:53,159 --> 00:51:55,200
I got to do [INAUDIBLE]
the sixth harmonic number
832
00:51:55,200 --> 00:51:56,825
and then invert it.
833
00:51:56,825 --> 00:52:00,439
There's an easier way to
show that this is false.
834
00:52:00,439 --> 00:52:00,938
Yeah?
835
00:52:00,938 --> 00:52:02,869
AUDIENCE: The expected
value equals 0?
836
00:52:02,869 --> 00:52:03,535
PROFESSOR: Yeah.
837
00:52:03,535 --> 00:52:07,330
The expected value
equals 0 which
838
00:52:07,330 --> 00:52:12,520
could happen if R is plus 1
or minus 1 equally likely.
839
00:52:12,520 --> 00:52:15,840
So here's an example here.
840
00:52:15,840 --> 00:52:23,640
So R equals 1 with
probability 1/2, and minus 1
841
00:52:23,640 --> 00:52:25,500
with probability 1/2.
842
00:52:25,500 --> 00:52:27,835
So the expected value of R is 0.
843
00:52:32,370 --> 00:52:33,370
So that blows up.
844
00:52:33,370 --> 00:52:34,625
That's infinity.
845
00:52:34,625 --> 00:52:36,000
What's the expected
value of 1/R?
846
00:52:41,320 --> 00:52:44,300
Well, 1/1 and 1 over
minus 1, it's the same.
847
00:52:44,300 --> 00:52:46,300
It equals 0.
848
00:52:46,300 --> 00:52:48,930
And this would say 0 equals 1/0.
849
00:52:48,930 --> 00:52:50,030
That's not true.
850
00:52:50,030 --> 00:52:51,980
So this is false.
851
00:52:51,980 --> 00:52:54,230
It is not true for
every random variable.
852
00:52:57,350 --> 00:53:00,180
So once you see this example,
just obviously not true.
853
00:53:00,180 --> 00:53:02,010
In fact, there's very
few random variables
854
00:53:02,010 --> 00:53:06,440
for which this is true, even
an indicator random variable.
855
00:53:06,440 --> 00:53:11,290
So it's 1 with probability 1/2
and 0 with probability 1/2.
856
00:53:11,290 --> 00:53:14,950
Then the expected value
of 1/R is infinite.
857
00:53:14,950 --> 00:53:16,790
1 over the expected
value of R is 2.
858
00:53:20,940 --> 00:53:24,570
So it's clearly not true.
859
00:53:24,570 --> 00:53:27,066
Let's do another one.
860
00:53:34,730 --> 00:53:37,090
What about this
potential corollary?
861
00:53:42,310 --> 00:53:53,530
Given independent random
variables R and T,
862
00:53:53,530 --> 00:54:02,570
if the expected value
or R/T is bigger than 1,
863
00:54:02,570 --> 00:54:07,630
then the expected value of R
is bigger than the expected
864
00:54:07,630 --> 00:54:15,970
value of T. And let me even give
you a potential proof of this,
865
00:54:15,970 --> 00:54:18,960
see if you like this proof.
866
00:54:18,960 --> 00:54:22,270
Well, let's assume the expected
value of R/T is bigger than 1.
867
00:54:25,970 --> 00:54:39,620
And let's multiply both sides
by the expected value of T.
868
00:54:39,620 --> 00:54:42,530
And well, the product rule
says that this is just
869
00:54:42,530 --> 00:54:48,830
the expected value
of R/T times T, which
870
00:54:48,830 --> 00:54:51,990
is just the expected value
of R because the T's cancel.
871
00:55:02,654 --> 00:55:03,570
So I gave you a proof.
872
00:55:06,370 --> 00:55:09,760
Anybody have any
quibbles with this proof?
873
00:55:15,172 --> 00:55:15,672
Yeah?
874
00:55:19,050 --> 00:55:20,330
That's a big problem.
875
00:55:20,330 --> 00:55:23,680
R/T is not independent
of T. [INAUDIBLE]
876
00:55:23,680 --> 00:55:28,240
if T is very big, likely
that R/T is small.
877
00:55:28,240 --> 00:55:30,280
So we can't do that step.
878
00:55:30,280 --> 00:55:34,950
We can't use the independence
here to go from here to here.
879
00:55:34,950 --> 00:55:36,030
That's wrong.
880
00:55:36,030 --> 00:55:39,474
There's actually another
big problem with this proof.
881
00:55:39,474 --> 00:55:40,640
Anybody see another problem?
882
00:55:40,640 --> 00:55:41,140
Yeah?
883
00:55:43,990 --> 00:55:44,880
Yeah.
884
00:55:44,880 --> 00:55:47,840
If the expected value
of T is negative,
885
00:55:47,840 --> 00:55:50,440
I would end up doing that.
886
00:55:50,440 --> 00:55:52,030
So that step's wrong.
887
00:55:52,030 --> 00:55:53,600
So this is a pretty lousy proof.
888
00:55:53,600 --> 00:55:55,900
Every step is wrong.
889
00:55:55,900 --> 00:55:58,170
So this is not a
good one to use.
890
00:55:58,170 --> 00:56:04,780
And in fact, the
theorem is wrong.
891
00:56:04,780 --> 00:56:07,100
Not only is the proof wrong,
but the result is wrong.
892
00:56:07,100 --> 00:56:08,950
It's not true.
893
00:56:08,950 --> 00:56:11,890
And we can see examples.
894
00:56:11,890 --> 00:56:13,840
We'll do some
examples in a minute.
895
00:56:13,840 --> 00:56:16,350
Now the amazing thing is that
despite the fact that this
896
00:56:16,350 --> 00:56:21,610
is just blatantly wrong,
it is used all the time
897
00:56:21,610 --> 00:56:23,270
in research papers.
898
00:56:23,270 --> 00:56:26,280
And let me give you
a famous example.
899
00:56:26,280 --> 00:56:32,430
This is a case of, actually, a
pretty well-known paper written
900
00:56:32,430 --> 00:56:36,090
by some very famous computer
science professors at Berkeley.
901
00:56:38,640 --> 00:56:42,130
And let me show
you what they did.
902
00:56:42,130 --> 00:56:45,950
And this is so that
you will never do this.
903
00:56:45,950 --> 00:56:54,870
They were trying to compare
two instruction sets way back
904
00:56:54,870 --> 00:56:55,510
in the day.
905
00:56:55,510 --> 00:56:58,750
And they were comparing
the RISC architecture
906
00:56:58,750 --> 00:57:01,740
to something called the Z8002.
907
00:57:01,740 --> 00:57:04,260
And they were
proponents of RISC.
908
00:57:04,260 --> 00:57:07,600
And they were using this to
prove that it was a better way
909
00:57:07,600 --> 00:57:09,290
to do things.
910
00:57:09,290 --> 00:57:14,290
So they had a bunch of benchmark
problems, probably 20 or so
911
00:57:14,290 --> 00:57:14,940
in the paper.
912
00:57:14,940 --> 00:57:16,520
And I'm not going
to do all 20 here.
913
00:57:16,520 --> 00:57:19,210
I'm going to give you a flavor
for what the data showed.
914
00:57:19,210 --> 00:57:27,110
And then they looked at
the code size for RISC
915
00:57:27,110 --> 00:57:32,182
and the other guys, Z8002.
916
00:57:32,182 --> 00:57:33,390
And then they took the ratio.
917
00:57:40,020 --> 00:57:44,720
So the first problem was
called E-string search,
918
00:57:44,720 --> 00:57:45,440
whatever that is.
919
00:57:45,440 --> 00:57:48,820
But it was some benchmark
problem out there at the time.
920
00:57:48,820 --> 00:57:51,510
And the code length
on RISC was 150
921
00:57:51,510 --> 00:57:53,450
say-- I've changed these
numbers a little bit
922
00:57:53,450 --> 00:57:54,820
to make them simpler.
923
00:57:54,820 --> 00:57:58,930
Code length here and
the Z8002 is 120.
924
00:57:58,930 --> 00:58:01,270
The ratio is 0.8.
925
00:58:01,270 --> 00:58:05,020
So for this problem, you're
trying to get low, short code.
926
00:58:05,020 --> 00:58:08,410
So this was a better way
to go to support that.
927
00:58:08,410 --> 00:58:12,640
And they had something
called F-bit test.
928
00:58:12,640 --> 00:58:14,530
And here you have 120 lines.
929
00:58:14,530 --> 00:58:16,750
Here's 180.
930
00:58:16,750 --> 00:58:20,260
So in this case, risk is better.
931
00:58:20,260 --> 00:58:26,500
So the ratio of that way
to this way would be 1.5.
932
00:58:26,500 --> 00:58:28,465
And they had computing
and Ackermann function.
933
00:58:31,540 --> 00:58:34,260
And that was 150 and 300.
934
00:58:34,260 --> 00:58:39,390
So a big win for
RISC, ratio of 2.
935
00:58:39,390 --> 00:58:45,690
And then they had a thing called
recursive sorting problem.
936
00:58:45,690 --> 00:58:47,530
This is a hard problem.
937
00:58:47,530 --> 00:58:50,040
There's 2,800 lines on RISC.
938
00:58:50,040 --> 00:58:54,190
1400 on the old way.
939
00:58:54,190 --> 00:58:57,200
Ratio of 0.5.
940
00:58:57,200 --> 00:59:00,400
And there was a bunch more which
I'm not going to go through.
941
00:59:00,400 --> 00:59:04,470
But their analysis, what they
did is they took the ratio,
942
00:59:04,470 --> 00:59:06,380
and then they averaged it.
943
00:59:06,380 --> 00:59:12,715
And so when you do this, you get
an average of, well, 2.3, 4.3,
944
00:59:12,715 --> 00:59:19,250
4.8/4 is 1.2.
945
00:59:19,250 --> 00:59:23,890
So the conclusion is that on
average code in this framework
946
00:59:23,890 --> 00:59:27,400
is 20% longer than
the code on RISC.
947
00:59:27,400 --> 00:59:30,100
Therefore, clearly, RISC
is a better way to go.
948
00:59:30,100 --> 00:59:33,020
Your code on average
will be shorter.
949
00:59:33,020 --> 00:59:38,080
Using the Z8002 on average,
the code will be 20% longer.
950
00:59:38,080 --> 00:59:39,246
Makes perfect sense, right?
951
00:59:42,162 --> 00:59:44,840
In fact, this is one of
the most common things
952
00:59:44,840 --> 00:59:47,133
that is done when people
are comparing two systems.
953
00:59:50,514 --> 00:59:56,420
Now just one problem with
this approach, and that's
954
00:59:56,420 --> 00:59:59,840
that it's completely
bogus, completely bogus.
955
00:59:59,840 --> 01:00:02,820
You cannot conclude--
let's make this conclusion.
956
01:00:02,820 --> 01:00:13,090
So their conclusion, they
concluded that Z8002 programs
957
01:00:13,090 --> 01:00:17,905
are 20% longer on average.
958
01:00:24,890 --> 01:00:28,110
Everybody understands
the reasoning why, right?
959
01:00:28,110 --> 01:00:31,875
Take the ratio of all the
test cases, average them up.
960
01:00:31,875 --> 01:00:33,410
Then you get the average ratio.
961
01:00:38,380 --> 01:00:41,030
Now there could be some
hint why this is bogus.
962
01:00:41,030 --> 01:00:45,000
If I just looked at--
I took and summed
963
01:00:45,000 --> 01:00:51,670
these numbers, if I add all
those numbers up, I get 3,220.
964
01:00:51,670 --> 01:00:55,570
And all these, I get 2,000.
965
01:00:55,570 --> 01:00:59,500
RISC code is not looking
shorter if I do that.
966
01:00:59,500 --> 01:01:02,340
Looking longer.
967
01:01:02,340 --> 01:01:08,200
But all that gain, all the loss
of RISC is in this one problem.
968
01:01:08,200 --> 01:01:10,960
And maybe it's not
fair to do that.
969
01:01:10,960 --> 01:01:14,537
And that's why when people
have data like this,
970
01:01:14,537 --> 01:01:15,620
they just take the ratios.
971
01:01:15,620 --> 01:01:20,110
Because now it would be-- if
I just took the average code
972
01:01:20,110 --> 01:01:22,290
length and took
the ratio of that,
973
01:01:22,290 --> 01:01:23,950
it's not fair because
one problem just
974
01:01:23,950 --> 01:01:26,409
wiped out the whole thing.
975
01:01:26,409 --> 01:01:27,700
I might as well not even do it.
976
01:01:27,700 --> 01:01:29,880
And they want every
problem to count equally.
977
01:01:29,880 --> 01:01:32,210
And that's why they
take the ratio,
978
01:01:32,210 --> 01:01:34,840
to make them all count equally.
979
01:01:34,840 --> 01:01:36,330
Let's do one more thing here.
980
01:01:36,330 --> 01:01:44,130
Let's look at what happens
if we take the ratio of RISC
981
01:01:44,130 --> 01:01:47,356
to the Z8002.
982
01:01:47,356 --> 01:01:48,355
Make some room for that.
983
01:01:48,355 --> 01:01:54,630
So this is-- this column
is Z8002 over RISC.
984
01:01:54,630 --> 01:02:00,930
What if I just did this--
RISC over the Z8002 I mean
985
01:02:00,930 --> 01:02:03,754
the answer should come
out to 1/1.2, right?
986
01:02:03,754 --> 01:02:05,420
That's what we expect,
because I've just
987
01:02:05,420 --> 01:02:07,820
been turning it upside down.
988
01:02:07,820 --> 01:02:10,490
Well, I get 1.25 here.
989
01:02:10,490 --> 01:02:12,100
These are just being inverted.
990
01:02:12,100 --> 01:02:14,386
Here I've got 2/3, 0.67.
991
01:02:14,386 --> 01:02:15,010
Here I get 1/2.
992
01:02:18,500 --> 01:02:19,440
Here I get 2.
993
01:02:22,400 --> 01:02:23,815
Let's add those up.
994
01:02:23,815 --> 01:02:32,250
I get 1.92, 2.42, 4.42.
995
01:02:32,250 --> 01:02:38,020
Divide by 4-- wow,
I get 1.1 something
996
01:02:38,020 --> 01:02:41,100
which says, well, that
on average, RISC is
997
01:02:41,100 --> 01:02:44,740
10% longer than the other one.
998
01:02:44,740 --> 01:02:50,430
So same analysis says
that RISC programs
999
01:02:50,430 --> 01:02:54,010
are 10% longer on average.
1000
01:02:57,430 --> 01:03:00,210
Now the beauty of
this method is you
1001
01:03:00,210 --> 01:03:05,410
can make any conclusion you
want seem reasonable, typically.
1002
01:03:05,410 --> 01:03:08,590
You could have the
exact same data.
1003
01:03:08,590 --> 01:03:13,330
And if you want RISC to look
better you do it this way.
1004
01:03:13,330 --> 01:03:17,720
If you want RISC to look
worse, you do it that way.
1005
01:03:17,720 --> 01:03:18,540
You see?
1006
01:03:18,540 --> 01:03:19,690
Is that possible?
1007
01:03:19,690 --> 01:03:21,470
Is it possible for
one to be 20% longer
1008
01:03:21,470 --> 01:03:24,026
than the other on average,
but the other be 10% longer
1009
01:03:24,026 --> 01:03:24,525
on average?
1010
01:03:27,040 --> 01:03:29,000
How many people think
that's possible?
1011
01:03:29,000 --> 01:03:31,440
We had some weird things
happen in this class,
1012
01:03:31,440 --> 01:03:33,650
but that's not possible.
1013
01:03:33,650 --> 01:03:34,830
That can't happen.
1014
01:03:34,830 --> 01:03:36,430
These conclusions
are both bogus.
1015
01:03:41,090 --> 01:03:44,210
Now I'm not teaching you
this so that later when
1016
01:03:44,210 --> 01:03:46,510
you're doing your PhD thesis
and it's down to the wire
1017
01:03:46,510 --> 01:03:49,250
and you need a conclusion
to be proved, good news.
1018
01:03:49,250 --> 01:03:51,240
You can prove it.
1019
01:03:51,240 --> 01:03:53,957
No matter what your conclusion
is, you can prove it.
1020
01:03:53,957 --> 01:03:55,290
That's not why we're doing this.
1021
01:03:55,290 --> 01:03:58,831
We're doing this so you can spot
the flaw in this whole setup
1022
01:03:58,831 --> 01:04:00,080
and that you'll never do this.
1023
01:04:00,080 --> 01:04:02,538
And you'll see it when other
people do because people do it
1024
01:04:02,538 --> 01:04:03,190
all the time.
1025
01:04:06,160 --> 01:04:09,460
So let's try to put some
formality under this
1026
01:04:09,460 --> 01:04:10,460
in terms of probability.
1027
01:04:10,460 --> 01:04:12,607
Because when you start
talking about averages,
1028
01:04:12,607 --> 01:04:15,190
really think about expectations
of random variables and stuff.
1029
01:04:15,190 --> 01:04:19,970
So let's try to view this
as a probability problem
1030
01:04:19,970 --> 01:04:22,410
and see if we can shed
some light on what's
1031
01:04:22,410 --> 01:04:25,480
going on here because it
sure seemed reasonable.
1032
01:04:35,850 --> 01:04:41,890
So let's let x be the benchmark.
1033
01:04:41,890 --> 01:04:45,400
And maybe that'll be
something in the sample space,
1034
01:04:45,400 --> 01:04:47,280
an outcome in the sample space.
1035
01:04:47,280 --> 01:05:00,490
Let's let R x be the code
length for RISC on x and Z x
1036
01:05:00,490 --> 01:05:09,340
be the code length for
the other processor on x.
1037
01:05:09,340 --> 01:05:19,060
And then we'll define a
probability of seeing x.
1038
01:05:19,060 --> 01:05:21,560
That's our problem
we're looking at.
1039
01:05:21,560 --> 01:05:28,290
And typically, you might
assume that it's uniform,
1040
01:05:28,290 --> 01:05:30,790
the distribution there.
1041
01:05:30,790 --> 01:05:33,830
We need this to be able to
define an expected value
1042
01:05:33,830 --> 01:05:37,350
for R and for Z.
1043
01:05:37,350 --> 01:05:42,610
Now what they're doing in the
paper, what really is happening
1044
01:05:42,610 --> 01:05:50,230
here, is instead of this, they
have the expected value of Z/R
1045
01:05:50,230 --> 01:05:52,160
is 1.2.
1046
01:05:52,160 --> 01:05:54,880
That is what they can conclude.
1047
01:05:54,880 --> 01:06:01,630
That does not mean that
the expected value of Z
1048
01:06:01,630 --> 01:06:05,350
is 1.2, the expected
value of R, which
1049
01:06:05,350 --> 01:06:10,200
is what they conclude
that the Z8002 code is
1050
01:06:10,200 --> 01:06:12,430
20% longer than RISC code.
1051
01:06:12,430 --> 01:06:14,860
This is true.
1052
01:06:14,860 --> 01:06:16,160
That is not implied.
1053
01:06:19,070 --> 01:06:21,506
And why not?
1054
01:06:21,506 --> 01:06:26,909
That's just, actually, what
this corollary was doing.
1055
01:06:26,909 --> 01:06:28,450
Really, it's just
what we were-- they
1056
01:06:28,450 --> 01:06:33,100
made the same false assumption
as happened in the corollary.
1057
01:06:33,100 --> 01:06:37,210
You can't multiply both sides
here by the expected value of R
1058
01:06:37,210 --> 01:06:41,240
and then get the expected value
Z. Of course, if you ask them,
1059
01:06:41,240 --> 01:06:43,066
they would have known that.
1060
01:06:43,066 --> 01:06:44,440
But they don't
even think through
1061
01:06:44,440 --> 01:06:46,523
that, they just used the
standard method of taking
1062
01:06:46,523 --> 01:06:49,290
the expected value of a ratio.
1063
01:06:49,290 --> 01:06:51,220
So this is fair to conclude.
1064
01:06:51,220 --> 01:06:58,340
But as we saw, the expected
value of R/Z was 1.1.
1065
01:06:58,340 --> 01:07:00,950
So both of these can be
true at the same time.
1066
01:07:00,950 --> 01:07:01,960
That's fine.
1067
01:07:01,960 --> 01:07:03,420
But you can't make
the conclusions
1068
01:07:03,420 --> 01:07:04,419
that they tried to make.
1069
01:07:08,000 --> 01:07:11,470
Here's another-- in
fact, in this case,
1070
01:07:11,470 --> 01:07:14,100
if we had a uniform
distribution,
1071
01:07:14,100 --> 01:07:21,396
the expected value of R
is like 805 for uniform.
1072
01:07:21,396 --> 01:07:28,990
And the expected
value of Z is 500.
1073
01:07:28,990 --> 01:07:31,254
And that's all you
can conclude if you're
1074
01:07:31,254 --> 01:07:32,420
taking uniform distribution.
1075
01:07:32,420 --> 01:07:36,310
In which case, of course,
if they're promoting RISC,
1076
01:07:36,310 --> 01:07:39,240
well, you don't like
that conclusion.
1077
01:07:39,240 --> 01:07:41,739
So it's better to
get this one's.
1078
01:07:41,739 --> 01:07:43,530
I don't think it was
intentional of course.
1079
01:07:43,530 --> 01:07:46,690
But it's nice that
it came out that way.
1080
01:07:49,430 --> 01:07:53,770
Here's another example that
really makes it painfully clear
1081
01:07:53,770 --> 01:07:57,060
why you never want to do this.
1082
01:07:57,060 --> 01:08:02,150
So a really simple case, just
generic variables R and Z.
1083
01:08:02,150 --> 01:08:08,550
And I got two problems only--
problem one, problem two.
1084
01:08:08,550 --> 01:08:11,020
R is 2 for problem
1, and z is 1.
1085
01:08:11,020 --> 01:08:12,768
And they reverse on problem 2.
1086
01:08:15,396 --> 01:08:19,205
Z/R is 2 and 1/2.
1087
01:08:21,760 --> 01:08:23,955
R/Z, just the reverse.
1088
01:08:28,029 --> 01:08:36,200
Now the expected
value of R/Z here
1089
01:08:36,200 --> 01:08:39,260
is 2 plus 1/2 divided
by 2 is 1 and 1/4.
1090
01:08:42,810 --> 01:08:44,359
And what's the
expected value of Z/R?
1091
01:08:49,399 --> 01:08:50,859
The average of
these is 1 and 1/4,
1092
01:08:50,859 --> 01:08:53,470
what's the average of these?
1093
01:08:53,470 --> 01:08:54,410
Same thing, 1 and 1/4.
1094
01:08:58,020 --> 01:09:00,830
So never, ever take
averages of ratios
1095
01:09:00,830 --> 01:09:02,689
without really knowing
what you're doing.
1096
01:09:05,560 --> 01:09:07,470
Any questions?
1097
01:09:07,470 --> 01:09:09,960
Yeah.
1098
01:09:09,960 --> 01:09:12,640
AUDIENCE: What would be the
word explanation of the expected
1099
01:09:12,640 --> 01:09:14,240
value of Z/R?
1100
01:09:14,240 --> 01:09:15,430
What is that?
1101
01:09:15,430 --> 01:09:18,270
PROFESSOR: That is the
average of the ratio.
1102
01:09:21,270 --> 01:09:25,859
It is not the ratio
of the average.
1103
01:09:25,859 --> 01:09:27,744
They are very different things.
1104
01:09:27,744 --> 01:09:29,660
And you can see how you
get caught up in that.
1105
01:09:29,660 --> 01:09:31,060
You could see how you have
linearity of expectation,
1106
01:09:31,060 --> 01:09:33,010
you got the product
rule for expectation.
1107
01:09:33,010 --> 01:09:39,135
You do not have a rule that
says this implies that.
1108
01:09:39,135 --> 01:09:41,590
AUDIENCE: [INAUDIBLE]
1109
01:09:41,590 --> 01:09:42,465
PROFESSOR: Which two?
1110
01:09:42,465 --> 01:09:46,710
AUDIENCE: [INAUDIBLE] Z/R?
1111
01:09:46,710 --> 01:09:48,910
PROFESSOR: Well, in
this case, they're one.
1112
01:09:48,910 --> 01:09:50,968
I don't think that'll
be true in general.
1113
01:09:50,968 --> 01:09:53,990
AUDIENCE: Does that
give you information?
1114
01:09:53,990 --> 01:09:55,820
PROFESSOR: They give
you information.
1115
01:09:55,820 --> 01:10:00,020
That may not be the
information you want.
1116
01:10:00,020 --> 01:10:01,530
It wouldn't imply
that which is what
1117
01:10:01,530 --> 01:10:04,220
you're after in some sense.
1118
01:10:04,220 --> 01:10:06,140
But it gives you
some information.
1119
01:10:06,140 --> 01:10:08,170
It's the expected average ratio.
1120
01:10:08,170 --> 01:10:12,700
The problem is the human brain
goes right from there to here.
1121
01:10:12,700 --> 01:10:14,450
It's just you do.
1122
01:10:14,450 --> 01:10:17,400
It's hard to help
yourself from doing it.
1123
01:10:17,400 --> 01:10:20,510
And it's not true.
1124
01:10:20,510 --> 01:10:23,430
That's the problem.
1125
01:10:23,430 --> 01:10:26,170
We have a version of this in the
one in the homework questions
1126
01:10:26,170 --> 01:10:27,420
which is true.
1127
01:10:27,420 --> 01:10:29,490
But it's a special
version of it where
1128
01:10:29,490 --> 01:10:33,030
you can say something positive.
1129
01:10:33,030 --> 01:10:36,020
Any questions about this?
1130
01:10:36,020 --> 01:10:39,180
So anybody ever shows
you an average of ratios,
1131
01:10:39,180 --> 01:10:42,120
you want the light to go
off and say, danger, danger.
1132
01:10:42,120 --> 01:10:44,682
Think what's happening here.
1133
01:10:44,682 --> 01:10:47,015
Or if you're ever analyzing
data to compare two systems.
1134
01:10:52,880 --> 01:10:55,540
So we talked a lot
about expectation, seen
1135
01:10:55,540 --> 01:10:56,840
a lot of ways of computing it.
1136
01:10:56,840 --> 01:10:59,310
We've done a lot of examples.
1137
01:10:59,310 --> 01:11:01,420
For the rest of today
and for next time,
1138
01:11:01,420 --> 01:11:04,180
we're going to talk
about deviations
1139
01:11:04,180 --> 01:11:06,870
from the expected value.
1140
01:11:06,870 --> 01:11:09,830
Now for some random
variables, they
1141
01:11:09,830 --> 01:11:11,880
are very likely to
take on values that
1142
01:11:11,880 --> 01:11:13,850
are near their expectation.
1143
01:11:13,850 --> 01:11:17,510
For example, if
I flip 100 coins.
1144
01:11:17,510 --> 01:11:20,770
And say they're fair and
mutually independent.
1145
01:11:20,770 --> 01:11:24,389
We know that the expected
number of heads is 50.
1146
01:11:24,389 --> 01:11:25,930
Does anybody remember
the probability
1147
01:11:25,930 --> 01:11:30,990
of getting far from that, namely
having 25 or fewer heads or 75
1148
01:11:30,990 --> 01:11:32,564
or more heads?
1149
01:11:32,564 --> 01:11:33,480
Remember, we did that?
1150
01:11:33,480 --> 01:11:35,760
It was a couple of weeks ago?
1151
01:11:35,760 --> 01:11:39,334
Is it likely to have
25 or fewer heads?
1152
01:11:39,334 --> 01:11:41,000
AUDIENCE: It's less
than 1 in a million?
1153
01:11:41,000 --> 01:11:42,140
PROFESSOR: Less
than 1 in a million.
1154
01:11:42,140 --> 01:11:42,400
Yeah.
1155
01:11:42,400 --> 01:11:43,450
It was 1 in 5
million or something.
1156
01:11:43,450 --> 01:11:45,420
I don't know, some
horribly small number.
1157
01:11:45,420 --> 01:11:49,510
So if I flip 100 coins,
I expect to get 50 heads.
1158
01:11:49,510 --> 01:11:52,480
And I'm very likely to
get close to 50 heads.
1159
01:11:52,480 --> 01:11:56,080
I'm not going to be 25 off.
1160
01:11:56,080 --> 01:11:59,240
And then the example
we had in recitation,
1161
01:11:59,240 --> 01:12:02,350
you got a noisy
channel, and you expect
1162
01:12:02,350 --> 01:12:06,640
an error rate, 1% of your
10,000 bits to be corrupted.
1163
01:12:06,640 --> 01:12:08,830
The chance of
getting 2% corrupted
1164
01:12:08,830 --> 01:12:12,210
was like-- what was it, 2 to
the minus 60 or something?
1165
01:12:12,210 --> 01:12:17,710
Extremely unlikely to be
far from the expected value.
1166
01:12:17,710 --> 01:12:21,900
But there's other cases where
you are likely-- you could well
1167
01:12:21,900 --> 01:12:24,189
be far from the expected value.
1168
01:12:24,189 --> 01:12:25,730
Can anybody remember
an example we've
1169
01:12:25,730 --> 01:12:32,510
done where you are almost
surely way off your expected
1170
01:12:32,510 --> 01:12:35,700
value for a random variable?
1171
01:12:35,700 --> 01:12:39,730
Anybody remember an example
we did that has that feature?
1172
01:12:39,730 --> 01:12:41,239
AUDIENCE: The appetizer I think.
1173
01:12:41,239 --> 01:12:42,280
PROFESSOR: The appetizer.
1174
01:12:42,280 --> 01:12:43,460
Let's see.
1175
01:12:43,460 --> 01:12:49,350
Appetizers, you expect 1, but
you're almost certain to be 0.
1176
01:12:49,350 --> 01:12:51,290
Or actually, you're
almost certain to be 0,
1177
01:12:51,290 --> 01:12:53,560
and you have a
chance of being n.
1178
01:12:53,560 --> 01:12:55,204
So if you count 0
as being close to 1,
1179
01:12:55,204 --> 01:12:57,120
you're likely to be close
to your expectation.
1180
01:12:57,120 --> 01:13:01,690
Because you're likely to
be 0, and you expect 1.
1181
01:13:01,690 --> 01:13:04,851
Remember that noisy
channel problem--
1182
01:13:04,851 --> 01:13:06,600
not the noisy channel,
the latency problem
1183
01:13:06,600 --> 01:13:08,330
across the channel?
1184
01:13:08,330 --> 01:13:12,310
And we show that the expected
latency was infinite?
1185
01:13:12,310 --> 01:13:15,580
But 99% of the time you
had 10 milliseconds,
1186
01:13:15,580 --> 01:13:17,770
something like that?
1187
01:13:17,770 --> 01:13:20,020
There's an example where
almost all the time you
1188
01:13:20,020 --> 01:13:22,870
are far from your expectation
which is infinite.
1189
01:13:22,870 --> 01:13:25,380
So there are examples
that go both ways.
1190
01:13:28,250 --> 01:13:33,570
Now let's look at another
couple of examples
1191
01:13:33,570 --> 01:13:36,700
that'll motivate the
definition that measures this.
1192
01:13:41,160 --> 01:13:45,930
I'd say that we've got a simple
Bernoulli random variable where
1193
01:13:45,930 --> 01:13:51,250
the probability that
R is 1,000 is 1/2,
1194
01:13:51,250 --> 01:13:58,660
and the probability that
R is minus 1,000 is 1/2.
1195
01:13:58,660 --> 01:14:00,660
Then the expected
value of R is 0.
1196
01:14:05,440 --> 01:14:07,770
Similarly, we could
have another one, S,
1197
01:14:07,770 --> 01:14:12,540
where the probability
that S equals 1 is 1/2,
1198
01:14:12,540 --> 01:14:16,230
and the probability that
S equals minus 1 is 1/2.
1199
01:14:16,230 --> 01:14:18,110
And the expected
value of S is 0.
1200
01:14:20,820 --> 01:14:23,240
Now if this was a
betting game and we're
1201
01:14:23,240 --> 01:14:26,470
talking about dollars--
here's where you're
1202
01:14:26,470 --> 01:14:28,910
wagering $1,000, fair game.
1203
01:14:28,910 --> 01:14:31,550
Here's where you're wagering $1.
1204
01:14:31,550 --> 01:14:36,950
Now in this game--
both games are fair.
1205
01:14:36,950 --> 01:14:38,980
Expected value is 0.
1206
01:14:38,980 --> 01:14:43,070
But here you're likely to end
up near your expected value.
1207
01:14:43,070 --> 01:14:46,130
Here you're certain to be far by
some measure from your expected
1208
01:14:46,130 --> 01:14:47,290
value.
1209
01:14:47,290 --> 01:14:52,740
And in fact, if you were
offered to play a game,
1210
01:14:52,740 --> 01:14:56,270
you might have a real decision
as to which game you played.
1211
01:14:56,270 --> 01:15:00,990
If you like risk, you
might play that game.
1212
01:15:00,990 --> 01:15:05,000
If you're risk averse, maybe
you stick with this game
1213
01:15:05,000 --> 01:15:07,180
because what you could
lose would be less.
1214
01:15:10,460 --> 01:15:16,770
Now this motivates the
definition of the variance
1215
01:15:16,770 --> 01:15:20,700
because it helps mathematicians
distinguish between these two
1216
01:15:20,700 --> 01:15:22,398
cases with a simple statistic.
1217
01:15:33,530 --> 01:15:44,670
The variance of a
random variable R--
1218
01:15:44,670 --> 01:15:49,170
we'll denote it by
var, V-A-R, of R--
1219
01:15:49,170 --> 01:15:54,210
is defined as the expected
value of the random variable
1220
01:15:54,210 --> 01:16:01,250
minus this expected
value squared.
1221
01:16:01,250 --> 01:16:02,950
That's sort of a mouthful there.
1222
01:16:02,950 --> 01:16:04,075
So let's break it down.
1223
01:16:07,285 --> 01:16:11,240
This is the expected value
of R. This is the deviation
1224
01:16:11,240 --> 01:16:14,130
from the expected value.
1225
01:16:14,130 --> 01:16:16,630
So this is the
deviation from the mean.
1226
01:16:21,490 --> 01:16:22,770
Then we square it.
1227
01:16:25,590 --> 01:16:31,800
So that equals the
square of the deviation.
1228
01:16:31,800 --> 01:16:36,810
And then we take the
expected value of the square.
1229
01:16:36,810 --> 01:16:42,190
So the variance equals
the expected value
1230
01:16:42,190 --> 01:16:46,156
of the square of the deviation.
1231
01:16:50,829 --> 01:16:52,370
In other words, the
variance gives us
1232
01:16:52,370 --> 01:16:55,630
the average of the
squares of the amount
1233
01:16:55,630 --> 01:16:59,662
by which the random variable
deviates from its mean.
1234
01:16:59,662 --> 01:17:04,500
Now the idea behind this is
that if a random variable is
1235
01:17:04,500 --> 01:17:08,710
likely to deviate from its
mean, the variance will be high.
1236
01:17:08,710 --> 01:17:10,580
And if it's likely
to be near its mean,
1237
01:17:10,580 --> 01:17:12,180
the variance will be low.
1238
01:17:12,180 --> 01:17:15,805
And so variance can tell us
something about the expected
1239
01:17:15,805 --> 01:17:16,305
deviation.
1240
01:17:23,920 --> 01:17:26,835
So let's compute the
variance for R and S
1241
01:17:26,835 --> 01:17:29,220
and see what happens.
1242
01:17:29,220 --> 01:17:35,010
So with R minus the
expected value of R, well,
1243
01:17:35,010 --> 01:17:37,670
that is going to be
1,000, because you expect
1244
01:17:37,670 --> 01:17:42,420
the value of 0, with
probability 1/2, and minus 1,000
1245
01:17:42,420 --> 01:17:44,770
with probability 1/2.
1246
01:17:44,770 --> 01:17:45,825
Then I square that.
1247
01:17:51,220 --> 01:17:58,430
Well, I square 1,000, I get a
million with probability 1/2.
1248
01:17:58,430 --> 01:18:00,700
And I square minus
1,000, I get a million
1249
01:18:00,700 --> 01:18:05,340
again with probability 1/2.
1250
01:18:05,340 --> 01:18:09,620
And so therefore,
the variance of R,
1251
01:18:09,620 --> 01:18:12,664
well, it's the expected value
of this, which is-- well,
1252
01:18:12,664 --> 01:18:13,580
it's always a million.
1253
01:18:13,580 --> 01:18:14,610
So it's just a million.
1254
01:18:18,650 --> 01:18:20,160
Big.
1255
01:18:20,160 --> 01:18:26,140
Now if I were to do this with S,
S minus the expected value of S
1256
01:18:26,140 --> 01:18:31,170
is 1 with probability 1/2,
minus 1 with probability 1/2.
1257
01:18:31,170 --> 01:18:39,290
If I square that, well,
I get 1 squared is 1,
1258
01:18:39,290 --> 01:18:42,670
minus 1 squared is 1.
1259
01:18:42,670 --> 01:18:47,280
And so the variance of S is
the expected value of this.
1260
01:18:47,280 --> 01:18:50,010
And that's just 1.
1261
01:18:50,010 --> 01:18:52,590
So a big difference
in the variance.
1262
01:18:52,590 --> 01:18:55,520
So the variance being different
tells us these random variables
1263
01:18:55,520 --> 01:18:58,110
are-- the distributions
are very different even
1264
01:18:58,110 --> 01:19:01,220
though their expected
values are the same.
1265
01:19:01,220 --> 01:19:02,870
And the guy with
big variance says,
1266
01:19:02,870 --> 01:19:06,920
hey, we're likely to
deviate from the mean here.
1267
01:19:06,920 --> 01:19:09,720
And so risk averse people
stay away from strategies
1268
01:19:09,720 --> 01:19:11,678
when they're investing
that have high variance.
1269
01:19:18,370 --> 01:19:24,710
Now does anybody have any idea
why we square the deviation?
1270
01:19:24,710 --> 01:19:26,850
Why don't we just-- why
didn't mathematicians
1271
01:19:26,850 --> 01:19:28,890
when they figured out
this stuff I don't know
1272
01:19:28,890 --> 01:19:30,931
how many centuries ago,
why didn't they just take
1273
01:19:30,931 --> 01:19:32,930
the expected deviation?
1274
01:19:32,930 --> 01:19:34,510
Why do the stupid
squaring thing?
1275
01:19:34,510 --> 01:19:37,110
That only is going
to complicate it?
1276
01:19:37,110 --> 01:19:44,250
Why don't we instead
compute the expected value
1277
01:19:44,250 --> 01:19:48,730
of R minus the mean?
1278
01:19:48,730 --> 01:19:51,620
Why didn't they do that
and call that the variance?
1279
01:19:51,620 --> 01:19:53,170
Yeah?
1280
01:19:53,170 --> 01:19:54,252
That's zero.
1281
01:19:54,252 --> 01:19:55,490
Yeah.
1282
01:19:55,490 --> 01:19:57,210
Because by linearity
of expectation,
1283
01:19:57,210 --> 01:19:59,670
that corollary
[? 4-2 ?] or whatever,
1284
01:19:59,670 --> 01:20:03,590
this is just the expected value
of R minus the expected value
1285
01:20:03,590 --> 01:20:09,030
of the expected value of R. The
expected value of a scalar is
1286
01:20:09,030 --> 01:20:09,925
just that scalar.
1287
01:20:13,400 --> 01:20:17,270
And that is 0.
1288
01:20:17,270 --> 01:20:20,570
So the expected
deviation from the mean
1289
01:20:20,570 --> 01:20:24,042
is 0 because of how
the mean is defined.
1290
01:20:24,042 --> 01:20:25,750
It's the midpoint,
the weighted midpoint.
1291
01:20:25,750 --> 01:20:28,320
The times you're high
cancel out the times you're
1292
01:20:28,320 --> 01:20:31,950
low if you got the mean right.
1293
01:20:31,950 --> 01:20:33,820
And so this is a
useless definition.
1294
01:20:33,820 --> 01:20:36,150
It's always 0.
1295
01:20:36,150 --> 01:20:40,262
So mathematicians had to do
something to capture this.
1296
01:20:40,262 --> 01:20:42,220
Now what would have been
the more logical thing
1297
01:20:42,220 --> 01:20:43,390
to do that is the next step.
1298
01:20:43,390 --> 01:20:44,910
This doesn't work,
but what would
1299
01:20:44,910 --> 01:20:47,730
you think the mathematicians
would've done?
1300
01:20:47,730 --> 01:20:51,670
Absolute value would have
made a lot of sense here.
1301
01:20:51,670 --> 01:20:54,370
Why they didn't do that?
1302
01:20:54,370 --> 01:20:56,440
Well, you could
do that, but it's
1303
01:20:56,440 --> 01:20:57,960
hard to work with
mathematically.
1304
01:20:57,960 --> 01:21:01,470
You can't prove nice
theorems, it turns out.
1305
01:21:01,470 --> 01:21:05,050
If you put the square in there
and make that be the variance,
1306
01:21:05,050 --> 01:21:09,030
you can prove a theorem
about linearity of variance.
1307
01:21:09,030 --> 01:21:11,290
And if the random
variables are independent,
1308
01:21:11,290 --> 01:21:14,090
then the variance of the sum
is the sum of the variances.
1309
01:21:14,090 --> 01:21:16,630
And mathematicians like
that kind of thing.
1310
01:21:16,630 --> 01:21:19,640
It makes it easier to work
with and do things with.
1311
01:21:19,640 --> 01:21:23,320
Now there are also other
choices like, in fact,
1312
01:21:23,320 --> 01:21:26,120
there's a special name
for a weird case where
1313
01:21:26,120 --> 01:21:27,685
you take the fourth power.
1314
01:21:32,450 --> 01:21:33,360
You could do that.
1315
01:21:33,360 --> 01:21:35,690
As long as an even
power, you could do it.
1316
01:21:35,690 --> 01:21:39,040
And that's actually
called the kurtosis.
1317
01:21:39,040 --> 01:21:41,260
Sounds like a foot disease.
1318
01:21:41,260 --> 01:21:44,370
But it's the kurtosis
of the random variable.
1319
01:21:44,370 --> 01:21:46,915
Now we're not going to worry
about that in this class.
1320
01:21:46,915 --> 01:21:50,030
But we are going to
worry about variance.
1321
01:21:50,030 --> 01:21:52,350
And let me do one
more definition,
1322
01:21:52,350 --> 01:21:56,560
then we'll talk about
variance a lot more tomorrow.
1323
01:21:56,560 --> 01:21:58,480
That square is a bit of a pain.
1324
01:21:58,480 --> 01:22:02,310
And to get rid of it, they
made another definition
1325
01:22:02,310 --> 01:22:06,590
after the fact called
the standard deviation.
1326
01:22:06,590 --> 01:22:10,790
And standard deviation
is defined as follows.
1327
01:22:13,790 --> 01:22:29,650
For a random variable R,
the standard deviation of R
1328
01:22:29,650 --> 01:22:34,810
is denoted by a sigma
of R. And it's just
1329
01:22:34,810 --> 01:22:38,750
the square root of
the variance, undoing
1330
01:22:38,750 --> 01:22:41,650
that nasty square
root after the fact.
1331
01:22:41,650 --> 01:22:45,250
So it turns out to be the
square root of the expectation
1332
01:22:45,250 --> 01:22:48,940
of the deviation squared.
1333
01:22:48,940 --> 01:22:52,090
Another name for this
you've probably seen,
1334
01:22:52,090 --> 01:23:02,770
it's the root of the mean of
the square of the deviations.
1335
01:23:02,770 --> 01:23:05,532
And so you get this thing
called root-mean-square,
1336
01:23:05,532 --> 01:23:06,990
which if any of
you ever done curve
1337
01:23:06,990 --> 01:23:08,614
fitting or any of
those kinds of things
1338
01:23:08,614 --> 01:23:12,152
in statistics or whatever, this
is what you're talking about.
1339
01:23:12,152 --> 01:23:14,360
And so that's why that
expression were to come about.
1340
01:23:18,090 --> 01:23:20,780
So for the standard
deviation of R--
1341
01:23:20,780 --> 01:23:23,670
what's the standard
deviation of R?
1342
01:23:23,670 --> 01:23:24,597
1,000?
1343
01:23:24,597 --> 01:23:26,180
In effect, that's
pretty close to what
1344
01:23:26,180 --> 01:23:28,360
you expect the deviation to be.
1345
01:23:28,360 --> 01:23:30,095
What's the standard
deviation of S?
1346
01:23:32,950 --> 01:23:33,450
1.
1347
01:23:33,450 --> 01:23:34,825
Square root of 1
is 1, and that's
1348
01:23:34,825 --> 01:23:37,240
what you expect its
deviation to be.
1349
01:23:37,240 --> 01:23:40,210
So we'll do more of this
tomorrow on recitation.