1
00:00:01,010 --> 00:00:03,190
ALBERT MEYER: Today's
topic is random variables.
2
00:00:03,190 --> 00:00:05,950
Random variables are an
absolutely fundamental concept
3
00:00:05,950 --> 00:00:06,950
in probability theory.
4
00:00:06,950 --> 00:00:09,750
But before we get into
officially defining them,
5
00:00:09,750 --> 00:00:11,270
let's start off
with an example that
6
00:00:11,270 --> 00:00:15,156
in fact, is a game because
that's a fun way to start.
7
00:00:15,156 --> 00:00:17,030
So we're going to play
the bigger number game
8
00:00:17,030 --> 00:00:18,480
and here's how it works.
9
00:00:18,480 --> 00:00:22,410
There are two teams,
and Team 1 has
10
00:00:22,410 --> 00:00:27,020
the task of picking two
different integers between 0
11
00:00:27,020 --> 00:00:29,819
and 7 inclusive, and
they write one integer
12
00:00:29,819 --> 00:00:31,610
on one piece of paper
and the other integer
13
00:00:31,610 --> 00:00:33,410
on the other piece of paper.
14
00:00:33,410 --> 00:00:35,470
They turn the two
pieces of paper
15
00:00:35,470 --> 00:00:37,820
face down so the
numbers are not visible,
16
00:00:37,820 --> 00:00:41,480
and the other team then sees
these two pieces of paper
17
00:00:41,480 --> 00:00:44,430
whose other side has
different numbers written
18
00:00:44,430 --> 00:00:46,350
on them sitting on the table.
19
00:00:46,350 --> 00:00:50,752
What Team 2 then does is picks
one of the pieces of paper
20
00:00:50,752 --> 00:00:53,490
and turns it over and
looks at the number on it.
21
00:00:53,490 --> 00:00:55,350
And then, based on
what that number
22
00:00:55,350 --> 00:00:58,420
is, they make a decision,
stick with the number
23
00:00:58,420 --> 00:01:02,480
they have or switch to the
other unknown number on the face
24
00:01:02,480 --> 00:01:03,780
down piece of paper.
25
00:01:03,780 --> 00:01:05,560
And that'll be
their final number.
26
00:01:05,560 --> 00:01:08,400
And the game is that Team
2 wins if they wind up
27
00:01:08,400 --> 00:01:10,160
with the larger number.
28
00:01:10,160 --> 00:01:12,285
So they're going to look
at the number on the paper
29
00:01:12,285 --> 00:01:13,701
that they expose
and they're going
30
00:01:13,701 --> 00:01:15,090
to try to decide
whether it looks
31
00:01:15,090 --> 00:01:17,182
like a big number
or little number.
32
00:01:17,182 --> 00:01:19,390
If it looks like a big
number, they'll stick with it.
33
00:01:19,390 --> 00:01:21,150
If it looks like
a little number,
34
00:01:21,150 --> 00:01:25,260
they'll switch to the other
one that they hope is larger.
35
00:01:25,260 --> 00:01:29,269
So which team do you think
has an advantage here?
36
00:01:29,269 --> 00:01:31,060
Course, if you've read
the notes, you know.
37
00:01:31,060 --> 00:01:33,290
But if you haven't been
exposed to this before,
38
00:01:33,290 --> 00:01:35,580
it's not really so obvious.
39
00:01:35,580 --> 00:01:38,330
And what we
encourage and what we
40
00:01:38,330 --> 00:01:41,270
used to do when we ran this
in real time in classes
41
00:01:41,270 --> 00:01:44,370
that we would have
students in teams, split
42
00:01:44,370 --> 00:01:46,160
their team in half,
one would be Team 1
43
00:01:46,160 --> 00:01:47,868
and the other would
be Team 2, and they'd
44
00:01:47,868 --> 00:01:50,730
play the game few times,
see if they could figure out
45
00:01:50,730 --> 00:01:53,100
who had the advantage.
46
00:01:53,100 --> 00:01:54,990
And if you have the
opportunity, this
47
00:01:54,990 --> 00:01:57,470
might be a good moment
to stop this video
48
00:01:57,470 --> 00:02:00,410
and try playing the game with
some friends if they're around.
49
00:02:00,410 --> 00:02:04,170
Otherwise, let's just proceed
and see how it all works.
50
00:02:04,170 --> 00:02:09,770
So this is the strategy
Team 2 is going to adopt.
51
00:02:09,770 --> 00:02:12,070
They're going to take this
idea about big and small
52
00:02:12,070 --> 00:02:16,640
that I mentioned and act
on it in a methodical way.
53
00:02:16,640 --> 00:02:20,190
So they're going to pick
a paper to expose, giving
54
00:02:20,190 --> 00:02:22,390
each paper equal probability.
55
00:02:22,390 --> 00:02:24,490
So that guarantees
that they have
56
00:02:24,490 --> 00:02:28,250
a 50/50 chance of picking the
big number and a 50/50 chance
57
00:02:28,250 --> 00:02:29,470
of picking the little number.
58
00:02:29,470 --> 00:02:34,529
Whatever ingenuity Team 1 tried
to do on which piece of paper
59
00:02:34,529 --> 00:02:36,320
was on the left and
which was on the right,
60
00:02:36,320 --> 00:02:39,970
it doesn't really matter if Team
2 simply picks a piece of paper
61
00:02:39,970 --> 00:02:40,580
at random.
62
00:02:40,580 --> 00:02:42,510
There's no way
that Team 1 can try
63
00:02:42,510 --> 00:02:46,080
to fake out Team 2 on
where they put the number.
64
00:02:46,080 --> 00:02:47,600
OK.
65
00:02:47,600 --> 00:02:49,950
The next step is
that Team 2 is going
66
00:02:49,950 --> 00:02:51,910
to decide whether the
number that they can see,
67
00:02:51,910 --> 00:02:54,190
the exposed number, is small.
68
00:02:54,190 --> 00:02:55,620
And if so, would they switch?
69
00:02:55,620 --> 00:02:56,810
And otherwise they stick.
70
00:02:56,810 --> 00:03:01,110
So that is, they're going to
define some threshold Z where
71
00:03:01,110 --> 00:03:03,410
being less than or
equal to Z means small,
72
00:03:03,410 --> 00:03:05,960
and being greater
than Z means large.
73
00:03:05,960 --> 00:03:08,680
The question is, how
do they choose Z?
74
00:03:08,680 --> 00:03:10,660
Well, a naive thing
to do would be
75
00:03:10,660 --> 00:03:14,870
to choose Z to be in the middle
of the interval from 0 to 7.
76
00:03:14,870 --> 00:03:19,120
Let's say, you
choose Z equals 3.
77
00:03:19,120 --> 00:03:25,590
So there would be four numbers
less than or equal to Z
78
00:03:25,590 --> 00:03:28,320
and four numbers greater
than Z. But of course,
79
00:03:28,320 --> 00:03:32,410
as soon as Team 1 knew that that
was your Z, what would they do?
80
00:03:32,410 --> 00:03:36,440
Well, they would make sure
that both numbers were
81
00:03:36,440 --> 00:03:39,260
on the same side of
Z. If you're Z was 3,
82
00:03:39,260 --> 00:03:43,650
they would always choose their
numbers to be, say, 0 and 1.
83
00:03:43,650 --> 00:03:46,690
And that way, when
you were switching,
84
00:03:46,690 --> 00:03:49,090
your Z would tell you that
you had a small number,
85
00:03:49,090 --> 00:03:50,710
you should switch
to the other one.
86
00:03:50,710 --> 00:03:53,540
And you'd only have a
50/50 chance of winning.
87
00:03:53,540 --> 00:03:57,750
So if you fixed that
value of Z, Team 2
88
00:03:57,750 --> 00:04:01,020
has a way of ensuring that
you have no advantage.
89
00:04:01,020 --> 00:04:03,630
You can only win with
probability 50/50.
90
00:04:03,630 --> 00:04:05,530
And that's true no
matter what Z you take.
91
00:04:05,530 --> 00:04:08,217
If Team 1 knew what
your Z was, they
92
00:04:08,217 --> 00:04:09,800
would just make sure
to pick their two
93
00:04:09,800 --> 00:04:12,880
numbers on the same
side of your Z.
94
00:04:12,880 --> 00:04:15,840
And then your Z wouldn't
really tell you anything.
95
00:04:15,840 --> 00:04:18,709
You'd switch or
stick in both cases,
96
00:04:18,709 --> 00:04:20,510
and you'd only
have a 50/50 chance
97
00:04:20,510 --> 00:04:21,930
of picking the right number.
98
00:04:21,930 --> 00:04:25,190
So what you do-- and this is
where probability comes in--
99
00:04:25,190 --> 00:04:28,450
is you pick Z in a way that
can't be predicted or made
100
00:04:28,450 --> 00:04:30,040
use of by Team 1.
101
00:04:30,040 --> 00:04:35,230
You pick Z at random, to
be any number from 0 to 7,
102
00:04:35,230 --> 00:04:36,800
not including 7 including 0.
103
00:04:36,800 --> 00:04:40,080
That is, your number is
either 0, 1, 2, up through 6.
104
00:04:40,080 --> 00:04:43,130
And being less than or
equal to Z means small,
105
00:04:43,130 --> 00:04:46,460
and being greater
than Z means large.
106
00:04:46,460 --> 00:04:48,670
And when you see a small
number, you'll switch
107
00:04:48,670 --> 00:04:52,320
and when you see a large
number, you'll stick.
108
00:04:52,320 --> 00:04:55,310
But what's going to be large
and what's going to be small
109
00:04:55,310 --> 00:04:59,920
is going to vary each time
you play the game, depending
110
00:04:59,920 --> 00:05:05,510
on what random number,
Z, comes out to be.
111
00:05:05,510 --> 00:05:08,420
So let's analyze your
probability if you're Team 2.
112
00:05:08,420 --> 00:05:11,726
What's the probability that
you're going to win now?
113
00:05:11,726 --> 00:05:16,399
Well, let's suppose that Team
1 picks these two numbers.
114
00:05:16,399 --> 00:05:17,940
We don't know what
they are, but they
115
00:05:17,940 --> 00:05:20,451
have to pick a low number
that's less than a high number.
116
00:05:20,451 --> 00:05:22,200
So these two numbers
are at least 1 apart,
117
00:05:22,200 --> 00:05:24,890
they can't have the same
number on both pieces of paper.
118
00:05:24,890 --> 00:05:29,127
Otherwise, it's clear
that you are not
119
00:05:29,127 --> 00:05:30,960
going to be able to
pick the large one, that
120
00:05:30,960 --> 00:05:31,780
would be cheating.
121
00:05:31,780 --> 00:05:33,322
OK, so there's two
different numbers.
122
00:05:33,322 --> 00:05:35,196
So one of them has to
be less than the other.
123
00:05:35,196 --> 00:05:37,340
We don't know how much
less, might be a lot less,
124
00:05:37,340 --> 00:05:39,990
might be only one less,
but low is less than high.
125
00:05:39,990 --> 00:05:43,730
OK, now we can consider
three cases of what
126
00:05:43,730 --> 00:05:45,930
happens with your strategy.
127
00:05:45,930 --> 00:05:48,730
The most interesting
case is the middle case.
128
00:05:48,730 --> 00:05:53,100
That is, when your Z,
which was chosen at random,
129
00:05:53,100 --> 00:05:56,150
happens to fall in the
interval between low and high.
130
00:05:56,150 --> 00:05:59,240
That is, your Z is strictly
less than high and greater than
131
00:05:59,240 --> 00:06:00,630
or equal to low.
132
00:06:00,630 --> 00:06:04,510
And then in that case, your Z
is really guiding you correctly
133
00:06:04,510 --> 00:06:05,560
on what to do.
134
00:06:05,560 --> 00:06:08,350
If you turn over the
low card, then it's
135
00:06:08,350 --> 00:06:10,790
going to look low because
it's less than or equal to Z
136
00:06:10,790 --> 00:06:12,880
so you'll switch to
the high card and win.
137
00:06:12,880 --> 00:06:15,690
If you turn over
the high card, it's
138
00:06:15,690 --> 00:06:17,820
going to be greater than
Z so it'll look high
139
00:06:17,820 --> 00:06:19,910
and you'll know
to stick with it.
140
00:06:19,910 --> 00:06:23,290
So in this case, you're
guaranteed to win.
141
00:06:23,290 --> 00:06:25,150
If you were lucky
enough to guess
142
00:06:25,150 --> 00:06:28,390
the right threshold between low
and high, you're going to win.
143
00:06:28,390 --> 00:06:31,190
And so the probability
that you win,
144
00:06:31,190 --> 00:06:33,850
given the middle
case occurs, is 1.
145
00:06:33,850 --> 00:06:35,280
Now, what about the middle case?
146
00:06:35,280 --> 00:06:37,070
How often does that happen?
147
00:06:37,070 --> 00:06:41,310
Well, the difference between
low and high is at least 1,
148
00:06:41,310 --> 00:06:44,650
so there's guaranteed
to be 1 chance in 7
149
00:06:44,650 --> 00:06:48,110
that your Z is going
to fall between them.
150
00:06:48,110 --> 00:06:51,550
And it could be more if low
and high are further apart,
151
00:06:51,550 --> 00:06:53,530
but as long as they're
at least one apart,
152
00:06:53,530 --> 00:06:57,200
there's a 1/7 chance that you're
going to fall in between them.
153
00:06:57,200 --> 00:06:58,670
OK.
154
00:06:58,670 --> 00:07:02,520
Now, in case H,
that's the case where
155
00:07:02,520 --> 00:07:04,920
Z happens to be
chosen greater than
156
00:07:04,920 --> 00:07:07,490
or equal to the high
number that Team 1 shows.
157
00:07:07,490 --> 00:07:10,480
In other words, Z is bigger
than both numbers than Team 1
158
00:07:10,480 --> 00:07:12,890
shows and put on
the pieces of paper.
159
00:07:12,890 --> 00:07:16,100
Well, in that case, Z just
isn't telling you anything.
160
00:07:16,100 --> 00:07:18,480
So what's going to happen is
that both numbers are going
161
00:07:18,480 --> 00:07:21,812
to look high to you--
sorry-- both numbers
162
00:07:21,812 --> 00:07:24,270
are going to look low to you
because they're both less than
163
00:07:24,270 --> 00:07:26,570
or equal to Z. So you'll switch.
164
00:07:26,570 --> 00:07:30,520
And that means that
you'll win, if and only
165
00:07:30,520 --> 00:07:34,810
if, you happen to turn
the low card over first.
166
00:07:34,810 --> 00:07:36,540
Well that was 50/50.
167
00:07:36,540 --> 00:07:39,570
So the probability
that you win, given
168
00:07:39,570 --> 00:07:45,390
that Z-- both cards are
on the low side of Z,
169
00:07:45,390 --> 00:07:46,920
you'll win with half the time.
170
00:07:46,920 --> 00:07:50,150
And symmetrically, if Z is
less than the low card, that
171
00:07:50,150 --> 00:07:52,900
is, Z is less than both
cards chosen by Team 1,
172
00:07:52,900 --> 00:07:56,400
then they're both going to
look high, and so you'll stick.
173
00:07:56,400 --> 00:07:59,677
And that means that you'll
stick, you'll win, if and only
174
00:07:59,677 --> 00:08:01,510
if, you happen to have
picked the high card.
175
00:08:01,510 --> 00:08:03,320
There's a 50/50 chance of that.
176
00:08:03,320 --> 00:08:09,130
So again, in this case that
Z makes both cards look high,
177
00:08:09,130 --> 00:08:14,900
or Z itself is low, Team 2,
you win with probability 1/2.
178
00:08:14,900 --> 00:08:19,010
Well, that's great because now
we can apply total probability.
179
00:08:19,010 --> 00:08:26,510
And what total probability
tells us is that Team 2 wins
180
00:08:26,510 --> 00:08:29,120
is the probability that
they win given case
181
00:08:29,120 --> 00:08:32,280
M times the probability
of M plus the probability
182
00:08:32,280 --> 00:08:34,960
that they win given
not the middle case
183
00:08:34,960 --> 00:08:37,860
times the probability
of not the middle case.
184
00:08:37,860 --> 00:08:39,580
But we figured out
what these were.
185
00:08:39,580 --> 00:08:41,870
Well, at least
inequalities on them,
186
00:08:41,870 --> 00:08:46,010
because there's
probability 1 that you'll
187
00:08:46,010 --> 00:08:47,990
win 1/7 of the time.
188
00:08:47,990 --> 00:08:50,880
And there's probability
a 1/2 that you'll
189
00:08:50,880 --> 00:08:54,900
win the rest of the time,
the other 6/7 of the time.
190
00:08:54,900 --> 00:08:57,590
You're going to win
4/7 of the time.
191
00:08:57,590 --> 00:09:02,050
The probability that you win
playing your strategy is 4/7.
192
00:09:02,050 --> 00:09:04,030
It's better than 50/50.
193
00:09:04,030 --> 00:09:06,050
You have an advantage.
194
00:09:06,050 --> 00:09:09,430
And whether that was a priori
obvious or not, I don't know.
195
00:09:09,430 --> 00:09:11,270
But I think it's kind of cool.
196
00:09:11,270 --> 00:09:14,770
OK, you win with
probability 4/7.
197
00:09:14,770 --> 00:09:17,540
Now, Team 2 has the advantage.
198
00:09:17,540 --> 00:09:19,430
And the important
thing to understand
199
00:09:19,430 --> 00:09:22,280
is it does not matter
what team does.
200
00:09:22,280 --> 00:09:26,000
No matter how smart
Team 1 is, Team 2
201
00:09:26,000 --> 00:09:27,780
has gotten control
of the situation
202
00:09:27,780 --> 00:09:30,530
because they picked--
which piece of paper
203
00:09:30,530 --> 00:09:33,210
they picked at random 50/50.
204
00:09:33,210 --> 00:09:36,000
So it doesn't matter
what strategy Team 1 used
205
00:09:36,000 --> 00:09:37,700
on where they
placed the numbers.
206
00:09:37,700 --> 00:09:41,570
And they chose Z
randomly, so again, it
207
00:09:41,570 --> 00:09:44,310
doesn't matter what
numbers Team 1 shows.
208
00:09:44,310 --> 00:09:49,540
Team 2 is still going to have
their 1/7 chance of coming out
209
00:09:49,540 --> 00:09:53,980
ahead, which is enough to tip
the balance in their favor.
210
00:09:53,980 --> 00:09:56,990
It's interesting that
symmetrically, Team 1 also
211
00:09:56,990 --> 00:09:59,390
has a random strategy
that they can use,
212
00:09:59,390 --> 00:10:04,070
which guarantees that no
matter what Team 2 does, Team 2
213
00:10:04,070 --> 00:10:06,950
wins with probability
at most 4/7.
214
00:10:06,950 --> 00:10:10,130
So either team can
force the probability
215
00:10:10,130 --> 00:10:15,800
that Team 2 wins to be at
most 4/7 and at least 4/7.
216
00:10:15,800 --> 00:10:19,740
So if they both play optimally,
it's going to stay at 4/7.
217
00:10:19,740 --> 00:10:22,780
And that's again, true no
matter what Team 2 does,
218
00:10:22,780 --> 00:10:25,930
Team 1 can put this
upper bound to 4/7 on it.
219
00:10:25,930 --> 00:10:28,310
So essentially we can
say that the value
220
00:10:28,310 --> 00:10:31,170
of this game, the
probability that Team 2 wins
221
00:10:31,170 --> 00:10:34,650
is optimally for both is 4/7.
222
00:10:34,650 --> 00:10:38,220
OK, now what does this game
got to do with anything,
223
00:10:38,220 --> 00:10:40,600
with our general topic
of random variables?
224
00:10:40,600 --> 00:10:42,970
Well, we'll be
formal in a moment.
225
00:10:42,970 --> 00:10:44,890
But informally,
a random variable
226
00:10:44,890 --> 00:10:49,460
is simply a number that's
produced by a random process.
227
00:10:49,460 --> 00:10:52,000
And just to give an
example before we come up
228
00:10:52,000 --> 00:10:55,530
with a formal definition,
the threshold variable Z
229
00:10:55,530 --> 00:11:01,930
was a thing that took a
value from 0 to 6 inclusive,
230
00:11:01,930 --> 00:11:03,600
each with probability 1/7.
231
00:11:03,600 --> 00:11:07,970
So it was producing a
number by a random process,
232
00:11:07,970 --> 00:11:11,880
that chose a number at random
with equal probability.
233
00:11:11,880 --> 00:11:22,000
If Team 2 plays properly
at random picking which
234
00:11:22,000 --> 00:11:25,770
piece of paper to expose, then
the number of the exposed card,
235
00:11:25,770 --> 00:11:29,760
or more precisely, whether the
exposed card is high or low,
236
00:11:29,760 --> 00:11:33,000
will also be a random variable.
237
00:11:33,000 --> 00:11:37,740
And if Team 1 plays optimally,
the number on the exposed card
238
00:11:37,740 --> 00:11:40,320
is going to be a
random variable.
239
00:11:40,320 --> 00:11:42,350
That is, Team 1 in their
optimal strategy that
240
00:11:42,350 --> 00:11:46,310
puts an upper bound to 4/7 is
in fact, going to choose the two
241
00:11:46,310 --> 00:11:47,230
numbers randomly.
242
00:11:47,230 --> 00:11:49,040
So the exposed card
is going to wind up
243
00:11:49,040 --> 00:11:51,930
being another random
variable, a number produced
244
00:11:51,930 --> 00:11:53,270
by the random process.
245
00:11:53,270 --> 00:11:55,800
And likewise, the number
of the larger card
246
00:11:55,800 --> 00:12:00,070
if Team 1 picks its larger
and smaller cards randomly,
247
00:12:00,070 --> 00:12:02,560
it's going to be another
example of a number produced
248
00:12:02,560 --> 00:12:04,256
by a random process.
249
00:12:04,256 --> 00:12:06,130
And likewise, the number
of the smaller card.
250
00:12:06,130 --> 00:12:07,230
So that's enough examples.
251
00:12:07,230 --> 00:12:09,030
This little game
has a whole bunch
252
00:12:09,030 --> 00:12:10,940
of random variables
appearing in it.
253
00:12:10,940 --> 00:12:13,480
And in the next
segment, we will look
254
00:12:13,480 --> 00:12:15,810
again officially,
what is the definition
255
00:12:15,810 --> 00:12:18,030
of a random variable?