1
00:00:00,000 --> 00:00:00,530
2
00:00:00,530 --> 00:00:02,960
The following content is
provided under a Creative
3
00:00:02,960 --> 00:00:04,370
Commons license.
4
00:00:04,370 --> 00:00:07,410
Your support will help MIT
OpenCourseWare continue to
5
00:00:07,410 --> 00:00:11,060
offer high quality educational
resources for free.
6
00:00:11,060 --> 00:00:13,960
To make a donation or view
additional materials from
7
00:00:13,960 --> 00:00:19,790
hundreds of MIT courses, visit
MIT OpenCourseWare at
8
00:00:19,790 --> 00:00:22,456
ocw.mit.edu.
9
00:00:22,456 --> 00:00:25,760
PROFESSOR: Today's focus is
probability and statistics.
10
00:00:25,760 --> 00:00:29,180
So let's start with
probability.
11
00:00:29,180 --> 00:00:33,246
Let's look at probability
for binary variables.
12
00:00:33,246 --> 00:00:39,160
13
00:00:39,160 --> 00:00:43,070
What do you mean by
a binary variable?
14
00:00:43,070 --> 00:00:45,650
It can take only two outcomes.
15
00:00:45,650 --> 00:00:48,920
So it can take only
two values.
16
00:00:48,920 --> 00:00:55,320
For example, it could be 0 or
1, head or tail, on or off.
17
00:00:55,320 --> 00:00:58,460
18
00:00:58,460 --> 00:01:03,670
So we are going to call this
variable A, for instance.
19
00:01:03,670 --> 00:01:11,900
So A could be H, or A is equal
to T. But that could happen.
20
00:01:11,900 --> 00:01:16,110
That event could happen with
a certain probability.
21
00:01:16,110 --> 00:01:18,920
So by that, I mean the
probabilities, like we are
22
00:01:18,920 --> 00:01:21,520
expressing the belief that the
23
00:01:21,520 --> 00:01:24,170
particularly event could happen.
24
00:01:24,170 --> 00:01:28,190
So we could assign
a value to that.
25
00:01:28,190 --> 00:01:36,380
That is the probability
of A taking value H.
26
00:01:36,380 --> 00:01:41,170
So here, the values
of A and B--
27
00:01:41,170 --> 00:01:45,410
sorry, here, the value of A can
be either H or T, which
28
00:01:45,410 --> 00:01:49,370
means it has only two
possible outcomes.
29
00:01:49,370 --> 00:01:51,190
That's why we call it
a binary variable.
30
00:01:51,190 --> 00:02:06,470
However, P of A is equal to H
can lie anywhere from 0 and 1,
31
00:02:06,470 --> 00:02:08,878
including 0 and 1.
32
00:02:08,878 --> 00:02:10,190
AUDIENCE: They don't
have to be even?
33
00:02:10,190 --> 00:02:10,876
PROFESSOR: Sorry?
34
00:02:10,876 --> 00:02:12,750
AUDIENCE: They don't
have to be even?
35
00:02:12,750 --> 00:02:13,090
PROFESSOR: Even?
36
00:02:13,090 --> 00:02:17,190
AUDIENCE: Even chance, even
probability, like the same.
37
00:02:17,190 --> 00:02:19,900
PROFESSOR: Sorry, I didn't
get your question.
38
00:02:19,900 --> 00:02:23,050
AUDIENCE: Even though they're
binary, don't you need be able
39
00:02:23,050 --> 00:02:27,780
to have the same probability?
40
00:02:27,780 --> 00:02:29,450
PROFESSOR: OK, we'll
look at that later.
41
00:02:29,450 --> 00:02:32,630
Like, this particular event
can take a particular
42
00:02:32,630 --> 00:02:33,600
probability.
43
00:02:33,600 --> 00:02:36,500
And we'll look at that
particular case later.
44
00:02:36,500 --> 00:02:39,130
But in general, a probability
will always lie
45
00:02:39,130 --> 00:02:40,740
between 0 and 1.
46
00:02:40,740 --> 00:02:43,770
47
00:02:43,770 --> 00:02:48,740
And it can take any value
between 0 and 1 since the
48
00:02:48,740 --> 00:02:51,395
range it can take is continuous,
sorry discrete.
49
00:02:51,395 --> 00:02:57,870
50
00:02:57,870 --> 00:03:02,330
However, the value the variable
can take is going to
51
00:03:02,330 --> 00:03:03,330
be discrete.
52
00:03:03,330 --> 00:03:08,190
It can take only H or T. So
that's why you call it a
53
00:03:08,190 --> 00:03:09,520
binary variable.
54
00:03:09,520 --> 00:03:13,110
For example, take
a deck of cards.
55
00:03:13,110 --> 00:03:17,910
Here, the value could be, for
example if you consider only
56
00:03:17,910 --> 00:03:20,420
one particular suit,
then it can be any
57
00:03:20,420 --> 00:03:21,810
one of those 13 values.
58
00:03:21,810 --> 00:03:24,350
59
00:03:24,350 --> 00:03:27,560
So there, this variable
is not binary.
60
00:03:27,560 --> 00:03:30,380
However, the probability of a
particular event happening is
61
00:03:30,380 --> 00:03:33,830
always between 0 and 1.
62
00:03:33,830 --> 00:03:37,690
Now, let's look at some
probability, like what you
63
00:03:37,690 --> 00:03:42,470
asked earlier is whether they
will be equal, whether the
64
00:03:42,470 --> 00:03:46,430
probably of head and
tail can be equal.
65
00:03:46,430 --> 00:03:51,730
So let's represent the
probability of A of H. This
66
00:03:51,730 --> 00:03:54,570
can be between 0 and 1.
67
00:03:54,570 --> 00:04:00,480
What is the probability
of A not happening?
68
00:04:00,480 --> 00:04:01,810
So we call it by A bar.
69
00:04:01,810 --> 00:04:06,000
70
00:04:06,000 --> 00:04:09,700
Given P of A, can you
give me P of A bar?
71
00:04:09,700 --> 00:04:10,660
AUDIENCE:1 minus P of A.
72
00:04:10,660 --> 00:04:16,070
PROFESSOR: 1 minus P of A.
If there are two events
73
00:04:16,070 --> 00:04:22,190
happening, for example, you're
throwing two coins, then we
74
00:04:22,190 --> 00:04:23,730
can consider their joint
probabilities.
75
00:04:23,730 --> 00:04:26,520
76
00:04:26,520 --> 00:04:33,030
So let's say we have a coin, A,
and this coin, B. So this
77
00:04:33,030 --> 00:04:34,940
coin can take two values.
78
00:04:34,940 --> 00:04:39,070
And so this coin can take
another two values.
79
00:04:39,070 --> 00:04:40,320
Sorry.
80
00:04:40,320 --> 00:04:51,618
81
00:04:51,618 --> 00:04:57,410
We know A can take H with
probability, say I assume it's
82
00:04:57,410 --> 00:04:59,380
unbiased, so it'll be 1/2.
83
00:04:59,380 --> 00:05:02,050
84
00:05:02,050 --> 00:05:03,330
All these are going to be 1/2.
85
00:05:03,330 --> 00:05:09,050
86
00:05:09,050 --> 00:05:11,415
What's the probability of HT?
87
00:05:11,415 --> 00:05:14,110
88
00:05:14,110 --> 00:05:20,020
So now, we are considering a
joint event, P of A is equal
89
00:05:20,020 --> 00:05:32,900
to H and P of B is equal to
T. So in probability, we
90
00:05:32,900 --> 00:05:35,160
represent it by something
like this.
91
00:05:35,160 --> 00:05:40,200
P A- do you know what is that?
92
00:05:40,200 --> 00:05:45,450
P A intersection B, you want
both events to happen.
93
00:05:45,450 --> 00:05:48,000
94
00:05:48,000 --> 00:05:57,350
That will be P of A. And in this
case, it's P of B. So we
95
00:05:57,350 --> 00:06:00,160
could simply say it's 1/4.
96
00:06:00,160 --> 00:06:01,570
Why is this possible?
97
00:06:01,570 --> 00:06:04,200
98
00:06:04,200 --> 00:06:06,380
It's because these two events
are independent.
99
00:06:06,380 --> 00:06:09,760
100
00:06:09,760 --> 00:06:14,100
The coin A getting head
doesn't affect
101
00:06:14,100 --> 00:06:17,810
coin B getting a tail.
102
00:06:17,810 --> 00:06:20,790
So it doesn't have
any influence.
103
00:06:20,790 --> 00:06:23,510
That's why these two events
are independent.
104
00:06:23,510 --> 00:06:27,360
The dependent events are a
bit complex, to analyze.
105
00:06:27,360 --> 00:06:30,860
Let's skip them at the moment.
106
00:06:30,860 --> 00:06:33,190
So we know all these
probabilities
107
00:06:33,190 --> 00:06:34,440
are going to be 1/4.
108
00:06:34,440 --> 00:06:36,770
109
00:06:36,770 --> 00:06:41,380
So we looked at a particular
condition here.
110
00:06:41,380 --> 00:06:44,780
That is, A taking head
and B taking tail.
111
00:06:44,780 --> 00:06:53,640
What about the condition, what
about the case where either A
112
00:06:53,640 --> 00:06:57,290
or B takes a head?
113
00:06:57,290 --> 00:06:59,950
How can we represent that?
114
00:06:59,950 --> 00:07:05,560
So it will be something like A
is equal to H or B is equal to
115
00:07:05,560 --> 00:07:13,600
H. Oh, probability at least
1, so by that, I can also
116
00:07:13,600 --> 00:07:14,850
represent something like this.
117
00:07:14,850 --> 00:07:18,370
118
00:07:18,370 --> 00:07:20,100
OK, here, this is sufficient
anyway.
119
00:07:20,100 --> 00:07:27,900
120
00:07:27,900 --> 00:07:29,400
So what are the possibility
events?
121
00:07:29,400 --> 00:07:54,000
122
00:07:54,000 --> 00:08:00,660
So these three events could give
rise to this probability.
123
00:08:00,660 --> 00:08:02,840
It's better if you can represent
this in a diagram.
124
00:08:02,840 --> 00:08:06,760
So let's go and represent
this in a diagram.
125
00:08:06,760 --> 00:08:12,175
This is A and this is B
getting, say, head.
126
00:08:12,175 --> 00:08:15,200
127
00:08:15,200 --> 00:08:18,250
In one case, both
can take head.
128
00:08:18,250 --> 00:08:22,410
That is, this particular
condition, intersection we
129
00:08:22,410 --> 00:08:23,660
earlier looked at.
130
00:08:23,660 --> 00:08:32,309
131
00:08:32,309 --> 00:08:35,789
So what is this whole thing?
132
00:08:35,789 --> 00:08:41,150
133
00:08:41,150 --> 00:08:46,010
That is, either A gets
H or B gets H,
134
00:08:46,010 --> 00:08:47,720
which is this condition.
135
00:08:47,720 --> 00:08:51,200
136
00:08:51,200 --> 00:08:54,450
We call it P of A union B. Ok.
137
00:08:54,450 --> 00:09:02,360
138
00:09:02,360 --> 00:09:07,720
Is there an efficient way of
finding this rather than
139
00:09:07,720 --> 00:09:09,450
writing down all
possible cases?
140
00:09:09,450 --> 00:09:11,950
141
00:09:11,950 --> 00:09:17,760
Is there an efficient way of
finding P of A union B?
142
00:09:17,760 --> 00:09:19,916
From high school maths,
probably?
143
00:09:19,916 --> 00:09:21,730
No idea?
144
00:09:21,730 --> 00:09:22,670
OK.
145
00:09:22,670 --> 00:09:29,350
P of A union B is equal to P of
A plus P of B minus P of A
146
00:09:29,350 --> 00:09:32,530
intersection B. Because if you
consider P of A, you would
147
00:09:32,530 --> 00:09:34,800
have taken this full circle.
148
00:09:34,800 --> 00:09:35,940
When you take P of
B, you would have
149
00:09:35,940 --> 00:09:38,010
taken this full circle.
150
00:09:38,010 --> 00:09:41,490
So which means you're counting
this area twice.
151
00:09:41,490 --> 00:09:42,740
So here, we deduct it once.
152
00:09:42,740 --> 00:09:47,130
153
00:09:47,130 --> 00:09:48,283
OK?
154
00:09:48,283 --> 00:09:49,533
Great.
155
00:09:49,533 --> 00:09:51,600
156
00:09:51,600 --> 00:09:54,720
So this is the basics
of the probability.
157
00:09:54,720 --> 00:10:00,900
Now, actually we looked at two
events, two joint events here.
158
00:10:00,900 --> 00:10:04,040
But we should have
a formal way of
159
00:10:04,040 --> 00:10:07,350
looking at multiple events.
160
00:10:07,350 --> 00:10:09,720
So how can we do that?
161
00:10:09,720 --> 00:10:11,640
The first way is doing
it by trees.
162
00:10:11,640 --> 00:10:16,020
163
00:10:16,020 --> 00:10:20,120
Let's say we represent
the outcome of the
164
00:10:20,120 --> 00:10:22,540
first trial by a branch.
165
00:10:22,540 --> 00:10:28,240
166
00:10:28,240 --> 00:10:32,240
We can represent the outcome of
the second trial by another
167
00:10:32,240 --> 00:10:37,160
branch from these two
previous branches.
168
00:10:37,160 --> 00:10:38,710
So this would be
H HH HT TH TT.
169
00:10:38,710 --> 00:10:51,840
170
00:10:51,840 --> 00:10:56,370
And we know this could happen
with probability 1/2.
171
00:10:56,370 --> 00:11:00,430
So we know it's, again,
1/2, 1/2, 1/2, 1/2.
172
00:11:00,430 --> 00:11:01,680
So this is 1/4.
173
00:11:01,680 --> 00:11:11,440
174
00:11:11,440 --> 00:11:18,955
Suppose we want to do this for
an outcome of throwing dice.
175
00:11:18,955 --> 00:11:22,060
176
00:11:22,060 --> 00:11:25,540
Then, probably we would
have 6 branches here.
177
00:11:25,540 --> 00:11:28,330
178
00:11:28,330 --> 00:11:33,270
Which, again, forks into
another 36 branches.
179
00:11:33,270 --> 00:11:35,730
So there should be another
easier way.
180
00:11:35,730 --> 00:11:38,730
For that, we could use a second
method call grid.
181
00:11:38,730 --> 00:11:42,410
182
00:11:42,410 --> 00:11:44,130
We could simply put
that in a diagram.
183
00:11:44,130 --> 00:11:52,470
184
00:11:52,470 --> 00:11:54,310
So this is the first trial.
185
00:11:54,310 --> 00:11:57,040
186
00:11:57,040 --> 00:11:59,165
And this will be our
second trial.
187
00:11:59,165 --> 00:12:11,580
188
00:12:11,580 --> 00:12:17,710
So now, we can represent any
possible outcome on this grid.
189
00:12:17,710 --> 00:12:22,090
For example, can give you me an
example where you throw the
190
00:12:22,090 --> 00:12:27,430
same number in both
the trials?
191
00:12:27,430 --> 00:12:30,710
Then, what would be the layout
of it in this grid?
192
00:12:30,710 --> 00:12:34,150
193
00:12:34,150 --> 00:12:37,790
Throwing the same number
in both the trials.
194
00:12:37,790 --> 00:12:39,025
Here's the first trial.
195
00:12:39,025 --> 00:12:40,976
This, the second.
196
00:12:40,976 --> 00:12:42,720
Then it would be the diagonal.
197
00:12:42,720 --> 00:12:48,000
198
00:12:48,000 --> 00:12:51,940
If you want to calculate the
probability, do you know the
199
00:12:51,940 --> 00:13:01,580
probability is the ratio between
the outcomes we expect
200
00:13:01,580 --> 00:13:04,170
over all possible outcomes?
201
00:13:04,170 --> 00:13:09,170
So here, we know there will
be 6 instances in this
202
00:13:09,170 --> 00:13:11,060
highlighted area.
203
00:13:11,060 --> 00:13:15,840
Compare that, 36 to
all possibilities.
204
00:13:15,840 --> 00:13:17,266
So it'll be simpler 6/36.
205
00:13:17,266 --> 00:13:22,620
206
00:13:22,620 --> 00:13:26,840
How can you find the probability
of getting a
207
00:13:26,840 --> 00:13:28,670
cumulative total of, say, 6?
208
00:13:28,670 --> 00:13:33,880
209
00:13:33,880 --> 00:13:36,480
Then again, it would
be very simple.
210
00:13:36,480 --> 00:13:43,930
It could be 1, 5; 2, 4;
3, 3; 4, 2; 1, 5.
211
00:13:43,930 --> 00:13:46,000
All right?
212
00:13:46,000 --> 00:13:49,360
So it'll be 5 by 36.
213
00:13:49,360 --> 00:13:52,550
214
00:13:52,550 --> 00:13:53,480
OK?
215
00:13:53,480 --> 00:13:56,550
So either by using trees or
grid, you can easily find the
216
00:13:56,550 --> 00:13:57,800
probabilities.
217
00:13:57,800 --> 00:14:04,060
218
00:14:04,060 --> 00:14:06,806
Now, let's look at a few
concrete examples.
219
00:14:06,806 --> 00:14:22,440
220
00:14:22,440 --> 00:14:24,030
Let's see.
221
00:14:24,030 --> 00:14:27,710
Suppose we are throwing
three coins.
222
00:14:27,710 --> 00:14:33,610
Then, what is the probability
of one particular outcome in
223
00:14:33,610 --> 00:14:36,590
that trial, in all
three trials?
224
00:14:36,590 --> 00:14:38,980
What is the probability,
assuming that these are
225
00:14:38,980 --> 00:14:41,020
unbiased coins?
226
00:14:41,020 --> 00:14:44,170
What is the probability of
one particular outcome?
227
00:14:44,170 --> 00:14:46,915
Because how many possible
outcomes are there if you are
228
00:14:46,915 --> 00:14:48,165
throwing three coins?
229
00:14:48,165 --> 00:14:50,440
230
00:14:50,440 --> 00:14:53,410
Consider this tree.
231
00:14:53,410 --> 00:14:54,760
First, it splits into 2.
232
00:14:54,760 --> 00:14:56,040
Then, it splits into 4.
233
00:14:56,040 --> 00:14:57,940
Then?
234
00:14:57,940 --> 00:15:00,860
8, all right?
235
00:15:00,860 --> 00:15:04,990
OK, so there are 8 possible
outcomes.
236
00:15:04,990 --> 00:15:08,110
So each outcome will have
the probability 1/8.
237
00:15:08,110 --> 00:15:12,230
238
00:15:12,230 --> 00:15:16,310
so what is the probability of
heads appearing exactly twice?
239
00:15:16,310 --> 00:15:19,370
240
00:15:19,370 --> 00:15:21,760
How can you do that?
241
00:15:21,760 --> 00:15:24,620
Of course, you can write
the tree and count.
242
00:15:24,620 --> 00:15:26,480
What is the easier way
of doing that?
243
00:15:26,480 --> 00:15:30,450
Since we know this count, since
we know this probability
244
00:15:30,450 --> 00:15:32,310
of a particular event
happening?
245
00:15:32,310 --> 00:15:34,590
How can we come up with
the probability of
246
00:15:34,590 --> 00:15:36,510
getting exactly 2 heads?
247
00:15:36,510 --> 00:15:41,960
248
00:15:41,960 --> 00:15:44,880
It could be head, head, or
tail-- so this is by
249
00:15:44,880 --> 00:15:47,790
enumerating all the
possible outcomes.
250
00:15:47,790 --> 00:15:51,700
So it could have been head,
head, tail, where me put the
251
00:15:51,700 --> 00:15:54,080
tail only at the end.
252
00:15:54,080 --> 00:15:56,545
It could have been
head, tail, head.
253
00:15:56,545 --> 00:16:01,100
Or it could have been
tail, head, head.
254
00:16:01,100 --> 00:16:06,670
In these three cases, you're
getting exactly 2 heads.
255
00:16:06,670 --> 00:16:10,150
So we are enumerating all
possible outcomes.
256
00:16:10,150 --> 00:16:12,560
And we know each possible
outcome will take the
257
00:16:12,560 --> 00:16:14,400
probability 1/8.
258
00:16:14,400 --> 00:16:18,360
So the total probability
here is 3/8.
259
00:16:18,360 --> 00:16:18,950
OK?
260
00:16:18,950 --> 00:16:21,540
So this is one way of handling
a probability question.
261
00:16:21,540 --> 00:16:25,600
262
00:16:25,600 --> 00:16:28,670
You can do that only because
these are independent events.
263
00:16:28,670 --> 00:16:31,240
And you can sum them.
264
00:16:31,240 --> 00:16:32,490
We'll come to that later.
265
00:16:32,490 --> 00:16:43,070
266
00:16:43,070 --> 00:16:47,500
Suppose you are rolling
two four-sided dice.
267
00:16:47,500 --> 00:16:52,000
And assuming they're fair,
how many possible
268
00:16:52,000 --> 00:16:53,900
outcomes are there?
269
00:16:53,900 --> 00:16:59,585
Two four-sided dice, and
assuming that each of them are
270
00:16:59,585 --> 00:17:02,890
fair-- that means unbiased--
271
00:17:02,890 --> 00:17:05,740
how many possible outcomes
are there?
272
00:17:05,740 --> 00:17:08,839
Consider this tree.
273
00:17:08,839 --> 00:17:13,040
First, it branches into 4, OK?
274
00:17:13,040 --> 00:17:15,849
In the first trial, it's a
four-sided dice, so there are
275
00:17:15,849 --> 00:17:17,710
4 possible outcomes.
276
00:17:17,710 --> 00:17:18,960
So it branches into 4.
277
00:17:18,960 --> 00:17:22,770
278
00:17:22,770 --> 00:17:25,329
Then, each branch will,
in turn, fork
279
00:17:25,329 --> 00:17:27,280
into another 4 branches.
280
00:17:27,280 --> 00:17:31,100
So there are totally
16 outcomes.
281
00:17:31,100 --> 00:17:35,540
So what is the probability
of rolling a 2 and a 3?
282
00:17:35,540 --> 00:17:39,900
What is the probability of
rolling a 2 and a 3?
283
00:17:39,900 --> 00:17:44,950
Not in a given order, not
in the given order.
284
00:17:44,950 --> 00:17:46,770
Can anyone give the answer?
285
00:17:46,770 --> 00:17:49,870
286
00:17:49,870 --> 00:17:51,130
OK, let's see.
287
00:17:51,130 --> 00:17:54,510
So we have to roll
a 2 and a 3.
288
00:17:54,510 --> 00:17:57,250
So which means it could have
been 2, 3, or 3, 2.
289
00:17:57,250 --> 00:18:00,350
290
00:18:00,350 --> 00:18:05,730
And we know the probability
of each event is 1/16.
291
00:18:05,730 --> 00:18:07,750
So this will be 1/16.
292
00:18:07,750 --> 00:18:12,230
And this will be 1/16.
293
00:18:12,230 --> 00:18:14,810
So the total probability
is 1/8.
294
00:18:14,810 --> 00:18:17,560
295
00:18:17,560 --> 00:18:23,820
What is the probability of
getting the sum of the rolls
296
00:18:23,820 --> 00:18:25,960
an odd number?
297
00:18:25,960 --> 00:18:28,090
What is the probability of
getting an odd number as sum
298
00:18:28,090 --> 00:18:30,110
of the rolls?
299
00:18:30,110 --> 00:18:33,790
Now, this is getting a bit
tricky because now it's maybe
300
00:18:33,790 --> 00:18:37,880
a bit harder to enumerate
all possible cases.
301
00:18:37,880 --> 00:18:39,130
So how can we do that?
302
00:18:39,130 --> 00:18:46,250
303
00:18:46,250 --> 00:18:47,190
There should be a short cut.
304
00:18:47,190 --> 00:18:48,681
AUDIENCE: It can either
be odd or even.
305
00:18:48,681 --> 00:18:49,540
PROFESSOR: Sorry?
306
00:18:49,540 --> 00:18:51,640
AUDIENCE: You can either
get odd or even.
307
00:18:51,640 --> 00:18:53,620
PROFESSOR: It can be either
odd or even, right?
308
00:18:53,620 --> 00:18:55,230
So it will be 1/2.
309
00:18:55,230 --> 00:18:59,000
OK, there's another trick we
might be able to use to get
310
00:18:59,000 --> 00:19:00,250
the answers quickly.
311
00:19:00,250 --> 00:19:02,640
312
00:19:02,640 --> 00:19:07,060
What is the probability of the
first roll being equal to the
313
00:19:07,060 --> 00:19:08,310
second roll?
314
00:19:08,310 --> 00:19:13,200
315
00:19:13,200 --> 00:19:16,240
In the same line,
you can think.
316
00:19:16,240 --> 00:19:19,840
What is the probability of
getting the first roll equal
317
00:19:19,840 --> 00:19:21,320
to the second roll?
318
00:19:21,320 --> 00:19:22,580
It's quite similar to this.
319
00:19:22,580 --> 00:19:25,870
320
00:19:25,870 --> 00:19:27,120
Any ideas?
321
00:19:27,120 --> 00:19:30,550
322
00:19:30,550 --> 00:19:31,840
It's a four-sided dice.
323
00:19:31,840 --> 00:19:34,380
There are 4 possible outcomes.
324
00:19:34,380 --> 00:19:37,510
This is one case where it could
be 1, 1, or it could be
325
00:19:37,510 --> 00:19:39,900
2, 2, or 3, 3, or 4, 4.
326
00:19:39,900 --> 00:19:45,320
And if it's inside a dice,
it would be n, right?
327
00:19:45,320 --> 00:19:50,690
So if it's n-sided dice, there
and n possible outcomes
328
00:19:50,690 --> 00:19:56,550
desired, and totally
n by n outcomes.
329
00:19:56,550 --> 00:19:58,960
So you get 1/n probability.
330
00:19:58,960 --> 00:20:03,240
331
00:20:03,240 --> 00:20:08,340
What is the probability of at
least 1 roll equal to 4?
332
00:20:08,340 --> 00:20:10,340
At least 1 roll equal to 4?
333
00:20:10,340 --> 00:20:14,490
334
00:20:14,490 --> 00:20:15,750
This is very interesting.
335
00:20:15,750 --> 00:20:17,040
These type of questions,
you'll get in
336
00:20:17,040 --> 00:20:19,770
that Psets, I know.
337
00:20:19,770 --> 00:20:21,890
Probably in the quiz, too.
338
00:20:21,890 --> 00:20:25,060
What is the probability
of getting at least 1
339
00:20:25,060 --> 00:20:26,310
roll equal to 4?
340
00:20:26,310 --> 00:20:28,690
341
00:20:28,690 --> 00:20:30,780
OK, so what are the
possible outcomes?
342
00:20:30,780 --> 00:20:35,330
First roll, could be a 4.
343
00:20:35,330 --> 00:20:39,300
And the second roll
could be anything.
344
00:20:39,300 --> 00:20:42,565
345
00:20:42,565 --> 00:20:44,480
Or it could be 4, and
the first roll
346
00:20:44,480 --> 00:20:46,650
could have been anything.
347
00:20:46,650 --> 00:20:49,740
Or both could have been 4, but
we would have considered that
348
00:20:49,740 --> 00:20:50,990
here, as well.
349
00:20:50,990 --> 00:20:56,640
350
00:20:56,640 --> 00:20:59,340
So what we had to do is we had
to calculate this probability
351
00:20:59,340 --> 00:21:02,030
and this probability, add them,
and deduct this, because
352
00:21:02,030 --> 00:21:04,760
this would have been
double counted.
353
00:21:04,760 --> 00:21:08,230
It's quite like, this
intersection.
354
00:21:08,230 --> 00:21:12,490
We want to remove that, and we
want to find the union OK?
355
00:21:12,490 --> 00:21:15,560
So what is this probability?
356
00:21:15,560 --> 00:21:18,455
Since we don't care about the
second roll, we have to care
357
00:21:18,455 --> 00:21:21,300
only about the first roll,
our first roll
358
00:21:21,300 --> 00:21:24,590
getting 4, which is 1/4.
359
00:21:24,590 --> 00:21:28,450
And this is 1/4 similarly.
360
00:21:28,450 --> 00:21:32,770
And this is 1/4 by
1/4, so 1/16.
361
00:21:32,770 --> 00:21:35,280
So it'll be 1/2 minus 1/16.
362
00:21:35,280 --> 00:21:37,890
363
00:21:37,890 --> 00:21:41,460
And when you give the answers,
if it's hard, you can just
364
00:21:41,460 --> 00:21:43,050
leave it like this.
365
00:21:43,050 --> 00:21:46,190
So this is what we call giving
the answers as formula instead
366
00:21:46,190 --> 00:21:47,990
of giving exact fractions.
367
00:21:47,990 --> 00:21:50,280
Because sometimes it might be
hard to find the fraction.
368
00:21:50,280 --> 00:21:53,500
Suppose it's something like 1
over, say, 2 to the power 5
369
00:21:53,500 --> 00:21:55,370
and a 3 to the 2, something
like this.
370
00:21:55,370 --> 00:21:56,800
Or we'll say 5.
371
00:21:56,800 --> 00:22:00,095
You're not supposed to give the
exact value in this amount
372
00:22:00,095 --> 00:22:01,100
or even the fractions.
373
00:22:01,100 --> 00:22:03,180
You can give such formulas.
374
00:22:03,180 --> 00:22:07,205
You can give something like
this, too, to give the inverse
375
00:22:07,205 --> 00:22:10,930
probability of that
not happening.
376
00:22:10,930 --> 00:22:11,400
Let's see.
377
00:22:11,400 --> 00:22:16,310
Let's move into a little bit
more complicated example.
378
00:22:16,310 --> 00:22:18,710
A pack of cards--
379
00:22:18,710 --> 00:22:21,750
what is the probability
of getting an ace?
380
00:22:21,750 --> 00:22:23,000
Anyone?
381
00:22:23,000 --> 00:22:25,442
382
00:22:25,442 --> 00:22:26,354
AUDIENCE: 1 out of 2?
383
00:22:26,354 --> 00:22:28,180
PROFESSOR: 1 out of 2?
384
00:22:28,180 --> 00:22:30,880
AUDIENCE: out of 52.
385
00:22:30,880 --> 00:22:32,920
PROFESSOR: Not a particular--
386
00:22:32,920 --> 00:22:37,055
an ace, yes, just ace.
387
00:22:37,055 --> 00:22:38,438
AUDIENCE: Is it 4 out of 52?
388
00:22:38,438 --> 00:22:42,030
PROFESSOR: 4/52, yes.
389
00:22:42,030 --> 00:22:44,400
Or if you consider one
suit, it would have
390
00:22:44,400 --> 00:22:46,100
been like 1/13, right?
391
00:22:46,100 --> 00:22:48,480
You could have considered
one suit, and out of--
392
00:22:48,480 --> 00:22:50,690
OK.
393
00:22:50,690 --> 00:22:52,612
It's the same analysis, right?
394
00:22:52,612 --> 00:22:54,220
OK.
395
00:22:54,220 --> 00:22:57,630
What is the probability of
getting a specific card, which
396
00:22:57,630 --> 00:22:59,795
means, say, the ace of hearts?
397
00:22:59,795 --> 00:23:04,690
398
00:23:04,690 --> 00:23:08,560
It's what she said,
yeah, 1/52.
399
00:23:08,560 --> 00:23:10,990
What is the probability
of not getting an ace?
400
00:23:10,990 --> 00:23:14,190
401
00:23:14,190 --> 00:23:15,170
AUDIENCE: [INAUDIBLE]?
402
00:23:15,170 --> 00:23:16,750
PROFESSOR: Sorry?
403
00:23:16,750 --> 00:23:18,220
AUDIENCE: 1 minus--
404
00:23:18,220 --> 00:23:19,470
PROFESSOR: 1/13.
405
00:23:19,470 --> 00:23:22,060
406
00:23:22,060 --> 00:23:25,950
OK, this is where me make you
solve the inverse probability.
407
00:23:25,950 --> 00:23:29,480
OK, so that will come into
play very often.
408
00:23:29,480 --> 00:23:33,980
OK, now let's get into two
decks of playing cards.
409
00:23:33,980 --> 00:23:39,160
OK, what is the sample size?
410
00:23:39,160 --> 00:23:42,930
What is the sample size
of drawing cards from
411
00:23:42,930 --> 00:23:44,630
two decks of cards?
412
00:23:44,630 --> 00:23:45,420
Two cards, actually.
413
00:23:45,420 --> 00:23:48,110
You're going to draw two cards
from two different decks.
414
00:23:48,110 --> 00:23:51,930
415
00:23:51,930 --> 00:23:53,530
Sorry?
416
00:23:53,530 --> 00:23:54,470
OK.
417
00:23:54,470 --> 00:23:59,850
What is the sample size of
drawing a card from one deck?
418
00:23:59,850 --> 00:24:03,530
There are 52 possible
outcomes.
419
00:24:03,530 --> 00:24:07,890
So for each outcome here, we
have 52 outcomes there, right?
420
00:24:07,890 --> 00:24:09,500
So it's 52 by 52.
421
00:24:09,500 --> 00:24:11,810
It's like the tree, but here,
we have 52 branches.
422
00:24:11,810 --> 00:24:15,060
423
00:24:15,060 --> 00:24:17,830
So eventually, you will
have 52 by 52.
424
00:24:17,830 --> 00:24:19,220
This is where you can't
enumerate all
425
00:24:19,220 --> 00:24:20,650
the possible cases.
426
00:24:20,650 --> 00:24:24,420
So you should have a way
to find the final
427
00:24:24,420 --> 00:24:26,317
probability, OK?
428
00:24:26,317 --> 00:24:29,810
429
00:24:29,810 --> 00:24:33,440
So in this case, what is the
probability of getting at
430
00:24:33,440 --> 00:24:34,775
least one ace?
431
00:24:34,775 --> 00:24:37,820
432
00:24:37,820 --> 00:24:42,280
What's the probability of
getting at least one ace?
433
00:24:42,280 --> 00:24:46,150
This is, again, similar
to this case.
434
00:24:46,150 --> 00:24:47,000
Remember this diagram.
435
00:24:47,000 --> 00:24:48,250
It's called Venn diagram.
436
00:24:48,250 --> 00:24:53,250
437
00:24:53,250 --> 00:24:54,720
Remember this.
438
00:24:54,720 --> 00:24:58,260
So what is the probability of
getting at least one ace,
439
00:24:58,260 --> 00:25:01,170
which means you could have got
the ace from the first deck,
440
00:25:01,170 --> 00:25:03,940
or the second deck, or both.
441
00:25:03,940 --> 00:25:05,950
But if you're getting from both,
you have to deduct it
442
00:25:05,950 --> 00:25:11,510
because otherwise, you would
have double counted it.
443
00:25:11,510 --> 00:25:16,220
So getting an ace from the
first deck is 1/13.
444
00:25:16,220 --> 00:25:18,130
Second deck, 1/13.
445
00:25:18,130 --> 00:25:22,240
Getting from both
is 1/52 by 52.
446
00:25:22,240 --> 00:25:27,460
Sorry, 1/13 by 1/13.
447
00:25:27,460 --> 00:25:39,900
448
00:25:39,900 --> 00:25:40,375
Sorry.
449
00:25:40,375 --> 00:25:41,784
AUDIENCE: Are you adding them?
450
00:25:41,784 --> 00:25:46,010
PROFESSOR: Yeah, that's what
I explained earlier.
451
00:25:46,010 --> 00:25:47,310
You're doing two trials.
452
00:25:47,310 --> 00:25:50,400
453
00:25:50,400 --> 00:25:52,290
You could have got the
ace from here.
454
00:25:52,290 --> 00:25:54,310
And this could have
been anything.
455
00:25:54,310 --> 00:25:56,270
You could have got the ace from
here, and this could have
456
00:25:56,270 --> 00:25:57,150
been anything.
457
00:25:57,150 --> 00:26:00,400
You could have got
an ace from both.
458
00:26:00,400 --> 00:26:03,150
So you should add these two
probabilities because we need
459
00:26:03,150 --> 00:26:07,590
a case where at least
one card is ace.
460
00:26:07,590 --> 00:26:10,775
But the problem is, this could
have happened here and here.
461
00:26:10,775 --> 00:26:12,025
And so you will deduct it.
462
00:26:12,025 --> 00:26:16,330
463
00:26:16,330 --> 00:26:20,690
What is the probability of
getting neither card--
464
00:26:20,690 --> 00:26:22,690
what is the probability of
neither card being an ace?
465
00:26:22,690 --> 00:26:26,394
466
00:26:26,394 --> 00:26:27,320
AUDIENCE: 1 minus that?
467
00:26:27,320 --> 00:26:31,890
PROFESSOR: 1 minus
this, exactly.
468
00:26:31,890 --> 00:26:33,040
OK, you're getting comfortable
with the
469
00:26:33,040 --> 00:26:35,550
inverse probability now.
470
00:26:35,550 --> 00:26:42,320
What's the probability of two
cards from the same suit?
471
00:26:42,320 --> 00:26:44,060
What is the probability
of getting two cards
472
00:26:44,060 --> 00:26:45,310
from the same suit?
473
00:26:45,310 --> 00:26:50,290
474
00:26:50,290 --> 00:26:52,810
Now, it's getting interesting.
475
00:26:52,810 --> 00:26:55,390
Two cards from the same suit.
476
00:26:55,390 --> 00:26:58,600
So how can we think
about this?
477
00:26:58,600 --> 00:27:02,910
Of course, you can enumerate all
possible cases and count.
478
00:27:02,910 --> 00:27:04,160
We don't want to do that.
479
00:27:04,160 --> 00:27:08,970
480
00:27:08,970 --> 00:27:13,315
OK, you're going to use the grid
here to visualize this.
481
00:27:13,315 --> 00:27:18,100
482
00:27:18,100 --> 00:27:19,110
OK?
483
00:27:19,110 --> 00:27:21,240
It could have been a
spades, or hearts,
484
00:27:21,240 --> 00:27:22,890
or clubs, or a diamond.
485
00:27:22,890 --> 00:27:29,270
486
00:27:29,270 --> 00:27:32,270
So we want two cards of
the same suit, right?
487
00:27:32,270 --> 00:27:38,440
488
00:27:38,440 --> 00:27:42,280
So it's 4/16 possible
outcomes.
489
00:27:42,280 --> 00:27:45,480
490
00:27:45,480 --> 00:27:47,310
Do you see that?
491
00:27:47,310 --> 00:27:50,270
So see, we are using
all the tools
492
00:27:50,270 --> 00:27:51,650
available at our disposal--
493
00:27:51,650 --> 00:27:58,340
trees, grids, counting, Ven
diagrams, inverse probability.
494
00:27:58,340 --> 00:28:01,000
Yeah, you should be able to
do that to get the answers
495
00:28:01,000 --> 00:28:04,270
quickly because you could have
actually done-- you could have
496
00:28:04,270 --> 00:28:06,130
done something like this, too.
497
00:28:06,130 --> 00:28:08,180
But it will take more
time, right?
498
00:28:08,180 --> 00:28:14,240
So this will be a simpler way
of visualizing things.
499
00:28:14,240 --> 00:28:18,170
What is the probability of
getting neither card a diamond
500
00:28:18,170 --> 00:28:19,420
nor a club?
501
00:28:19,420 --> 00:28:25,300
502
00:28:25,300 --> 00:28:27,615
Neither card is diamond
nor club.
503
00:28:27,615 --> 00:28:28,865
That is tricky.
504
00:28:28,865 --> 00:28:31,000
505
00:28:31,000 --> 00:28:36,080
But since we have this grid, we
can easily visualize that.
506
00:28:36,080 --> 00:28:39,360
507
00:28:39,360 --> 00:28:43,590
So if neither card is diamond
nor club, then it could have
508
00:28:43,590 --> 00:28:45,130
been only these two
values, right?
509
00:28:45,130 --> 00:28:48,740
510
00:28:48,740 --> 00:28:52,530
Which is, again, 4/16.
511
00:28:52,530 --> 00:28:54,200
So there are 4 possible cases.
512
00:28:54,200 --> 00:28:57,490
513
00:28:57,490 --> 00:28:58,740
OK?
514
00:28:58,740 --> 00:29:06,930
515
00:29:06,930 --> 00:29:09,860
So what is the summary?
516
00:29:09,860 --> 00:29:11,870
What is the take home
message here?
517
00:29:11,870 --> 00:29:18,680
518
00:29:18,680 --> 00:29:22,940
In probability, the probability
of the belief, or
519
00:29:22,940 --> 00:29:26,635
the way of expressing
the belief, of a
520
00:29:26,635 --> 00:29:29,320
particular event happening.
521
00:29:29,320 --> 00:29:32,990
Now, there could be several
possible outcomes.
522
00:29:32,990 --> 00:29:35,390
Out of those possible outcomes,
you have a certain
523
00:29:35,390 --> 00:29:37,400
number of desired outcomes.
524
00:29:37,400 --> 00:29:39,900
How can you find that?
525
00:29:39,900 --> 00:29:41,600
You can either enumerate
all of them.
526
00:29:41,600 --> 00:29:44,610
You can put them in a tree, or
you can put them in a grid.
527
00:29:44,610 --> 00:29:48,430
Or you can use some sort of Venn
diagram and come up with
528
00:29:48,430 --> 00:29:50,470
some sort of analysis.
529
00:29:50,470 --> 00:29:57,310
Here, we start with our belief
that the coin is unbiased, or
530
00:29:57,310 --> 00:29:59,350
we have a fair chance
of drawing any card
531
00:29:59,350 --> 00:30:00,820
from the deck of cards.
532
00:30:00,820 --> 00:30:06,650
So we have all these unbiased
beliefs, or beliefs about the
533
00:30:06,650 --> 00:30:09,680
characteristics of each trial.
534
00:30:09,680 --> 00:30:11,140
So we start from that.
535
00:30:11,140 --> 00:30:13,690
536
00:30:13,690 --> 00:30:19,440
Then, we find the probability
of a particular event
537
00:30:19,440 --> 00:30:22,540
happening in a certain
number of trials.
538
00:30:22,540 --> 00:30:30,230
But what if you don't have the
knowledge about the coin?
539
00:30:30,230 --> 00:30:32,250
What if you don't know whether
it's fair or not?
540
00:30:32,250 --> 00:30:37,920
What if you don't know P of A is
equal to H is equal to 1/2?
541
00:30:37,920 --> 00:30:38,910
Suppose you don't know that.
542
00:30:38,910 --> 00:30:42,380
Suppose it's P. How
can you find it?
543
00:30:42,380 --> 00:30:47,250
544
00:30:47,250 --> 00:30:50,800
What you could do is you
could simulate this.
545
00:30:50,800 --> 00:30:55,040
You can throw coin several
times and count the total
546
00:30:55,040 --> 00:30:58,910
number of heads you get, OK?
547
00:30:58,910 --> 00:31:04,140
So it could be n of heads
over n trial will
548
00:31:04,140 --> 00:31:05,550
give you the P, right?
549
00:31:05,550 --> 00:31:10,860
550
00:31:10,860 --> 00:31:14,950
This is a way of finding the
probabilities through a
551
00:31:14,950 --> 00:31:16,290
certain number of trials.
552
00:31:16,290 --> 00:31:20,150
It's like simulating
the experiments.
553
00:31:20,150 --> 00:31:21,515
It's called Monte Carlo
simulation.
554
00:31:21,515 --> 00:31:24,400
555
00:31:24,400 --> 00:31:27,950
And using that, we try
to find a particular
556
00:31:27,950 --> 00:31:29,986
parameter of the model.
557
00:31:29,986 --> 00:31:33,750
You know how they actually found
the value of pi at the
558
00:31:33,750 --> 00:31:35,240
beginning, pi?
559
00:31:35,240 --> 00:31:38,210
560
00:31:38,210 --> 00:31:40,290
It's again using a Monte
Carlo simulation.
561
00:31:40,290 --> 00:31:49,980
What you could do is for a given
radius, you can actually
562
00:31:49,980 --> 00:31:51,920
check whether it lies within
a circle or not.
563
00:31:51,920 --> 00:31:53,680
You can simulate the Monte
Carlo simulation.
564
00:31:53,680 --> 00:31:59,070
And given this radius, you can
come up with a particular
565
00:31:59,070 --> 00:32:04,520
location at random and check
whether it's within this
566
00:32:04,520 --> 00:32:07,700
boundary or not, OK?
567
00:32:07,700 --> 00:32:10,080
So then, you know the outcome.
568
00:32:10,080 --> 00:32:11,170
You know the outcomes, right?
569
00:32:11,170 --> 00:32:22,280
So suppose this is n_a, And
the total outcome is n_t.
570
00:32:22,280 --> 00:32:24,250
This gives you the
area, right?
571
00:32:24,250 --> 00:32:29,212
We know this is r-squared,
and this is pi r-squared.
572
00:32:29,212 --> 00:32:30,462
Sorry.
573
00:32:30,462 --> 00:32:36,920
574
00:32:36,920 --> 00:32:39,621
When this is 4 r-squared,
this is 2r, right?
575
00:32:39,621 --> 00:32:44,700
576
00:32:44,700 --> 00:32:47,135
So using this, you can
easily calculate pi.
577
00:32:47,135 --> 00:32:55,020
578
00:32:55,020 --> 00:32:59,270
So now, since we are going to
come up with these parameters
579
00:32:59,270 --> 00:33:05,800
through repeating the trials, we
need to have a standardized
580
00:33:05,800 --> 00:33:08,740
way of finding these
parameters.
581
00:33:08,740 --> 00:33:11,470
We can't simply say
this, right?
582
00:33:11,470 --> 00:33:13,615
Take this example.
583
00:33:13,615 --> 00:33:17,640
You know this MIT
shuttle right?
584
00:33:17,640 --> 00:33:21,380
A shuttle arriving at the
right time, or the time
585
00:33:21,380 --> 00:33:24,870
difference between the arrival
and the actual quoted time can
586
00:33:24,870 --> 00:33:27,380
be plotted in a graph.
587
00:33:27,380 --> 00:33:31,220
So if you put that it is
spread around 0, right?
588
00:33:31,220 --> 00:33:35,010
Probably, or we hope so.
589
00:33:35,010 --> 00:33:36,370
OK?
590
00:33:36,370 --> 00:33:47,840
Now, from this, we can see that
actually the mean of this
591
00:33:47,840 --> 00:33:52,950
simulation will give you the
expected difference in the
592
00:33:52,950 --> 00:33:58,330
time, the expected difference
in the arrival time from the
593
00:33:58,330 --> 00:33:59,580
actual quoted time.
594
00:33:59,580 --> 00:34:01,890
595
00:34:01,890 --> 00:34:06,830
And we hope this expectation
to be 0.
596
00:34:06,830 --> 00:34:09,150
We call that mean.
597
00:34:09,150 --> 00:34:10,400
Means is taking the average.
598
00:34:10,400 --> 00:34:20,550
599
00:34:20,550 --> 00:34:26,389
But this distribution might
actually give you some
600
00:34:26,389 --> 00:34:29,650
information, some extra
information, as well.
601
00:34:29,650 --> 00:34:34,150
That is, how well we can
actually believe this, how
602
00:34:34,150 --> 00:34:35,700
much we can rely on this.
603
00:34:35,700 --> 00:34:41,340
If the spread is greater,
something like this, then
604
00:34:41,340 --> 00:34:44,449
probably you might actually not
trust the system, right?
605
00:34:44,449 --> 00:34:47,340
Although the mean is 0,
it's going to come
606
00:34:47,340 --> 00:34:48,449
early or late, right?
607
00:34:48,449 --> 00:34:49,699
Which means it's useless.
608
00:34:49,699 --> 00:34:52,750
609
00:34:52,750 --> 00:35:00,090
Similarly, in this case, we have
a spread around mean 0.
610
00:35:00,090 --> 00:35:10,280
But if you take the score, the
marks you get for 600, it
611
00:35:10,280 --> 00:35:11,290
could be something like this.
612
00:35:11,290 --> 00:35:13,730
It's not centered
around 0, right?
613
00:35:13,730 --> 00:35:14,790
Hopefully.
614
00:35:14,790 --> 00:35:18,570
It's probably, say, 50.
615
00:35:18,570 --> 00:35:22,850
Then, we actually want the
spread to be small or large?
616
00:35:22,850 --> 00:35:25,420
617
00:35:25,420 --> 00:35:31,290
We want the spread to be large
because we want to distinguish
618
00:35:31,290 --> 00:35:32,580
the levels, right?
619
00:35:32,580 --> 00:35:34,360
The students' level
of understanding.
620
00:35:34,360 --> 00:35:40,270
600.
621
00:35:40,270 --> 00:35:44,930
Anyway, so the spread determines
what is the
622
00:35:44,930 --> 00:35:50,650
variation percent in their
distribution of the scores?
623
00:35:50,650 --> 00:35:53,305
We measure that by a variable
called standard deviation.
624
00:35:53,305 --> 00:35:59,340
625
00:35:59,340 --> 00:36:05,630
In this case, this particular
sample will be different from
626
00:36:05,630 --> 00:36:10,770
its mean by a particular
value, right?
627
00:36:10,770 --> 00:36:17,980
We can express that as
x_i minus its mean.
628
00:36:17,980 --> 00:36:19,318
Let's call the mean mu.
629
00:36:19,318 --> 00:36:22,070
630
00:36:22,070 --> 00:36:24,440
So this would be
the difference.
631
00:36:24,440 --> 00:36:29,400
Standard deviation is summing
up all the differences.
632
00:36:29,400 --> 00:36:32,210
But the problem is, when you sum
up the differences, it'll
633
00:36:32,210 --> 00:36:34,210
be 0, right?
634
00:36:34,210 --> 00:36:36,890
The total summation of the
differences will be 0 if
635
00:36:36,890 --> 00:36:42,380
that's how you get the mean
because if you expand this,
636
00:36:42,380 --> 00:36:43,760
it'll be something
like this, right?
637
00:36:43,760 --> 00:36:49,000
638
00:36:49,000 --> 00:36:50,250
Which will be n mu.
639
00:36:50,250 --> 00:37:03,030
640
00:37:03,030 --> 00:37:05,160
Should be equal to 0.
641
00:37:05,160 --> 00:37:08,690
So we have to sum, or
actually take the
642
00:37:08,690 --> 00:37:10,490
differences into account.
643
00:37:10,490 --> 00:37:12,540
So, let's square this.
644
00:37:12,540 --> 00:37:17,350
So now, it will no
longer be 0.
645
00:37:17,350 --> 00:37:21,142
Now, this gives 0,
the differences.
646
00:37:21,142 --> 00:37:24,330
It's the squared sum of the
differences averaged across
647
00:37:24,330 --> 00:37:25,580
all the samples.
648
00:37:25,580 --> 00:37:27,820
649
00:37:27,820 --> 00:37:29,315
We call this variance.
650
00:37:29,315 --> 00:37:32,280
651
00:37:32,280 --> 00:37:33,370
And the square root of
652
00:37:33,370 --> 00:37:36,555
variance is standard deviation.
653
00:37:36,555 --> 00:37:45,530
654
00:37:45,530 --> 00:37:47,320
OK?
655
00:37:47,320 --> 00:37:51,930
Now, having a standard
deviation--
656
00:37:51,930 --> 00:37:54,650
657
00:37:54,650 --> 00:37:57,050
so we know the standard
deviation tells you how spread
658
00:37:57,050 --> 00:37:59,910
the distribution is.
659
00:37:59,910 --> 00:38:04,280
But can we actually rely only
on the standard deviation to
660
00:38:04,280 --> 00:38:09,230
determine the consistency
of some event?
661
00:38:09,230 --> 00:38:11,390
Can we?
662
00:38:11,390 --> 00:38:12,070
Probably not.
663
00:38:12,070 --> 00:38:19,050
Suppose take two examples,
one is the scores, 50.
664
00:38:19,050 --> 00:38:20,920
And suppose the standard
deviation is minus
665
00:38:20,920 --> 00:38:24,270
10, plus 10, OK?
666
00:38:24,270 --> 00:38:26,290
So the standard deviation
is 10 here.
667
00:38:26,290 --> 00:38:29,720
Suppose it lies in this form.
668
00:38:29,720 --> 00:38:34,420
Consider another example, the
weight, the weight of the
669
00:38:34,420 --> 00:38:38,360
people, like say at MIT.
670
00:38:38,360 --> 00:38:44,850
And suppose it's centered
around 150.
671
00:38:44,850 --> 00:38:50,120
Now, if the standard deviation
is, say, 10, then the standard
672
00:38:50,120 --> 00:38:53,780
deviation 10 here and the
standard deviation 10 here
673
00:38:53,780 --> 00:38:59,640
don't convey the same
message, OK?
674
00:38:59,640 --> 00:39:07,110
So we need to have a different
way of expressing the
675
00:39:07,110 --> 00:39:10,650
consistency of a distribution.
676
00:39:10,650 --> 00:39:29,110
So we represent it by
coefficient of variation,
677
00:39:29,110 --> 00:39:37,690
which is equal to the standard
deviation divided by mean.
678
00:39:37,690 --> 00:39:42,810
679
00:39:42,810 --> 00:39:47,240
Now here, it will be 10/150.
680
00:39:47,240 --> 00:39:50,810
Here, it will be 10/50.
681
00:39:50,810 --> 00:39:55,100
So we know this is more
consistent than this.
682
00:39:55,100 --> 00:40:01,110
The weights of the students at
MIT, it's more consistent than
683
00:40:01,110 --> 00:40:05,215
the marks you might get,
or you get, for 600.
684
00:40:05,215 --> 00:40:06,465
It might be true.
685
00:40:06,465 --> 00:40:11,120
686
00:40:11,120 --> 00:40:16,530
Now, what is for the use of
the standard deviation?
687
00:40:16,530 --> 00:40:17,780
How can we use that?
688
00:40:17,780 --> 00:40:20,610
689
00:40:20,610 --> 00:40:28,220
Let's look at this graph where
suppose the mean is 0 and the
690
00:40:28,220 --> 00:40:31,150
standard deviation is, say, 5.
691
00:40:31,150 --> 00:40:34,370
692
00:40:34,370 --> 00:40:37,460
Consider another example where
standard deviation is 10.
693
00:40:37,460 --> 00:40:43,890
694
00:40:43,890 --> 00:40:46,510
It might have been
like this, OK?
695
00:40:46,510 --> 00:40:58,680
Now, before that, let me sort
of digress a little bit so I
696
00:40:58,680 --> 00:41:00,030
can explain this better.
697
00:41:00,030 --> 00:41:03,120
698
00:41:03,120 --> 00:41:08,106
We can take the outcome of a
particular event as a sample
699
00:41:08,106 --> 00:41:10,440
in our distribution.
700
00:41:10,440 --> 00:41:12,920
So suppose you're
throwing a die.
701
00:41:12,920 --> 00:41:15,560
So you get an outcome.
702
00:41:15,560 --> 00:41:21,650
You can represent that outcome
as a distribution, OK?
703
00:41:21,650 --> 00:41:29,810
So here, there's x, which
can take 1 to, say, 6.
704
00:41:29,810 --> 00:41:33,450
And we can represent x_i as
a sample point in our
705
00:41:33,450 --> 00:41:36,110
distribution.
706
00:41:36,110 --> 00:41:43,060
So I don't know, it might be
uniform, probably, we hope.
707
00:41:43,060 --> 00:41:46,750
So it's with 1/6 probability,
we always take
708
00:41:46,750 --> 00:41:47,402
one of these values.
709
00:41:47,402 --> 00:41:48,652
OK.
710
00:41:48,652 --> 00:41:50,120
711
00:41:50,120 --> 00:41:53,020
But this might not be the
case with all events.
712
00:41:53,020 --> 00:41:57,250
713
00:41:57,250 --> 00:42:02,090
OK, so what I'm trying to say
here is you can actually
714
00:42:02,090 --> 00:42:07,040
represent the outcome of the
trial in the distribution.
715
00:42:07,040 --> 00:42:10,410
Or you can also represent the
probability of something
716
00:42:10,410 --> 00:42:12,115
happening in a distribution.
717
00:42:12,115 --> 00:42:15,650
718
00:42:15,650 --> 00:42:16,700
How does it work?
719
00:42:16,700 --> 00:42:19,840
OK, in this case, we
throw our dice.
720
00:42:19,840 --> 00:42:20,930
We get an outcome.
721
00:42:20,930 --> 00:42:23,170
We go and put it
in the x-axis.
722
00:42:23,170 --> 00:42:26,140
It could be between 1 and 6.
723
00:42:26,140 --> 00:42:27,575
And it takes this
distribution.
724
00:42:27,575 --> 00:42:30,220
725
00:42:30,220 --> 00:42:34,550
In addition, what you could
do is you could
726
00:42:34,550 --> 00:42:36,780
have, say, 100 trials.
727
00:42:36,780 --> 00:42:38,810
So you throw a coin.
728
00:42:38,810 --> 00:42:40,270
You take 100 trials.
729
00:42:40,270 --> 00:42:44,570
You get the mean, you get the
probability of getting a head.
730
00:42:44,570 --> 00:42:46,850
And you have that mean, right?
731
00:42:46,850 --> 00:42:49,230
So probability of getting
a head for 100
732
00:42:49,230 --> 00:42:53,690
trials, say, 0.51.
733
00:42:53,690 --> 00:42:58,610
You do another 100 trials,
you got another one.
734
00:42:58,610 --> 00:43:00,900
So you have now another
distribution.
735
00:43:00,900 --> 00:43:03,600
So there's a distribution
of probabilities.
736
00:43:03,600 --> 00:43:05,910
So you can have a distribution
of probabilities, or you can
737
00:43:05,910 --> 00:43:08,600
have a distribution
for the events.
738
00:43:08,600 --> 00:43:12,280
We handle these two cases
in the p-set.
739
00:43:12,280 --> 00:43:16,150
So probably you should be able
to distinguish those two.
740
00:43:16,150 --> 00:43:21,410
Anyway, so here in this
particular example, let's take
741
00:43:21,410 --> 00:43:23,710
this as our mu.
742
00:43:23,710 --> 00:43:25,700
Let's take this as our
standard deviation.
743
00:43:25,700 --> 00:43:29,090
And for the first distribution,
let's take the
744
00:43:29,090 --> 00:43:30,870
standard deviation to be 5.
745
00:43:30,870 --> 00:43:32,920
When the standard deviation
is great, it's
746
00:43:32,920 --> 00:43:36,130
going to be more spread.
747
00:43:36,130 --> 00:43:39,320
It's going to be more
distributed than the former.
748
00:43:39,320 --> 00:43:41,960
So here, say the standard
deviation is 10.
749
00:43:41,960 --> 00:43:45,330
750
00:43:45,330 --> 00:43:49,200
The standard deviation is a
way of expressing how many
751
00:43:49,200 --> 00:43:53,790
items, how many samples are
going to lie between those
752
00:43:53,790 --> 00:43:56,610
particular boundaries.
753
00:43:56,610 --> 00:44:02,080
So for a normal distribution, we
know the exact area, exact
754
00:44:02,080 --> 00:44:03,950
probability of things
happening.
755
00:44:03,950 --> 00:44:07,670
756
00:44:07,670 --> 00:44:12,320
If there's no mu, we know within
the first standard
757
00:44:12,320 --> 00:44:26,710
deviation, there will be 68%
of events lie in that area.
758
00:44:26,710 --> 00:44:28,135
Within two standard
deviations--
759
00:44:28,135 --> 00:44:34,150
760
00:44:34,150 --> 00:44:39,465
OK, one standard
deviation, 68%.
761
00:44:39,465 --> 00:44:42,580
762
00:44:42,580 --> 00:44:45,520
Two standard deviations
on either side,
763
00:44:45,520 --> 00:44:47,910
it's going to be 95%.
764
00:44:47,910 --> 00:44:53,110
Three standard deviations,
it's going to be 99%.
765
00:44:53,110 --> 00:44:59,760
So suppose you conducted
so many trials.
766
00:44:59,760 --> 00:45:02,260
And you get the values.
767
00:45:02,260 --> 00:45:08,500
And in the distribution, suppose
mu, mean, is 10, and
768
00:45:08,500 --> 00:45:09,750
the standard deviation
is, say, 1.
769
00:45:09,750 --> 00:45:12,510
770
00:45:12,510 --> 00:45:19,430
So now, with 99% confidence, we
can say then the outcome of
771
00:45:19,430 --> 00:45:22,710
the next trial is going
to be between what?
772
00:45:22,710 --> 00:45:26,200
773
00:45:26,200 --> 00:45:31,250
7 and 13, right?
774
00:45:31,250 --> 00:45:34,480
So this is where finding the
distribution and standard
775
00:45:34,480 --> 00:45:40,540
deviation helps us giving
a confidence interval,
776
00:45:40,540 --> 00:45:43,340
expressing our belief of that
particular event happening.
777
00:45:43,340 --> 00:45:47,050
778
00:45:47,050 --> 00:45:51,290
We will look at a few examples
because you might need this in
779
00:45:51,290 --> 00:45:52,540
your p-set.
780
00:45:52,540 --> 00:46:18,600
781
00:46:18,600 --> 00:46:20,156
So this particular
function you have
782
00:46:20,156 --> 00:46:22,930
already seen in the lecture.
783
00:46:22,930 --> 00:46:27,310
784
00:46:27,310 --> 00:46:31,870
But we need to understand
this particular part.
785
00:46:31,870 --> 00:46:35,160
786
00:46:35,160 --> 00:46:39,150
Suppose you have a probability
of something happening.
787
00:46:39,150 --> 00:46:40,620
Suppose you estimated
the probability
788
00:46:40,620 --> 00:46:41,300
of something happening.
789
00:46:41,300 --> 00:46:46,880
Suppose you're given the
coin is biased, OK?
790
00:46:46,880 --> 00:46:47,940
Sorry, unbiased.
791
00:46:47,940 --> 00:46:51,840
So we know p of H
is equal to 1/2.
792
00:46:51,840 --> 00:46:55,210
How can we simulate
an outcome?
793
00:46:55,210 --> 00:46:57,740
How can you simulate an outcome
and see whether it's a
794
00:46:57,740 --> 00:47:01,030
head or a tail with this
particular probability?
795
00:47:01,030 --> 00:47:08,090
We do that by calling this
function, random.random(),
796
00:47:08,090 --> 00:47:12,160
which is going to give you a
random value between 0 and 1.
797
00:47:12,160 --> 00:47:14,160
And you're going to
check whether it's
798
00:47:14,160 --> 00:47:16,300
below this or not.
799
00:47:16,300 --> 00:47:19,710
If it's below this, we
can take it as head.
800
00:47:19,710 --> 00:47:21,620
If it's not, it's tail.
801
00:47:21,620 --> 00:47:25,740
And this will happen with
probability 1/2, because the
802
00:47:25,740 --> 00:47:29,780
random function is going to
return a value between 0 and 1
803
00:47:29,780 --> 00:47:31,180
with equal probabilities.
804
00:47:31,180 --> 00:47:34,270
It's uniform probabilities.
805
00:47:34,270 --> 00:47:37,970
So to simulate a head or tail,
you call that function.
806
00:47:37,970 --> 00:47:41,930
You write the expression
like that, OK?
807
00:47:41,930 --> 00:47:48,180
808
00:47:48,180 --> 00:47:53,110
Then, if you consider this
example, for a certain number
809
00:47:53,110 --> 00:47:56,890
of flips, we simulate
the event.
810
00:47:56,890 --> 00:47:58,970
And we count the number
of heads we obtain.
811
00:47:58,970 --> 00:48:03,950
812
00:48:03,950 --> 00:48:06,240
And also from that, you
can calculate the
813
00:48:06,240 --> 00:48:07,590
number of tails as well.
814
00:48:07,590 --> 00:48:11,580
If you know the total flips, you
know the number of tails.
815
00:48:11,580 --> 00:48:14,765
Using that, we are taking
two ratios.
816
00:48:14,765 --> 00:48:16,890
Now, the ratio between the
heads and tails, and the
817
00:48:16,890 --> 00:48:19,690
difference between
heads and tails.
818
00:48:19,690 --> 00:48:24,160
We are doing this for certain
number of trials.
819
00:48:24,160 --> 00:48:27,530
And we're going to take the mean
and standard deviation of
820
00:48:27,530 --> 00:48:32,220
these trials, OK?
821
00:48:32,220 --> 00:48:38,170
So here in our distribution,
what are we considering?
822
00:48:38,170 --> 00:48:42,220
823
00:48:42,220 --> 00:48:46,000
What is going to build our
distribution here?
824
00:48:46,000 --> 00:48:49,010
825
00:48:49,010 --> 00:48:50,310
The ratios, right?
826
00:48:50,310 --> 00:48:53,560
The ratios of the events.
827
00:48:53,560 --> 00:48:58,130
And we simulated certain number
of trials to get those
828
00:48:58,130 --> 00:49:01,570
events, OK?
829
00:49:01,570 --> 00:49:04,470
Only if you simulate certain
number of trials, you can
830
00:49:04,470 --> 00:49:08,240
actually summarize the outcome
of the events in mean and
831
00:49:08,240 --> 00:49:10,530
standard deviation.
832
00:49:10,530 --> 00:49:14,480
This is exactly like the
difference in the times of the
833
00:49:14,480 --> 00:49:20,350
bus arriving and the
quoted times.
834
00:49:20,350 --> 00:49:23,700
Let's check this example.
835
00:49:23,700 --> 00:49:25,010
Let's plot this and see.
836
00:49:25,010 --> 00:49:41,830
837
00:49:41,830 --> 00:49:43,080
It's going to take a while.
838
00:49:43,080 --> 00:49:48,390
839
00:49:48,390 --> 00:49:51,590
OK, that's another thing I want
to explain here because
840
00:49:51,590 --> 00:49:53,555
since you're going to
be going to plot--
841
00:49:53,555 --> 00:49:58,050
we are going to use PyLab
extensively and plot graphs.
842
00:49:58,050 --> 00:50:02,090
You'll need to put a title and
labels to all the plots you're
843
00:50:02,090 --> 00:50:03,040
generating.
844
00:50:03,040 --> 00:50:06,160
Plus, you can use this
text to actually put
845
00:50:06,160 --> 00:50:07,190
the text in the graph.
846
00:50:07,190 --> 00:50:09,720
We will show that in a while.
847
00:50:09,720 --> 00:50:10,970
Plus--
848
00:50:10,970 --> 00:50:13,250
849
00:50:13,250 --> 00:50:14,230
here, sorry.
850
00:50:14,230 --> 00:50:19,310
If you want to change the axis
to log-log scale, you can call
851
00:50:19,310 --> 00:50:24,310
this comma at the end after
calling the plot.
852
00:50:24,310 --> 00:50:27,250
Because you might sometimes need
to change the axis to log
853
00:50:27,250 --> 00:50:28,913
scale in x and y-axis.
854
00:50:28,913 --> 00:50:33,930
855
00:50:33,930 --> 00:50:40,260
So this is the mean,
heads versus tails.
856
00:50:40,260 --> 00:50:45,760
And if you can see it, the mean
tends to be 1 when we
857
00:50:45,760 --> 00:50:48,870
have a large number of flips.
858
00:50:48,870 --> 00:50:52,860
So to get the consistency,
we need to simulate
859
00:50:52,860 --> 00:50:55,950
large number of trials.
860
00:50:55,950 --> 00:51:00,610
Then only it will tend to be
close to the mean, OK?
861
00:51:00,610 --> 00:51:04,000
862
00:51:04,000 --> 00:51:07,740
This is sort of a way of
checking the evolution of the
863
00:51:07,740 --> 00:51:13,540
series by actually doing it for
a certain number of flips
864
00:51:13,540 --> 00:51:14,926
at every time.
865
00:51:14,926 --> 00:51:19,280
So it's quite like
a scatter plot.
866
00:51:19,280 --> 00:51:27,250
A scatter plot is like plotting
the outcomes of our
867
00:51:27,250 --> 00:51:28,500
experiments.
868
00:51:28,500 --> 00:51:30,530
869
00:51:30,530 --> 00:51:36,230
Suppose it's x1 and
x2 in a graph.
870
00:51:36,230 --> 00:51:37,420
So we are going to say--
871
00:51:37,420 --> 00:51:42,800
so for example, suppose you
have a variable, and the
872
00:51:42,800 --> 00:51:46,090
variable causes an outcome--
873
00:51:46,090 --> 00:51:51,360
a probability of the coin flip,
so p of H. And it can
874
00:51:51,360 --> 00:51:57,290
result in a certain number of
heads appearing, say n of H.
875
00:51:57,290 --> 00:52:02,150
Now, you can do a scatter plot
between these two variables.
876
00:52:02,150 --> 00:52:04,050
And it will be probably
a spread.
877
00:52:04,050 --> 00:52:07,990
But we know that if you increase
the probability of
878
00:52:07,990 --> 00:52:11,510
heads, the number of heads is
going to increase as well.
879
00:52:11,510 --> 00:52:13,340
So it would be probably
something like this.
880
00:52:13,340 --> 00:52:18,320
881
00:52:18,320 --> 00:52:20,742
From this, we can assume
that it's linear or
882
00:52:20,742 --> 00:52:21,140
something like that.
883
00:52:21,140 --> 00:52:24,660
But the scatter plot is actually
representing the
884
00:52:24,660 --> 00:52:28,620
outcomes of the trial versus
some other variable in the
885
00:52:28,620 --> 00:52:30,877
graph and visualize it.
886
00:52:30,877 --> 00:52:33,560
887
00:52:33,560 --> 00:52:36,230
And let me show the last
graph, and we'll
888
00:52:36,230 --> 00:52:37,480
be done with that.
889
00:52:37,480 --> 00:52:52,040
890
00:52:52,040 --> 00:52:55,710
So this, again, we actually
know, instead of putting a
891
00:52:55,710 --> 00:53:00,080
scatter plot, we're actually
giving the distribution as a
892
00:53:00,080 --> 00:53:06,340
histogram and printing a
text box in the graph.
893
00:53:06,340 --> 00:53:09,970
This might be useful if you want
to display something on
894
00:53:09,970 --> 00:53:12,840
your graph.
895
00:53:12,840 --> 00:53:15,990
I guess we will be uploading
the code to the site.
896
00:53:15,990 --> 00:53:19,080
So you can check the code
if you want later, OK?
897
00:53:19,080 --> 00:53:20,510
Sure.
898
00:53:20,510 --> 00:53:21,760
See you next week.
899
00:53:21,760 --> 00:53:27,615