1
00:00:00,360 --> 00:00:03,020
Let us now put to use our
understanding of the
2
00:00:03,020 --> 00:00:06,000
coin-tossing model and the
associated binomial
3
00:00:06,000 --> 00:00:07,550
probabilities.
4
00:00:07,550 --> 00:00:10,210
We will solve the following
problem.
5
00:00:10,210 --> 00:00:13,910
We have a coin, which
is tossed 10 times.
6
00:00:13,910 --> 00:00:17,710
And we're told that exactly
three out of the 10 tosses
7
00:00:17,710 --> 00:00:19,800
resulted in heads.
8
00:00:19,800 --> 00:00:22,530
Given this information, we would
like to calculate the
9
00:00:22,530 --> 00:00:27,310
probability that the first
two tosses were heads.
10
00:00:27,310 --> 00:00:29,940
This is a question of
calculating a conditional
11
00:00:29,940 --> 00:00:33,670
probability of one event
given another.
12
00:00:33,670 --> 00:00:37,720
The conditional probability of
event A, namely that the first
13
00:00:37,720 --> 00:00:41,480
two tosses were heads, given
that another event B has
14
00:00:41,480 --> 00:00:45,250
occurred, namely that we
had exactly three heads
15
00:00:45,250 --> 00:00:47,600
out of the 10 tosses.
16
00:00:47,600 --> 00:00:51,560
However, before we can start
working towards the solution
17
00:00:51,560 --> 00:00:54,990
to this problem, we need to
specify a probability model
18
00:00:54,990 --> 00:00:57,080
that we will be working with.
19
00:00:57,080 --> 00:01:00,070
We need to be explicit about
our assumptions.
20
00:01:00,070 --> 00:01:03,290
To this effect, let us introduce
the following
21
00:01:03,290 --> 00:01:04,780
assumptions.
22
00:01:04,780 --> 00:01:08,140
We will assume that the
different coin tosses are
23
00:01:08,140 --> 00:01:09,390
independent.
24
00:01:09,390 --> 00:01:13,270
In addition, we will assume
that each coin toss has a
25
00:01:13,270 --> 00:01:18,300
fixed probability, p, the same
for each toss, that the
26
00:01:18,300 --> 00:01:21,560
particular toss results
in heads.
27
00:01:21,560 --> 00:01:24,090
These are the exact same
assumptions that we made
28
00:01:24,090 --> 00:01:27,580
earlier when we derived the
binomial probabilities.
29
00:01:27,580 --> 00:01:31,450
And in particular, we have the
following formula that if we
30
00:01:31,450 --> 00:01:35,650
have n tosses, the probability
that we obtain exactly k heads
31
00:01:35,650 --> 00:01:37,880
is given by this expression.
32
00:01:37,880 --> 00:01:42,550
So now, we have a model in place
and also the tools that
33
00:01:42,550 --> 00:01:46,400
we can use to analyze this
particular model.
34
00:01:46,400 --> 00:01:48,289
Let us start working
towards a solution.
35
00:01:48,289 --> 00:01:51,600
Actually, we will develop two
different solutions and
36
00:01:51,600 --> 00:01:54,200
compare them at the end.
37
00:01:54,200 --> 00:01:56,060
The first approach,
which is the
38
00:01:56,060 --> 00:01:58,710
safest one, is the following.
39
00:01:58,710 --> 00:02:02,125
Since we want to calculate a
conditional probability, let
40
00:02:02,125 --> 00:02:05,000
us just start with the
definition of conditional
41
00:02:05,000 --> 00:02:06,520
probabilities.
42
00:02:06,520 --> 00:02:10,228
The conditional probability of
an event given another event
43
00:02:10,228 --> 00:02:13,970
is the probability that both
events happen, divided by the
44
00:02:13,970 --> 00:02:16,540
probability of the conditioning
event.
45
00:02:16,540 --> 00:02:21,070
Now, let us specialize to the
particular example that we're
46
00:02:21,070 --> 00:02:23,740
trying to solve.
47
00:02:23,740 --> 00:02:27,110
So in the numerator, we're
talking about the probability
48
00:02:27,110 --> 00:02:31,329
that event A happens and
event B happens.
49
00:02:31,329 --> 00:02:33,060
What does that mean?
50
00:02:33,060 --> 00:02:35,440
This means that event
A happens--
51
00:02:35,440 --> 00:02:39,970
that is, the first two tosses
resulted in heads, which I'm
52
00:02:39,970 --> 00:02:43,440
going to denote symbolically
this way.
53
00:02:43,440 --> 00:02:47,610
But in addition to that,
event B happens.
54
00:02:47,610 --> 00:02:51,260
And event B requires that there
is a total of three
55
00:02:51,260 --> 00:02:57,720
heads, which means that we
had one more head in
56
00:02:57,720 --> 00:02:59,250
the remaining tosses.
57
00:02:59,250 --> 00:03:07,510
So we have one head in tosses
3 all the way to 10.
58
00:03:15,810 --> 00:03:19,530
As for the denominator, let's
keep it the way it is for now.
59
00:03:22,410 --> 00:03:26,270
So let's continue with
the numerator.
60
00:03:26,270 --> 00:03:29,380
We're talking about the
probability of two events
61
00:03:29,380 --> 00:03:33,970
happening, that the first two
tosses were heads and that in
62
00:03:33,970 --> 00:03:38,700
tosses 3 up to 10, we had
exactly one head.
63
00:03:38,700 --> 00:03:42,550
Here comes the independence
assumption.
64
00:03:42,550 --> 00:03:45,750
Because the different tosses
are independent, whatever
65
00:03:45,750 --> 00:03:49,650
happens in the first two tosses
is independent from
66
00:03:49,650 --> 00:03:53,480
whatever happened in
tosses 3 up to 10.
67
00:03:53,480 --> 00:03:57,270
So the probability of these two
events happening is the
68
00:03:57,270 --> 00:04:00,420
product of their individual
probabilities.
69
00:04:00,420 --> 00:04:04,880
So we first have the probability
that the first two
70
00:04:04,880 --> 00:04:10,170
tosses were heads, which
is p squared.
71
00:04:10,170 --> 00:04:13,610
And we need to multiply it
with the probability that
72
00:04:13,610 --> 00:04:17,230
there was exactly one head
in the tosses numbered
73
00:04:17,230 --> 00:04:19,019
from 3 up to 10.
74
00:04:19,019 --> 00:04:21,450
These are eight tosses.
75
00:04:21,450 --> 00:04:25,980
The probability of one head in
eight tosses is given by the
76
00:04:25,980 --> 00:04:31,820
binomial formula, with k equal
to 1 and n equal to 8.
77
00:04:31,820 --> 00:04:38,190
So this expression, this part,
becomes 8 choose 1, p to the
78
00:04:38,190 --> 00:04:44,790
first power times 1 minus
p to the seventh power.
79
00:04:44,790 --> 00:04:47,560
So this is the numerator.
80
00:04:47,560 --> 00:04:51,420
The denominator is
easier to find.
81
00:04:51,420 --> 00:04:53,230
This is the probability
that we had
82
00:04:53,230 --> 00:04:55,909
three heads in 10 tosses.
83
00:04:55,909 --> 00:04:57,690
So we just use this formula.
84
00:04:57,690 --> 00:05:04,220
The probability of three heads
is given by: 10 tosses choose
85
00:05:04,220 --> 00:05:10,685
three, p to the third, times 1
minus p to the seventh power.
86
00:05:13,740 --> 00:05:16,690
And here we notice that terms
in the numerator and
87
00:05:16,690 --> 00:05:23,350
denominator cancel out, and we
obtain 8 choose 1 divided by
88
00:05:23,350 --> 00:05:25,690
10 choose 3.
89
00:05:25,690 --> 00:05:29,480
And to simplify things
just a little more,
90
00:05:29,480 --> 00:05:31,090
what is 8 choose 1?
91
00:05:31,090 --> 00:05:34,150
This is the number of ways that
we can choose one item
92
00:05:34,150 --> 00:05:36,090
out of eight items.
93
00:05:36,090 --> 00:05:37,680
And this is just 8.
94
00:05:42,220 --> 00:05:45,930
And let's leave the denominator
the way it is.
95
00:05:45,930 --> 00:05:51,150
So this is the answer to the
question that we had.
96
00:05:51,150 --> 00:05:55,840
And now let us work towards
developing a second approach
97
00:05:55,840 --> 00:05:58,140
towards this particular
answer.
98
00:06:00,710 --> 00:06:05,790
In our second approach, we start
first by looking at the
99
00:06:05,790 --> 00:06:08,750
sample space and understanding
what
100
00:06:08,750 --> 00:06:12,110
conditioning is all about.
101
00:06:12,110 --> 00:06:16,760
In our model, we have
a sample space.
102
00:06:16,760 --> 00:06:19,660
As usual we can denote
it by omega.
103
00:06:19,660 --> 00:06:25,380
And the sample space contains a
bunch of possible outcomes.
104
00:06:25,380 --> 00:06:33,090
A typical outcome is going to
be a sequence of length 10.
105
00:06:33,090 --> 00:06:36,409
It's a sequence of
heads or tails.
106
00:06:36,409 --> 00:06:39,070
And it's a sequence that
has length 10.
107
00:06:42,680 --> 00:06:45,870
We want to calculate conditional
probabilities.
108
00:06:45,870 --> 00:06:50,290
And this places us in a
conditional universe.
109
00:06:50,290 --> 00:06:54,500
We have the conditioning event
B, which is some set.
110
00:06:58,280 --> 00:07:02,010
And conditional probabilities
are probabilities defined
111
00:07:02,010 --> 00:07:05,810
inside this set B and define
the probabilities, the
112
00:07:05,810 --> 00:07:09,140
conditional probabilities of
the different outcomes.
113
00:07:09,140 --> 00:07:11,460
What are the elements
of the set B?
114
00:07:11,460 --> 00:07:17,140
A typical element of the set B
is a sequence, which is, again
115
00:07:17,140 --> 00:07:23,630
of length 10, but has
exactly three heads.
116
00:07:23,630 --> 00:07:26,593
So these are the three-head
sequences.
117
00:07:32,580 --> 00:07:37,750
Now, since we're conditioning
on event B, we can just work
118
00:07:37,750 --> 00:07:39,790
with conditional
probabilities.
119
00:07:39,790 --> 00:07:44,850
So let us find the conditional
probability law.
120
00:07:44,850 --> 00:07:50,880
Recall that any three-head
sequence has the same
121
00:07:50,880 --> 00:07:55,520
probability of occurring in
the original unconditional
122
00:07:55,520 --> 00:07:59,870
probability model, namely as
we discussed earlier, any
123
00:07:59,870 --> 00:08:04,560
particular three-head sequence
has a probability equal to
124
00:08:04,560 --> 00:08:06,530
this expression.
125
00:08:06,530 --> 00:08:09,150
So three-head sequences are
all equally likely.
126
00:08:09,150 --> 00:08:11,740
This means that the
unconditional probabilities of
127
00:08:11,740 --> 00:08:14,790
all the elements of
B are the same.
128
00:08:14,790 --> 00:08:18,300
When we construct conditional
probabilities given an event
129
00:08:18,300 --> 00:08:23,510
B, what happens is that the
ratio or the relative
130
00:08:23,510 --> 00:08:28,520
proportions of the probabilities
remain the same.
131
00:08:28,520 --> 00:08:31,640
So conditional probabilities
are proportional to
132
00:08:31,640 --> 00:08:34,070
unconditional probabilities.
133
00:08:34,070 --> 00:08:36,730
These elements of B were
equally likely in
134
00:08:36,730 --> 00:08:38,159
the original model.
135
00:08:38,159 --> 00:08:41,960
Therefore, they remain equally
likely in the conditional
136
00:08:41,960 --> 00:08:43,870
model as well.
137
00:08:43,870 --> 00:08:48,260
What this means is that the
conditional probability law on
138
00:08:48,260 --> 00:08:50,910
the set B is uniform.
139
00:08:50,910 --> 00:08:55,140
Given that B occurred, all the
possible outcomes now have the
140
00:08:55,140 --> 00:08:56,910
same probability.
141
00:08:56,910 --> 00:08:59,310
Since we have a uniform
probability law, this means
142
00:08:59,310 --> 00:09:01,560
that we can now answer
probability
143
00:09:01,560 --> 00:09:03,940
questions by just counting.
144
00:09:03,940 --> 00:09:06,850
We're interested in the
probability of a certain
145
00:09:06,850 --> 00:09:11,640
event, A, given that
B occurred.
146
00:09:11,640 --> 00:09:15,660
Now, given that B occurred, this
part of A cannot happen.
147
00:09:15,660 --> 00:09:18,900
So we're interested in the
probability of outcomes that
148
00:09:18,900 --> 00:09:23,240
belong in this shaded region,
those outcomes that belong
149
00:09:23,240 --> 00:09:28,830
within the set B. To find the
probability of this shaded
150
00:09:28,830 --> 00:09:33,080
region occurring, we just need
to count how many outcomes
151
00:09:33,080 --> 00:09:37,220
belong to the shaded region and
divide them by the number
152
00:09:37,220 --> 00:09:41,290
of outcomes that belong
to the set B.
153
00:09:41,290 --> 00:09:44,880
That is, we work inside this
conditional universe.
154
00:09:44,880 --> 00:09:47,230
All of the elements in this
conditional universe are
155
00:09:47,230 --> 00:09:48,700
equally likely.
156
00:09:48,700 --> 00:09:51,860
And therefore, we calculate
probabilities by counting.
157
00:09:51,860 --> 00:09:55,410
So the desired probability is
going to be the number of
158
00:09:55,410 --> 00:10:00,520
elements in the shaded region,
which is the intersection of A
159
00:10:00,520 --> 00:10:08,280
with B, divided by the number of
elements that belong to the
160
00:10:08,280 --> 00:10:13,000
set B.
161
00:10:13,000 --> 00:10:18,270
How many elements are there in
the intersection of A and B?
162
00:10:18,270 --> 00:10:22,590
These are the outcomes or
sequences of length 10, in
163
00:10:22,590 --> 00:10:24,980
which the first two tosses
were heads--
164
00:10:24,980 --> 00:10:26,940
no choice here.
165
00:10:26,940 --> 00:10:29,630
And there is one more head.
166
00:10:29,630 --> 00:10:33,350
That additional head can appear
in one out of eight
167
00:10:33,350 --> 00:10:34,830
possible places.
168
00:10:34,830 --> 00:10:38,240
So there's eight possible
sequences that have the
169
00:10:38,240 --> 00:10:40,710
desired property.
170
00:10:40,710 --> 00:10:44,010
How many elements are
there in the set B?
171
00:10:44,010 --> 00:10:49,270
How many three-head sequences
are there?
172
00:10:49,270 --> 00:10:54,800
Well, the number of three-head
sequences is the same as the
173
00:10:54,800 --> 00:10:58,920
number of ways that we can
choose three elements out of a
174
00:10:58,920 --> 00:11:01,450
set of cardinality 10.
175
00:11:01,450 --> 00:11:07,580
And this is 10 choose 3, as
we also discussed earlier.
176
00:11:07,580 --> 00:11:11,750
So this is the same answer as
we derived before with our
177
00:11:11,750 --> 00:11:13,100
first approach.
178
00:11:13,100 --> 00:11:17,650
So both approaches, of course,
give the same solution.
179
00:11:17,650 --> 00:11:21,310
This second approach is a little
easier, because we
180
00:11:21,310 --> 00:11:24,730
never had to involve any
p's in our calculation.
181
00:11:24,730 --> 00:11:27,210
We go to the answer directly.
182
00:11:27,210 --> 00:11:31,110
The reason that this approach
worked was that the
183
00:11:31,110 --> 00:11:35,970
conditional universe, the
event B, had a uniform
184
00:11:35,970 --> 00:11:37,300
probability law on it.