WEBVTT

00:00:00.040 --> 00:00:02.390
The following content is
provided under a Creative

00:00:02.390 --> 00:00:03.680
Commons License.

00:00:03.680 --> 00:00:06.640
Your support will help MIT
OpenCourseWare continue to

00:00:06.640 --> 00:00:09.980
offer high quality educational
resources for free.

00:00:09.980 --> 00:00:12.820
To make a donation or to view
additional materials from

00:00:12.820 --> 00:00:16.760
hundreds of MIT courses, visit
MIT OpenCourseWare at

00:00:16.760 --> 00:00:18.010
ocw.mit.edu.

00:00:22.260 --> 00:00:26.740
PROFESSOR: This is how to
randomize two, and what--.

00:00:26.740 --> 00:00:28.165
AUDIENCE: Are there slides?

00:00:31.020 --> 00:00:32.759
PROFESSOR: Sorry.

00:00:32.759 --> 00:00:37.590
What we're going to talk about,
it's just recap in case

00:00:37.590 --> 00:00:41.270
we missed anything this morning
about the different

00:00:41.270 --> 00:00:45.470
methods of introducing an
element of randomization into

00:00:45.470 --> 00:00:48.100
your project.

00:00:48.100 --> 00:00:52.010
Then I want to talk about the
unit of randomization, whether

00:00:52.010 --> 00:00:55.610
you randomize individuals,
or schools,

00:00:55.610 --> 00:00:59.060
or clinics, or districts.

00:00:59.060 --> 00:01:02.660
If you are very lucky and work
somewhere like Indonesia, Ben

00:01:02.660 --> 00:01:06.680
Olken gets to randomize on the
district level of many

00:01:06.680 --> 00:01:08.780
hundreds of thousands of
people per unit of

00:01:08.780 --> 00:01:12.300
randomization, you need a very
big country to do that.

00:01:12.300 --> 00:01:18.220
Multiple treatments, and we'll
go through an example of how

00:01:18.220 --> 00:01:22.400
you can design an evaluation
with different treatments to

00:01:22.400 --> 00:01:30.060
get at some really underlying
questions, big questions in

00:01:30.060 --> 00:01:35.120
the literature or in the
development field, rather than

00:01:35.120 --> 00:01:39.600
just does this program work,
but much more of the deep

00:01:39.600 --> 00:01:40.860
level questions.

00:01:40.860 --> 00:01:42.253
Then I want to talk about
stratification.

00:01:45.310 --> 00:01:48.610
And that's something where
actually the theory has

00:01:48.610 --> 00:01:54.370
developed a little bit more, and
as Cynthia can attest, it

00:01:54.370 --> 00:01:59.800
basically saved our project.

00:01:59.800 --> 00:02:03.610
In one case, we thought we just
didn't have enough sample

00:02:03.610 --> 00:02:06.840
to do this, but we had
stratified very carefully.

00:02:06.840 --> 00:02:09.800
And thank goodness we actually
managed to get a result out of

00:02:09.800 --> 00:02:10.539
that project.

00:02:10.539 --> 00:02:13.530
And it was only because we did
a good stratification that

00:02:13.530 --> 00:02:14.320
that was possible.

00:02:14.320 --> 00:02:17.660
So it's definitely worth
thinking about

00:02:17.660 --> 00:02:19.950
how to do it correctly.

00:02:19.950 --> 00:02:23.360
And then very briefly just talk
about the mechanics of

00:02:23.360 --> 00:02:24.220
randomization.

00:02:24.220 --> 00:02:29.500
But I think that's actually
best done in the groups.

00:02:29.500 --> 00:02:32.480
And we'll also be circulating
and we'll put up on the

00:02:32.480 --> 00:02:36.540
website some exercises.

00:02:36.540 --> 00:02:38.730
If you actually literally--

00:02:38.730 --> 00:02:41.290
I've learned all about
randomization, but how do I

00:02:41.290 --> 00:02:43.680
literally do it?

00:02:43.680 --> 00:02:45.940
And the answer for
us is normally--

00:02:45.940 --> 00:02:48.450
the answer with me is I
get an RA to do it.

00:02:48.450 --> 00:02:53.190
[LAUGHTER]

00:02:53.190 --> 00:02:57.480
You can write stata code, but
you can also do it in Excel.

00:03:01.180 --> 00:03:06.030
So this should be a recap of
what you did this morning, but

00:03:06.030 --> 00:03:09.380
I just want to talk about--

00:03:09.380 --> 00:03:12.220
I like kind of putting
things in boxes and

00:03:12.220 --> 00:03:13.470
seeing pros and cons.

00:03:13.470 --> 00:03:18.240
The different kinds of ways of
introducing some element of

00:03:18.240 --> 00:03:22.910
randomization into your
project, to be able to

00:03:22.910 --> 00:03:30.230
evaluate it: basic lottery, just
some in, some out, some

00:03:30.230 --> 00:03:34.630
get the program, some
don't; a phase in.

00:03:34.630 --> 00:03:36.860
Can someone explain to
me what a randomized

00:03:36.860 --> 00:03:38.110
phase in design is?

00:03:40.848 --> 00:03:42.330
Hopefully you did
it this morning.

00:03:42.330 --> 00:03:47.728
Does anyone remember what a
randomized phase in design is?

00:03:47.728 --> 00:03:49.380
AUDIENCE: Is that the one
where everyone gets

00:03:49.380 --> 00:03:51.040
it, but over time?

00:03:51.040 --> 00:03:51.380
PROFESSOR: Yes.

00:03:51.380 --> 00:03:55.440
Everyone gets it in the
end, but you randomize

00:03:55.440 --> 00:03:56.790
when they get it.

00:03:56.790 --> 00:03:59.130
So some people get it the first
year, some people get it

00:03:59.130 --> 00:04:02.320
the second, and that's a very
natural way in which projects

00:04:02.320 --> 00:04:04.250
expand over time.

00:04:04.250 --> 00:04:09.570
And so you introduce your
element of randomization at

00:04:09.570 --> 00:04:11.570
that point and say, well,
who gets it the

00:04:11.570 --> 00:04:14.480
first year is random.

00:04:14.480 --> 00:04:19.310
Rotation, randomized rotation.

00:04:19.310 --> 00:04:20.660
Did Dean talk about that?

00:04:23.180 --> 00:04:24.920
AUDIENCE: The way I remember it
is it's almost like phase

00:04:24.920 --> 00:04:28.726
in, except for the service goes
away from some people

00:04:28.726 --> 00:04:29.130
after a certain point.

00:04:29.130 --> 00:04:29.940
PROFESSOR: Yeah, exactly.

00:04:29.940 --> 00:04:33.750
So with phase in, you're
building up over time till

00:04:33.750 --> 00:04:34.580
everyone gets it.

00:04:34.580 --> 00:04:37.180
With rotation, you get it this
year, but then you don't get

00:04:37.180 --> 00:04:39.530
it the next year.

00:04:39.530 --> 00:04:41.390
Encouragement, an encouragement
design.

00:04:46.650 --> 00:04:48.150
OK, yeah?

00:04:48.150 --> 00:04:51.012
AUDIENCE: Basically
that treatment--

00:04:51.012 --> 00:04:52.200
you use the same word.

00:04:52.200 --> 00:04:56.282
You're encouraging people to
apply for a program or to get

00:04:56.282 --> 00:04:59.060
the intervention and then
you're comparing all the

00:04:59.060 --> 00:05:01.990
people who have access to the
program or the intervention

00:05:01.990 --> 00:05:03.870
versus people who don't.

00:05:03.870 --> 00:05:06.950
PROFESSOR: You're comparing the
people who were encouraged

00:05:06.950 --> 00:05:09.860
to go to the program, where they
may not all actually get

00:05:09.860 --> 00:05:13.070
the program, but the ones who
were given this extra special

00:05:13.070 --> 00:05:15.150
encouragement or information
about the

00:05:15.150 --> 00:05:16.390
program, that's right.

00:05:16.390 --> 00:05:21.390
So let's just think about
when those are useful.

00:05:21.390 --> 00:05:24.140
How do you decide
which of those--

00:05:24.140 --> 00:05:27.320
what are the times when you
might want to use these?

00:05:27.320 --> 00:05:31.880
So basic lottery it's very
natural to do when a program

00:05:31.880 --> 00:05:33.250
is oversubscribed.

00:05:33.250 --> 00:05:35.130
So when you've got a training
course, and more people have

00:05:35.130 --> 00:05:40.610
applied for the training
course then

00:05:40.610 --> 00:05:41.930
you've got places for.

00:05:41.930 --> 00:05:44.130
Again, I'm sure Dean talked
about the fact that that

00:05:44.130 --> 00:05:45.990
doesn't mean you have to accept
everyone, whether

00:05:45.990 --> 00:05:47.420
they're qualified or not.

00:05:47.420 --> 00:05:49.720
You can throw out the people who
aren't qualified and then

00:05:49.720 --> 00:05:51.885
just randomize within the people
who are qualified.

00:05:55.510 --> 00:06:00.340
And that's OK when it's
politically acceptable for

00:06:00.340 --> 00:06:03.430
some people to get nothing.

00:06:03.430 --> 00:06:07.360
Sometimes that's OK, sometimes
it isn't OK.

00:06:07.360 --> 00:06:09.650
A phase in is a good
design when you're

00:06:09.650 --> 00:06:12.180
expanding over time.

00:06:12.180 --> 00:06:14.980
You don't have enough capacity
to train everyone the first

00:06:14.980 --> 00:06:18.200
year, or get all the programs up
and running the first year,

00:06:18.200 --> 00:06:22.810
so you've got to have some phase
in anyway, so why not

00:06:22.810 --> 00:06:25.340
randomize the phase in?

00:06:25.340 --> 00:06:30.940
And it's also useful when
politically you have to give

00:06:30.940 --> 00:06:34.210
something to everyone by the
end of the treatment.

00:06:34.210 --> 00:06:35.930
Maybe you are worried
that people won't

00:06:35.930 --> 00:06:37.460
cooperate with you.

00:06:37.460 --> 00:06:41.820
Maybe you just feel that unless
they're going to get

00:06:41.820 --> 00:06:45.160
something at the end, maybe
you feel that it's just

00:06:45.160 --> 00:06:47.790
inappropriate not to
treat everyone you

00:06:47.790 --> 00:06:50.080
possibly can by the end.

00:06:50.080 --> 00:06:52.100
Whatever your reason, if you
feel that you have to give

00:06:52.100 --> 00:06:54.680
everyone that you're contacting
something by the

00:06:54.680 --> 00:06:58.490
end, then it's a
good approach.

00:06:58.490 --> 00:07:02.840
Rotation, again, is useful
when you can't

00:07:02.840 --> 00:07:05.010
have a complete control.

00:07:05.010 --> 00:07:06.670
Just politically,
it's difficult.

00:07:06.670 --> 00:07:08.440
People won't cooperate
with you.

00:07:08.440 --> 00:07:11.100
In the Balsakhi example, they
were very nervous that the

00:07:11.100 --> 00:07:14.400
schools just weren't going to
let them come in and test the

00:07:14.400 --> 00:07:17.570
kids unless they were going to
get something out of it.

00:07:17.570 --> 00:07:21.630
So they had to give them all
something at some point, but

00:07:21.630 --> 00:07:24.300
you didn't have enough
resources to do

00:07:24.300 --> 00:07:26.380
every one by the end.

00:07:26.380 --> 00:07:28.830
You only had enough resources
to do half the people.

00:07:28.830 --> 00:07:30.640
So you can do half and
then switch, and

00:07:30.640 --> 00:07:33.260
then the other half.

00:07:33.260 --> 00:07:39.340
Now an encouragement design is
very useful when you can't

00:07:39.340 --> 00:07:43.220
deny anyone access
to the program.

00:07:43.220 --> 00:07:48.630
So it's been used and discussed
in when you're

00:07:48.630 --> 00:07:57.410
setting up a business or former
support centers that

00:07:57.410 --> 00:08:01.710
anyone can walk into and use,
you don't want to say, if

00:08:01.710 --> 00:08:03.730
someone walks in your door--
you're desperately trying to

00:08:03.730 --> 00:08:05.470
drum up custom for this--

00:08:05.470 --> 00:08:08.320
somebody walks in the door, you
don't want to say you're

00:08:08.320 --> 00:08:09.570
not on our list, go away.

00:08:12.520 --> 00:08:15.140
That doesn't make sense
for your program.

00:08:15.140 --> 00:08:17.390
Your program is trying
to attract people.

00:08:17.390 --> 00:08:21.040
But it might make sense to
spend some extra money to

00:08:21.040 --> 00:08:24.030
encourage some particular
people to come

00:08:24.030 --> 00:08:25.590
and visit your center.

00:08:30.680 --> 00:08:35.640
So it's very useful when
everyone is eligible for the

00:08:35.640 --> 00:08:39.669
program, but the take
up isn't very high.

00:08:39.669 --> 00:08:43.039
So you've got these centers,
anybody could walk in, but

00:08:43.039 --> 00:08:45.120
most people aren't walking in.

00:08:45.120 --> 00:08:48.440
If most people are walking in
and using the service anyway,

00:08:48.440 --> 00:08:49.730
you got a problem.

00:08:49.730 --> 00:08:51.790
Those are going to be very hard
to evaluate, because you

00:08:51.790 --> 00:08:54.700
haven't got any margin in
which to change things.

00:08:54.700 --> 00:08:58.600
But if take up is currently
low, but everyone is

00:08:58.600 --> 00:09:02.400
ineligible, then that's
an opportunity to do

00:09:02.400 --> 00:09:03.710
encouragement design.

00:09:03.710 --> 00:09:08.055
It's also possible to do when
you've only got two--

00:09:08.055 --> 00:09:12.270
I was talking over lunch about
trying to evaluate some

00:09:12.270 --> 00:09:15.920
agriculture interventions where
they're setting up two

00:09:15.920 --> 00:09:17.760
rice mills in Sierra Leone.

00:09:17.760 --> 00:09:20.570
Two is just not enough
to randomize.

00:09:20.570 --> 00:09:21.920
You don't want to randomize
where you put

00:09:21.920 --> 00:09:23.670
the rice mill anyway.

00:09:23.670 --> 00:09:28.850
But you can talk about
encouraging some people or

00:09:28.850 --> 00:09:31.080
informing some people that
there's going to be a new

00:09:31.080 --> 00:09:35.660
place where they'll be able to
sell rice or get extension

00:09:35.660 --> 00:09:39.910
services associated with
the rice mill.

00:09:39.910 --> 00:09:42.800
So advantages.

00:09:42.800 --> 00:09:45.590
A basic lottery is very
familiar to people.

00:09:45.590 --> 00:09:48.430
It's easy to understand.

00:09:48.430 --> 00:09:50.390
It's very intuitive.

00:09:50.390 --> 00:09:52.590
You're pulling names
out of a hat.

00:09:52.590 --> 00:09:55.510
You've got an equal chance
of getting it.

00:09:55.510 --> 00:09:57.770
It's very easy to implement,
and you can

00:09:57.770 --> 00:09:58.990
implement it in public.

00:09:58.990 --> 00:10:02.550
Sometimes it's useful to be
able to show people that

00:10:02.550 --> 00:10:04.120
you're being fair.

00:10:04.120 --> 00:10:06.560
They see their names going into
the hat, and they see

00:10:06.560 --> 00:10:08.920
people pulling them
out of the hat.

00:10:08.920 --> 00:10:14.490
Sometimes that's important and
useful to be able to do that.

00:10:14.490 --> 00:10:17.480
Again, the phase in is
relatively easy to understand

00:10:17.480 --> 00:10:19.010
what's going on.

00:10:19.010 --> 00:10:21.700
We're rolling it out, and we're
giving everyone an equal

00:10:21.700 --> 00:10:23.180
chance of having it
in the first year.

00:10:23.180 --> 00:10:28.160
Don't worry, you'll get it later
on, but you'll have to

00:10:28.160 --> 00:10:29.410
wait a little bit.

00:10:35.586 --> 00:10:37.370
I understand what I'm
writing here.

00:10:37.370 --> 00:10:40.240
Control comply as expect
to benefit.

00:10:40.240 --> 00:10:43.750
Oh yes, so the control is going
to comply with you.

00:10:43.750 --> 00:10:46.460
They're going to take the
surveys because they know that

00:10:46.460 --> 00:10:48.060
they're going to get something
in the end.

00:10:48.060 --> 00:10:50.120
So they're willing to keep
talking to you for the three

00:10:50.120 --> 00:10:52.420
years because--

00:10:52.420 --> 00:11:01.200
So the rotation will give you
more data points than the

00:11:01.200 --> 00:11:05.580
phase in, because the problem
with the phase in is over

00:11:05.580 --> 00:11:08.290
time, you're running
out controls.

00:11:08.290 --> 00:11:11.510
By the end, you don't have
any controls left.

00:11:11.510 --> 00:11:14.380
Whereas the rotation, you have
some controls the whole time,

00:11:14.380 --> 00:11:16.790
because someone phase out.

00:11:16.790 --> 00:11:21.090
Encouragement, as I say, you can
get away with the smaller

00:11:21.090 --> 00:11:22.630
sample size.

00:11:22.630 --> 00:11:25.020
You can do something even though
you've only got two

00:11:25.020 --> 00:11:26.840
rice mills or two business
centers in the

00:11:26.840 --> 00:11:28.860
whole of the country.

00:11:28.860 --> 00:11:31.660
And you can randomize an
individual level, even when

00:11:31.660 --> 00:11:35.870
the program is at a
much bigger level.

00:11:35.870 --> 00:11:38.680
But we'll talk more about the
unit of randomization.

00:11:38.680 --> 00:11:40.930
You'll see what I mean by
that in a bit more.

00:11:40.930 --> 00:11:42.150
So the disadvantages--

00:11:42.150 --> 00:11:44.140
I probably should have
kept going along

00:11:44.140 --> 00:11:45.260
one line, but anyway--

00:11:45.260 --> 00:11:49.960
so a basic lottery is easy to
understand and implement.

00:11:49.960 --> 00:11:55.630
The disadvantage is that you've
got a real control, and

00:11:55.630 --> 00:11:57.970
the real control doesn't
have any incentive to

00:11:57.970 --> 00:11:59.020
cooperate with you.

00:11:59.020 --> 00:12:01.650
And sometimes that's a problem
and sometimes it isn't.

00:12:01.650 --> 00:12:04.920
I mean, a lot of where I work in
rural Sierra Leone, people

00:12:04.920 --> 00:12:06.480
are very happy to
answer surveys.

00:12:06.480 --> 00:12:08.930
They don't have very
much else to do.

00:12:08.930 --> 00:12:10.400
They like the attention.

00:12:10.400 --> 00:12:12.830
Oh, can you come and
survey me too?

00:12:12.830 --> 00:12:17.190
But if you're talking about
urban India, there's lots of

00:12:17.190 --> 00:12:18.320
other things they
should be doing.

00:12:18.320 --> 00:12:19.690
They've got to go get
to their job.

00:12:19.690 --> 00:12:22.630
You've got to time the survey
very carefully, otherwise

00:12:22.630 --> 00:12:25.370
they're really not going
to want to talk to you.

00:12:25.370 --> 00:12:29.620
So you've got to worry a bit
when you have really control

00:12:29.620 --> 00:12:32.150
in some areas about differential
attrition.

00:12:32.150 --> 00:12:34.610
We'll talk about attrition
later on in the week.

00:12:34.610 --> 00:12:39.640
But if you have more people
unwilling to talk to you in

00:12:39.640 --> 00:12:41.270
control than you
have treatment,

00:12:41.270 --> 00:12:42.520
you have a real problem.

00:12:44.740 --> 00:12:49.290
A phrase in, again,
it's very--

00:12:49.290 --> 00:12:52.380
as they say, the advantage is
it's very natural to the way

00:12:52.380 --> 00:12:54.120
that a lot of organizations
work.

00:12:54.120 --> 00:12:57.080
They often expand over time.

00:12:57.080 --> 00:13:02.930
But there's a problem of
anticipation of the effects.

00:13:02.930 --> 00:13:05.430
So if they know they're going
to get it in two years, that

00:13:05.430 --> 00:13:07.480
may change what they're
doing now.

00:13:07.480 --> 00:13:07.690
Yeah?

00:13:07.690 --> 00:13:11.176
AUDIENCE: Can I ask a question
about control groups not being

00:13:11.176 --> 00:13:12.172
willing to participate?

00:13:12.172 --> 00:13:15.658
Are there examples when an
incentive has been used in

00:13:15.658 --> 00:13:19.144
order to get people to comply
with the survey?

00:13:19.144 --> 00:13:21.966
There's obviously optimal
ways to design that so

00:13:21.966 --> 00:13:23.640
you don't ruin the--

00:13:23.640 --> 00:13:29.710
PROFESSOR: So sometimes we give
small things, like didn't

00:13:29.710 --> 00:13:31.020
you use backpacks?

00:13:31.020 --> 00:13:33.620
Little backpacks for
kids in Balsakhi?

00:13:33.620 --> 00:13:35.160
GUEST SPEAKER: No, we used
those to actually get our

00:13:35.160 --> 00:13:36.344
surveyors to comply.

00:13:36.344 --> 00:13:40.780
[LAUGHTER]

00:13:40.780 --> 00:13:43.260
PROFESSOR: So sometimes people
give things out.

00:13:43.260 --> 00:13:44.975
Normally we don't do that.

00:13:47.560 --> 00:13:49.780
Our most recent problem
with it--

00:13:49.780 --> 00:13:53.660
and there's also ethical
constraints.

00:13:53.660 --> 00:13:55.860
So everything we do
has to go through

00:13:55.860 --> 00:13:57.780
human subjects clearance.

00:13:57.780 --> 00:14:02.170
And you have to justify if
you're going to give people an

00:14:02.170 --> 00:14:03.950
incentive to comply, and
is that going to

00:14:03.950 --> 00:14:07.050
change how they respond?

00:14:07.050 --> 00:14:11.080
The one case we've had problems
recently which really

00:14:11.080 --> 00:14:14.875
screwed us, as Eric can attest
to because he was doing all

00:14:14.875 --> 00:14:20.160
the analysis, was people
wouldn't comply to get their

00:14:20.160 --> 00:14:22.660
hemoglobin checked, which
required a pin

00:14:22.660 --> 00:14:26.230
prick taking blood.

00:14:26.230 --> 00:14:29.390
And then the people in the
treatment were more willing to

00:14:29.390 --> 00:14:33.140
give blood than the people in
the control, and that caused

00:14:33.140 --> 00:14:34.460
us a lot of problems.

00:14:34.460 --> 00:14:37.160
But I think probably human
subjects saying, we'll pay you

00:14:37.160 --> 00:14:41.820
to take your blood, we've
had trouble there.

00:14:44.400 --> 00:14:48.420
But I think if you're worrying
about time and things, and

00:14:48.420 --> 00:14:53.120
kind of snacks, if it takes a
long time, and you want people

00:14:53.120 --> 00:14:58.580
to come into centers to do
experiments and games.

00:14:58.580 --> 00:15:01.690
Sometimes people use games as
a way of getting an outcome

00:15:01.690 --> 00:15:03.720
measure, and that takes quite
a lot of time to get them to

00:15:03.720 --> 00:15:05.270
play all these different
games and see how

00:15:05.270 --> 00:15:06.320
they're playing them.

00:15:06.320 --> 00:15:10.420
And then providing food at the
testing is kind of a very

00:15:10.420 --> 00:15:13.360
natural, and I think nobody's
going to complain about that.

00:15:13.360 --> 00:15:16.000
And it helps make sure
that people come in.

00:15:16.000 --> 00:15:18.300
In the US, it's actually
very common to pay

00:15:18.300 --> 00:15:20.180
people to submit surveys.

00:15:20.180 --> 00:15:25.230
So you give them a voucher,
you'll get a voucher if you

00:15:25.230 --> 00:15:25.890
fill in the survey.

00:15:25.890 --> 00:15:28.760
I'm not used to doing surveys
in the US, but I know my

00:15:28.760 --> 00:15:33.750
colleagues who do it in the US
will pay to get people to send

00:15:33.750 --> 00:15:35.000
in surveys.

00:15:37.170 --> 00:15:40.835
So there's a bit of a cultural
thing about what's the

00:15:40.835 --> 00:15:44.460
appropriate way to do this.

00:15:44.460 --> 00:15:47.510
So as I say, with the phase in,
you have to worry about

00:15:47.510 --> 00:15:49.090
anticipation of effects.

00:15:49.090 --> 00:15:51.440
And again, this really depends
on what you're measuring,

00:15:51.440 --> 00:15:53.550
whether this is going
to be a problem.

00:15:53.550 --> 00:15:56.290
If you're looking at
accumulation of capital or

00:15:56.290 --> 00:16:01.180
savings, or buying a durable,
if you know you're going to

00:16:01.180 --> 00:16:03.380
get money next year, then it'll
effect whether you buy

00:16:03.380 --> 00:16:04.640
something this year.

00:16:04.640 --> 00:16:08.990
Where as if it's going to
school, you're not going to

00:16:08.990 --> 00:16:11.980
wait till next year to go
to school probably.

00:16:11.980 --> 00:16:13.890
So you have to think about
it in the context

00:16:13.890 --> 00:16:16.080
of what you're doing.

00:16:16.080 --> 00:16:19.080
The other real problem with
a phase in is it's very

00:16:19.080 --> 00:16:20.575
difficult to get long
term effects.

00:16:24.660 --> 00:16:26.950
Why is it difficult to get
a long term effect

00:16:26.950 --> 00:16:29.458
in a phase in design?

00:16:29.458 --> 00:16:32.386
AUDIENCE: Because within a short
period of time, everyone

00:16:32.386 --> 00:16:34.582
has the treatment, and therefore
it's hard to tell

00:16:34.582 --> 00:16:36.790
the difference between control
and treatment phase.

00:16:36.790 --> 00:16:37.070
PROFESSOR: Right.

00:16:37.070 --> 00:16:39.230
Because we are looking 10 years
out, you're looking at

00:16:39.230 --> 00:16:41.980
someone who's had it for nine
years versus 10 years.

00:16:44.750 --> 00:16:47.110
Whereas if you want the 10 year
effective of a project,

00:16:47.110 --> 00:16:50.040
you really want somebody to have
not got it for 10 years

00:16:50.040 --> 00:16:51.650
versus to have got
it for 10 years.

00:16:51.650 --> 00:16:54.980
So that can often be the
complete death knell to using

00:16:54.980 --> 00:16:56.060
a phase in.

00:16:56.060 --> 00:16:58.820
The one exception to that is if
you've got a school program

00:16:58.820 --> 00:17:02.390
or a kind of age cohort, then
you phase it in over time.

00:17:02.390 --> 00:17:04.440
Some people will just missed
it because they

00:17:04.440 --> 00:17:05.790
will have moved on.

00:17:05.790 --> 00:17:10.950
So one of our longest horizon
projects is actually a phase

00:17:10.950 --> 00:17:13.579
in project, which is the
deworming, because they're

00:17:13.579 --> 00:17:17.369
managing to follow up the
cohorts who had left school by

00:17:17.369 --> 00:17:20.280
the time it reached
their school.

00:17:20.280 --> 00:17:22.790
So it's not always impossible,
but it's something you really

00:17:22.790 --> 00:17:25.660
have to think about.

00:17:25.660 --> 00:17:30.820
Encouragement design, this
you'll talk about more in the

00:17:30.820 --> 00:17:34.120
analysis session.

00:17:34.120 --> 00:17:37.040
You always have to think
about what's the

00:17:37.040 --> 00:17:39.840
question that I'm answering.

00:17:39.840 --> 00:17:42.470
And with an encouragement
design, you're answering the

00:17:42.470 --> 00:17:47.160
question, what's the effect of
the program on people who

00:17:47.160 --> 00:17:50.640
respond to the incentive?

00:17:50.640 --> 00:17:53.070
Because some people are
responding to the

00:17:53.070 --> 00:17:54.510
incentive to take up.

00:17:54.510 --> 00:17:57.030
Some people are already doing
it without the incentive.

00:17:57.030 --> 00:17:59.980
Some people won't do it even
with the incentive.

00:17:59.980 --> 00:18:02.330
When you're measuring the
impact, you're measuring the

00:18:02.330 --> 00:18:06.280
impact on the kind of person who
responds to the incentive,

00:18:06.280 --> 00:18:08.500
who's not your average person.

00:18:08.500 --> 00:18:11.480
Now maybe that's exactly
who you want to be

00:18:11.480 --> 00:18:12.930
measuring the effect on.

00:18:12.930 --> 00:18:16.000
Because if you're looking at a
program that might encourage

00:18:16.000 --> 00:18:20.160
people to do more, say you're
looking at savings and you've

00:18:20.160 --> 00:18:24.040
got an encouragement to come to
a meeting of a 401k plan,

00:18:24.040 --> 00:18:27.340
and that encourages them to take
up a 401k plan, that's

00:18:27.340 --> 00:18:30.810
kind of exactly the people
you're interested in, who's a

00:18:30.810 --> 00:18:35.570
marginal 401k participant.

00:18:35.570 --> 00:18:39.690
In other cases, you're more
interested in kind of the

00:18:39.690 --> 00:18:42.180
average effect of a program.

00:18:42.180 --> 00:18:45.190
So again, you have to
worry about that.

00:18:45.190 --> 00:18:46.760
You need a big enough
inducement to

00:18:46.760 --> 00:18:48.150
really change take up.

00:18:48.150 --> 00:18:50.720
If you change it a little bit,
you're not going to have

00:18:50.720 --> 00:18:53.000
enough statistical power
to measure the effect.

00:18:53.000 --> 00:18:56.620
We'll talk about statistical
power tomorrow.

00:18:56.620 --> 00:18:58.940
The other thing you have to
worry about is are you

00:18:58.940 --> 00:19:01.340
measuring the effect of taking
up the program, or are you

00:19:01.340 --> 00:19:05.060
measuring the effect of the
incentive to take it up?

00:19:05.060 --> 00:19:08.470
If you do a really big incentive
to take it up, that

00:19:08.470 --> 00:19:10.850
might have a direct effect.

00:19:10.850 --> 00:19:15.570
So there's no right answer as
to which is the best design.

00:19:15.570 --> 00:19:18.840
It completely depends on
what your project is.

00:19:18.840 --> 00:19:21.910
And hopefully this is what
you're learning this week, is

00:19:21.910 --> 00:19:24.680
which of these is suitable
to my particular

00:19:24.680 --> 00:19:25.605
problem and my situation.

00:19:25.605 --> 00:19:25.950
Yeah?

00:19:25.950 --> 00:19:29.265
AUDIENCE: Could you explain what
the control group would

00:19:29.265 --> 00:19:32.380
be in the encouragement, and
how would you gather

00:19:32.380 --> 00:19:35.020
information about the
control group?

00:19:35.020 --> 00:19:37.710
PROFESSOR: So the control group
in the encouragement is

00:19:37.710 --> 00:19:41.610
the people who weren't given the
encouragement to attend.

00:19:47.400 --> 00:19:52.160
There's an example on our
website of the 401k plan, and

00:19:52.160 --> 00:19:54.740
they're looking at what's
the impact on--

00:19:54.740 --> 00:19:59.010
they wanted to answer, if you
take up a 401k plan, does it

00:19:59.010 --> 00:20:01.590
just shift your saving from one
kind of asset to another

00:20:01.590 --> 00:20:02.170
kind of asset?

00:20:02.170 --> 00:20:06.270
Or does it actually totally
increase your savings?

00:20:06.270 --> 00:20:07.830
So they needed some variation.

00:20:07.830 --> 00:20:10.570
They weren't going to be able
to randomly persuade this

00:20:10.570 --> 00:20:13.040
company to have a 401k.

00:20:13.040 --> 00:20:15.700
Nobody is going to decide
whether to have a 401k plan in

00:20:15.700 --> 00:20:18.070
their company based on
the toss of the coin.

00:20:18.070 --> 00:20:19.720
It's too important
of a decision.

00:20:19.720 --> 00:20:22.510
However, you will find that a
lot of companies, a lot of

00:20:22.510 --> 00:20:23.160
universities--

00:20:23.160 --> 00:20:25.460
and this was done within
a university context--

00:20:25.460 --> 00:20:29.270
a lot of the eligible people
are not signed up.

00:20:29.270 --> 00:20:32.840
So they took the list of all
the people who are eligible

00:20:32.840 --> 00:20:36.260
who had not signed up and
randomly sent letters to some

00:20:36.260 --> 00:20:37.490
of them saying--

00:20:37.490 --> 00:20:40.310
I think maybe even a monetary
encouragement--

00:20:40.310 --> 00:20:44.650
to come to the meeting, where
they learned about 401k plan.

00:20:44.650 --> 00:20:49.250
More of those people ended up
signing up for a 401k plan

00:20:49.250 --> 00:20:52.310
than the people who had
not received a letter.

00:20:52.310 --> 00:20:56.650
Some of the people who had not
received a letter did sign up.

00:20:56.650 --> 00:21:00.660
But fewer of them signed up
than the people who had

00:21:00.660 --> 00:21:03.900
received an encouragement to
attend the meeting and sign up

00:21:03.900 --> 00:21:05.340
for a 401k plan.

00:21:05.340 --> 00:21:07.920
So all you need is
a difference.

00:21:07.920 --> 00:21:11.230
More of the people in the
treatment group sign up, the

00:21:11.230 --> 00:21:13.400
control group are the people
who are not encouraged.

00:21:13.400 --> 00:21:15.250
And there were fewer of
them who signed up.

00:21:15.250 --> 00:21:16.670
As long as there's
a difference.

00:21:16.670 --> 00:21:21.470
In our microfinance example, we
have in our treatment areas

00:21:21.470 --> 00:21:27.270
where microfinance is offered,
but it's not actually like we

00:21:27.270 --> 00:21:30.760
can say, you are taking
microfinance, you are not.

00:21:30.760 --> 00:21:32.470
We can only offer it.

00:21:32.470 --> 00:21:36.170
It's available, and then people
have to sign up for it.

00:21:36.170 --> 00:21:39.590
We have some difference-- not
a huge difference, but some

00:21:39.590 --> 00:21:42.100
difference in the percentage
of people who were offered

00:21:42.100 --> 00:21:45.980
microfinance who take it up
versus those in areas where

00:21:45.980 --> 00:21:47.970
they were not offered it.

00:21:47.970 --> 00:21:51.050
So all long as there's some
difference there, you can

00:21:51.050 --> 00:21:53.870
statistically tease
out the effect.

00:21:53.870 --> 00:21:55.980
And it's random whether
you're offered.

00:21:55.980 --> 00:21:57.620
It's not random whether
you take it up.

00:21:57.620 --> 00:21:59.390
It's random whether
you're offered.

00:21:59.390 --> 00:22:01.650
And you'll learn in the analysis
section how you

00:22:01.650 --> 00:22:05.430
actually cope with the analysis
when not everyone

00:22:05.430 --> 00:22:07.540
takes it up, but some
people take it up.

00:22:07.540 --> 00:22:09.420
Yeah?

00:22:09.420 --> 00:22:12.790
AUDIENCE: How was that
a nontrivial finding?

00:22:12.790 --> 00:22:16.490
More people that you market
a 401k to will sign up?

00:22:16.490 --> 00:22:18.380
PROFESSOR: No, no, that's
not the finding.

00:22:18.380 --> 00:22:23.960
The finding is using the fact
that more people who are

00:22:23.960 --> 00:22:30.530
marketed to sign up, you can
then look at how their savings

00:22:30.530 --> 00:22:32.450
behavior changed.

00:22:32.450 --> 00:22:35.440
Did they just shift money out of
their other savings and put

00:22:35.440 --> 00:22:37.830
it in the 401, or did it
actually lead to an increase

00:22:37.830 --> 00:22:39.060
in total savings?

00:22:39.060 --> 00:22:41.640
And that's kind of the
fundamental policy question

00:22:41.640 --> 00:22:46.940
about 401ks, does giving tax
preference to savings, does it

00:22:46.940 --> 00:22:49.920
increase total savings, or is it
just move your savings from

00:22:49.920 --> 00:22:51.500
one kind of instrument
to another?

00:22:54.010 --> 00:22:56.970
And you look on average at the
people who were offered, do

00:22:56.970 --> 00:23:00.460
they have totally more savings
versus the people who were not

00:23:00.460 --> 00:23:02.740
encouraged to do 401ks?

00:23:02.740 --> 00:23:06.860
And then basically you adjust
for the number of people who

00:23:06.860 --> 00:23:08.384
actually took up.

00:23:08.384 --> 00:23:09.650
Another question?

00:23:09.650 --> 00:23:11.057
AUDIENCE: This subject has been
brought up a couple times

00:23:11.057 --> 00:23:13.498
so far, but I'm still
confused on.

00:23:13.498 --> 00:23:15.680
You say within the
disadvantages, there's the

00:23:15.680 --> 00:23:18.110
problem that you're going to
measure the impact of those

00:23:18.110 --> 00:23:19.530
who respond to the incentive.

00:23:19.530 --> 00:23:23.150
And this seems like a major
disadvantage, that it puts in

00:23:23.150 --> 00:23:25.256
a lot of selection bias, because
whoever is responding

00:23:25.256 --> 00:23:25.974
to the incentive.

00:23:25.974 --> 00:23:29.160
So what has been brought up so
far is that you then look at

00:23:29.160 --> 00:23:31.720
the intended treatment rather
than the treatment itself, but

00:23:31.720 --> 00:23:32.740
I still don't understand--.

00:23:32.740 --> 00:23:35.470
PROFESSOR: OK, so you're going
to do intend to treat versus

00:23:35.470 --> 00:23:40.340
treatment on the treated
on Friday.

00:23:40.340 --> 00:23:42.880
So the actual mechanics
of how you do it,

00:23:42.880 --> 00:23:44.250
we're putting it off.

00:23:44.250 --> 00:23:46.740
You're not meant to be able to
do it yet, because you have a

00:23:46.740 --> 00:23:48.810
whole hour and a half on that.

00:23:48.810 --> 00:23:51.040
AUDIENCE: But if you then
do that, then does

00:23:51.040 --> 00:23:52.380
that selection disappear?

00:23:52.380 --> 00:23:53.630
PROFESSOR: No.

00:23:58.070 --> 00:24:00.460
So you said you're worried
about selection bias, the

00:24:00.460 --> 00:24:03.550
people who are going
to show up.

00:24:03.550 --> 00:24:07.260
It's not that we measure the
outcomes of those who sign up

00:24:07.260 --> 00:24:08.710
versus all the control.

00:24:08.710 --> 00:24:12.125
We measure, on average, the
effect of all the people who

00:24:12.125 --> 00:24:16.230
are offered the treatment versus
the average of all the

00:24:16.230 --> 00:24:18.970
people who were not offered.

00:24:18.970 --> 00:24:23.020
So on average treatment versus
control where treatment is

00:24:23.020 --> 00:24:25.360
being offered, not taking up.

00:24:25.360 --> 00:24:26.765
So we have no selection bias.

00:24:30.320 --> 00:24:36.900
If we see a change, we assume
that all that change comes

00:24:36.900 --> 00:24:39.090
from the few people who
actually changed their

00:24:39.090 --> 00:24:41.010
behavior as a result
of the incentive.

00:24:41.010 --> 00:24:46.470
So say half the people take up
and we see a change of 2%.

00:24:49.920 --> 00:24:53.210
If all the 2% is coming from
just half the people changing

00:24:53.210 --> 00:24:56.630
their behavior, then we assume
that the change in behavior

00:24:56.630 --> 00:24:58.380
there was 4%.

00:24:58.380 --> 00:25:02.060
Because it's coming from half
the sample, and averaged over

00:25:02.060 --> 00:25:04.900
everyone it's 2%, so if it's
only coming from half, they

00:25:04.900 --> 00:25:06.150
must have changed by 4%.

00:25:08.750 --> 00:25:12.690
So that's what you
do, very simply.

00:25:12.690 --> 00:25:14.780
It's not a selection bias
because we're taking the

00:25:14.780 --> 00:25:19.770
averages of two completely
equivalent groups.

00:25:19.770 --> 00:25:23.240
But we are taking it
from the change in

00:25:23.240 --> 00:25:25.850
behavior of certain people.

00:25:25.850 --> 00:25:29.150
And so what we are measuring is
how the program changed the

00:25:29.150 --> 00:25:31.370
behavior of those
certain people.

00:25:31.370 --> 00:25:35.590
So it's not selection
bias, it's just

00:25:35.590 --> 00:25:36.850
what are you measuring.

00:25:36.850 --> 00:25:38.610
Who's changing?

00:25:38.610 --> 00:25:39.950
Who are you looking at?

00:25:39.950 --> 00:25:42.460
It's just like saying if you
do the project in the

00:25:42.460 --> 00:25:45.120
mountains, you're getting the
impact of doing the project in

00:25:45.120 --> 00:25:50.440
the mountains, whereas it may
not tell you about what's the

00:25:50.440 --> 00:25:52.070
effect of doing it
on the plains.

00:25:52.070 --> 00:25:55.790
AUDIENCE: It's the external
validity, right?

00:25:55.790 --> 00:25:57.220
PROFESSOR: Yeah, it's
external validity.

00:25:57.220 --> 00:25:59.870
But it really depends
on the question.

00:25:59.870 --> 00:26:01.530
You can't say that's
right or wrong.

00:26:01.530 --> 00:26:04.030
Because if your question is what
happens to people who are

00:26:04.030 --> 00:26:06.430
in the mountains, then that's
the right answer.

00:26:06.430 --> 00:26:08.420
If you want to know what happens
to people in the

00:26:08.420 --> 00:26:12.425
plains, then you have to think
about does this make sense?

00:26:16.340 --> 00:26:19.370
In this case it's changing the
effect on the marginal person

00:26:19.370 --> 00:26:21.100
who responds to incentive.

00:26:21.100 --> 00:26:25.270
If that's the very poor who
respond to the incentive, then

00:26:25.270 --> 00:26:29.510
you know the effect of doing the
program on the very poor.

00:26:29.510 --> 00:26:31.800
And maybe that's what
you want to know.

00:26:31.800 --> 00:26:36.170
But as long as you know what
the question is that you're

00:26:36.170 --> 00:26:38.130
answering, I think
that's important.

00:26:38.130 --> 00:26:41.160
Then you can think about
whether it makes sense,

00:26:41.160 --> 00:26:46.120
whether you've got an external
validity question or whether--

00:26:46.120 --> 00:26:50.420
you care about the poor
anyway, so I'm happy.

00:26:50.420 --> 00:26:54.250
So I've taken half an hour
on one slide, so I should

00:26:54.250 --> 00:26:58.090
probably speed up.

00:26:58.090 --> 00:26:59.340
So unit and randomization.

00:27:05.600 --> 00:27:09.320
We just went over an awful lot
material in that slide, so

00:27:09.320 --> 00:27:10.520
it's quite fundamental.

00:27:10.520 --> 00:27:13.780
So I'm glad you asked
questions about it.

00:27:13.780 --> 00:27:17.570
So the unit of randomization is
are we going to randomize

00:27:17.570 --> 00:27:19.990
individuals to get the program,
or are we going to

00:27:19.990 --> 00:27:24.400
randomize communities to get the
program, or a group level?

00:27:24.400 --> 00:27:26.110
It could be a school,
a community, a

00:27:26.110 --> 00:27:28.390
health center, a district.

00:27:28.390 --> 00:27:31.720
But it's a clump of people.

00:27:31.720 --> 00:27:35.000
So how do we make a decision
about what unit

00:27:35.000 --> 00:27:36.250
to randomize at?

00:27:40.060 --> 00:27:45.580
So if we do it at an individual
level, you get the

00:27:45.580 --> 00:27:49.820
program, you don't get the
program, you get the program,

00:27:49.820 --> 00:27:54.070
we can do a really nice,
detailed evaluation at a

00:27:54.070 --> 00:27:56.760
relatively small cost.

00:27:56.760 --> 00:27:59.280
That's the benefit.

00:27:59.280 --> 00:28:04.420
But it may be politically
difficult to do that.

00:28:04.420 --> 00:28:08.200
To have different treatments
within one community,

00:28:08.200 --> 00:28:09.670
particularly if you're
thinking--

00:28:09.670 --> 00:28:11.900
imagine if you were in
a school context, and

00:28:11.900 --> 00:28:12.910
you've got a class.

00:28:12.910 --> 00:28:17.390
And you say, well,
you get a lunch.

00:28:17.390 --> 00:28:19.752
We're providing lunch to
you, but I'm sorry, you

00:28:19.752 --> 00:28:21.900
don't get any lunch.

00:28:21.900 --> 00:28:25.800
It's just very hard to do that,
and often inappropriate

00:28:25.800 --> 00:28:26.380
to do that.

00:28:26.380 --> 00:28:29.760
And what's more, it often
doesn't work, because most

00:28:29.760 --> 00:28:33.270
kids, when they're given
something and their neighbor

00:28:33.270 --> 00:28:35.956
isn't, they all share
with their neighbor.

00:28:35.956 --> 00:28:38.270
At least most kids in developing
countries, maybe

00:28:38.270 --> 00:28:42.120
not my kids.

00:28:42.120 --> 00:28:44.690
And then you've just totally
screwed up your evaluation.

00:28:44.690 --> 00:28:46.850
Because if they're sharing with
their neighbor, who's the

00:28:46.850 --> 00:28:50.840
control, you don't know what the
effect of having lunch is,

00:28:50.840 --> 00:28:54.960
because actually there isn't any
difference between them.

00:28:54.960 --> 00:28:58.495
So that's not going to work.

00:29:01.010 --> 00:29:04.830
So sometimes a program can
only be implemented at a

00:29:04.830 --> 00:29:06.110
certain level.

00:29:06.110 --> 00:29:09.790
There's just kind of logistical
things which mean

00:29:09.790 --> 00:29:13.135
we're setting up a center
in a community.

00:29:16.030 --> 00:29:18.570
We don't set it up for
individuals, we set up one in

00:29:18.570 --> 00:29:19.200
the community.

00:29:19.200 --> 00:29:21.780
So we either do it
or we don't.

00:29:21.780 --> 00:29:27.380
So that often gives you the
answer right there as to

00:29:27.380 --> 00:29:30.350
what's the unit that you
can randomize at.

00:29:30.350 --> 00:29:34.680
Spillovers is exactly what we
were talking about in terms of

00:29:34.680 --> 00:29:36.130
sharing the food.

00:29:36.130 --> 00:29:37.980
Sitting next to someone
who gets the

00:29:37.980 --> 00:29:40.540
treatment may impact you.

00:29:40.540 --> 00:29:43.450
And if it does, you have to take
that into account when

00:29:43.450 --> 00:29:48.960
you design what unit that
you're going to do the

00:29:48.960 --> 00:29:51.670
evaluation at.

00:29:51.670 --> 00:29:55.210
So as they say, encouragement
is this kind of weird thing

00:29:55.210 --> 00:29:58.130
that's halfway between the two,
in the sense that the

00:29:58.130 --> 00:30:01.950
program may be implemented at
a village or district level,

00:30:01.950 --> 00:30:06.140
but you can randomize
encouragement to take it up at

00:30:06.140 --> 00:30:07.200
an individual level.

00:30:07.200 --> 00:30:10.550
So sometimes that's a nice
way out of that.

00:30:13.960 --> 00:30:19.470
So multiple treatments is
sometimes people list as kind

00:30:19.470 --> 00:30:23.110
of an alternative method
of randomization.

00:30:23.110 --> 00:30:32.860
But really, you can have any of
these different approaches

00:30:32.860 --> 00:30:34.320
could be done with multiple

00:30:34.320 --> 00:30:36.270
treatments or with one treatment.

00:30:42.610 --> 00:30:46.200
Now I'm going to take a little
time to kind of work through a

00:30:46.200 --> 00:30:50.060
couple of different examples
that are all

00:30:50.060 --> 00:30:51.490
based around schools.

00:30:51.490 --> 00:30:52.370
I don't know why.

00:30:52.370 --> 00:30:54.280
My last lecture was all
based around schools.

00:30:54.280 --> 00:31:00.170
I guess you had Dean talking
about different examples in

00:31:00.170 --> 00:31:01.420
the last lecture.

00:31:03.690 --> 00:31:07.480
So going back to the Balsakhi
case and thinking about the

00:31:07.480 --> 00:31:09.820
problems that came out,
and the issues around

00:31:09.820 --> 00:31:11.500
the Balsakhi case.

00:31:11.500 --> 00:31:14.220
We're going to look at two
different approaches to

00:31:14.220 --> 00:31:19.210
answering some of those
questions, and sort of discuss

00:31:19.210 --> 00:31:20.770
what are the pros and cons
of the different

00:31:20.770 --> 00:31:22.440
ways of doing it.

00:31:22.440 --> 00:31:26.920
So the fundamental issues, if
we think about the needs

00:31:26.920 --> 00:31:32.200
assessment around Balsakhi, what
were the problems in the

00:31:32.200 --> 00:31:33.700
needs assessment?

00:31:33.700 --> 00:31:36.550
You had very large
class sizes.

00:31:36.550 --> 00:31:39.610
You had children at different
levels of learning.

00:31:39.610 --> 00:31:42.800
You had teachers who
were often absent.

00:31:42.800 --> 00:31:45.870
And you had curricula that were
inappropriate for many of

00:31:45.870 --> 00:31:50.410
the kids, particularly the
most marginalized kids.

00:31:50.410 --> 00:31:54.260
This was urban India, pretty
similar problems in many

00:31:54.260 --> 00:31:57.500
developing countries.

00:31:57.500 --> 00:32:00.620
Most places I've worked
have that problem,

00:32:00.620 --> 00:32:02.690
all of those problems.

00:32:02.690 --> 00:32:06.780
So what are the different
interventions that we could do

00:32:06.780 --> 00:32:09.040
to address those issues?

00:32:09.040 --> 00:32:11.300
We could have more teaches,
and that would allow us to

00:32:11.300 --> 00:32:14.580
split the classes into
smaller classes.

00:32:14.580 --> 00:32:16.520
If we're worrying about children
being at different

00:32:16.520 --> 00:32:20.310
levels of learning, we could
stream pupils, i.e.

00:32:20.310 --> 00:32:23.570
divide the classes such that
you have people who are of

00:32:23.570 --> 00:32:26.475
more similar ability
in each class.

00:32:29.380 --> 00:32:32.260
How do we cope with teachers
often being absent?

00:32:32.260 --> 00:32:34.120
You could make teachers
more accountable, and

00:32:34.120 --> 00:32:36.560
they may show up more.

00:32:36.560 --> 00:32:38.470
How do we cope with the
curricula being often

00:32:38.470 --> 00:32:41.980
inappropriate for the most
marginalized children?

00:32:41.980 --> 00:32:46.050
Well, you might want to change
the curricula and make it more

00:32:46.050 --> 00:32:51.330
basic or more focused on where
the children actually are at.

00:32:51.330 --> 00:32:57.750
All too often the curricula
appear to be set according to

00:32:57.750 --> 00:33:02.650
what the kids of the Minister
of Education, their level

00:33:02.650 --> 00:33:06.260
rather than what's the
appropriate level for kids in

00:33:06.260 --> 00:33:08.440
poor schools.

00:33:08.440 --> 00:33:14.120
So how did the Balsakhi study
approach those questions and

00:33:14.120 --> 00:33:17.660
try and answer those
questions?

00:33:17.660 --> 00:33:20.050
The Balsakhi was limited
in the fact that

00:33:20.050 --> 00:33:23.150
they had one treatment.

00:33:23.150 --> 00:33:26.760
So it's going to be a little
more complicated to tease out

00:33:26.760 --> 00:33:29.000
all of those different
questions, because this is

00:33:29.000 --> 00:33:32.630
going back to the question you
discussed before, which is

00:33:32.630 --> 00:33:34.750
we've got a package
of interventions.

00:33:34.750 --> 00:33:38.490
They had a package which was
the Balsakhi program.

00:33:38.490 --> 00:33:40.540
And they managed to get
an awful lot of

00:33:40.540 --> 00:33:41.820
information out of that.

00:33:41.820 --> 00:33:44.120
And then we'll look at an
alternative that looks at

00:33:44.120 --> 00:33:48.140
multiple treatments and is able
to get at these questions

00:33:48.140 --> 00:33:50.510
in a somewhat more
precise way.

00:33:50.510 --> 00:33:53.430
So in the Balsakhi study you've
got this package, which

00:33:53.430 --> 00:33:57.200
is that each school in the
treatment got a Balsakhi a

00:33:57.200 --> 00:34:00.820
tutor in grades three or four.

00:34:00.820 --> 00:34:05.240
The lowest achieving children in
the class were sent to the

00:34:05.240 --> 00:34:08.440
Balsakhi for half the day, and
all the children at the end

00:34:08.440 --> 00:34:09.170
were given a test.

00:34:09.170 --> 00:34:10.739
So that was the design
of the project.

00:34:17.159 --> 00:34:20.290
Now we're going to go through
the questions that we want to

00:34:20.290 --> 00:34:21.000
try and answer.

00:34:21.000 --> 00:34:25.024
The first question is do smaller
class sizes improve

00:34:25.024 --> 00:34:27.460
test scores?

00:34:27.460 --> 00:34:32.960
So as you went through in the
case, even though it was one

00:34:32.960 --> 00:34:34.540
project and it was designed--

00:34:34.540 --> 00:34:37.690
one study and it was designed
to test the effectiveness of

00:34:37.690 --> 00:34:40.929
the Balsakhi program, they
actually were able to answer

00:34:40.929 --> 00:34:46.570
this question to some extent by
saying, the kids who were

00:34:46.570 --> 00:34:50.780
at a high level at the beginning
didn't get these

00:34:50.780 --> 00:34:51.860
other elements of Balsakhi.

00:34:51.860 --> 00:34:55.639
All they got was that the low
ability kids were moved out of

00:34:55.639 --> 00:34:57.570
their class.

00:34:57.570 --> 00:35:01.170
So they actually had smaller
class sizes as a result for

00:35:01.170 --> 00:35:02.860
half of the day.

00:35:02.860 --> 00:35:05.990
So that gives us a chance
to look at does

00:35:05.990 --> 00:35:07.970
lower class sizes help?

00:35:07.970 --> 00:35:11.250
So you just compare the high
achieving pupils in treatment

00:35:11.250 --> 00:35:16.010
and control, some of those in
the treatment classes had

00:35:16.010 --> 00:35:17.740
smaller class sizes
for half the day;

00:35:17.740 --> 00:35:21.000
those without did not.

00:35:21.000 --> 00:35:24.465
Now another question that we had
on our list is does having

00:35:24.465 --> 00:35:26.670
an accountable teacher
get better results?

00:35:26.670 --> 00:35:28.410
So we're worried about teaches
not showing up.

00:35:28.410 --> 00:35:31.190
If you make the teacher more
accountable, do they show up

00:35:31.190 --> 00:35:32.685
more often, and do you
get better results?

00:35:39.340 --> 00:35:42.490
Now in this case, the Balsakhi
is more accountable than the

00:35:42.490 --> 00:35:46.070
regular teacher, because the
Balsakhi is hired by an NGO.

00:35:46.070 --> 00:35:49.260
They can be fired if they're
not doing their job.

00:35:49.260 --> 00:35:56.440
But the other teacher is a
government teacher, and as we

00:35:56.440 --> 00:35:59.650
know, government teaches are
very rarely fired, whether in

00:35:59.650 --> 00:36:01.970
developing countries or
developed countries, whether

00:36:01.970 --> 00:36:05.150
they're doing a good
job or not.

00:36:05.150 --> 00:36:09.930
So we could look at the
treatment effect for low

00:36:09.930 --> 00:36:10.920
versus high children.

00:36:10.920 --> 00:36:14.370
What do I mean by the
treatment effect?

00:36:14.370 --> 00:36:16.760
What do I mean by that,
comparing the treatment effect

00:36:16.760 --> 00:36:18.570
for low versus higher
achieving children?

00:36:21.684 --> 00:36:25.960
AUDIENCE: Maybe the result
of the test scores.

00:36:25.960 --> 00:36:29.850
PROFESSOR: Yeah, so we're going
to use the test scores.

00:36:29.850 --> 00:36:33.350
But the treatment effect is
whose test scores am I

00:36:33.350 --> 00:36:36.000
comparing to get the
treatment effect

00:36:36.000 --> 00:36:37.490
for low scoring children?

00:36:42.760 --> 00:36:44.083
AUDIENCE: The one with the
government teacher versus the

00:36:44.083 --> 00:36:46.760
one with Balsakhi teacher?

00:36:46.760 --> 00:36:51.170
PROFESSOR: So all the
low scoring people

00:36:51.170 --> 00:36:53.160
are going to be--

00:36:53.160 --> 00:36:56.730
if they're in treatment, they'll
get the Balsakhi.

00:36:56.730 --> 00:36:57.880
Right?

00:36:57.880 --> 00:36:59.830
Yes, so it's--

00:36:59.830 --> 00:37:00.960
I see what you mean.

00:37:00.960 --> 00:37:01.480
You're right.

00:37:01.480 --> 00:37:07.640
So what we're saying is what's
the effect of the treatment

00:37:07.640 --> 00:37:08.770
for low scoring?

00:37:08.770 --> 00:37:17.380
That means compare the test
score improvement for the low

00:37:17.380 --> 00:37:20.135
performing kids in treatment
with the low performing kids

00:37:20.135 --> 00:37:20.790
in control.

00:37:20.790 --> 00:37:21.880
That's the treatment effect.

00:37:21.880 --> 00:37:23.640
What was the effect
of the treatment?

00:37:23.640 --> 00:37:27.000
Compare treatment and control
for low scoring kids.

00:37:27.000 --> 00:37:31.000
The difference is the
treatment effect.

00:37:31.000 --> 00:37:33.690
So what's the difference between
treatment and control

00:37:33.690 --> 00:37:41.450
for low versus the treatment
effect for high scoring kids?

00:37:41.450 --> 00:37:52.830
So basically all of the kids
in the treatment group got

00:37:52.830 --> 00:37:54.580
smaller class sizes.

00:37:54.580 --> 00:37:59.910
Only the initially low scoring
kids got the Balsakhi, got the

00:37:59.910 --> 00:38:02.660
accountable teacher.

00:38:02.660 --> 00:38:06.720
So if you look at the high
scoring kids, you get just the

00:38:06.720 --> 00:38:08.910
effect of class size.

00:38:08.910 --> 00:38:12.910
If you look at just the low
scoring kids, you get class

00:38:12.910 --> 00:38:16.600
size, lower class size, and
accountable teacher.

00:38:16.600 --> 00:38:19.540
You also get a different
curriculum.

00:38:19.540 --> 00:38:21.940
So you've got three changes.

00:38:21.940 --> 00:38:24.270
One of the changes we've taken
care of, because we've looked

00:38:24.270 --> 00:38:25.700
at that on its own.

00:38:25.700 --> 00:38:28.740
We've got smaller class sizes on
its own by looking at just

00:38:28.740 --> 00:38:30.190
high scoring kids.

00:38:30.190 --> 00:38:33.400
So the two left are changing
curricula and

00:38:33.400 --> 00:38:36.430
changing kind of teacher.

00:38:36.430 --> 00:38:39.830
And that's the difference
between the treatment effect

00:38:39.830 --> 00:38:42.510
for low and the treatment effect
for high scoring kids.

00:38:45.760 --> 00:38:48.910
So as I say, you've got two
things going on for the low

00:38:48.910 --> 00:38:49.750
scoring kids.

00:38:49.750 --> 00:38:52.050
You've got three things going
on, but we've controlled for

00:38:52.050 --> 00:38:53.350
one of them.

00:38:53.350 --> 00:38:55.850
The two things that are
different about the low

00:38:55.850 --> 00:38:58.480
scoring kids is they get a
different kind of teacher and

00:38:58.480 --> 00:39:00.640
they get a different
kind of curricula.

00:39:00.640 --> 00:39:02.960
So it's going to be really hard
to tease out those two

00:39:02.960 --> 00:39:04.610
things from each other.

00:39:08.140 --> 00:39:12.610
So does streaming improve
test scores?

00:39:12.610 --> 00:39:17.610
We can look at what happens to
the high scoring kids, because

00:39:17.610 --> 00:39:18.880
they don't have--

00:39:18.880 --> 00:39:20.760
we don't have to worry about the
fact that they're changing

00:39:20.760 --> 00:39:22.290
the kind of teacher, because
they've still got the

00:39:22.290 --> 00:39:22.760
government teacher.

00:39:22.760 --> 00:39:26.070
All they've got is the low
scoring kids taken out of

00:39:26.070 --> 00:39:28.500
their class.

00:39:28.500 --> 00:39:30.840
So we could look at the
high scoring kids.

00:39:30.840 --> 00:39:32.760
Now they found nothing.

00:39:32.760 --> 00:39:34.720
And that makes it easier to
interpret, because if you

00:39:34.720 --> 00:39:38.840
don't find anything, and there's
more streaming, they

00:39:38.840 --> 00:39:41.000
haven't got the low scoring
kids in their class, and

00:39:41.000 --> 00:39:44.290
they've got smaller
class sizes.

00:39:44.290 --> 00:39:47.380
So they didn't find anything.

00:39:47.380 --> 00:39:52.640
So they said, well, smaller
class sizes didn't help and

00:39:52.640 --> 00:39:53.750
streaming didn't help.

00:39:53.750 --> 00:39:56.150
But if they'd found a small
effect, they wouldn't have

00:39:56.150 --> 00:39:58.610
known whether it was
this or this.

00:39:58.610 --> 00:40:01.540
Because lots of things are being
changed, and they've

00:40:01.540 --> 00:40:03.360
only got one treatment.

00:40:03.360 --> 00:40:04.710
That's basically what
I'm trying to say.

00:40:04.710 --> 00:40:08.240
They managed to tease out a lot,
but they can't nail down

00:40:08.240 --> 00:40:10.240
everything, because they're
changing lots of different

00:40:10.240 --> 00:40:15.070
things about the classroom, but
they've only got treatment

00:40:15.070 --> 00:40:16.390
versus control.

00:40:16.390 --> 00:40:19.400
So if they got a zero and two
things going on, they can

00:40:19.400 --> 00:40:21.800
actually say, well, both
of them must be zero.

00:40:21.800 --> 00:40:26.200
Unless of course, one helped and
one hurt and they exactly

00:40:26.200 --> 00:40:28.870
offset each other.

00:40:28.870 --> 00:40:33.700
Again, focusing on basic
improvements, focusing the

00:40:33.700 --> 00:40:36.620
curricula, as we say, for the
low scoring kids, there was an

00:40:36.620 --> 00:40:38.790
improvement, but we don't know
whether it was because the

00:40:38.790 --> 00:40:43.050
teacher was more accountable
or the curricula changed.

00:40:43.050 --> 00:40:46.150
We can maybe look at teacher
attendance and see did that

00:40:46.150 --> 00:40:50.170
change a lot, in which case if
it didn't, if the Balsakhi

00:40:50.170 --> 00:40:53.310
didn't turn up more than the
regular teacher, then it's

00:40:53.310 --> 00:40:55.880
probably the curricula
that's going up.

00:40:55.880 --> 00:40:56.100
Yeah?

00:40:56.100 --> 00:40:59.016
AUDIENCE: Is it methodologically
sound to

00:40:59.016 --> 00:41:03.410
compare side by side the
treatment effect for low

00:41:03.410 --> 00:41:06.331
achieving students and high
achieving students, even

00:41:06.331 --> 00:41:08.110
though they're starting in
very different places?

00:41:08.110 --> 00:41:12.717
Can you make them comparable
because they're starting at

00:41:12.717 --> 00:41:15.707
such different levels, and it
might be more difficult to get

00:41:15.707 --> 00:41:17.810
from one level than
from another?

00:41:17.810 --> 00:41:18.310
PROFESSOR: Right.

00:41:18.310 --> 00:41:25.830
So again, you can't judge where
low scoring kids improve

00:41:25.830 --> 00:41:28.235
by 10%, high scoring kids
improve by 15%.

00:41:30.820 --> 00:41:33.050
Is one really bigger than the
other, or is it just the way

00:41:33.050 --> 00:41:33.990
you measure it?

00:41:33.990 --> 00:41:38.750
Is it harder to get high scoring
kids up or easier?

00:41:38.750 --> 00:41:39.880
Then you're in trouble.

00:41:39.880 --> 00:41:43.040
But in this case, the high
scoring kids didn't see any

00:41:43.040 --> 00:41:45.330
improvement at all.

00:41:45.330 --> 00:41:47.510
So that's kind of easier to
interpret, whereas the low

00:41:47.510 --> 00:41:50.210
scoring kids saw a huge
amount of improvement.

00:41:50.210 --> 00:41:53.310
You could say, well, it's just
hard to get the high scoring

00:41:53.310 --> 00:41:54.980
kids much better.

00:41:54.980 --> 00:41:59.450
But when we say high scoring
kids, we're like they're on

00:41:59.450 --> 00:42:00.450
grade level.

00:42:00.450 --> 00:42:02.210
They're not massively
behind grade level.

00:42:02.210 --> 00:42:04.870
It's not like they're such
superstars that there's no

00:42:04.870 --> 00:42:06.110
improvement possible.

00:42:06.110 --> 00:42:07.630
They were just on grade level.

00:42:07.630 --> 00:42:09.840
That's all we mean by high
scoring in this context.

00:42:09.840 --> 00:42:12.180
It wasn't that they were
desperately falling behind.

00:42:16.370 --> 00:42:19.400
But as I say, you don't want to
compare an improvement of

00:42:19.400 --> 00:42:22.400
10 versus an improvement in 15,
because that depends on

00:42:22.400 --> 00:42:25.530
how you scored the test.

00:42:25.530 --> 00:42:27.870
So it's hard to interpret.

00:42:27.870 --> 00:42:31.710
But in this case, the program
just didn't help the top.

00:42:31.710 --> 00:42:35.640
And in the textbook example that
I talked about before,

00:42:35.640 --> 00:42:39.370
giving textbooks just didn't
help the average kid.

00:42:39.370 --> 00:42:41.150
It only helped the top 20%.

00:42:41.150 --> 00:42:44.630
And then I think you
can say something.

00:42:44.630 --> 00:42:46.750
Again, it's pointing you to
the fact that maybe the

00:42:46.750 --> 00:42:49.320
curricula is just so
over their heads.

00:42:49.320 --> 00:42:52.990
If you don't see any movement
in the bottom 80% from a

00:42:52.990 --> 00:42:57.330
program, and you only see
movement in the top, then you

00:42:57.330 --> 00:42:58.860
can say something.

00:42:58.860 --> 00:43:02.590
Again, remember though my
comment that you should say in

00:43:02.590 --> 00:43:05.126
advance what it is you're
looking for.

00:43:05.126 --> 00:43:07.570
That you care about this,
because you don't want to dice

00:43:07.570 --> 00:43:10.200
it every percentile.

00:43:10.200 --> 00:43:13.740
The 56th percentile, it really
worked for them.

00:43:13.740 --> 00:43:15.800
It didn't work for anybody
else, but the

00:43:15.800 --> 00:43:17.590
56th percentile really--

00:43:17.590 --> 00:43:19.340
well, that's not very
convincing.

00:43:19.340 --> 00:43:22.490
But if you've got a story that
I'm worried about the

00:43:22.490 --> 00:43:27.410
curricula being appropriate, and
it's really the kids who

00:43:27.410 --> 00:43:29.640
are falling behind who are going
to benefit most, and

00:43:29.640 --> 00:43:32.610
this is the intervention
designed for them, then it's

00:43:32.610 --> 00:43:35.280
completely appropriate to
be looking at does this

00:43:35.280 --> 00:43:37.820
intervention actually help the
kids it was designed to help,

00:43:37.820 --> 00:43:40.880
which are the low
achieving kids?

00:43:40.880 --> 00:43:43.996
That basic question
is appropriate.

00:43:43.996 --> 00:43:44.908
Yeah?

00:43:44.908 --> 00:43:47.190
AUDIENCE: This is a
little off topic.

00:43:47.190 --> 00:43:51.584
In our group we were reviewing
this, we talked about criteria

00:43:51.584 --> 00:43:54.512
for determining a low achieving
student, and that

00:43:54.512 --> 00:43:59.068
that was not based on anything
quantitative necessarily, that

00:43:59.068 --> 00:44:02.372
that was sort of a more
discretionary choice on the

00:44:02.372 --> 00:44:03.788
part of the teacher.

00:44:03.788 --> 00:44:06.690
But in fact, the thing that
you used to evaluate the

00:44:06.690 --> 00:44:09.920
efficacy of intervention
at the end was a test.

00:44:09.920 --> 00:44:12.670
And it was a pretest.

00:44:12.670 --> 00:44:14.860
So I'm just curious about
the test itself.

00:44:14.860 --> 00:44:18.485
Was this a specially designed
test by you?

00:44:18.485 --> 00:44:22.595
Was this something that
the NGO already had?

00:44:22.595 --> 00:44:27.900
Were students familiar with
this kind of a test?

00:44:27.900 --> 00:44:31.330
PROFESSOR: It was a test that
was designed to pick out

00:44:31.330 --> 00:44:35.090
people who were falling
behind the curricula.

00:44:35.090 --> 00:44:39.850
And the idea of the program was
that we want to pull out

00:44:39.850 --> 00:44:42.550
the kids who are really falling
behind the curricula.

00:44:42.550 --> 00:44:45.450
So it was a test designed to
pick out the people that they

00:44:45.450 --> 00:44:51.580
thought would benefit from the
program, and the things that

00:44:51.580 --> 00:44:54.960
they were targeting
in the program.

00:44:54.960 --> 00:44:58.050
You're right that in a sense
it would've been more

00:44:58.050 --> 00:45:01.790
effective to say, we'll
decide based on the

00:45:01.790 --> 00:45:03.850
test which kids go.

00:45:03.850 --> 00:45:05.730
This test is designed to figure
out who's going to

00:45:05.730 --> 00:45:08.270
benefit from the program,
who's the target

00:45:08.270 --> 00:45:09.570
group for the program.

00:45:09.570 --> 00:45:12.000
We'll tell the teachers these
are the kids that

00:45:12.000 --> 00:45:12.830
you should pull out.

00:45:12.830 --> 00:45:16.200
But I guess it was just seen
that in this context, the

00:45:16.200 --> 00:45:17.680
teachers have a lot
of control.

00:45:17.680 --> 00:45:22.940
And it was seen that they needed
to make that decision.

00:45:22.940 --> 00:45:26.050
Now in the end, they could see
whether the kids that were

00:45:26.050 --> 00:45:28.410
pulled out were in fact
the ones who scored

00:45:28.410 --> 00:45:29.930
lowly on the test.

00:45:29.930 --> 00:45:33.740
It wasn't a complete match,
but it was a decent match.

00:45:33.740 --> 00:45:38.450
So this wouldn't make any sense
if there wasn't some

00:45:38.450 --> 00:45:42.810
comparison comparative between
those two, if it wasn't a

00:45:42.810 --> 00:45:45.290
pretty good predictor of
who got the program.

00:45:45.290 --> 00:45:47.015
AUDIENCE: Did you develop
it, or had they

00:45:47.015 --> 00:45:48.433
already developed it?

00:45:48.433 --> 00:45:52.600
GUEST SPEAKER: Pratham
had developed it.

00:45:52.600 --> 00:45:53.485
PROFESSOR: So it
was the NGO--.

00:45:53.485 --> 00:45:55.470
GUEST SPEAKER: Pratham hired
them on the school curriculum.

00:45:55.470 --> 00:45:59.100
So they basically went through
the textbook of the grades

00:45:59.100 --> 00:46:00.540
three and four--

00:46:00.540 --> 00:46:06.329
actually on grades two and
three, and included questions

00:46:06.329 --> 00:46:07.323
[UNINTELLIGIBLE].

00:46:07.323 --> 00:46:11.300
AUDIENCE: And they did that
specifically for the purposes

00:46:11.300 --> 00:46:12.660
of this study?

00:46:12.660 --> 00:46:15.660
Or was this part of their
program, but it just wasn't

00:46:15.660 --> 00:46:17.290
the main criteria
for the study?

00:46:20.150 --> 00:46:21.270
GUEST SPEAKER: They were
testing people

00:46:21.270 --> 00:46:22.728
independent of the study.

00:46:22.728 --> 00:46:25.887
But I think in this particular
instance, this test was

00:46:25.887 --> 00:46:29.060
designed at the same
time as the study.

00:46:29.060 --> 00:46:32.020
PROFESSOR: They've since
actually developed another

00:46:32.020 --> 00:46:36.760
tool which we now use in a lot
of other places, which is a

00:46:36.760 --> 00:46:43.420
very, very basic tool for seeing
basically can you read.

00:46:43.420 --> 00:46:47.790
And it's a very nice tool for
sorting the very bottom.

00:46:47.790 --> 00:46:51.020
Because a lot of education tests
are all aimed at the

00:46:51.020 --> 00:46:53.780
top, and they don't sort out
the bottom half of the

00:46:53.780 --> 00:46:54.930
distribution.

00:46:54.930 --> 00:46:59.210
And Pratham's developed a very
nice test, which they now

00:46:59.210 --> 00:47:01.280
introduce in--

00:47:01.280 --> 00:47:06.210
they do a nationwide testing
of kids across India.

00:47:06.210 --> 00:47:09.240
And they have a very good
mapping of where a kid's

00:47:09.240 --> 00:47:10.020
falling behind.

00:47:10.020 --> 00:47:14.640
And it's a test that's really
designed to get at the bottom.

00:47:14.640 --> 00:47:18.856
We now use it when I was doing
a study in Bangladesh.

00:47:18.856 --> 00:47:20.820
And people in the World
Bank are using

00:47:20.820 --> 00:47:22.070
it to do it in Africa.

00:47:24.610 --> 00:47:27.320
It's called a dipstick because
it can be done quite quickly,

00:47:27.320 --> 00:47:30.600
but it gives you quite a good,
accurate indication of where

00:47:30.600 --> 00:47:31.900
people are going.

00:47:31.900 --> 00:47:33.680
So that was kind
of complicated.

00:47:33.680 --> 00:47:36.460
We're having trouble sorting out
exactly what was going on

00:47:36.460 --> 00:47:38.540
in different--

00:47:38.540 --> 00:47:41.370
although we get a very clear
idea of does the program work,

00:47:41.370 --> 00:47:42.170
and we know it works.

00:47:42.170 --> 00:47:46.030
If we want to try to get at
these fundamental ideas, we

00:47:46.030 --> 00:47:48.260
got some sense from it,
but it's a little

00:47:48.260 --> 00:47:50.330
hard to get at precisely.

00:47:50.330 --> 00:47:53.550
So how can we do a better job
of really nailing these

00:47:53.550 --> 00:47:55.940
fundamental issues that
we want to get at?

00:47:55.940 --> 00:48:01.200
Well, the alternative is that
the old economics thing if you

00:48:01.200 --> 00:48:05.485
want to get four outcomes, you
got to have four instruments.

00:48:08.470 --> 00:48:11.770
So do smaller classes
improve test scores?

00:48:11.770 --> 00:48:13.450
Well what's the obvious
way to do that?

00:48:13.450 --> 00:48:17.620
You just do a treatment
that adds teachers.

00:48:17.620 --> 00:48:21.450
Does accountable teachers
get better results?

00:48:21.450 --> 00:48:26.580
You make the new teachers more
accountable, and you randomize

00:48:26.580 --> 00:48:32.310
whether a kid gets a smaller
class with an accountable

00:48:32.310 --> 00:48:35.510
teacher, or a smaller class
with the old teacher, you

00:48:35.510 --> 00:48:36.330
randomize that.

00:48:36.330 --> 00:48:39.290
So you can test the new teachers
who are accountable

00:48:39.290 --> 00:48:43.770
versus the regular teachers
that were not accountable.

00:48:43.770 --> 00:48:45.980
Does streaming improve
test scores?

00:48:45.980 --> 00:48:48.480
Well, you do that separately
with the different treatment.

00:48:52.390 --> 00:48:55.930
In some schools, you divide
classes into ability

00:48:55.930 --> 00:48:58.160
groupings, and in other
schools, you just

00:48:58.160 --> 00:48:59.230
randomly mix them.

00:48:59.230 --> 00:49:03.390
And you've got an opportunity
to do that division because

00:49:03.390 --> 00:49:05.820
you've just added some
more teachers.

00:49:05.820 --> 00:49:08.660
But you randomize so some
schools get the division,

00:49:08.660 --> 00:49:12.440
unlike the Balsakhi, where
all the classes

00:49:12.440 --> 00:49:13.630
got divided by ability.

00:49:13.630 --> 00:49:16.220
This case sometimes they will
be and sometimes they won't,

00:49:16.220 --> 00:49:18.840
and then we've got more
variation, and we can pick up

00:49:18.840 --> 00:49:21.340
those things more precisely.

00:49:21.340 --> 00:49:23.910
Does focusing the basics
improve results?

00:49:23.910 --> 00:49:29.060
Well, you can train some people
to focus on the basics

00:49:29.060 --> 00:49:31.920
and only introduce those
in some schools.

00:49:31.920 --> 00:49:37.910
This is an actual project did
the first three of these.

00:49:37.910 --> 00:49:40.900
If you wanted to do this
one, you'd have

00:49:40.900 --> 00:49:44.010
to add another treatment.

00:49:44.010 --> 00:49:49.390
So extra teacher provision in
Kenya, at least one of those

00:49:49.390 --> 00:49:53.840
involved in Balsakhi went off
and did it again with more

00:49:53.840 --> 00:49:56.600
schools and more treatments.

00:49:56.600 --> 00:50:00.700
Esther was part of both
of these projects.

00:50:00.700 --> 00:50:06.490
So they started with a target
population of 330 schools.

00:50:06.490 --> 00:50:09.930
Some of them were randomized
into pure control group, no

00:50:09.930 --> 00:50:14.880
extra teacher, exactly as
they were doing before.

00:50:14.880 --> 00:50:18.490
Another bunch much more, and
you'll realize why, because

00:50:18.490 --> 00:50:22.590
I'm about to subdivide this into
many other things, got an

00:50:22.590 --> 00:50:23.820
extra teacher.

00:50:23.820 --> 00:50:27.350
And that extra teacher was
a contract teacher, i.e.

00:50:27.350 --> 00:50:31.320
they could be fired if
they didn't show up.

00:50:31.320 --> 00:50:43.970
So then you split it into those
who are streamed, i.e.

00:50:43.970 --> 00:50:46.960
they are grouped by ability when
they're segregated into

00:50:46.960 --> 00:50:50.210
the different classes,
and those who--

00:50:50.210 --> 00:50:54.410
I shouldn't say ability, I
should say achievement--

00:50:54.410 --> 00:50:58.820
versus those who are
just randomly

00:50:58.820 --> 00:51:00.640
divided between classes.

00:51:00.640 --> 00:51:01.890
There's no attempt to stream.

00:51:04.600 --> 00:51:07.750
Within that you've got
your extra teachers.

00:51:07.750 --> 00:51:10.050
Some of them are contract
teachers, some of them not.

00:51:10.050 --> 00:51:14.220
So you need to randomize which
classes get the government

00:51:14.220 --> 00:51:16.620
teacher and which get the
contract teacher.

00:51:16.620 --> 00:51:18.910
What would happen if you didn't
insist on that being

00:51:18.910 --> 00:51:21.430
randomized?

00:51:21.430 --> 00:51:23.920
Well no actually, that's
fine, because these

00:51:23.920 --> 00:51:25.390
classes are the same.

00:51:25.390 --> 00:51:30.950
But when I go here, we've got
some of the classes are

00:51:30.950 --> 00:51:33.300
grouped to be the low
level learning

00:51:33.300 --> 00:51:35.190
when they enter school.

00:51:35.190 --> 00:51:38.840
In Kenya, this was done by
whether you knew any English,

00:51:38.840 --> 00:51:43.640
because all the teaching is
in English in school.

00:51:43.640 --> 00:51:47.440
So there were some kids who
were turning up in school

00:51:47.440 --> 00:51:51.250
knowing some English and some
kids turning up who have no

00:51:51.250 --> 00:51:52.740
words of English.

00:51:52.740 --> 00:51:55.670
So the idea is, that's kind of
the fundamental thing when you

00:51:55.670 --> 00:51:58.190
start school.

00:51:58.190 --> 00:52:00.320
You're talking to a bunch of
people, some of whom know

00:52:00.320 --> 00:52:03.420
English and some don't,
can you adapt your

00:52:03.420 --> 00:52:04.580
teaching level to that?

00:52:04.580 --> 00:52:07.050
And this allows them to adapt
their teaching to whether the

00:52:07.050 --> 00:52:10.640
kids in the class know
any English or not.

00:52:10.640 --> 00:52:12.610
So some are grouped.

00:52:12.610 --> 00:52:14.690
So these are all grouped
by ability.

00:52:14.690 --> 00:52:18.470
You've got the lower level of
English learning at the

00:52:18.470 --> 00:52:20.230
beginning and the higher
level of English

00:52:20.230 --> 00:52:21.630
learning at the beginning.

00:52:21.630 --> 00:52:28.920
Now, if I didn't randomize, so
some of these got government

00:52:28.920 --> 00:52:29.980
teachers and--

00:52:29.980 --> 00:52:33.150
or contract teachers, and some
of these got government

00:52:33.150 --> 00:52:34.980
teachers versus contact
teachers.

00:52:34.980 --> 00:52:36.850
If I didn't insist
on randomizing

00:52:36.850 --> 00:52:39.490
this, what would happen?

00:52:39.490 --> 00:52:41.820
What would I fear would happen,
and why would that

00:52:41.820 --> 00:52:47.080
screw up my ability to do
the evaluation properly?

00:52:47.080 --> 00:52:52.010
If I just sorted they by the
level of scoring on English at

00:52:52.010 --> 00:52:54.700
the beginning, and then said
OK, you've got contract

00:52:54.700 --> 00:52:56.220
teachers, you've got government
teachers, you

00:52:56.220 --> 00:52:58.090
figure out who teaches
which class.

00:52:58.090 --> 00:53:03.736
What would I worry
about happening?

00:53:03.736 --> 00:53:06.176
AUDIENCE: They would assign
teachers in a systematic way,

00:53:06.176 --> 00:53:09.104
and you wouldn't be able to
determine whether the effects

00:53:09.104 --> 00:53:11.056
were due to the tracking,
or because we

00:53:11.056 --> 00:53:13.500
got the better teacher.

00:53:13.500 --> 00:53:13.840
PROFESSOR: Right.

00:53:13.840 --> 00:53:17.950
My guess is all the higher level
classes would be taken

00:53:17.950 --> 00:53:21.930
by the government teacher, and
all the lower classes would be

00:53:21.930 --> 00:53:24.070
given to the contract teacher.

00:53:24.070 --> 00:53:26.675
And then I wouldn't know whether
it was streaming works

00:53:26.675 --> 00:53:30.540
for high level but not for low
level or vice versa, or

00:53:30.540 --> 00:53:33.920
whether it's something to
do with the teacher.

00:53:33.920 --> 00:53:37.830
I have to take control and say,
you're only doing this if

00:53:37.830 --> 00:53:42.310
I get to decide which class
is taken by which teacher.

00:53:42.310 --> 00:53:46.970
But this way we know that half
of the lower level kids got

00:53:46.970 --> 00:53:49.110
taken by a government teacher,
half of them

00:53:49.110 --> 00:53:50.340
by a contract teacher.

00:53:50.340 --> 00:53:54.095
Which classes it was
was randomized.

00:53:54.095 --> 00:53:59.605
Because otherwise I'm going to
get all the low ones being--

00:53:59.605 --> 00:54:02.530
or most of the low ones being
taken by the contract teacher,

00:54:02.530 --> 00:54:04.300
most of the high ones
being taken by

00:54:04.300 --> 00:54:05.460
the government teacher.

00:54:05.460 --> 00:54:08.740
And I've got two things going
on, and only one difference

00:54:08.740 --> 00:54:12.282
that I can't sort
out the effects.

00:54:12.282 --> 00:54:16.815
AUDIENCE: If the interest is
to determine the effect of

00:54:16.815 --> 00:54:18.065
[UNINTELLIGIBLE PHRASE].

00:54:26.320 --> 00:54:28.050
PROFESSOR: But I'm not.

00:54:28.050 --> 00:54:29.300
What I'm--

00:54:31.730 --> 00:54:34.812
lots of clicks involved.

00:54:34.812 --> 00:54:37.200
What were my questions
at the beginning?

00:54:40.651 --> 00:54:41.901
Oh dear.

00:54:45.090 --> 00:54:51.250
I'm trying to identify things
to deal with all of these.

00:54:51.250 --> 00:54:56.030
And I'm trying to identify a
bunch of different things.

00:54:56.030 --> 00:54:57.890
I'm trying to identify
what's the effect of

00:54:57.890 --> 00:55:00.510
smaller class sizes?

00:55:00.510 --> 00:55:03.380
I'm also worrying that children
in these big classes

00:55:03.380 --> 00:55:05.080
are at different levels.

00:55:05.080 --> 00:55:07.745
So I also want to answer
what's the effect

00:55:07.745 --> 00:55:09.830
of streaming kids?

00:55:09.830 --> 00:55:10.560
So that's our--

00:55:10.560 --> 00:55:10.910
I agree.

00:55:10.910 --> 00:55:14.360
It's a whole separate
question.

00:55:14.360 --> 00:55:18.010
And I'm describing a design of
an evaluation that does not

00:55:18.010 --> 00:55:19.870
answer one question.

00:55:19.870 --> 00:55:22.700
It answers three questions.

00:55:22.700 --> 00:55:24.900
This particular one didn't
answer this one.

00:55:24.900 --> 00:55:28.580
So three questions that-- so one
evaluation with multiple

00:55:28.580 --> 00:55:32.830
treatments is designed to answer
three of, I would argue

00:55:32.830 --> 00:55:36.890
three of the most fundamental
questions in education.

00:55:36.890 --> 00:55:39.410
How important a class size?

00:55:39.410 --> 00:55:42.500
How important is
it to deliver--

00:55:42.500 --> 00:55:45.710
to have coherent classes that
are all at the same level so

00:55:45.710 --> 00:55:48.300
that you can deliver a
message that's at the

00:55:48.300 --> 00:55:50.400
level of the children?

00:55:50.400 --> 00:55:52.680
And how important is it to make
teachers accountable?

00:55:52.680 --> 00:55:54.650
Those are three completely
different questions.

00:55:54.650 --> 00:55:59.520
The Balsakhi was getting some
answers to these, but they

00:55:59.520 --> 00:56:01.500
only changed one thing.

00:56:01.500 --> 00:56:05.470
They only had one program, so
it was hard to tease out.

00:56:05.470 --> 00:56:10.070
So we're designing this
to answer those

00:56:10.070 --> 00:56:13.120
three different questions.

00:56:13.120 --> 00:56:14.660
So you're right.

00:56:14.660 --> 00:56:16.840
If you just want to answer
what's the effect of adding

00:56:16.840 --> 00:56:19.700
teaches, you don't need such
a complicated design.

00:56:19.700 --> 00:56:23.600
But they also wanted to answer
what's the most effective way

00:56:23.600 --> 00:56:26.520
of adding more teachers, or
alternatively, just what's a

00:56:26.520 --> 00:56:28.180
better way of organizing
a school?

00:56:28.180 --> 00:56:31.570
Does this improve the
way the school works

00:56:31.570 --> 00:56:33.660
to do it this way?

00:56:33.660 --> 00:56:36.090
And then you see that
we've got--

00:56:36.090 --> 00:56:38.880
that's why we started with
lots more schools in this

00:56:38.880 --> 00:56:42.480
group, because we're dividing
it more times, and we need

00:56:42.480 --> 00:56:44.340
these different.

00:56:44.340 --> 00:56:47.820
So how are we going to actually
look at the results

00:56:47.820 --> 00:56:50.650
to be able to answer
this question?

00:56:50.650 --> 00:56:53.840
So the first hypothesis we
have is providing extra

00:56:53.840 --> 00:56:56.540
teachers leads to better
educational outcomes.

00:56:56.540 --> 00:57:02.090
Just smaller class sizes, better
educational outcomes.

00:57:02.090 --> 00:57:06.510
So to answer that question, we
simply compare all the people

00:57:06.510 --> 00:57:12.280
who are in this group with all
the people who are in control.

00:57:12.280 --> 00:57:17.800
That's the comparison when we
do the analysis, we do that.

00:57:17.800 --> 00:57:24.370
Our secondary thing is, is it
more important to have smaller

00:57:24.370 --> 00:57:27.490
class sizes for lower
performing kids?

00:57:27.490 --> 00:57:29.780
Are they the ones who
benefit most?

00:57:29.780 --> 00:57:31.530
And then we can look at
the low performing

00:57:31.530 --> 00:57:33.870
kids in these groups.

00:57:33.870 --> 00:57:37.540
This is a subgroup analysis
saying, is the program effect

00:57:37.540 --> 00:57:41.410
different for different
kinds of kids?

00:57:41.410 --> 00:57:45.595
Is this kind of approach most
important for people who start

00:57:45.595 --> 00:57:47.730
at a given level?

00:57:47.730 --> 00:57:50.620
So that's relatively simple.

00:57:50.620 --> 00:57:54.540
Our second hypothesis is
students in classes grouped by

00:57:54.540 --> 00:57:57.490
ability perform better on
average than those in mixed

00:57:57.490 --> 00:57:58.310
level classes.

00:57:58.310 --> 00:58:01.152
So this is exactly the second
question, which is, I agree, a

00:58:01.152 --> 00:58:03.260
completely separate question.

00:58:03.260 --> 00:58:06.960
And that, we don't
look at this one.

00:58:09.870 --> 00:58:14.460
Because these have a different
class size, so you don't want

00:58:14.460 --> 00:58:17.020
to throw them in with this
lot, because then you're

00:58:17.020 --> 00:58:19.280
changing two things at once.

00:58:19.280 --> 00:58:24.660
You take all of those who have
smaller class sizes, some who

00:58:24.660 --> 00:58:29.690
are mixed ability, and some who
are split by the level of

00:58:29.690 --> 00:58:30.940
attainment when they come in.

00:58:34.760 --> 00:58:39.640
And a big question in the
education literature is, maybe

00:58:39.640 --> 00:58:43.550
it's good for high performing
kids to be separated out, but

00:58:43.550 --> 00:58:46.900
maybe that actually hurts
low performing kids.

00:58:46.900 --> 00:58:48.210
So they were able
to look at that.

00:58:48.210 --> 00:58:52.150
Actually they found that both
low performing at the baseline

00:58:52.150 --> 00:58:55.600
and high performing at the
baseline, both did better

00:58:55.600 --> 00:59:00.060
under streaming than under
mixed classes.

00:59:00.060 --> 00:59:03.750
Their argument being that
those who are in the low

00:59:03.750 --> 00:59:05.750
performing group
were actually--

00:59:05.750 --> 00:59:09.460
their teacher could teach to
their level and they did

00:59:09.460 --> 00:59:11.160
better as a result.

00:59:14.740 --> 00:59:22.080
Now the other question we have
is, is it better to--?

00:59:22.080 --> 00:59:24.140
AUDIENCE: Just a point
of interest.

00:59:24.140 --> 00:59:26.110
That's a very different
conclusion than

00:59:26.110 --> 00:59:26.865
has been found here.

00:59:26.865 --> 00:59:27.771
PROFESSOR: Yes.

00:59:27.771 --> 00:59:31.240
AUDIENCE: Because the biggest
educational research study

00:59:31.240 --> 00:59:34.820
here found the Coleman
effect, the benefit

00:59:34.820 --> 00:59:37.995
to the poorer kids.

00:59:37.995 --> 00:59:41.060
So it's interesting why in a
developing country it would be

00:59:41.060 --> 00:59:43.760
so different.

00:59:43.760 --> 00:59:44.804
PROFESSOR: I don't know
the answer to that.

00:59:44.804 --> 00:59:48.660
It was interesting that
virtually everything they

00:59:48.660 --> 00:59:51.060
found was opposite to
what you find in the

00:59:51.060 --> 00:59:53.120
developed country example.

00:59:53.120 --> 00:59:56.680
So class size did not
have an effect.

00:59:56.680 --> 01:00:00.500
Well, the class size evidence
is somewhat mixed.

01:00:00.500 --> 01:00:04.150
But a lot of the more recent
stuff, the randomized class

01:00:04.150 --> 01:00:09.100
size ones, I think, have
found class size--

01:00:09.100 --> 01:00:10.970
you probably know the--

01:00:10.970 --> 01:00:14.190
Mike, you know the US education

01:00:14.190 --> 01:00:15.120
literature better than me.

01:00:15.120 --> 01:00:18.070
But my understanding is a lot of
the recent stuff has found

01:00:18.070 --> 01:00:21.500
class size affects in the US.

01:00:21.500 --> 01:00:26.380
But now, some people argue it
only helps if you bring it

01:00:26.380 --> 01:00:29.760
down below 20.

01:00:29.760 --> 01:00:34.320
And they're bringing
it from 100 to 50.

01:00:34.320 --> 01:00:37.490
And a lot of educationalists
will say, well, there's no

01:00:37.490 --> 01:00:40.820
point, because unless you give
individual attention, and you

01:00:40.820 --> 01:00:43.230
can't give individual
attention at 50.

01:00:43.230 --> 01:00:44.650
You've got to bring
it way down.

01:00:44.650 --> 01:00:47.800
So there's another question
about these things could be

01:00:47.800 --> 01:00:50.050
very different in how
much you would do.

01:00:50.050 --> 01:00:52.580
You can't just say class
size doesn't matter.

01:00:52.580 --> 01:00:55.850
It may matter over some ranges
and not over other ranges.

01:00:55.850 --> 01:01:01.120
But in terms of a practical
proposal, these countries

01:01:01.120 --> 01:01:03.785
aren't going to get
down to below 20.

01:01:03.785 --> 01:01:06.600
So they're arguing about should
we get it down from 100

01:01:06.600 --> 01:01:09.080
to 50, and it doesn't seem
to have a very big

01:01:09.080 --> 01:01:11.660
effect in this context.

01:01:11.660 --> 01:01:19.310
What does have an effect is the
streaming, and also which

01:01:19.310 --> 01:01:22.560
teacher you have; whether you
have an accountable teacher.

01:01:22.560 --> 01:01:29.320
So the accountable teacher, the
contract teachers who have

01:01:29.320 --> 01:01:32.890
less experience, almost
as much training

01:01:32.890 --> 01:01:34.070
actually in this context.

01:01:34.070 --> 01:01:36.880
In the Balsakhi case, the
contract teachers had less

01:01:36.880 --> 01:01:39.790
training than government
teachers, still performed

01:01:39.790 --> 01:01:42.750
amazingly; were the ones
who saw the really big

01:01:42.750 --> 01:01:44.000
improvements.

01:01:47.220 --> 01:01:49.360
But here, the contract teachers
are basically people

01:01:49.360 --> 01:01:52.630
who are trained and waiting to
become a government teacher,

01:01:52.630 --> 01:01:53.690
but haven't--

01:01:53.690 --> 01:01:55.770
they've trained lots of
government teachers, but they

01:01:55.770 --> 01:01:57.340
haven't got any places
for them.

01:01:57.340 --> 01:01:59.270
So they're mainly the people
who are taking up these

01:01:59.270 --> 01:02:04.290
contracts did much better,
much higher test scores.

01:02:04.290 --> 01:02:07.380
You see here, we've got three
different boxes with contract

01:02:07.380 --> 01:02:10.430
teachers, and three different
boxes of government.

01:02:10.430 --> 01:02:13.450
And we can pool all of these
together to make the

01:02:13.450 --> 01:02:14.790
comparison.

01:02:14.790 --> 01:02:15.780
Why can we do that?

01:02:15.780 --> 01:02:17.970
Because some of them have lower
class sizes and some of

01:02:17.970 --> 01:02:19.090
them don't.

01:02:19.090 --> 01:02:22.390
But these have higher class
sizes and these have higher

01:02:22.390 --> 01:02:24.260
class sizes.

01:02:24.260 --> 01:02:28.280
So the ones with higher class
sizes are in both treatment

01:02:28.280 --> 01:02:29.640
and control, so we're fine.

01:02:29.640 --> 01:02:33.280
The average class size in all
this red box is the same as

01:02:33.280 --> 01:02:37.740
the average class size
in all this red box.

01:02:37.740 --> 01:02:40.690
We could pool all of them
together, which helps a lot on

01:02:40.690 --> 01:02:42.460
our sample size, because you
remember there are only about

01:02:42.460 --> 01:02:45.571
55 in each one of these.

01:02:45.571 --> 01:02:47.735
AUDIENCE: What exactly
was the criteria for

01:02:47.735 --> 01:02:48.938
the contract teacher?

01:02:48.938 --> 01:02:50.390
What's their contract
based on?

01:02:50.390 --> 01:02:54.910
PROFESSOR: So their contract
is with the local NGO who

01:02:54.910 --> 01:02:56.660
hires them.

01:02:56.660 --> 01:02:59.090
They have a whole other cross
cutting thing which I haven't

01:02:59.090 --> 01:03:02.920
got into, which is some
of the communities--

01:03:02.920 --> 01:03:06.090
so they were hired by the
community actually, with

01:03:06.090 --> 01:03:08.480
funding from an NGO.

01:03:08.480 --> 01:03:12.110
They were responsible
to the community.

01:03:12.110 --> 01:03:17.920
And sometimes the communities
were given extra training to

01:03:17.920 --> 01:03:21.700
help them oversee these
teachers and

01:03:21.700 --> 01:03:22.450
sometimes they weren't.

01:03:22.450 --> 01:03:26.260
But they're contracted in the
sense, if you get a job with

01:03:26.260 --> 01:03:28.360
the government, it's for life.

01:03:28.360 --> 01:03:31.040
And these guys can be terminated
at any time.

01:03:31.040 --> 01:03:31.800
AUDIENCE: Would they
be terminated

01:03:31.800 --> 01:03:35.020
based on test stores?

01:03:35.020 --> 01:03:35.780
PROFESSOR: No.

01:03:35.780 --> 01:03:39.580
So the main thing that's
going on here is that

01:03:39.580 --> 01:03:41.582
these guys show up.

01:03:41.582 --> 01:03:42.990
AUDIENCE: Oh I see,
[UNINTELLIGIBLE].

01:03:42.990 --> 01:03:45.390
PROFESSOR: It's just like
they actually--

01:03:45.390 --> 01:03:48.020
AUDIENCE: If you didn't show
up, you would get fired?

01:03:48.020 --> 01:03:48.300
PROFESSOR: Yes.

01:03:48.300 --> 01:03:51.910
So it's not like you've got to
have an improvement of 10% in

01:03:51.910 --> 01:03:53.350
your test scores.

01:03:53.350 --> 01:03:58.390
No, it's like maybe if they like
severely beat the kids

01:03:58.390 --> 01:04:02.570
regularly, or got known for
raping the kids, which is

01:04:02.570 --> 01:04:05.570
actually relatively
common, that might

01:04:05.570 --> 01:04:06.460
get them into trouble.

01:04:06.460 --> 01:04:09.780
But mainly the issue here is
that they showed up, because

01:04:09.780 --> 01:04:12.030
they knew that if they didn't
show up on a regular basis,

01:04:12.030 --> 01:04:13.280
they'd be fired.

01:04:16.790 --> 01:04:22.070
And that seems to be the big
thing that's going on.

01:04:22.070 --> 01:04:25.620
Interestingly, the cross cutting
thing I talked about,

01:04:25.620 --> 01:04:28.420
we're looking at training
the communities to

01:04:28.420 --> 01:04:30.800
oversee these teachers.

01:04:30.800 --> 01:04:33.770
Sometimes when you've got the
contract teacher showing up,

01:04:33.770 --> 01:04:35.345
the government teacher
showed up less.

01:04:43.740 --> 01:04:46.000
So sorry, I was saying
about class size.

01:04:46.000 --> 01:04:49.550
We don't have lower class
size, but we do have

01:04:49.550 --> 01:04:51.800
differences in these in terms
of whether they're split

01:04:51.800 --> 01:04:53.940
randomly or streamed.

01:04:53.940 --> 01:04:56.560
So there are other differences
going on between these

01:04:56.560 --> 01:04:59.500
different boxes, but they all
have the same class size.

01:04:59.500 --> 01:05:01.910
And on average, they all
have the same thing.

01:05:01.910 --> 01:05:04.880
On average, all the other
characteristics are similar

01:05:04.880 --> 01:05:08.140
between these two groups.

01:05:08.140 --> 01:05:10.730
Sometimes the government teacher
saw there was an extra

01:05:10.730 --> 01:05:14.020
teacher, so they showed
up even less than

01:05:14.020 --> 01:05:16.080
in this pure control.

01:05:16.080 --> 01:05:19.180
But where you had the community
trained to oversee

01:05:19.180 --> 01:05:20.810
them, that actually
happened less.

01:05:20.810 --> 01:05:24.320
So that was the one benefit of
giving help to the community

01:05:24.320 --> 01:05:31.660
to oversee, because the contract
teacher said look,

01:05:31.660 --> 01:05:33.540
the community's breathing
down my neck.

01:05:33.540 --> 01:05:34.950
I'm meant to be teaching
this class.

01:05:34.950 --> 01:05:37.770
I can't always be teaching
your class as well.

01:05:37.770 --> 01:05:43.230
But I'm just stunned in terms
of the education literature.

01:05:43.230 --> 01:05:45.530
In all the work that we've done,
this is just the first

01:05:45.530 --> 01:05:46.940
order problem.

01:05:46.940 --> 01:05:49.280
Just showing up is just
one of the first

01:05:49.280 --> 01:05:52.470
order problems in education.

01:05:52.470 --> 01:05:54.350
And in health too.

01:05:54.350 --> 01:05:56.420
Health is even worse.

01:05:56.420 --> 01:06:00.390
Fewer health workers show up
than teachers show up in any

01:06:00.390 --> 01:06:01.640
country we've ever studied.

01:06:04.050 --> 01:06:08.060
So that's not very complicated,
it's just that

01:06:08.060 --> 01:06:09.210
they actually do their job.

01:06:09.210 --> 01:06:10.078
Yeah?

01:06:10.078 --> 01:06:12.263
AUDIENCE: How do you parse out
what [UNINTELLIGIBLE] the

01:06:12.263 --> 01:06:14.260
government teachers sometimes
show up even less?

01:06:14.260 --> 01:06:15.975
So there's no counter-factual--?

01:06:19.190 --> 01:06:21.010
PROFESSOR: Yeah, because you've
got government teachers

01:06:21.010 --> 01:06:23.840
here, in the pure control.

01:06:23.840 --> 01:06:25.000
They are all government
teachers.

01:06:25.000 --> 01:06:31.690
Now they have a bigger class
size, but you can compare the

01:06:31.690 --> 01:06:35.840
government teachers here with
the government teachers here.

01:06:35.840 --> 01:06:40.060
And indeed, that's what's
giving you your--

01:06:40.060 --> 01:06:42.500
a pure class size effect
without changing

01:06:42.500 --> 01:06:46.478
accountability, you would look
at this versus this.

01:06:46.478 --> 01:06:49.610
AUDIENCE: Do you have any
problems that spill over when

01:06:49.610 --> 01:06:52.484
the government teacher doesn't
turn up, so the contract

01:06:52.484 --> 01:06:55.090
teacher has to teach
both classes?

01:06:55.090 --> 01:06:56.370
PROFESSOR: In a sense,
that's just the

01:06:56.370 --> 01:06:59.140
program effect, right?

01:06:59.140 --> 01:07:05.460
It's not a spill over to the
control, because these are all

01:07:05.460 --> 01:07:08.980
within the fact that then--

01:07:08.980 --> 01:07:11.960
it's not like they're then
showing up more to some

01:07:11.960 --> 01:07:12.760
control over here.

01:07:12.760 --> 01:07:16.180
So a spillover is when you have
an effect on the control.

01:07:16.180 --> 01:07:19.130
But these aren't in the
control, they're

01:07:19.130 --> 01:07:20.380
between these two.

01:07:24.250 --> 01:07:29.030
But that's the effect of having
a contract teacher.

01:07:29.030 --> 01:07:31.140
When you have a contract
teacher, you've got to take

01:07:31.140 --> 01:07:34.430
into account that it may
change the effect.

01:07:34.430 --> 01:07:38.240
It may change what the
government teacher does.

01:07:38.240 --> 01:07:41.060
So it's not going to be
effective if it totally

01:07:41.060 --> 01:07:44.374
reduces what a government
teaches does.

01:07:44.374 --> 01:07:48.358
AUDIENCE: But wouldn't it limit
your ability to compare

01:07:48.358 --> 01:07:50.350
across your different
treatments?

01:07:50.350 --> 01:07:53.338
Because if you've got--

01:07:53.338 --> 01:07:57.820
please go back to
the last slide--

01:07:57.820 --> 01:08:00.808
it influences your class size.

01:08:00.808 --> 01:08:05.389
Because if I've got a contract
teacher who's now having to do

01:08:05.389 --> 01:08:09.213
the work of a government teacher
too, what were two

01:08:09.213 --> 01:08:12.615
classrooms have now effectively
become one, so it

01:08:12.615 --> 01:08:15.595
is a spillover in
that that is now

01:08:15.595 --> 01:08:17.090
behaving like your control.

01:08:17.090 --> 01:08:20.800
PROFESSOR: Well, when you look
at the class size effect,

01:08:20.800 --> 01:08:29.319
you're looking at here,
between these two.

01:08:29.319 --> 01:08:32.729
And what you're saying is, you
think you've halved class size

01:08:32.729 --> 01:08:35.920
by having twice as
many teachers.

01:08:35.920 --> 01:08:39.970
But actually you haven't, but
you take into account when

01:08:39.970 --> 01:08:43.060
you're comparing that.

01:08:43.060 --> 01:08:45.080
So you're saying you're
doubling the number of

01:08:45.080 --> 01:08:55.140
teachers, but the class size is
improving by less than that

01:08:55.140 --> 01:08:57.510
slightly, because you've
got more absenteeism.

01:08:57.510 --> 01:08:59.609
But on the other hand, if you're
looking at the program,

01:08:59.609 --> 01:09:02.420
is this a good thing to do?

01:09:02.420 --> 01:09:06.260
That's part of what it is.

01:09:06.260 --> 01:09:09.520
So in all of these cases, when
you write it up and you

01:09:09.520 --> 01:09:14.220
explain what's going on, you
don't just show the numbers,

01:09:14.220 --> 01:09:17.180
you explain this is
the mechanism.

01:09:17.180 --> 01:09:19.850
And you're measuring all that
you were talking about before

01:09:19.850 --> 01:09:22.279
in terms of the theory of change
and measuring each step

01:09:22.279 --> 01:09:23.260
along the way.

01:09:23.260 --> 01:09:25.580
Then why is it we think
is going on.

01:09:30.359 --> 01:09:33.399
And then you can see, as I say,
if you've got this other

01:09:33.399 --> 01:09:36.500
design of looking at which--

01:09:36.500 --> 01:09:39.580
another treatment that actually
managed to keep the

01:09:39.580 --> 01:09:41.840
two classes more separate
than what's--

01:09:41.840 --> 01:09:45.200
what do we see in those versus
the ones that weren't able to

01:09:45.200 --> 01:09:46.120
keep the two separate.

01:09:46.120 --> 01:09:48.580
It wasn't that they
never showed up.

01:09:48.580 --> 01:09:51.819
It was just that they
did see that--

01:09:51.819 --> 01:09:53.260
and it is important.

01:09:53.260 --> 01:09:55.470
What I would call that is
a kind of unintended

01:09:55.470 --> 01:10:03.430
consequences, and that just
emphasizes the fact that you

01:10:03.430 --> 01:10:06.060
really need to be thinking about
what are some of the

01:10:06.060 --> 01:10:08.360
potential unintended
consequences of what you're

01:10:08.360 --> 01:10:14.120
doing, so that you can measure
them and make sure that you

01:10:14.120 --> 01:10:15.400
pick them up.

01:10:15.400 --> 01:10:17.750
And then if it doesn't work,
maybe it's because of that.

01:10:17.750 --> 01:10:21.140
And then what can you do to
offset that, and hopefully

01:10:21.140 --> 01:10:22.460
maybe you have another
treatment that

01:10:22.460 --> 01:10:23.710
tries to offset that.

01:10:29.400 --> 01:10:34.770
So then you can say, is it more
effective in those ones

01:10:34.770 --> 01:10:37.300
where you've got tracking
or not?

01:10:41.290 --> 01:10:41.480
Yeah?

01:10:41.480 --> 01:10:44.072
AUDIENCE: Are government
teachers assigned in the same

01:10:44.072 --> 01:10:47.404
manner within the community,
where usually if you're

01:10:47.404 --> 01:10:49.320
training, you'd stay in
the same community?

01:10:49.320 --> 01:10:50.570
PROFESSOR: No.

01:10:53.410 --> 01:10:55.202
AUDIENCE: I don't understand
how large

01:10:55.202 --> 01:10:55.935
these communities are.

01:10:55.935 --> 01:10:58.025
But if you're comparing somebody
who was drawn from a

01:10:58.025 --> 01:11:01.502
community, like a contractor
was drawn from a community,

01:11:01.502 --> 01:11:04.208
and I understand you're mostly
focusing on absenteeism, not

01:11:04.208 --> 01:11:05.930
the ability of the teacher.

01:11:05.930 --> 01:11:09.866
But if the teacher is drawn from
the same community as the

01:11:09.866 --> 01:11:12.326
contract teacher, or the
government teachers are

01:11:12.326 --> 01:11:16.480
randomized, is there any way
you could have some sort of

01:11:16.480 --> 01:11:17.600
bias situation there?

01:11:17.600 --> 01:11:20.710
Because if you're from the
same community, maybe you

01:11:20.710 --> 01:11:21.833
consistently get--.

01:11:21.833 --> 01:11:26.100
PROFESSOR: OK, let's be really
careful when we bandy around

01:11:26.100 --> 01:11:33.290
the word bias, because bias
means something specific.

01:11:33.290 --> 01:11:37.380
So this is I think what you're
talking about is, be careful

01:11:37.380 --> 01:11:41.330
that you're understanding what
it is that's generating the

01:11:41.330 --> 01:11:43.280
difference between the contract
teachers and the

01:11:43.280 --> 01:11:45.360
government teachers.

01:11:45.360 --> 01:11:49.820
I'm saying that it's the fact
that they're accountable and

01:11:49.820 --> 01:11:51.880
they can be fired.

01:11:51.880 --> 01:11:54.990
But it may also be
that they're more

01:11:54.990 --> 01:11:57.460
local to the area.

01:11:57.460 --> 01:11:59.080
They may live--

01:11:59.080 --> 01:12:01.930
or not live, because they all
live in the area now, but they

01:12:01.930 --> 01:12:07.550
are originally from that
community, and that may make

01:12:07.550 --> 01:12:12.740
them more likely to be more
responsive to the

01:12:12.740 --> 01:12:14.110
needs of the community.

01:12:14.110 --> 01:12:16.620
So it's not just that they can
be fired, it's that they know

01:12:16.620 --> 01:12:18.920
people in the community, and
they're more responsive.

01:12:18.920 --> 01:12:24.530
So that's potentially part of
the reason for why this is

01:12:24.530 --> 01:12:26.590
different from this.

01:12:26.590 --> 01:12:29.240
If again, if you're looking at
the practical thing of what

01:12:29.240 --> 01:12:32.920
somebody would do when they do
these kind of locally hired

01:12:32.920 --> 01:12:37.580
para-teachers, which is this
isn't just made up for some

01:12:37.580 --> 01:12:38.730
academic point.

01:12:38.730 --> 01:12:41.590
This is happening around the
world a lot, because

01:12:41.590 --> 01:12:45.100
governments can't afford
to double the number of

01:12:45.100 --> 01:12:46.510
government teachers.

01:12:46.510 --> 01:12:51.530
So a lot of what they're doing
in a lot of places is you have

01:12:51.530 --> 01:12:53.390
shiksha mitras in
India, you have

01:12:53.390 --> 01:12:54.950
para-teachers across Africa.

01:12:54.950 --> 01:13:01.320
People who are hired, not
tenured, and often have more

01:13:01.320 --> 01:13:03.345
roots in the local community.

01:13:03.345 --> 01:13:06.220
AUDIENCE: And they're
paid less?

01:13:06.220 --> 01:13:06.620
PROFESSOR: Yes.

01:13:06.620 --> 01:13:08.510
A lot, lot, lot less.

01:13:08.510 --> 01:13:12.880
So these guys are paid a lot
less than these guys.

01:13:12.880 --> 01:13:14.110
So you're right.

01:13:14.110 --> 01:13:17.970
You're changing a few things
from this to this, and they're

01:13:17.970 --> 01:13:20.240
doing better.

01:13:20.240 --> 01:13:22.320
And is it because they're
paid less?

01:13:22.320 --> 01:13:26.410
I doubt it's directly because
they're paid less.

01:13:26.410 --> 01:13:29.980
But this is a relevant package
is I guess what I'm saying.

01:13:29.980 --> 01:13:31.870
This is a relevant package
that a lot of

01:13:31.870 --> 01:13:34.150
countries are exploring.

01:13:34.150 --> 01:13:37.270
And you're right, it could be
some link to a local area,

01:13:37.270 --> 01:13:43.530
although a lot of people in
Kenya, they will be allocated

01:13:43.530 --> 01:13:48.960
by the central government, but
they will request to go to if

01:13:48.960 --> 01:13:51.480
not their exact community,
to their general area.

01:13:55.500 --> 01:14:00.650
And interesting, shiksha
mitras in India are

01:14:00.650 --> 01:14:05.430
theoretically controlled by
the local community, but

01:14:05.430 --> 01:14:11.540
actually the money comes from
the state government at least.

01:14:11.540 --> 01:14:17.290
And they don't show up any more
than regular teachers do.

01:14:17.290 --> 01:14:19.660
And they are literally
from the community.

01:14:19.660 --> 01:14:22.030
So it doesn't seem to have
helped that much in India.

01:14:28.020 --> 01:14:32.320
So again, you can just split
this and look at it, is the

01:14:32.320 --> 01:14:35.770
government teacher versus the
contract teacher better in

01:14:35.770 --> 01:14:37.623
different situations or
other situations?

01:14:42.870 --> 01:14:46.110
Are they particularly better for
low performing kids or for

01:14:46.110 --> 01:14:48.670
high performing kids?

01:14:48.670 --> 01:14:51.380
And that you can only do though
if you've got-- if the

01:14:51.380 --> 01:14:54.790
55, remember we've got
55 in each of these.

01:14:54.790 --> 01:14:57.590
So when we pooled them, we have
a lot more statistical

01:14:57.590 --> 01:14:59.760
power than if we try and--

01:14:59.760 --> 01:15:03.430
where I'm just doing one box
versus another box, I don't

01:15:03.430 --> 01:15:05.550
have a lot of statistical
power.

01:15:05.550 --> 01:15:08.310
So you have to think at the
beginning exactly how--

01:15:08.310 --> 01:15:11.280
do you really want to tease
out those real detailed

01:15:11.280 --> 01:15:16.350
questions, or am I OK with the
general pooling of government

01:15:16.350 --> 01:15:17.600
versus contract teachers?

01:15:20.430 --> 01:15:24.420
So that's an example of--

01:15:24.420 --> 01:15:27.005
two different examples,
Balsakhi, where we tried to

01:15:27.005 --> 01:15:30.890
answer all these questions with
changing one treatment.

01:15:30.890 --> 01:15:40.080
So the benefits of these
cross-cutting treatments is

01:15:40.080 --> 01:15:49.340
you can explicitly test these
more specific questions.

01:15:49.340 --> 01:15:52.690
You can also explicitly test
interactions if you do it in a

01:15:52.690 --> 01:15:56.030
cross-cutting, as opposed to
one experiment here that

01:15:56.030 --> 01:16:00.350
reduces class size, one
experiment here that has

01:16:00.350 --> 01:16:02.000
contract teachers.

01:16:02.000 --> 01:16:05.040
If you do it all in one place,
you can see, does it work

01:16:05.040 --> 01:16:08.640
better if I pile all of them
on together, or if it works

01:16:08.640 --> 01:16:11.860
better with this particular
subgroup?

01:16:11.860 --> 01:16:14.700
And often, you can economize
on data collection because

01:16:14.700 --> 01:16:19.520
you're doing one survey, and
changing different things.

01:16:19.520 --> 01:16:25.200
And as long as you keep a handle
and keep it straight,

01:16:25.200 --> 01:16:28.460
there's a lot of piloting of the
questionnaire, which you

01:16:28.460 --> 01:16:29.520
only have to do once.

01:16:29.520 --> 01:16:33.130
And maybe you only have to stick
one RA and you hope that

01:16:33.130 --> 01:16:36.970
they cover 330 schools instead
of 100, but you only have to

01:16:36.970 --> 01:16:39.820
pay one salary.

01:16:39.820 --> 01:16:46.930
So the problem is that
when you've got a

01:16:46.930 --> 01:16:50.100
cost-cutting design--

01:16:50.100 --> 01:16:58.960
so a cross-cutting is this,
where we're using all of the--

01:16:58.960 --> 01:17:01.150
we're looking at government
teachers versus contract

01:17:01.150 --> 01:17:07.330
teachers across these different
groups, versus

01:17:07.330 --> 01:17:09.510
across these different groups.

01:17:09.510 --> 01:17:16.860
The negative is that I'm looking
at a contract teacher

01:17:16.860 --> 01:17:19.000
versus a government teacher.

01:17:19.000 --> 01:17:23.120
In the background, some of those
classes are streamed

01:17:23.120 --> 01:17:24.370
versus not streamed.

01:17:31.310 --> 01:17:34.570
It should all fall out in the
wash, because these have

01:17:34.570 --> 01:17:35.420
streamed as well.

01:17:35.420 --> 01:17:39.940
So to the extent that streaming
has an effect, it

01:17:39.940 --> 01:17:41.700
has an equal effect
in these two.

01:17:48.590 --> 01:17:51.870
And whenever you do an
evaluation, some of the

01:17:51.870 --> 01:17:54.120
schools that you're working
in will have

01:17:54.120 --> 01:17:56.170
some particular approach.

01:17:56.170 --> 01:17:57.620
Some of the schools you're
working in will have a

01:17:57.620 --> 01:17:58.960
different approach.

01:17:58.960 --> 01:18:02.720
And you're changing across the
schools, say the balance of

01:18:02.720 --> 01:18:06.770
half the schools will be
Catholic schools, and half the

01:18:06.770 --> 01:18:10.420
schools will be government
schools, and that's fine,

01:18:10.420 --> 01:18:12.870
because that's the world.

01:18:12.870 --> 01:18:16.130
Some schools are big, some
schools are small.

01:18:16.130 --> 01:18:20.620
But you're introducing your
own differences in the

01:18:20.620 --> 01:18:23.160
background, in the
average score.

01:18:23.160 --> 01:18:29.030
So if in Kenya, no one else
streamed, no schools streamed,

01:18:29.030 --> 01:18:32.540
then you're testing this in an
environment where over half of

01:18:32.540 --> 01:18:35.330
your schools are doing something
that Kenyan schools

01:18:35.330 --> 01:18:37.530
don't normally do.

01:18:37.530 --> 01:18:41.090
So you're still getting a valid
effect of the difference

01:18:41.090 --> 01:18:45.710
between contract versus
government teachers, but

01:18:45.710 --> 01:18:49.140
you're testing it in an
environment which may be a bit

01:18:49.140 --> 01:18:53.170
different, because you're
also doing other things.

01:18:53.170 --> 01:18:56.880
So I guess what I'm saying is
layering it in this way of

01:18:56.880 --> 01:19:00.070
lots of different things allows
you to answer lots of

01:19:00.070 --> 01:19:03.160
questions with a
smaller budget.

01:19:03.160 --> 01:19:08.230
But you have to remember that
in the end, some of these

01:19:08.230 --> 01:19:11.630
schools don't look like
completely typical Kenyan

01:19:11.630 --> 01:19:13.080
schools, because they've
got streaming.

01:19:13.080 --> 01:19:17.180
And most Kenyan schools
don't stream.

01:19:17.180 --> 01:19:23.020
So you're testing something in
a slightly atypical school.

01:19:23.020 --> 01:19:24.890
Is that important or is
it not important?

01:19:24.890 --> 01:19:27.040
How important a concern
is that to you?

01:19:27.040 --> 01:19:28.990
It's all about trade offs.

01:19:28.990 --> 01:19:32.910
Stratification, 10 minutes
to do stratification.

01:19:32.910 --> 01:19:36.750
The point of stratifying,
you've done your little

01:19:36.750 --> 01:19:40.540
exercise on stratification
in your cases?

01:19:40.540 --> 01:19:43.456
So you should understand
it all.

01:19:43.456 --> 01:19:44.930
That's fine.

01:19:44.930 --> 01:19:52.540
The point of it is to give
yourself extra confidence,

01:19:52.540 --> 01:19:58.270
extra chance, a higher chance
than normal of making sure

01:19:58.270 --> 01:20:02.770
that your treatment and
comparison groups are the same

01:20:02.770 --> 01:20:06.180
on anything you can measure.

01:20:06.180 --> 01:20:09.560
It's particularly important when
you have a small sample.

01:20:09.560 --> 01:20:12.230
You did your little exercise,
and so your bars go up and

01:20:12.230 --> 01:20:12.990
down, right?

01:20:12.990 --> 01:20:15.740
If you have a really big sample,
it doesn't give you

01:20:15.740 --> 01:20:19.450
that much extra power, extra
benefit, because they're going

01:20:19.450 --> 01:20:20.530
to be balanced anyway.

01:20:20.530 --> 01:20:23.930
Law of large numbers, if you
draw enough times, it's going

01:20:23.930 --> 01:20:25.180
to be balanced anyway.

01:20:27.530 --> 01:20:29.910
What is it?

01:20:29.910 --> 01:20:33.990
Sometimes people get really
stressed out about what am I

01:20:33.990 --> 01:20:35.260
stratifying on?

01:20:35.260 --> 01:20:37.690
That's not the most important
thing that you're going to

01:20:37.690 --> 01:20:40.600
learn here in what
you stratify on.

01:20:40.600 --> 01:20:44.170
I would encourage you to
stratify, but if you can't,

01:20:44.170 --> 01:20:47.330
again, it doesn't bias you, it
doesn't give you the wrong

01:20:47.330 --> 01:20:51.700
answer, but it just helps
on the margin.

01:20:51.700 --> 01:20:55.140
And sometimes that margin, as
they say, we've discovered to

01:20:55.140 --> 01:20:59.330
our great relief, it saved our
bacon when you're really on

01:20:59.330 --> 01:21:00.970
the edge of not having
enough sample.

01:21:04.510 --> 01:21:11.150
So all it is is dividing your
sample into different buckets

01:21:11.150 --> 01:21:13.400
before you do your
randomization.

01:21:13.400 --> 01:21:14.930
That's all it is.

01:21:14.930 --> 01:21:17.040
So it's a very simple concept.

01:21:17.040 --> 01:21:19.060
It's not always simple
to do, but it's

01:21:19.060 --> 01:21:20.520
a very simple concept.

01:21:20.520 --> 01:21:22.920
Instead of putting everyone into
one hat and pulling out

01:21:22.920 --> 01:21:26.140
of one hat, you divide it into
different hats and then draw

01:21:26.140 --> 01:21:28.010
half of each hat.

01:21:28.010 --> 01:21:30.440
That's all we're
talking about.

01:21:30.440 --> 01:21:33.010
So you select treatment and
control from each of the

01:21:33.010 --> 01:21:35.060
subgroups, so you make
sure that you're

01:21:35.060 --> 01:21:37.280
balanced on each shore.

01:21:37.280 --> 01:21:38.795
So what happens if you
don't stratify?

01:21:41.510 --> 01:21:43.530
What could go wrong if
you don't stratify?

01:21:43.530 --> 01:21:46.994
What's the danger if
you don't stratify?

01:21:46.994 --> 01:21:52.840
AUDIENCE: You're taking a risk
that some particular group

01:21:52.840 --> 01:21:55.830
don't get into the sample.

01:21:55.830 --> 01:21:56.110
PROFESSOR: Right.

01:21:56.110 --> 01:21:59.930
So you're taking a risk that
maybe more of your treatment

01:21:59.930 --> 01:22:04.500
ends up being richer
than your control.

01:22:04.500 --> 01:22:07.180
And that was the whole point
of randomizing, was to make

01:22:07.180 --> 01:22:08.180
sure that they were equal.

01:22:08.180 --> 01:22:11.820
And if you have a big enough
sample, you'll be OK, they

01:22:11.820 --> 01:22:12.840
will be equal.

01:22:12.840 --> 01:22:16.660
But you risk it going wrong.

01:22:20.360 --> 01:22:24.810
So if you do your pull, and even
though you stratified it

01:22:24.810 --> 01:22:30.850
for some, or you can't stratify,
you do your pull,

01:22:30.850 --> 01:22:33.950
and just by chance you get
a really bad draw.

01:22:33.950 --> 01:22:36.760
And when you draw it, your
treatment and comparison look

01:22:36.760 --> 01:22:42.150
very different, I would advise
you to draw it again.

01:22:42.150 --> 01:22:44.320
And then phone us, and
we'll tell you how to

01:22:44.320 --> 01:22:45.590
fix it in the analysis.

01:22:45.590 --> 01:22:47.210
Because you do have to do
something funny with you

01:22:47.210 --> 01:22:48.900
standard errors, and
that's fine.

01:22:48.900 --> 01:22:53.370
But you're really screwed if,
just by chance, you do your

01:22:53.370 --> 01:22:56.930
pull and you end up with all the
rich guys in one treatment

01:22:56.930 --> 01:22:58.530
and all the poor guys
in another.

01:22:58.530 --> 01:22:59.780
Then you're done.

01:23:04.060 --> 01:23:06.340
Then your evaluation isn't
going to work.

01:23:06.340 --> 01:23:07.810
This is statistics, right?

01:23:07.810 --> 01:23:10.340
It's on chance, and by chance
you just might get

01:23:10.340 --> 01:23:13.660
a really bad pull.

01:23:13.660 --> 01:23:16.470
This is stacking the decks
to make sure it doesn't.

01:23:16.470 --> 01:23:20.720
If you do that and you see that
it's just screwed, as I

01:23:20.720 --> 01:23:22.060
say, I would pull again.

01:23:33.050 --> 01:23:34.160
Onto a different thing.

01:23:34.160 --> 01:23:35.750
So when should you stratify?

01:23:39.010 --> 01:23:42.780
You should stratify on the
variables that have an

01:23:42.780 --> 01:23:44.990
important impact on your
outcome variable.

01:23:48.530 --> 01:23:50.910
If you're trying to improve
incomes, then it's income at

01:23:50.910 --> 01:23:51.650
the beginning.

01:23:51.650 --> 01:23:55.040
Or you're trying to improve
education, and it's test

01:23:55.040 --> 01:23:57.010
scores at the beginning,
those are going

01:23:57.010 --> 01:23:58.655
to be critical things.

01:23:58.655 --> 01:24:01.500
If it's something totally
irrelevant, then you don't

01:24:01.500 --> 01:24:04.220
need to stratify it.

01:24:04.220 --> 01:24:07.210
So the most important thing
is your outcome.

01:24:07.210 --> 01:24:08.660
Stratify on subgroups
that you're

01:24:08.660 --> 01:24:10.600
particularly interested in.

01:24:10.600 --> 01:24:13.130
Say you're really interested
in what's the effect of the

01:24:13.130 --> 01:24:19.460
program on the poor, or the
originally low achieving, or

01:24:19.460 --> 01:24:24.870
does this work for the Muslim
minority, make sure you have

01:24:24.870 --> 01:24:26.630
enough Muslims in your treatment
and control,

01:24:26.630 --> 01:24:27.810
otherwise you're not
going to be able

01:24:27.810 --> 01:24:30.050
to answer that question.

01:24:32.830 --> 01:24:36.260
As I say, if you got a huge
sample, you'll find we're

01:24:36.260 --> 01:24:39.460
virtually never in that
situation, but it's

01:24:39.460 --> 01:24:41.470
particularly important to do
it when you have a small

01:24:41.470 --> 01:24:44.990
sample set, or where your power
is weak, and then it can

01:24:44.990 --> 01:24:47.800
just-- it's not going to help
you enormously, but it just

01:24:47.800 --> 01:24:49.700
might push you over
the threshold to

01:24:49.700 --> 01:24:52.100
getting enough power.

01:24:52.100 --> 01:24:56.400
It starts getting very
complicated to do if you try

01:24:56.400 --> 01:24:58.970
and stratify on everything.

01:24:58.970 --> 01:25:04.735
You're both income and gender,
and you can imagine there may

01:25:04.735 --> 01:25:09.260
be some buckets that don't
have any people in them.

01:25:09.260 --> 01:25:12.350
So it starts getting
very complicated.

01:25:12.350 --> 01:25:14.640
It also makes it less
transparent.

01:25:14.640 --> 01:25:19.950
It's very hard to do a
complicated stratification and

01:25:19.950 --> 01:25:20.880
do a public draw.

01:25:20.880 --> 01:25:23.500
It's very easy to do a
simple one, you just

01:25:23.500 --> 01:25:25.380
literally have two hats.

01:25:25.380 --> 01:25:29.230
Divide it into two hats and then
you do the draw from the

01:25:29.230 --> 01:25:32.860
urban places and do a draw from
the rural, and everyone

01:25:32.860 --> 01:25:33.720
can understand that.

01:25:33.720 --> 01:25:39.230
If you're trying to do urban
and rural and rich and poor

01:25:39.230 --> 01:25:42.370
and whatever, then nobody is
going to understand what

01:25:42.370 --> 01:25:43.930
you're doing with all
these 15 different

01:25:43.930 --> 01:25:46.740
hats, and it's a mess.

01:25:49.960 --> 01:25:54.040
So usually when we stratify,
we do it on the computer.

01:25:54.040 --> 01:25:56.680
Again, if you do a public draw,
you can't redraw if it

01:25:56.680 --> 01:25:59.570
turns out wrong.

01:25:59.570 --> 01:26:01.425
Increasingly we don't
do public draws.

01:26:04.990 --> 01:26:08.150
You can stratify on
index variables.

01:26:08.150 --> 01:26:11.380
You can put lots of things
into an index and then

01:26:11.380 --> 01:26:15.090
stratify on that
if you want to.

01:26:15.090 --> 01:26:18.450
So mechanics, how do
I actually do this?

01:26:18.450 --> 01:26:21.540
This is a comment from earlier
courses, like we talked all

01:26:21.540 --> 01:26:26.410
about it, but how do I
do a random drawing?

01:26:26.410 --> 01:26:29.390
We talk about hats, and you
could do it in a hat, but we

01:26:29.390 --> 01:26:31.640
don't normally do it.

01:26:31.640 --> 01:26:34.940
The first and really important
thing for designing your

01:26:34.940 --> 01:26:40.820
evaluation is you can only do
a random draw from a list.

01:26:40.820 --> 01:26:43.270
You have to start with a list.

01:26:43.270 --> 01:26:46.230
And sometimes that's actually
practically quite hard to do,

01:26:46.230 --> 01:26:47.560
because you don't have
a list of all the

01:26:47.560 --> 01:26:48.810
people in the community.

01:26:51.660 --> 01:26:54.010
You have to go and get a
list, there's really no

01:26:54.010 --> 01:26:56.980
other way to do it.

01:26:56.980 --> 01:27:00.320
People will talk about--

01:27:00.320 --> 01:27:05.640
and people do this when they do
a random draw to interview

01:27:05.640 --> 01:27:08.730
people for doing a survey, to
make sure you get a random

01:27:08.730 --> 01:27:11.220
sample-- you're surveying a
random sample of people in the

01:27:11.220 --> 01:27:14.290
community, they'll talk about
go to the center of the

01:27:14.290 --> 01:27:19.690
village, walk along the street
and take every other person.

01:27:19.690 --> 01:27:22.570
Not a fan of that, because
actually when people have done

01:27:22.570 --> 01:27:27.440
that for us, it's not a random
sample, because they never get

01:27:27.440 --> 01:27:28.320
to the outlaying.

01:27:28.320 --> 01:27:30.560
And as we all know, the people
who live in the outlaying

01:27:30.560 --> 01:27:32.940
parts of the community are very
different from the people

01:27:32.940 --> 01:27:37.600
who live at the center of the
community, so you really

01:27:37.600 --> 01:27:40.620
basically need to get a list.

01:27:40.620 --> 01:27:43.560
And that's expensive.

01:27:43.560 --> 01:27:47.870
But you either use a census,
or the list of kids in the

01:27:47.870 --> 01:27:50.180
school, or a list
of schools from

01:27:50.180 --> 01:27:53.005
the Ministry of Education.

01:27:53.005 --> 01:27:57.220
And that we know can take a lot
of heartache and time to

01:27:57.220 --> 01:28:02.760
persuade them to hand
over those lists, or

01:28:02.760 --> 01:28:05.870
you go and do a census.

01:28:05.870 --> 01:28:11.210
Often we've just had to go and
count people, list people, get

01:28:11.210 --> 01:28:16.860
the eligible people, write a
list, and then draw from that.

01:28:16.860 --> 01:28:17.830
So how can you do it?

01:28:17.830 --> 01:28:22.000
You could literally pull it
out of a hat or a bucket.

01:28:22.000 --> 01:28:24.200
I keep talking about hats and
buckets because it's very easy

01:28:24.200 --> 01:28:26.150
to visualize what
you're doing.

01:28:26.150 --> 01:28:28.890
It's transparent, but it's
time consuming and it's

01:28:28.890 --> 01:28:30.710
complicated if you have
large groups, and

01:28:30.710 --> 01:28:32.850
it's hard to stratify.

01:28:32.850 --> 01:28:35.220
So what do we normally do?

01:28:35.220 --> 01:28:39.220
Well, you could use a random
number generation in a

01:28:39.220 --> 01:28:41.770
spreadsheet program,
and you order the

01:28:41.770 --> 01:28:42.980
observations randomly.

01:28:42.980 --> 01:28:46.020
Did you do an exercise where
you actually did this?

01:28:46.020 --> 01:28:48.200
OK, so why am I talking
to you about this?

01:28:48.200 --> 01:28:50.490
So I'm just going to
go through it.

01:28:50.490 --> 01:28:54.130
So you can use Excel to do this,
and it's much better if

01:28:54.130 --> 01:28:59.100
you go over it in your groups
and do it, or you can run a

01:28:59.100 --> 01:29:00.310
Stata program.

01:29:00.310 --> 01:29:02.700
Did you do a Stata program
in your groups?

01:29:02.700 --> 01:29:04.880
No.

01:29:04.880 --> 01:29:08.610
But we can provide
people with--

01:29:08.610 --> 01:29:11.160
so I don't know, how many people
here have used Stata?

01:29:14.620 --> 01:29:19.570
So this is a program, a
statistical program, that

01:29:19.570 --> 01:29:22.650
economists tend to use.

01:29:22.650 --> 01:29:24.480
So the people who
are familiar--

01:29:24.480 --> 01:29:27.850
I wouldn't advise you to do it
this way if you're not used to

01:29:27.850 --> 01:29:28.610
using Stata.

01:29:28.610 --> 01:29:30.886
But if you're used to using
Stata, you know about it, but

01:29:30.886 --> 01:29:34.910
you just haven't done pulling
a random sample from Stata,

01:29:34.910 --> 01:29:38.150
then we can just show you some
code examples and you could

01:29:38.150 --> 01:29:39.790
build off those.

01:29:39.790 --> 01:29:43.380
For people who haven't done
that, it's really perfectly

01:29:43.380 --> 01:29:48.090
possible to do it in Excel, as
long as you don't get too

01:29:48.090 --> 01:29:51.170
complicated in your
stratification.

01:29:51.170 --> 01:29:57.080
We are also, I hope, going to
be setting up a help desk in

01:29:57.080 --> 01:30:02.930
the near future for people who
have attended this course.

01:30:02.930 --> 01:30:05.480
We're always available anyway
for people who've attended the

01:30:05.480 --> 01:30:07.250
course to come and--

01:30:07.250 --> 01:30:10.455
I'm doing this, you remember you
talked about this, and I'm

01:30:10.455 --> 01:30:12.730
stuck because it's a bit more
complicated than I thought I

01:30:12.730 --> 01:30:14.410
understood it.

01:30:14.410 --> 01:30:17.410
But we're going to actually
formalize that process, and

01:30:17.410 --> 01:30:21.770
that would mean that if you need
help actually doing it,

01:30:21.770 --> 01:30:26.270
you can come to us and
ask us, can you just

01:30:26.270 --> 01:30:27.590
check my Stata code?

01:30:27.590 --> 01:30:29.320
Can you check that I'm doing
this right in Excel?

01:30:29.320 --> 01:30:31.210
Because you don't want
to screw that up.

01:30:31.210 --> 01:30:35.340
And we'll have a team of
graduate students or the kind

01:30:35.340 --> 01:30:39.810
of the people who are here TAing
who will go over that

01:30:39.810 --> 01:30:41.710
and make sure that you--

01:30:41.710 --> 01:30:46.750
because we would hate for the
sake of a rusty nail that the

01:30:46.750 --> 01:30:49.210
horseshoe falls off and the
kingdom is lost and stuff.

01:30:49.210 --> 01:30:52.420
If for the sake of getting your
stratification or your

01:30:52.420 --> 01:30:58.120
random pull right, it's worth if
you want someone to have a

01:30:58.120 --> 01:31:01.160
double check of that and make
sure that you're doing it,

01:31:01.160 --> 01:31:02.420
that you're happy with it.

01:31:02.420 --> 01:31:05.450
Because if you get it wrong,
then that invalidates all the

01:31:05.450 --> 01:31:06.225
rest of your work.

01:31:06.225 --> 01:31:06.500
Yeah?

01:31:06.500 --> 01:31:10.140
AUDIENCE: When you stratify,
aren't you compromising on the

01:31:10.140 --> 01:31:11.510
randomness?

01:31:11.510 --> 01:31:16.890
PROFESSOR: No, because within
groups you are randomizing.

01:31:16.890 --> 01:31:21.530
So all you're saying is I
could randomly pull--

01:31:21.530 --> 01:31:26.730
I could take all of this group
and randomly draw some of you

01:31:26.730 --> 01:31:31.830
to come to the dinner tonight,
and some not to.

01:31:31.830 --> 01:31:36.460
Or I could say I want
to make sure that we

01:31:36.460 --> 01:31:37.710
have at least half--

01:31:40.210 --> 01:31:42.090
equivalent numbers of
men and women as

01:31:42.090 --> 01:31:43.820
we have in the course.

01:31:43.820 --> 01:31:46.410
And then I'd get all the men
over on one side, and all the

01:31:46.410 --> 01:31:49.370
women over the other, and I
would draw half of the men and

01:31:49.370 --> 01:31:51.550
half of the women.

01:31:51.550 --> 01:31:53.320
So it's still random.

01:31:53.320 --> 01:31:56.180
There's still equal chance of
getting picked, I'm just

01:31:56.180 --> 01:32:00.190
making sure that I get
proportionate.

01:32:03.100 --> 01:32:10.170
So in the microfinance example
where I said we got extra

01:32:10.170 --> 01:32:14.170
powers, we got communities,
and we linked them up, we

01:32:14.170 --> 01:32:18.010
paired them as close as possible
to each other.

01:32:18.010 --> 01:32:23.500
And then within pairs, one was
treatment and one was control.

01:32:23.500 --> 01:32:26.800
So we always knew that there was
somebody who looked almost

01:32:26.800 --> 01:32:30.560
identical to that community
in the other bucket.

01:32:30.560 --> 01:32:33.960
There was always a control that
was almost identical, and

01:32:33.960 --> 01:32:39.090
that meant that we just got
less variation than

01:32:39.090 --> 01:32:39.840
we would have had.

01:32:39.840 --> 01:32:41.430
Otherwise, less variation
gives you

01:32:41.430 --> 01:32:43.000
more statistical power.

01:32:43.000 --> 01:32:46.310
But everyone had an equal chance
of being in, we're just

01:32:46.310 --> 01:32:50.080
stacking the decks to make
sure that they were as

01:32:50.080 --> 01:32:51.310
comparable as possible.

01:32:51.310 --> 01:32:54.420
Because when you draw randomly,
by chance they could

01:32:54.420 --> 01:32:55.850
not be comparable.

01:32:55.850 --> 01:33:02.960
But if you literally pair them
and switch one versus another,

01:33:02.960 --> 01:33:07.840
you are sure that there'll be a
control that looks very much

01:33:07.840 --> 01:33:09.230
like the treatment.

01:33:09.230 --> 01:33:13.470
And you don't get the chance
case that you just happened to

01:33:13.470 --> 01:33:16.410
get all the rich communities in
the treatment and all the

01:33:16.410 --> 01:33:20.240
poor communities
in the control.

01:33:20.240 --> 01:33:27.270
With the stratifying, you have
to take that into account when

01:33:27.270 --> 01:33:28.770
you do your analysis.

01:33:28.770 --> 01:33:32.730
So you just put in a dummy
variable, those people who are

01:33:32.730 --> 01:33:34.510
actually going to run the
regression at the end, you

01:33:34.510 --> 01:33:36.620
need to put in a dummy
variable for each

01:33:36.620 --> 01:33:38.500
strata that you did.

01:33:38.500 --> 01:33:40.960
AUDIENCE: What's the cost
of stratification?

01:33:40.960 --> 01:33:44.520
PROFESSOR: So that
is the one cost.

01:33:44.520 --> 01:33:47.650
You have to put in a dummy
in every case.

01:33:47.650 --> 01:33:50.260
So as I say, it gets quite
complicated to do it

01:33:50.260 --> 01:33:51.130
when you have lots.

01:33:51.130 --> 01:33:51.700
The main--

01:33:51.700 --> 01:33:53.330
AUDIENCE: And you cut
a degree of freedom?

01:33:53.330 --> 01:33:54.900
PROFESSOR: Yeah, you lose
a degree of freedom.

01:33:54.900 --> 01:33:56.170
AUDIENCE: You lose power?

01:33:56.170 --> 01:33:57.910
PROFESSOR: Yes.

01:33:57.910 --> 01:34:02.990
So there's been a long esoteric
debate in the

01:34:02.990 --> 01:34:06.280
econometrics, amongst the
serious econometricians over

01:34:06.280 --> 01:34:10.440
the last year about whether you
can decide not to put the

01:34:10.440 --> 01:34:11.440
dummy in the end.

01:34:11.440 --> 01:34:14.830
I think the conclusion is that
you have to decide in advance

01:34:14.830 --> 01:34:15.790
whether you're going
to put your dummy.

01:34:15.790 --> 01:34:18.510
So basically, you just have to
put your dummy variables in.

01:34:18.510 --> 01:34:21.800
So you do lose a degree
of freedom.

01:34:21.800 --> 01:34:24.650
The argument was, maybe then
you could be worse off.

01:34:24.650 --> 01:34:27.120
Someone did a whole bunch
of simulations.

01:34:27.120 --> 01:34:30.220
They proved that you could
potentially be worse off.

01:34:30.220 --> 01:34:34.310
Nobody has been able, even in a
simulation, to come up with

01:34:34.310 --> 01:34:37.570
an example whether you were
actually worse off.

01:34:37.570 --> 01:34:43.480
But you lost more power
than you gained.

01:34:43.480 --> 01:34:46.960
But if you remember my advice
to you was to do it on the

01:34:46.960 --> 01:34:50.180
ones that are likely to be
important, and that is because

01:34:50.180 --> 01:34:52.617
potentially there is a
theoretical possibility you

01:34:52.617 --> 01:34:55.150
could over-stratify, because you
could lose some degrees of

01:34:55.150 --> 01:34:57.690
freedom at the end, because you
have to put in a dummy for

01:34:57.690 --> 01:35:01.860
something that didn't actually
help you reduce variance.

01:35:01.860 --> 01:35:04.820
But as I say, even someone who
is trying to prove this case

01:35:04.820 --> 01:35:07.920
did a bunch of simulations and
couldn't find a single example

01:35:07.920 --> 01:35:09.530
where they actually
ended up with less

01:35:09.530 --> 01:35:11.550
power through stratify.

01:35:11.550 --> 01:35:15.320
So my gut from this, and
especially after having done

01:35:15.320 --> 01:35:20.490
the microfinance one was
on the whole, I would--

01:35:20.490 --> 01:35:23.560
now the main constraint is you
normally don't have a lot of

01:35:23.560 --> 01:35:25.590
information before you start.

01:35:25.590 --> 01:35:28.110
If you've done your baseline
before you randomize, you have

01:35:28.110 --> 01:35:29.290
a ton of information.

01:35:29.290 --> 01:35:30.760
Then you can stratify
on anything

01:35:30.760 --> 01:35:31.445
you have in the baseline.

01:35:31.445 --> 01:35:35.410
If you randomize before your
baseline, you probably don't

01:35:35.410 --> 01:35:37.030
know very much about
these communities,

01:35:37.030 --> 01:35:38.090
and it's pretty hard.

01:35:38.090 --> 01:35:39.890
But if you're randomizing
after you've done your

01:35:39.890 --> 01:35:46.900
baseline, and it entered and
cleaned, basically in order to

01:35:46.900 --> 01:35:49.690
do that, you have to delay
your implementation for

01:35:49.690 --> 01:35:52.000
several months after you've
done your baseline.

01:35:52.000 --> 01:35:54.090
If you want to do you
implementation immediately

01:35:54.090 --> 01:35:56.280
you've done your baseline,
you're probably not going to

01:35:56.280 --> 01:35:58.870
have that many variables
entered.

01:35:58.870 --> 01:36:01.550
And then you just have to
stratify on what variables you

01:36:01.550 --> 01:36:03.240
have entered or cleaned
or whatever.

01:36:03.240 --> 01:36:09.540
So that's usually the
constraint, that you don't

01:36:09.540 --> 01:36:14.330
have much information on
which to stratify.

01:36:14.330 --> 01:36:16.428
That's really what it comes
down to usually.

01:36:20.820 --> 01:36:24.909
OK, I went over a bit,
but any questions?

01:36:24.909 --> 01:36:25.382
Yeah?

01:36:25.382 --> 01:36:27.115
AUDIENCE: I just have one
question-- more of a

01:36:27.115 --> 01:36:30.112
clarification-- but it goes
back to the questions--

01:36:30.112 --> 01:36:33.610
the different levels of
questions that you asked.

01:36:33.610 --> 01:36:36.974
So were those questions that the
client came to you with?

01:36:36.974 --> 01:36:40.160
Or did the client come to you
and say, we have a program, we

01:36:40.160 --> 01:36:42.260
want to know if it works?

01:36:42.260 --> 01:36:44.900
And then as you were talking
with them, the different

01:36:44.900 --> 01:36:47.030
layers came out because
you realized you're

01:36:47.030 --> 01:36:49.440
doing all this stuff?

01:36:49.440 --> 01:36:52.240
PROFESSOR: So if you remember
where I talked about in terms

01:36:52.240 --> 01:36:54.720
of when do you do an evaluation,
I talked about

01:36:54.720 --> 01:36:57.210
different possibilities.

01:36:57.210 --> 01:37:02.100
And one was you have to program,
that you want to test

01:37:02.100 --> 01:37:05.660
the program, and that was
the Balsakhi case.

01:37:05.660 --> 01:37:08.940
And they had this program, they
wanted to test it, and

01:37:08.940 --> 01:37:14.720
the researchers used the fact
that the program did some

01:37:14.720 --> 01:37:17.480
things, and that some of the
things affected some kids and

01:37:17.480 --> 01:37:21.260
not other kids to tease out some
interesting questions.

01:37:21.260 --> 01:37:25.200
In the extra teacher provision
case, it was more like where I

01:37:25.200 --> 01:37:27.620
said, you want to know
some questions?

01:37:27.620 --> 01:37:31.280
Set up a field site and just go
experiment the hell out of

01:37:31.280 --> 01:37:34.280
it so that you can find out what
the hell's going on, and

01:37:34.280 --> 01:37:37.350
you understand these deep
parameters that then you could

01:37:37.350 --> 01:37:38.970
go off and design.

01:37:38.970 --> 01:37:42.300
So the extra teacher program
was really that.

01:37:42.300 --> 01:37:47.370
It was saying there are these
fundamental questions in

01:37:47.370 --> 01:37:50.980
education that we don't
know the answer to.

01:37:50.980 --> 01:37:54.400
And we're going off the
literature in the US where

01:37:54.400 --> 01:37:56.350
they've done lots of
experiments, but we don't know

01:37:56.350 --> 01:37:59.440
if that's relevant to
these countries.

01:37:59.440 --> 01:38:04.790
So those are important questions
to lots of people

01:38:04.790 --> 01:38:05.980
around the world.

01:38:05.980 --> 01:38:07.880
Let's find out what
the answers to--

01:38:10.490 --> 01:38:12.230
it wasn't really a very--

01:38:12.230 --> 01:38:17.440
it was an NGO program, but it
was mainly done for these

01:38:17.440 --> 01:38:20.520
research purposes, and these
more general policy purposes.

01:38:20.520 --> 01:38:24.630
And it was paid for not by the
NGO, the evaluation wasn't

01:38:24.630 --> 01:38:26.570
paid for by the NGO.

01:38:26.570 --> 01:38:31.050
But it was, I think, relevant.

01:38:31.050 --> 01:38:34.690
As I say, it is a relevant
policy package, because the

01:38:34.690 --> 01:38:37.610
reason they did it was not just
of academic interest to

01:38:37.610 --> 01:38:40.310
see does class size matter,
but governments around the

01:38:40.310 --> 01:38:43.940
world are worrying about whether
to hire more teachers,

01:38:43.940 --> 01:38:46.710
and should they hire them as
government teachers or

01:38:46.710 --> 01:38:47.960
contract teachers?

01:38:50.130 --> 01:38:52.760
There isn't actually much
discussion in the policy world

01:38:52.760 --> 01:38:55.090
about the streaming.

01:38:55.090 --> 01:38:59.800
But given our other findings
about how the curricula is so

01:38:59.800 --> 01:39:03.240
over the head of other kids,
and the tentative findings

01:39:03.240 --> 01:39:07.780
from Balsakhi that it really
looked like focusing--

01:39:07.780 --> 01:39:10.120
the lower performing kids seemed
to do better being

01:39:10.120 --> 01:39:13.640
pulled out, that was the
motivation for saying, well,

01:39:13.640 --> 01:39:15.870
actually maybe this is something
that's relevant in

01:39:15.870 --> 01:39:16.480
other cases.

01:39:16.480 --> 01:39:19.440
And I think it's more that that
then raises the question,

01:39:19.440 --> 01:39:22.480
and then people are starting to
think about it, rather than

01:39:22.480 --> 01:39:26.780
that streaming really was a
big question in the policy

01:39:26.780 --> 01:39:28.718
context of Kenya at the time.

01:39:35.010 --> 01:39:38.240
And the ETP one, I should say,
that's probably one of our

01:39:38.240 --> 01:39:40.460
most complicated.

01:39:40.460 --> 01:39:44.220
Don't look at that complicated
diagram of all the different

01:39:44.220 --> 01:39:47.350
things that were being tested
and multiple treatments.

01:39:47.350 --> 01:39:50.910
That's kind of one of the most
complicated designs we've ever

01:39:50.910 --> 01:39:55.930
done, so that's not like we're
expecting everyone to go out

01:39:55.930 --> 01:39:58.150
of here, or go off and do
something like that.

01:39:58.150 --> 01:40:02.240
But it was just kind of the
extreme of this is the

01:40:02.240 --> 01:40:05.306
potential that you could
potentially get at.

01:40:05.306 --> 01:40:06.556
OK, thanks.