WEBVTT

00:00:00.040 --> 00:00:02.390
The following content is
provided under a Creative

00:00:02.390 --> 00:00:03.680
Commons license.

00:00:03.680 --> 00:00:06.640
Your support will help MIT
OpenCourseWare continue to

00:00:06.640 --> 00:00:09.980
offer high quality educational
resources for free.

00:00:09.980 --> 00:00:12.820
To make a donation or to view
additional materials from

00:00:12.820 --> 00:00:15.246
hundreds of MIT courses, visit
MIT OpenCourseWare at

00:00:15.246 --> 00:00:16.496
ocw.mit.edu.

00:00:22.070 --> 00:00:26.270
So for the course, we have
four days of lectures.

00:00:26.270 --> 00:00:30.530
Today we'll try to convince
you that it was actually a

00:00:30.530 --> 00:00:35.340
good idea to come here, why
randomized evaluation is such

00:00:35.340 --> 00:00:39.470
a useful tool and why it's
superior to many other kinds

00:00:39.470 --> 00:00:41.830
of impact evaluation.

00:00:41.830 --> 00:00:44.190
Once we've convinced you that
it's a good idea to come here,

00:00:44.190 --> 00:00:46.940
then we'll start going through
the nuts and bolts of actually

00:00:46.940 --> 00:00:50.410
how to run randomized
evaluations.

00:00:50.410 --> 00:00:56.210
Tomorrow we'll go over some
of the general design

00:00:56.210 --> 00:00:57.830
possibilities.

00:00:57.830 --> 00:01:00.840
The following day we'll go
into some more of the

00:01:00.840 --> 00:01:04.680
technical aspects like sample
size and measurement.

00:01:04.680 --> 00:01:09.750
The last day we'll kind of
discuss the fact that even in

00:01:09.750 --> 00:01:12.280
randomized evaluations, things
can go wrong, and how

00:01:12.280 --> 00:01:13.650
you deal with that.

00:01:13.650 --> 00:01:17.520
And throughout the entire course
as you're learning

00:01:17.520 --> 00:01:23.250
this, you'll be incorporating
what you learn by designing in

00:01:23.250 --> 00:01:26.860
step with the lectures your own
randomized evaluation in

00:01:26.860 --> 00:01:28.300
the groups that you've
been preassigned.

00:01:28.300 --> 00:01:30.090
So if you check out your name
tags, you'll see that you have

00:01:30.090 --> 00:01:31.950
a group number.

00:01:31.950 --> 00:01:35.590
Find other people with the
same color and number and

00:01:35.590 --> 00:01:36.680
those will be your
group members.

00:01:36.680 --> 00:01:40.480
We tried to put you together
in ways that made sense.

00:01:40.480 --> 00:01:43.460
So we tried to have people who
were interested in agriculture

00:01:43.460 --> 00:01:47.150
work with other people
within agriculture.

00:01:47.150 --> 00:01:51.360
And over the course of the four
days, you'll be designing

00:01:51.360 --> 00:01:54.650
your own randomized
evaluation.

00:01:54.650 --> 00:01:58.830
And on the last day, on
Saturday, you will be

00:01:58.830 --> 00:02:03.040
presenting your evaluation
to the entire group.

00:02:03.040 --> 00:02:05.470
So that is pretty much
what to expect over

00:02:05.470 --> 00:02:06.230
the next five days.

00:02:06.230 --> 00:02:09.180
Have I forgotten anything?

00:02:09.180 --> 00:02:12.140
Then let me reintroduce
Rachel who will start

00:02:12.140 --> 00:02:13.390
five minutes early.

00:02:17.880 --> 00:02:20.480
RACHEL GLENNERSTER: The group
the Mark was talking about is

00:02:20.480 --> 00:02:22.710
a really integral part
of the course.

00:02:22.710 --> 00:02:25.970
So unfortunately that is not
something that you'll be able

00:02:25.970 --> 00:02:27.400
to get online.

00:02:27.400 --> 00:02:32.610
But hopefully it's a chance for
you to go through, each

00:02:32.610 --> 00:02:37.900
time we present an idea in the
lecture, you then need to go

00:02:37.900 --> 00:02:46.840
and apply it in the case that
you're developing in your

00:02:46.840 --> 00:02:50.840
evaluation you're developing
in your group.

00:02:50.840 --> 00:02:53.430
And that's what all the teaching
assistants are here,

00:02:53.430 --> 00:03:00.360
to help you go through the
case studies, but also to

00:03:00.360 --> 00:03:03.810
develop your own evaluations.

00:03:03.810 --> 00:03:12.120
So I'm going to start with the
general question of what is an

00:03:12.120 --> 00:03:15.000
impact evaluation and when
should we do one.

00:03:21.960 --> 00:03:24.590
One of the objectives of this
lecture is just to make sure

00:03:24.590 --> 00:03:28.390
that we're all on the same page
when we start using terms

00:03:28.390 --> 00:03:31.720
like process evaluation
and impact evaluation.

00:03:31.720 --> 00:03:36.420
Because I realize the more I
spend time with people who are

00:03:36.420 --> 00:03:38.810
kind of professional evaluators,
I realize that

00:03:38.810 --> 00:03:42.620
economists and professional
evaluators use the same terms

00:03:42.620 --> 00:03:46.300
but to mean different things,
which is incredibly confusing.

00:03:46.300 --> 00:03:50.210
So I'm sure a lot of this will
be familiar to you, but on the

00:03:50.210 --> 00:03:54.020
other hand, we need to make sure
that every one is at the

00:03:54.020 --> 00:03:57.270
same level in using the terms in
the same way before we kind

00:03:57.270 --> 00:04:00.000
of head into the
nuts and bolts.

00:04:00.000 --> 00:04:05.380
But I also have incorporated in
this discussion about when

00:04:05.380 --> 00:04:07.670
you should do an impact
evaluation, which is something

00:04:07.670 --> 00:04:13.480
that comes up an awful lot
when I go and talk to

00:04:13.480 --> 00:04:17.769
organizations who are trying
to think through their

00:04:17.769 --> 00:04:21.959
evaluation strategy.

00:04:21.959 --> 00:04:25.270
They've heard that they ought
to be doing more impact

00:04:25.270 --> 00:04:26.220
evaluations.

00:04:26.220 --> 00:04:28.270
There's lots of focus on this.

00:04:28.270 --> 00:04:29.480
But they're expensive.

00:04:29.480 --> 00:04:34.490
So how do they decide which of
their programs to evaluate and

00:04:34.490 --> 00:04:36.480
at what stage to evaluate it.

00:04:36.480 --> 00:04:39.170
They're getting pressure
from donors to do this.

00:04:39.170 --> 00:04:41.890
But they're not quite sure
when it's appropriate.

00:04:41.890 --> 00:04:44.310
So we'll try and cover
those ideas.

00:04:44.310 --> 00:04:49.220
So we'll start with why is it
that we here at J-PAL are

00:04:49.220 --> 00:04:50.730
focused on impact evaluation.

00:04:50.730 --> 00:04:53.030
Because there's lots of other
things in evaluating our

00:04:53.030 --> 00:04:55.160
programs that are important.

00:04:55.160 --> 00:04:57.960
But we just do impact
evaluation.

00:04:57.960 --> 00:05:01.230
We also only do randomized
impact evaluation.

00:05:01.230 --> 00:05:03.300
And that's not to say that's
the only thing

00:05:03.300 --> 00:05:04.160
that's worth doing.

00:05:04.160 --> 00:05:05.280
We certainly don't think that.

00:05:05.280 --> 00:05:07.090
That's what we do.

00:05:07.090 --> 00:05:08.400
There's a reason we
do it, because

00:05:08.400 --> 00:05:09.500
we think it's important.

00:05:09.500 --> 00:05:12.420
But it's certainly not the only
thing that you should be

00:05:12.420 --> 00:05:14.440
doing in your organizations.

00:05:14.440 --> 00:05:19.670
So step back and look at the
objectives of evaluation and a

00:05:19.670 --> 00:05:25.030
model of change, which is very
important in terms of how to

00:05:25.030 --> 00:05:29.410
think about your evaluation,
how to design it, different

00:05:29.410 --> 00:05:33.350
types of evaluation, how
evaluation feeds into cost

00:05:33.350 --> 00:05:37.440
benefit analysis, and then
into why to do an impact

00:05:37.440 --> 00:05:39.910
evaluation, and putting
it all together into

00:05:39.910 --> 00:05:42.340
an evaluation strategy.

00:05:42.340 --> 00:05:45.160
And then coming back
to how do we learn.

00:05:45.160 --> 00:05:50.880
How do we make an organization
that learns from its

00:05:50.880 --> 00:05:53.960
evaluation strategy rather
than just doing this as

00:05:53.960 --> 00:05:58.040
something a funder wants
me to do or I have to

00:05:58.040 --> 00:05:59.240
do to tick a box.

00:05:59.240 --> 00:06:03.700
How do I develop an organization
that learns from

00:06:03.700 --> 00:06:08.300
its evaluations and makes it
a better organization?

00:06:08.300 --> 00:06:13.240
So this is the motivation
for what we do.

00:06:17.300 --> 00:06:21.610
And I think this point is sort
of the main point here.

00:06:21.610 --> 00:06:24.860
If you step back and you think
about how much evidence we

00:06:24.860 --> 00:06:31.970
have in development, to make the
decisions that we need to

00:06:31.970 --> 00:06:35.340
make, it's really quite
appalling how little

00:06:35.340 --> 00:06:36.590
information we have.

00:06:39.780 --> 00:06:42.640
If you think about some of the
biggest challenges in the

00:06:42.640 --> 00:06:48.330
world in development about how
to prevent the spread of

00:06:48.330 --> 00:06:53.210
HIV/AIDS in Sub-Saharan Africa,
how to improve the

00:06:53.210 --> 00:06:58.330
productivity of small farmers
across the world, it's really

00:06:58.330 --> 00:07:02.570
amazing how little really
rigorous evidence we have to

00:07:02.570 --> 00:07:04.180
make those decisions.

00:07:04.180 --> 00:07:08.130
And we may know that this
project may work or that

00:07:08.130 --> 00:07:13.760
project may work, but we very
rarely know what is the most

00:07:13.760 --> 00:07:17.270
cost-effective place to put
a dollar that I have.

00:07:17.270 --> 00:07:22.690
If I'm choosing in HIV
prevention, if I've got to

00:07:22.690 --> 00:07:26.930
choose between a lot of
different seemingly great

00:07:26.930 --> 00:07:31.610
projects, what is the project
that's going to give me the

00:07:31.610 --> 00:07:33.580
most bang for my buck?

00:07:33.580 --> 00:07:36.990
And we really don't have that
kind of consistent rigorous

00:07:36.990 --> 00:07:42.250
impact evaluation data in order
to make those decisions.

00:07:42.250 --> 00:07:46.240
And that was really the reason
why J-PAL was started, because

00:07:46.240 --> 00:07:49.170
of the feeling that we could
do so much better if we had

00:07:49.170 --> 00:07:50.920
that kind of data.

00:07:50.920 --> 00:07:54.980
And it's also too often the
case that decisions about

00:07:54.980 --> 00:08:02.490
development are based on emotion
rather than data.

00:08:02.490 --> 00:08:05.060
You can see this in proposals
that people write and the

00:08:05.060 --> 00:08:08.570
discussions that people have,
very compelling, personal

00:08:08.570 --> 00:08:12.640
stories, which are important,
but aren't really what we

00:08:12.640 --> 00:08:15.960
should be making all
our decisions on.

00:08:15.960 --> 00:08:18.620
That may be very motivating
to get people involved.

00:08:18.620 --> 00:08:21.040
But when you're talking about
trade-offs, you've got to have

00:08:21.040 --> 00:08:24.500
a lot more rigorous evidence.

00:08:29.650 --> 00:08:33.299
If we had that kind of evidence,
we could be a lot

00:08:33.299 --> 00:08:38.000
more effective with the
money that we have.

00:08:38.000 --> 00:08:41.620
I also think it's true,
sometimes people say oh, well

00:08:41.620 --> 00:08:45.190
you're just talking about
passing a dollar and spending

00:08:45.190 --> 00:08:47.690
it in a slightly more marginally
effective way.

00:08:47.690 --> 00:08:50.570
But what we really need
is more money going

00:08:50.570 --> 00:08:52.670
into poverty really.

00:08:52.670 --> 00:08:56.660
But arguably, potentially one of
the most important ways to

00:08:56.660 --> 00:08:59.620
get more money to go into
poverty relief is to convince

00:08:59.620 --> 00:09:02.170
people that the money that's
going in is actually used

00:09:02.170 --> 00:09:03.240
effectively.

00:09:03.240 --> 00:09:05.400
So I don't see these
as either/or.

00:09:05.400 --> 00:09:08.600
Using the money effectively
and raising more money, I

00:09:08.600 --> 00:09:14.040
think, both can come from
having more evidence.

00:09:14.040 --> 00:09:20.080
So it's also important, I think,
in a way, to move from

00:09:20.080 --> 00:09:24.070
what I think is a very damaging
and nonconstructive

00:09:24.070 --> 00:09:26.740
debate between the aid
optimists and the aid

00:09:26.740 --> 00:09:28.010
pessimists.

00:09:28.010 --> 00:09:34.010
It's a very kind of polarized
debate with Jeff Sachs on one

00:09:34.010 --> 00:09:36.140
side and Bill Easterly
on the other.

00:09:36.140 --> 00:09:39.420
This is a quote from Jeff Sachs:
"I've identified the

00:09:39.420 --> 00:09:42.370
specific investments
that are needed"--

00:09:42.370 --> 00:09:43.860
from the previous sentence,
you know that

00:09:43.860 --> 00:09:45.530
this is to end poverty--

00:09:45.530 --> 00:09:48.240
"found ways to plan and
implement them and show that

00:09:48.240 --> 00:09:49.880
they can be affordable."

00:09:49.880 --> 00:09:52.120
Now if you think we
know everything

00:09:52.120 --> 00:09:54.530
about development already--

00:09:54.530 --> 00:09:57.350
we know what's needed, we know
how to implement it-- then,

00:09:57.350 --> 00:09:59.980
kind of, this is the wrong
course for you.

00:09:59.980 --> 00:10:03.120
But I think most of us would
agree that that's slightly

00:10:03.120 --> 00:10:05.850
overstating how much information
we have about how

00:10:05.850 --> 00:10:06.910
to end poverty.

00:10:06.910 --> 00:10:12.050
There's a lot more questions out
there than that suggests.

00:10:12.050 --> 00:10:15.120
His argument is, but we have
to get people motivated.

00:10:15.120 --> 00:10:17.650
So we have got to say that
we know everything.

00:10:17.650 --> 00:10:21.060
I don't think we have to be
quite that optimistic.

00:10:21.060 --> 00:10:24.530
On the other hand, I think this
is way too pessimistic.

00:10:24.530 --> 00:10:28.580
After $2.3 trillion over five
decades, why the desperate

00:10:28.580 --> 00:10:31.530
needs of the world's poor still
so tragically unmet?

00:10:31.530 --> 00:10:34.690
Isn't it finally time to end the
impunity of foreign aid?

00:10:34.690 --> 00:10:38.300
So Bill Easterly is kind of
saying, oh, it has not worked.

00:10:38.300 --> 00:10:40.620
So let's throw it all away.

00:10:40.620 --> 00:10:43.490
We've got to find a middle
ground here.

00:10:43.490 --> 00:10:48.890
It's just about time that
we have a development.

00:10:48.890 --> 00:10:50.540
And they're talking about aid.

00:10:50.540 --> 00:10:52.500
And I would argue it's
much more about

00:10:52.500 --> 00:10:53.990
development than aid.

00:10:53.990 --> 00:10:57.440
Aid is only a small fraction of
the money that's spent on

00:10:57.440 --> 00:11:00.870
reducing poverty in developing
countries.

00:11:00.870 --> 00:11:05.220
Development pessimism
is just as bad.

00:11:05.220 --> 00:11:09.540
We've got to think more
strategically about not just

00:11:09.540 --> 00:11:13.720
that all aid is bad or
development funding is wasted,

00:11:13.720 --> 00:11:19.320
but how do we focus the money
on the right things.

00:11:19.320 --> 00:11:22.160
So that's kind of
the motivation

00:11:22.160 --> 00:11:24.320
for what we're doing.

00:11:24.320 --> 00:11:28.540
But if you think on a very
grand scale, but thinking

00:11:28.540 --> 00:11:33.080
about the objective of
evaluation in general, you can

00:11:33.080 --> 00:11:34.210
think of it as three things.

00:11:34.210 --> 00:11:39.560
Accountability, did we do what
we say we were going to do?

00:11:39.560 --> 00:11:44.850
And again, this is true of aid
agencies, NGOs, government.

00:11:44.850 --> 00:11:48.160
And did we have a positive
impact on people's lives?

00:11:48.160 --> 00:11:52.230
So those are two different
aspects of accountability that

00:11:52.230 --> 00:11:57.220
evaluation needs to speak to.

00:11:57.220 --> 00:11:59.890
Evaluation isn't only about
accountability though.

00:11:59.890 --> 00:12:04.910
I think it's very importantly
about lesson learning so we do

00:12:04.910 --> 00:12:06.800
better in the future.

00:12:06.800 --> 00:12:09.400
And that's about does a
particular program work or

00:12:09.400 --> 00:12:12.540
not, and what's the most
effective route to achieve a

00:12:12.540 --> 00:12:14.600
certain outcome?

00:12:14.600 --> 00:12:15.940
Are there similarities?

00:12:15.940 --> 00:12:19.540
Are there lessons that you can
learn across projects?

00:12:19.540 --> 00:12:23.030
Are there similarities about
what we're finding in the

00:12:23.030 --> 00:12:27.480
evaluation of this project
and that project?

00:12:27.480 --> 00:12:31.200
For example, are there are
ways that you're learning

00:12:31.200 --> 00:12:34.310
about how to change people's
behavior in health, and

00:12:34.310 --> 00:12:36.300
agriculture, and education?

00:12:36.300 --> 00:12:39.060
Are there similarities, sort of
underlying principles, that

00:12:39.060 --> 00:12:44.440
we're learning about that we can
use in different contexts?

00:12:44.440 --> 00:12:47.420
And ultimately, to reduce
poverty through more effective

00:12:47.420 --> 00:12:51.390
programs is the ultimate
objective of evaluation.

00:12:51.390 --> 00:12:57.200
So using that as a framework,
what makes a good evaluation?

00:12:57.200 --> 00:13:01.990
Well, the key thing is
it's got to answer

00:13:01.990 --> 00:13:03.240
an important question.

00:13:05.495 --> 00:13:08.430
But it's no good if it answers
an important question but it

00:13:08.430 --> 00:13:10.270
answers it badly.

00:13:10.270 --> 00:13:12.430
It's got to answer it
in an unbiased way.

00:13:12.430 --> 00:13:13.130
What do I mean by that?

00:13:13.130 --> 00:13:15.490
I mean that it's got to
find the truthful

00:13:15.490 --> 00:13:19.290
answer to the question.

00:13:19.290 --> 00:13:24.890
And really to do that, you
need to have a model or a

00:13:24.890 --> 00:13:30.120
theory of change about how the
project is working so that you

00:13:30.120 --> 00:13:33.130
can test the different
steps in the model.

00:13:33.130 --> 00:13:37.700
And that's the best way to
then learn the most.

00:13:37.700 --> 00:13:41.220
If we simply say, we test
whether the project worked or

00:13:41.220 --> 00:13:44.040
didn't work, we learned
something, but we learn an

00:13:44.040 --> 00:13:47.270
awful lot more if you have a
specific model of how the

00:13:47.270 --> 00:13:50.230
project is going to work and you
test the different steps

00:13:50.230 --> 00:13:51.500
along the way.

00:13:51.500 --> 00:13:54.780
Sometimes people say--

00:13:54.780 --> 00:13:57.050
this is something that drives
me mad in the evaluation

00:13:57.050 --> 00:13:57.570
literature--

00:13:57.570 --> 00:14:00.030
you hear people saying,
well, randomized

00:14:00.030 --> 00:14:01.550
evaluations are a black box.

00:14:01.550 --> 00:14:04.210
They can tell you whether
something works or not, but

00:14:04.210 --> 00:14:07.010
they can't tell you why.

00:14:07.010 --> 00:14:11.980
I hope in the next few days,
we're going show to you how

00:14:11.980 --> 00:14:15.230
you design an evaluation that
tells you not just whether it

00:14:15.230 --> 00:14:19.060
works or not at the end, but
why and how, and the steps

00:14:19.060 --> 00:14:23.640
along the way, and design it
cleverly so that you learn as

00:14:23.640 --> 00:14:27.270
much as you possibly can from
the evaluation about the

00:14:27.270 --> 00:14:28.760
fundamental question.

00:14:28.760 --> 00:14:30.380
And that's about getting
the questions

00:14:30.380 --> 00:14:31.850
right at the beginning.

00:14:31.850 --> 00:14:36.910
And it's about doing your model
correctly and thinking

00:14:36.910 --> 00:14:41.080
of indicators along the way that
are going to allow you to

00:14:41.080 --> 00:14:44.760
get to all those steps and
really understand the theory

00:14:44.760 --> 00:14:46.450
of change that's happening.

00:14:49.400 --> 00:14:52.870
The model is going to start with
what is it we're trying

00:14:52.870 --> 00:14:56.900
to do, who are the targets,
and what are their needs.

00:14:56.900 --> 00:15:01.600
So in an evaluation in
development, this would often

00:15:01.600 --> 00:15:05.120
be called a needs assessment.

00:15:05.120 --> 00:15:06.430
And what are their needs?

00:15:06.430 --> 00:15:08.585
But then what's the program
seeking to change?

00:15:12.810 --> 00:15:16.780
And looking at precise and
individual bits of the

00:15:16.780 --> 00:15:22.700
program, what's the precise
program or part of the program

00:15:22.700 --> 00:15:26.210
that's being evaluation,
so asking

00:15:26.210 --> 00:15:28.810
very specific questions.

00:15:28.810 --> 00:15:31.950
So let's look at an example.

00:15:31.950 --> 00:15:33.850
And all of this we're going
to come back to

00:15:33.850 --> 00:15:35.310
and do in more detail.

00:15:35.310 --> 00:15:37.630
How do you do a logframe?

00:15:37.630 --> 00:15:40.470
Again, maybe some of you
have done that before.

00:15:40.470 --> 00:15:42.410
Maybe you haven't.

00:15:42.410 --> 00:15:49.620
But hopefully you'll learn more
about how we think about

00:15:49.620 --> 00:15:51.290
doing a logframe.

00:15:51.290 --> 00:15:56.110
So here's an example of an
evaluation that looked, a very

00:15:56.110 --> 00:16:01.230
simple one, does giving
textbooks to children in Kenya

00:16:01.230 --> 00:16:04.490
improve test scores was
the evaluation.

00:16:04.490 --> 00:16:07.230
But what was the need?

00:16:07.230 --> 00:16:12.830
What was the problem that the
program was trying to test?

00:16:12.830 --> 00:16:16.680
Well poor children in Busia
District in Kenya had low

00:16:16.680 --> 00:16:18.270
learning levels.

00:16:18.270 --> 00:16:20.190
They also had low incomes.

00:16:20.190 --> 00:16:22.190
They had few books.

00:16:22.190 --> 00:16:25.940
That meant that they couldn't
take the books home, and that,

00:16:25.940 --> 00:16:28.180
the theory was, made
it hard to learn.

00:16:28.180 --> 00:16:30.100
So it was hard to learn because
they didn't have a

00:16:30.100 --> 00:16:32.410
book in front of them in class,
but also because there

00:16:32.410 --> 00:16:35.370
were so few, they couldn't take
them home and read up

00:16:35.370 --> 00:16:38.730
more and do exercises at home.

00:16:38.730 --> 00:16:40.810
So what was the input?

00:16:40.810 --> 00:16:46.265
The input was that a local NGO
bought additional textbooks.

00:16:52.430 --> 00:16:55.970
In order to get to your
long-term goal, you not only

00:16:55.970 --> 00:16:58.010
need the books, you
need to make sure

00:16:58.010 --> 00:16:59.250
that they're delivered.

00:16:59.250 --> 00:17:03.460
Because, again, making this
chain along the way will help

00:17:03.460 --> 00:17:06.380
you understand if it doesn't
work, where did it go wrong,

00:17:06.380 --> 00:17:08.869
if the books were bought but
they never got there, or they

00:17:08.869 --> 00:17:10.430
were stuck in the cupboard.

00:17:10.430 --> 00:17:13.790
How many times have we been to
schools and oh, yes, we have

00:17:13.790 --> 00:17:14.410
lots of books.

00:17:14.410 --> 00:17:16.725
And we don't want them to get
messy when the children are

00:17:16.725 --> 00:17:17.109
using them.

00:17:17.109 --> 00:17:20.599
So they're all nicely in
their sealed package.

00:17:20.599 --> 00:17:23.500
Well you need to be able to
distinguish if something

00:17:23.500 --> 00:17:25.380
doesn't work, is it because it's
stuck in the cupboard?

00:17:25.380 --> 00:17:27.950
Or was it because even when the
books are out there they

00:17:27.950 --> 00:17:30.320
didn't get used or
they didn't help?

00:17:30.320 --> 00:17:33.040
So the books are delivered
and used.

00:17:33.040 --> 00:17:36.990
The children use the books and
they're able to study better.

00:17:36.990 --> 00:17:42.250
And finally, the impact, which
is what we're about here, is

00:17:42.250 --> 00:17:44.540
yes, you got all of
those steps done.

00:17:44.540 --> 00:17:48.070
But did it actually change
their lives?

00:17:48.070 --> 00:17:51.260
Did it actually achieve the
impact you are hoping to get

00:17:51.260 --> 00:17:52.860
is high test scores?

00:17:52.860 --> 00:17:56.370
The long-term goal would be not
just high test scores but

00:17:56.370 --> 00:17:57.580
higher income.

00:17:57.580 --> 00:18:02.530
And that long-term goal may be
very difficult to test in the

00:18:02.530 --> 00:18:04.560
evaluation.

00:18:04.560 --> 00:18:09.810
And you may use some other
work that may have linked

00:18:09.810 --> 00:18:15.100
these in previous studies in the
same country to make the

00:18:15.100 --> 00:18:18.090
assumption that if we got higher
test scores, it will

00:18:18.090 --> 00:18:20.000
have a positive impact.

00:18:20.000 --> 00:18:22.480
So again, that's a decision
when you make in your

00:18:22.480 --> 00:18:26.600
evaluation how far along this
chain you go, if it's a

00:18:26.600 --> 00:18:30.090
process evaluation you
may stop here.

00:18:30.090 --> 00:18:32.660
If it's an impact evaluation
you have to stop here.

00:18:32.660 --> 00:18:35.080
But you may not have enough
money to take it all the way

00:18:35.080 --> 00:18:39.010
through to the finest, finest
level that you would like to.

00:18:39.010 --> 00:18:43.740
Oh, I didn't do my little red
triangles at right point.

00:18:43.740 --> 00:18:44.990
OK.

00:18:46.440 --> 00:18:49.500
So I've already, in a
sense, introduced

00:18:49.500 --> 00:18:52.040
some of these concepts.

00:18:52.040 --> 00:18:54.350
But again, let's review them so
we know we're talking about

00:18:54.350 --> 00:18:55.810
the same thing.

00:18:55.810 --> 00:18:59.580
There are many different
kinds of evaluation.

00:18:59.580 --> 00:19:04.700
And needs assessment is where
you go in and look at a

00:19:04.700 --> 00:19:08.450
population, see what
are the issues.

00:19:08.450 --> 00:19:11.460
How many of them
have bed nets?

00:19:11.460 --> 00:19:12.830
What are test scores
at the moment?

00:19:12.830 --> 00:19:14.340
How many books are there?

00:19:14.340 --> 00:19:16.180
What's class size?

00:19:16.180 --> 00:19:20.720
What are the problems in
your target population?

00:19:20.720 --> 00:19:24.220
In our process evaluation, does
someone want to tell me

00:19:24.220 --> 00:19:29.870
what they would see as
a process evaluation?

00:19:29.870 --> 00:19:31.820
We talked a little
bit about it.

00:19:31.820 --> 00:19:31.990
Someone?

00:19:31.990 --> 00:19:32.800
Yeah.

00:19:32.800 --> 00:19:33.040
AUDIENCE:
[UNINTELLIGIBLE PHRASE] the

00:19:33.040 --> 00:19:40.740
chain that you just presented
to see how you get from the

00:19:40.740 --> 00:19:46.500
input to the output from the
output to the outcome of this

00:19:46.500 --> 00:19:47.393
RACHEL GLENNERSTER: Right.

00:19:47.393 --> 00:19:50.704
AUDIENCE: Are we successful in
doing that in transforming our

00:19:50.704 --> 00:19:52.600
input into output?

00:19:52.600 --> 00:19:53.330
RACHEL GLENNERSTER: Right.

00:19:53.330 --> 00:20:01.150
So process evaluation looks at
did we buy the textbooks?

00:20:01.150 --> 00:20:02.730
Were they delivered?

00:20:02.730 --> 00:20:04.350
Were they used?

00:20:04.350 --> 00:20:10.270
So moving inputs, outputs,
outcomes, but stopping short

00:20:10.270 --> 00:20:12.300
before we get to the impact.

00:20:12.300 --> 00:20:16.860
And that's a very useful thing
to do, and should be done

00:20:16.860 --> 00:20:22.680
basically everywhere you do a
program, or at least some of

00:20:22.680 --> 00:20:25.180
those steps need to be measured
almost every time you

00:20:25.180 --> 00:20:27.090
do a program.

00:20:27.090 --> 00:20:29.830
But it kind of stops short
before you get

00:20:29.830 --> 00:20:30.950
to the impact stage.

00:20:30.950 --> 00:20:34.992
Have we actually changed
people's lives?

00:20:34.992 --> 00:20:36.610
We wanted to build a school.

00:20:36.610 --> 00:20:38.810
Did we build a school?

00:20:38.810 --> 00:20:39.830
We wanted to build a bridge.

00:20:39.830 --> 00:20:40.940
Did we build a bridge?

00:20:40.940 --> 00:20:42.270
We wanted to deliver things?

00:20:42.270 --> 00:20:43.630
Did we deliver things?

00:20:43.630 --> 00:20:46.810
But it's stopping before you
get the point of knowing

00:20:46.810 --> 00:20:49.430
whether this has actually
changed people's lives.

00:20:49.430 --> 00:20:53.300
So an impact evaluation then
goes to the next stage and

00:20:53.300 --> 00:20:57.150
says, given that we have done
what we said we're going to

00:20:57.150 --> 00:21:00.640
do, has that actually
changed things?

00:21:00.640 --> 00:21:04.380
And this is where there
was a big gap in

00:21:04.380 --> 00:21:08.780
terms of what we know.

00:21:08.780 --> 00:21:10.010
There's a lot of
lesson learning

00:21:10.010 --> 00:21:11.710
you can do from process.

00:21:11.710 --> 00:21:17.200
But in terms of knowing what
kind of project is going to be

00:21:17.200 --> 00:21:21.100
successful in reducing poverty,
you really need to go

00:21:21.100 --> 00:21:22.350
this next step.

00:21:24.730 --> 00:21:28.040
Now we used to just talk
about those three.

00:21:28.040 --> 00:21:32.930
But increasingly, as I say as
I have more contact outside

00:21:32.930 --> 00:21:38.080
economics and the more research
side of evaluation to

00:21:38.080 --> 00:21:45.800
work a lot with DFID and other
organizations, other

00:21:45.800 --> 00:21:49.010
foundations, and other agencies,
I realize a lot of

00:21:49.010 --> 00:21:52.180
what people outside the academic
community call

00:21:52.180 --> 00:21:55.180
evaluation I would
call review.

00:21:55.180 --> 00:21:55.770
It's very weird.

00:21:55.770 --> 00:21:59.550
Because they would often call
what I call something else.

00:21:59.550 --> 00:22:07.870
But what I mean by review is,
it's sort of an assessment.

00:22:07.870 --> 00:22:12.280
It's sending a knowledgeable
person in and reviewing the

00:22:12.280 --> 00:22:16.480
program and giving their
comments on it, which can be

00:22:16.480 --> 00:22:21.050
extremely helpful if you have
a good person going and

00:22:21.050 --> 00:22:24.400
talking to the people involved,
and saying, well, in

00:22:24.400 --> 00:22:27.430
my experience, it could have
been done differently.

00:22:27.430 --> 00:22:31.120
But it doesn't quite actually
do any of these things.

00:22:35.000 --> 00:22:39.170
It's not just focused in
did I build the school.

00:22:39.170 --> 00:22:43.290
But it's asking questions
about was there enough

00:22:43.290 --> 00:22:45.370
participation.

00:22:45.370 --> 00:22:49.030
How well organized
was the NGO?

00:22:49.030 --> 00:22:51.200
And a lot of this is
very subjective.

00:22:51.200 --> 00:22:53.610
So I'm not saying that
this is bad.

00:22:53.610 --> 00:22:56.430
It's just kind of different.

00:22:56.430 --> 00:23:00.550
And if you have someone
very good doing it,

00:23:00.550 --> 00:23:01.800
it can be very useful.

00:23:04.120 --> 00:23:06.630
My concern with it, is that it's
very subjective to the

00:23:06.630 --> 00:23:07.470
person who's going.

00:23:07.470 --> 00:23:07.640
Yeah.

00:23:07.640 --> 00:23:08.420
Logan?

00:23:08.420 --> 00:23:15.490
AUDIENCE: I think that you see
so many reviews simply because

00:23:15.490 --> 00:23:18.070
the way-- you just
mentioned DFID.

00:23:18.070 --> 00:23:20.420
USAID, I think, is
the same way.

00:23:20.420 --> 00:23:22.010
It's all retroactive.

00:23:22.010 --> 00:23:24.680
The way that contracts are
awarded and things like that,

00:23:24.680 --> 00:23:27.990
usually it's because it's a
requirement to evaluate a

00:23:27.990 --> 00:23:29.490
certain number of programs.

00:23:29.490 --> 00:23:33.340
And it's not until after the
program is actually done that

00:23:33.340 --> 00:23:35.660
they decide they're going
to evaluate it.

00:23:35.660 --> 00:23:41.080
And it's obviously cheaper to
send one person over and do

00:23:41.080 --> 00:23:42.450
the simple review.

00:23:42.450 --> 00:23:44.340
I think it would
be interesting.

00:23:44.340 --> 00:23:47.390
We'll probably get to this when
we talk about how you can

00:23:47.390 --> 00:23:51.050
apply some of the randomized
control test methodology to

00:23:51.050 --> 00:23:53.720
something that you're
doing retroactively.

00:23:56.890 --> 00:24:01.670
RACHEL GLENNERSTER: So just to
repeat, the argument is we do

00:24:01.670 --> 00:24:04.400
a lot of reviews because a
lot of evaluation is done

00:24:04.400 --> 00:24:07.210
retroactively.

00:24:07.210 --> 00:24:09.870
What you can do at that
point is very limited.

00:24:09.870 --> 00:24:11.730
Yes.

00:24:11.730 --> 00:24:14.180
So this is a big distinction
between the kinds of

00:24:14.180 --> 00:24:20.480
evaluations, is one that's set
up beforehand and one that is

00:24:20.480 --> 00:24:22.270
after the event.

00:24:22.270 --> 00:24:23.280
We've got this program.

00:24:23.280 --> 00:24:24.530
We want to know whether
it works.

00:24:27.140 --> 00:24:30.330
Basically it's really
hard to do that.

00:24:30.330 --> 00:24:32.730
You've already kind of shot
yourself in the foot if you

00:24:32.730 --> 00:24:34.390
haven't set it up beforehand.

00:24:34.390 --> 00:24:37.530
If we think about what I was
saying about it's crucial to

00:24:37.530 --> 00:24:41.580
have a theory of change, a model
about what we're trying

00:24:41.580 --> 00:24:44.040
to achieve and how we're going
to try and achieve it, and

00:24:44.040 --> 00:24:48.340
measuring each of those steps,
if you're coming in

00:24:48.340 --> 00:24:52.930
afterwards, then you're kind
of adhoc-ly making up what

00:24:52.930 --> 00:24:56.000
your theory of change is.

00:24:56.000 --> 00:25:01.540
And you haven't set up systems
to measure those steps along

00:25:01.540 --> 00:25:05.900
the way, it's going to
be very hard to do.

00:25:05.900 --> 00:25:10.700
And that's exactly why you end
up with a lot of reviews.

00:25:10.700 --> 00:25:11.750
You're in this mess.

00:25:11.750 --> 00:25:13.745
And so you just send someone
knowledgeable and hope they

00:25:13.745 --> 00:25:16.110
can figure it out.

00:25:16.110 --> 00:25:19.880
To answer your specific question
though, you can't do

00:25:19.880 --> 00:25:23.090
a randomized evaluation
after the event.

00:25:23.090 --> 00:25:28.040
Because the whole point is
you're moving people into

00:25:28.040 --> 00:25:31.100
treatment and control based
on a flip of the coin.

00:25:31.100 --> 00:25:35.320
And then after the event, people
have been allocated to

00:25:35.320 --> 00:25:36.970
the treatment or not
the treatment.

00:25:36.970 --> 00:25:41.760
And it's very difficult to know
afterwards were these

00:25:41.760 --> 00:25:45.630
people similar beforehand.

00:25:45.630 --> 00:25:47.930
It's impossible to
distinguish.

00:25:47.930 --> 00:25:49.110
They may look different now.

00:25:49.110 --> 00:25:52.400
But you don't know whether
they look different now

00:25:52.400 --> 00:25:58.970
because they were different in
the beginning or because

00:25:58.970 --> 00:26:01.120
they're different because
of the program.

00:26:01.120 --> 00:26:02.210
Yeah?

00:26:02.210 --> 00:26:04.070
AUDIENCE: I was really
interested to read the first

00:26:04.070 --> 00:26:09.110
case study because it seemed
that you were applying

00:26:09.110 --> 00:26:10.560
randomized control
methodology.

00:26:10.560 --> 00:26:12.300
But it seemed to be actually
done retroactively.

00:26:14.885 --> 00:26:16.550
RACHEL GLENNERSTER: No.

00:26:16.550 --> 00:26:18.095
It wasn't.

00:26:18.095 --> 00:26:19.620
It might look that way.

00:26:19.620 --> 00:26:25.180
But it was set up so the first
case study uses a lot of

00:26:25.180 --> 00:26:29.370
different methodologies,
compares different

00:26:29.370 --> 00:26:30.450
methodologies.

00:26:30.450 --> 00:26:34.150
But they couldn't use all those
methodologies if they

00:26:34.150 --> 00:26:36.470
hadn't designed it as a
randomized study at the

00:26:36.470 --> 00:26:37.720
beginning actually.

00:26:41.620 --> 00:26:44.270
If you've set up a randomized
evaluation, you can always do

00:26:44.270 --> 00:26:45.900
a non-randomized evaluation
of it.

00:26:45.900 --> 00:26:48.700
But if you haven't done it as
a randomized to start with,

00:26:48.700 --> 00:26:49.950
you can't make it randomized.

00:26:56.290 --> 00:26:59.880
Prospective evaluation, setting
up the evaluation from

00:26:59.880 --> 00:27:04.840
the beginning, is very important
I would say in any

00:27:04.840 --> 00:27:07.900
methodology you use.

00:27:07.900 --> 00:27:11.290
But it's impossible to do it.

00:27:14.150 --> 00:27:18.010
There are a couple of examples
where people have done a

00:27:18.010 --> 00:27:21.060
randomized evaluation afterwards
or an evaluation

00:27:21.060 --> 00:27:21.830
afterwards.

00:27:21.830 --> 00:27:24.300
And that is because the
randomization happened

00:27:24.300 --> 00:27:25.520
beforehand.

00:27:25.520 --> 00:27:28.580
But it wasn't done because
it was an evaluation.

00:27:28.580 --> 00:27:34.070
So if you look at the case on
the women's empowerment in

00:27:34.070 --> 00:27:39.670
India, you will do later in the
week, that was not set up

00:27:39.670 --> 00:27:40.740
as an evaluation.

00:27:40.740 --> 00:27:44.700
It was set up as a randomized
program.

00:27:44.700 --> 00:27:48.010
And the rationale was they
wanted to be fair.

00:27:48.010 --> 00:27:50.430
So where there's limited
resources, sometimes

00:27:50.430 --> 00:27:55.620
governments are the people who
randomized in order to be fair

00:27:55.620 --> 00:27:58.820
to the different participants.

00:27:58.820 --> 00:28:01.060
Some people, in this
case, will have to

00:28:01.060 --> 00:28:02.410
get a women's leader.

00:28:02.410 --> 00:28:05.960
In Colombia Project, that you'll
find on our website,

00:28:05.960 --> 00:28:10.870
the Colombian government wanted
to provide vouchers to

00:28:10.870 --> 00:28:11.690
go to private school.

00:28:11.690 --> 00:28:13.590
But they couldn't afford
it for every one.

00:28:13.590 --> 00:28:17.030
So they randomized who
would get them.

00:28:17.030 --> 00:28:19.920
So that's the one case where
you can do a randomized

00:28:19.920 --> 00:28:22.290
evaluation after the event,
when somebody else is

00:28:22.290 --> 00:28:25.510
randomized beforehand, but they
weren't actually thinking

00:28:25.510 --> 00:28:29.150
of it as an evaluation.

00:28:29.150 --> 00:28:32.370
But even then, it would've
been nice to have data

00:28:32.370 --> 00:28:34.980
beforehand.

00:28:34.980 --> 00:28:38.170
So the last thing on this list
is cost-benefit analysis,

00:28:38.170 --> 00:28:42.200
which is something that you can
do with the input from all

00:28:42.200 --> 00:28:43.450
of these other things.

00:28:46.540 --> 00:28:50.320
As they say, the piece of
information that we have so

00:28:50.320 --> 00:28:54.120
little of is what's the effect
of a dollar here versus a

00:28:54.120 --> 00:28:55.470
dollar here.

00:28:55.470 --> 00:29:00.960
And you can only do that if
that's one of your ultimate

00:29:00.960 --> 00:29:04.780
objectives when you're doing
these other impact evaluations

00:29:04.780 --> 00:29:06.800
or these other evaluation
methodologies.

00:29:06.800 --> 00:29:09.890
Because you need to be
collecting data about costs.

00:29:09.890 --> 00:29:13.380
And the benefits will come from
your impact evaluation.

00:29:13.380 --> 00:29:16.440
But you need to get your costs
from your process evaluation.

00:29:16.440 --> 00:29:18.610
And you can put the
two together.

00:29:18.610 --> 00:29:20.760
And you can do a cost
effectiveness.

00:29:20.760 --> 00:29:25.190
Then if somebody else has done
that in their other study, you

00:29:25.190 --> 00:29:28.560
can do a cost effectiveness
comparison across studies.

00:29:28.560 --> 00:29:35.660
Or even you can evaluate a range
of different options on

00:29:35.660 --> 00:29:36.840
your impact evaluation.

00:29:36.840 --> 00:29:38.400
And that will give
you comparative

00:29:38.400 --> 00:29:43.500
cost-effectiveness across.

00:29:43.500 --> 00:29:47.440
So going into a bit more detail
in some of these, needs

00:29:47.440 --> 00:29:48.750
assessment.

00:29:48.750 --> 00:29:51.580
We'll look at who's the
target population.

00:29:51.580 --> 00:29:56.410
Is it all children or are we
particularly focused on

00:29:56.410 --> 00:30:00.840
helping the lowest-achieving
in the group?

00:30:00.840 --> 00:30:04.100
What's the nature of the
problem being solved?

00:30:04.100 --> 00:30:06.450
Many of these communities will
have lots of problems.

00:30:06.450 --> 00:30:10.120
So what are we particularly
trying to focus on here?

00:30:10.120 --> 00:30:11.140
Is a test schools?

00:30:11.140 --> 00:30:14.590
Is it attendance at school?

00:30:14.590 --> 00:30:16.590
How will textbooks solve
the problem?

00:30:16.590 --> 00:30:18.170
I was talking about textbooks.

00:30:18.170 --> 00:30:20.220
Maybe it's because they
can take them home.

00:30:20.220 --> 00:30:23.380
Well, if that's part of your
model, your theory of change,

00:30:23.380 --> 00:30:25.830
you need to be actually
measuring that, not just did

00:30:25.830 --> 00:30:28.550
they arrive.

00:30:28.550 --> 00:30:33.540
How does the service fit
into the environment?

00:30:33.540 --> 00:30:36.660
How many times have we sat
in an office and designed

00:30:36.660 --> 00:30:39.100
something that we thought made
complete sense, and gone out

00:30:39.100 --> 00:30:42.190
to the field and thought,
what was I thinking?

00:30:42.190 --> 00:30:43.440
This isn't going to work.

00:30:46.480 --> 00:30:48.030
How does it feel for
the teachers?

00:30:48.030 --> 00:30:50.180
Do they understand
the new books?

00:30:50.180 --> 00:30:52.720
Do they know how to
teach from them?

00:30:52.720 --> 00:30:54.560
How do the books fit
into the curricula?

00:31:01.690 --> 00:31:04.320
What are you trying to
get out of this?

00:31:04.320 --> 00:31:06.090
As they say, you want a clear
sense of the target

00:31:06.090 --> 00:31:08.540
population.

00:31:08.540 --> 00:31:14.010
So then you want to see are
the students responding?

00:31:14.010 --> 00:31:17.640
If you're particularly worried
about low-performing kids, are

00:31:17.640 --> 00:31:20.640
they responding to
the textbooks?

00:31:20.640 --> 00:31:26.650
Students who are falling behind,
a sense of the needs

00:31:26.650 --> 00:31:31.140
the program will fill, what
are the teachers lacking?

00:31:31.140 --> 00:31:33.010
How are we going to deliver
the textbooks?

00:31:33.010 --> 00:31:35.660
How many textbooks are
we going to deliver?

00:31:35.660 --> 00:31:37.990
And what are the potential
barriers for people learning

00:31:37.990 --> 00:31:39.240
from the textbooks?

00:31:41.550 --> 00:31:48.960
Then a clear articulation of
the program benefits, and a

00:31:48.960 --> 00:31:50.240
sense of alternatives.

00:31:50.240 --> 00:31:53.970
And as I say, if you want to
look at cost-effectiveness of

00:31:53.970 --> 00:31:57.530
alternative approaches, it's
very important to think

00:31:57.530 --> 00:32:00.930
through not just this program in
isolation, but what are the

00:32:00.930 --> 00:32:02.510
alternatives that we
could be doing?

00:32:02.510 --> 00:32:04.470
And how does that
fit with them?

00:32:04.470 --> 00:32:06.580
Is this one is the most
expensive things we're going

00:32:06.580 --> 00:32:10.010
to try, one of the cheapest
things we want to try, and

00:32:10.010 --> 00:32:11.260
everything in between?

00:32:13.860 --> 00:32:17.250
So you may be thinking in this
context, is this a replicable

00:32:17.250 --> 00:32:20.350
program that I'm going to
be able to do elsewhere?

00:32:20.350 --> 00:32:24.190
Is this the gold-plated version
that I'll do if I get

00:32:24.190 --> 00:32:25.810
lots of funding?

00:32:25.810 --> 00:32:28.160
Or is this something that
I can replicate in

00:32:28.160 --> 00:32:31.080
lots of other places?

00:32:31.080 --> 00:32:34.490
Process evaluation, I've really
sort of talked quite a

00:32:34.490 --> 00:32:35.160
bit about these.

00:32:35.160 --> 00:32:36.410
So I'm going through
them faster.

00:32:39.160 --> 00:32:43.020
And when you do an impact
evaluation, because the impact

00:32:43.020 --> 00:32:46.950
evaluation is the last thing on
that chain, you need to do

00:32:46.950 --> 00:32:48.850
all the other bits on
the chain as well.

00:32:48.850 --> 00:32:51.740
You can't do an impact
evaluation without a process

00:32:51.740 --> 00:32:54.670
evaluation or you won't
understand what the hell your

00:32:54.670 --> 00:32:59.440
answer is meaning at the end.

00:32:59.440 --> 00:33:02.130
So as we say, a process
evaluation is asking are the

00:33:02.130 --> 00:33:03.830
services being delivered?

00:33:03.830 --> 00:33:04.875
Is the money being spent?

00:33:04.875 --> 00:33:06.910
Are the textbooks reaching
the classroom?

00:33:06.910 --> 00:33:08.160
Are they being used?

00:33:10.660 --> 00:33:13.830
And it's also, as I say,
important to be asking

00:33:13.830 --> 00:33:17.480
yourself, what are
the alternatives.

00:33:17.480 --> 00:33:18.910
Could you do this
in a better way?

00:33:22.330 --> 00:33:26.410
Just like a company is always
thinking are there ways to

00:33:26.410 --> 00:33:27.660
reduce costs.

00:33:30.140 --> 00:33:31.610
You should be thinking
are their ways to

00:33:31.610 --> 00:33:34.200
do this more cheaply.

00:33:34.200 --> 00:33:36.840
Are the services reaching
the right populations?

00:33:36.840 --> 00:33:38.540
Which students are
taking them home?

00:33:38.540 --> 00:33:40.540
Is it the ones that I'm
targeting or only the most

00:33:40.540 --> 00:33:42.130
motivated ones?

00:33:42.130 --> 00:33:44.110
And also, are the clients
satisfied?

00:33:44.110 --> 00:33:46.130
What's their response
to the program?

00:33:51.410 --> 00:33:57.480
So an impact evaluation, am
i missing a top bit here?

00:33:57.480 --> 00:33:58.440
No.

00:33:58.440 --> 00:33:59.690
OK.

00:34:01.280 --> 00:34:02.110
Here we go.

00:34:02.110 --> 00:34:03.460
We're out of order.

00:34:03.460 --> 00:34:06.430
So an impact evaluation
is, as they say,

00:34:06.430 --> 00:34:07.620
taking it from there.

00:34:07.620 --> 00:34:11.639
So assuming once you've got all
the processes working, and

00:34:11.639 --> 00:34:15.590
it's all happening, but
if it happens, does

00:34:15.590 --> 00:34:16.840
it produce an impact?

00:34:30.480 --> 00:34:33.550
Take our theory of
change seriously.

00:34:33.550 --> 00:34:36.800
And say what we might expect
to change if that theory of

00:34:36.800 --> 00:34:37.969
change is happening.

00:34:37.969 --> 00:34:41.100
So we've got this theory of
change that says, this is how

00:34:41.100 --> 00:34:44.340
we expect things to change.

00:34:44.340 --> 00:34:47.989
These are the processes by which
we expect, like the kids

00:34:47.989 --> 00:34:49.550
taking the books home.

00:34:49.550 --> 00:34:52.590
So we want to design some
intermediate indicators and

00:34:52.590 --> 00:34:56.190
final outcomes that will
trace out that model.

00:35:00.840 --> 00:35:05.030
So our primary focus is going to
be, did the textbooks cause

00:35:05.030 --> 00:35:06.870
children to learn more.

00:35:06.870 --> 00:35:09.460
But we might also be
interested in some

00:35:09.460 --> 00:35:11.970
distributional issues.

00:35:11.970 --> 00:35:16.430
Not just on average, we might
also be interested in was it

00:35:16.430 --> 00:35:18.790
the high achieving kids
that learned more?

00:35:18.790 --> 00:35:20.570
Was it the low-achieving kids?

00:35:20.570 --> 00:35:22.590
Because very often in
development, we're just as

00:35:22.590 --> 00:35:25.920
interested in the distributional
implications of

00:35:25.920 --> 00:35:27.890
a project as the average.

00:35:27.890 --> 00:35:29.270
So who is it who learned?

00:35:33.740 --> 00:35:38.330
How does impact differ
from process?

00:35:38.330 --> 00:35:41.290
In the process, we describe
what happened.

00:35:41.290 --> 00:35:43.130
And you can do that from
reading documents,

00:35:43.130 --> 00:35:46.250
interviewing people in
administrative records.

00:35:46.250 --> 00:35:51.340
In an impact question, we need
to compare what happened to

00:35:51.340 --> 00:35:55.260
the people who got the program
with what would have happened.

00:35:55.260 --> 00:35:58.960
This is the fundamental question
that Dan is going to

00:35:58.960 --> 00:36:05.090
hammer on about in his lecture
about why do we use randomized

00:36:05.090 --> 00:36:06.340
evaluations.

00:36:08.500 --> 00:36:10.830
We talk about this is
the counterfactual.

00:36:10.830 --> 00:36:15.270
What would have happened if the
program hadn't happened?

00:36:15.270 --> 00:36:17.100
That's the fundamental
question that we're

00:36:17.100 --> 00:36:18.100
trying to get at.

00:36:18.100 --> 00:36:21.640
Obviously it's impossible to
know exactly what would have

00:36:21.640 --> 00:36:24.420
happened if the program
hadn't happened.

00:36:24.420 --> 00:36:26.280
But that's what we're
trying to get at.

00:36:26.280 --> 00:36:27.400
Just one second.

00:36:27.400 --> 00:36:28.300
Yeah?

00:36:28.300 --> 00:36:32.610
AUDIENCE: So one thing that
would seem to fit in somewhere

00:36:32.610 --> 00:36:37.090
with the impact thing but
doesn't quite meet the

00:36:37.090 --> 00:36:40.065
criteria that you've just
described that we've use

00:36:40.065 --> 00:36:45.240
sometimes is this
pre-post test.

00:36:45.240 --> 00:36:47.760
And that isn't necessarily
going to say

00:36:47.760 --> 00:36:49.070
what would have happened.

00:36:49.070 --> 00:36:51.990
But it will say, well,
what were the

00:36:51.990 --> 00:36:53.118
conditions when you started?

00:36:53.118 --> 00:36:56.610
And we extrapolate from that
looking at where we are when

00:36:56.610 --> 00:37:01.283
we ended, what can we say
about the impact of the

00:37:01.283 --> 00:37:01.450
intervention?

00:37:01.450 --> 00:37:03.430
RACHEL GLENNERSTER: Right.

00:37:03.430 --> 00:37:07.480
So that is one way that people
often try and do an impact

00:37:07.480 --> 00:37:10.650
evaluation and measure are
they having an impact.

00:37:10.650 --> 00:37:14.530
And I guess it can give you some
sense of whether you're

00:37:14.530 --> 00:37:16.850
having an impact or
flag problems.

00:37:16.850 --> 00:37:18.840
It's to say, well, what were
conditions at the beginning?

00:37:18.840 --> 00:37:21.330
What are they like now?

00:37:21.330 --> 00:37:25.760
Then you have this assumption,
which is that all the

00:37:25.760 --> 00:37:30.770
difference between then and
now is due to the program.

00:37:30.770 --> 00:37:34.125
And often that's not a very
appropriate assumption.

00:37:36.680 --> 00:37:39.770
Often things happen.

00:37:39.770 --> 00:37:43.180
If we take our example of
schools, the kids will know

00:37:43.180 --> 00:37:44.700
more at the end of
the year then

00:37:44.700 --> 00:37:45.780
they knew at the beginning.

00:37:45.780 --> 00:37:49.620
Well would they have known more
even if we hadn't given

00:37:49.620 --> 00:37:51.620
them more textbooks.

00:37:51.620 --> 00:37:52.870
Probably.

00:37:56.240 --> 00:37:58.460
So that's kind of
the fundamental

00:37:58.460 --> 00:37:59.720
assumption you're making.

00:37:59.720 --> 00:38:02.760
And it's a difficult
one to make.

00:38:02.760 --> 00:38:06.520
It's also the case that we
talked to people who were

00:38:06.520 --> 00:38:10.120
doing a project in Gujarat.

00:38:10.120 --> 00:38:14.990
And they were tearing their hair
out and saying, well, we

00:38:14.990 --> 00:38:19.690
seem to be doing terribly.

00:38:19.690 --> 00:38:22.310
Our program is doing terribly.

00:38:22.310 --> 00:38:27.990
People now are worse off
than when we started.

00:38:27.990 --> 00:38:35.410
This was, well Mark will know
the years of the riots and

00:38:35.410 --> 00:38:37.720
earthquake in Gujarat.

00:38:37.720 --> 00:38:40.870
They'd basically taken data
when they started.

00:38:40.870 --> 00:38:44.080
In the meantime, there had been
a massive earthquake and

00:38:44.080 --> 00:38:50.180
massive ethnic riots against
Muslims in Gujarat.

00:38:50.180 --> 00:38:51.790
Of course people
were worse off.

00:38:51.790 --> 00:38:53.540
And that's not because of you.

00:38:53.540 --> 00:38:56.660
So if can go either
way actually.

00:38:56.660 --> 00:38:59.440
You can assume that your program
is doing much better

00:38:59.440 --> 00:39:01.000
because other things are coming

00:39:01.000 --> 00:39:02.440
along and helping people.

00:39:02.440 --> 00:39:06.320
And you're attributing all the
change to your program.

00:39:06.320 --> 00:39:11.330
Or it could be the case in
this extreme example.

00:39:11.330 --> 00:39:15.560
There's a massive earthquake
and massive religious and

00:39:15.560 --> 00:39:19.790
ethnic riots and you
attribute all the

00:39:19.790 --> 00:39:20.930
negative to your program.

00:39:20.930 --> 00:39:27.340
So it's a way that sometimes
people use of

00:39:27.340 --> 00:39:28.590
trying to get an impact.

00:39:32.990 --> 00:39:35.560
It's not a very accurate way of
getting your impact, which

00:39:35.560 --> 00:39:39.290
is why a randomized evaluation
would help.

00:39:39.290 --> 00:39:42.830
So, as you say, it doesn't
quite fit this criteria.

00:39:42.830 --> 00:39:45.650
Because it doesn't
quite answer.

00:39:45.650 --> 00:39:47.300
It says what happened
over the period.

00:39:47.300 --> 00:39:48.950
It doesn't say what would
have happened.

00:39:48.950 --> 00:39:51.300
It is not a comparison of what
would have happened with what

00:39:51.300 --> 00:39:52.180
actually had.

00:39:52.180 --> 00:39:58.880
And that's how you want
to get at your impact.

00:39:58.880 --> 00:40:02.620
So there are various
ways to get at it.

00:40:02.620 --> 00:40:07.050
But some of them are more
effective than others.

00:40:07.050 --> 00:40:10.870
So let's go back to our
objectives and see if we get a

00:40:10.870 --> 00:40:15.010
match these different kinds of
evaluations to our different

00:40:15.010 --> 00:40:20.720
objectives for evaluation and
find out which evaluation will

00:40:20.720 --> 00:40:22.580
answer which question.

00:40:22.580 --> 00:40:29.520
So accountability: the first
question for accountability is

00:40:29.520 --> 00:40:32.860
just did we do what we said
we were going to do.

00:40:32.860 --> 00:40:37.860
Now that you can use a process
evaluation to do that.

00:40:37.860 --> 00:40:40.970
Because did I do what I said
I was going to do.

00:40:40.970 --> 00:40:42.480
I promised to deliver books.

00:40:42.480 --> 00:40:44.240
Did I actually deliver books?

00:40:44.240 --> 00:40:48.210
Process evaluation is fine
for that kind of level of

00:40:48.210 --> 00:40:49.460
accountability.

00:40:55.200 --> 00:40:59.610
If my accountability is not just
did I do what I said, but

00:40:59.610 --> 00:41:02.600
did what I do help?

00:41:02.600 --> 00:41:04.320
Ultimately I'm there
to help people.

00:41:04.320 --> 00:41:06.120
Am I actually helping people?

00:41:06.120 --> 00:41:09.080
That's a deeper level
of accountability.

00:41:09.080 --> 00:41:13.430
And that, you can only answer
with an impact evaluation.

00:41:13.430 --> 00:41:17.570
Did I actually make the change
that I wanted to happen?

00:41:17.570 --> 00:41:24.900
If we look at lesson learning,
the first kind of lesson

00:41:24.900 --> 00:41:30.820
learning is, does a particular
program work or not work.

00:41:30.820 --> 00:41:36.110
So an impact evaluation can tell
you whether a particular

00:41:36.110 --> 00:41:37.360
program worked.

00:41:37.360 --> 00:41:39.550
If you look at different
impact evaluations of

00:41:39.550 --> 00:41:42.600
different programs, you can
start saying which ones

00:41:42.600 --> 00:41:45.380
worked, whether they work in
different situations, or

00:41:45.380 --> 00:41:47.620
whether a particular kind of
program works in different

00:41:47.620 --> 00:41:49.910
situations or not.

00:41:49.910 --> 00:41:53.150
Now what is the most effective
route for achieving a certain

00:41:53.150 --> 00:41:58.920
outcome is kind of an even
deeper level of learning.

00:41:58.920 --> 00:42:02.050
What kind of thing is the best
thing to do in this situation?

00:42:02.050 --> 00:42:04.490
And there you want to have
a cost-benefit analysis

00:42:04.490 --> 00:42:09.330
comparing several programs based
on a number of different

00:42:09.330 --> 00:42:12.100
impact evaluations.

00:42:12.100 --> 00:42:15.100
And then we said an even
deeper level is, can I

00:42:15.100 --> 00:42:20.770
understand how we
change behavior?

00:42:20.770 --> 00:42:24.740
Understand deep parameters of
what makes a successful

00:42:24.740 --> 00:42:27.740
program, of how do we change
behavior from health to

00:42:27.740 --> 00:42:28.330
agriculture?

00:42:28.330 --> 00:42:32.990
What are some similarities and
understanding of how people

00:42:32.990 --> 00:42:36.510
tick, and how we can use that
to design better programs?

00:42:36.510 --> 00:42:44.010
And again, that's linking our
results back to theories.

00:42:44.010 --> 00:42:47.000
You have got to have a deeper
theory understanding it, and

00:42:47.000 --> 00:42:49.670
then test that with different
impact evaluations.

00:42:49.670 --> 00:42:52.370
And you can get some kind of
general lessons from looking

00:42:52.370 --> 00:42:53.640
across impact evaluations.

00:42:56.350 --> 00:43:01.620
And then if we want to have our
reduced poverty through

00:43:01.620 --> 00:43:04.640
more effective programs, which
is our ultimate objective of

00:43:04.640 --> 00:43:08.880
doing evaluations, we've got to
say, did we learn from our

00:43:08.880 --> 00:43:09.910
impact evaluations?

00:43:09.910 --> 00:43:12.670
Because if we don't learn from
them and change our programs

00:43:12.670 --> 00:43:16.730
as a result, then we're not
going to achieve that.

00:43:16.730 --> 00:43:22.080
And I guess to say that solid,
reliable impact evaluations

00:43:22.080 --> 00:43:24.470
are a building block.

00:43:24.470 --> 00:43:26.590
You're not going to get
everything out of one impact

00:43:26.590 --> 00:43:27.090
evaluation.

00:43:27.090 --> 00:43:30.320
But if you build up enough, you
can generate the general

00:43:30.320 --> 00:43:31.810
lessons that you need
to do that.

00:43:37.660 --> 00:43:39.290
I've said quite a lot
of this already.

00:43:39.290 --> 00:43:42.380
But needs assessments give you
the metric for defining the

00:43:42.380 --> 00:43:43.990
cost-benefit ratio.

00:43:43.990 --> 00:43:48.390
So when we're looking at
cost-benefit analysis, we're

00:43:48.390 --> 00:43:51.660
looking at what's the most
cost-effective way of

00:43:51.660 --> 00:43:52.700
achieving x?

00:43:52.700 --> 00:43:54.160
Well, you need a needs
assessment to

00:43:54.160 --> 00:43:55.990
say what's the x?

00:43:55.990 --> 00:44:00.820
What's the thing that I should
be really trying to solve?

00:44:00.820 --> 00:44:04.880
Process evaluation gives you the
costs for your inputs to

00:44:04.880 --> 00:44:06.940
do a cost-benefit analysis.

00:44:06.940 --> 00:44:10.430
And an impact evaluation
tells you the benefit.

00:44:10.430 --> 00:44:14.110
So all of these different inputs
and needed to be able

00:44:14.110 --> 00:44:16.060
to do an effective cost-benefit
analysis.

00:44:16.060 --> 00:44:17.250
AUDIENCE: Rachel?

00:44:17.250 --> 00:44:18.054
RACHEL GLENNERSTER: Yeah?

00:44:18.054 --> 00:44:21.580
AUDIENCE: The needs assessment
seems to be more of a program

00:44:21.580 --> 00:44:24.010
design sort of a
[UNINTELLIGIBLE], whereas the

00:44:24.010 --> 00:44:26.570
remaining three are more like
the program has already been

00:44:26.570 --> 00:44:29.655
designed and we are being
cautious, we have thought that

00:44:29.655 --> 00:44:31.940
this is the right program
to go with.

00:44:31.940 --> 00:44:35.873
Please design a process
evaluation for this or a

00:44:35.873 --> 00:44:37.490
program evaluation for this.

00:44:37.490 --> 00:44:39.890
How has that needs assessment
different from the one that

00:44:39.890 --> 00:44:41.600
feeds into program design?

00:44:41.600 --> 00:44:46.600
RACHEL GLENNERSTER: Well, in a
sense, there's two different

00:44:46.600 --> 00:44:48.120
concepts here.

00:44:48.120 --> 00:44:48.540
You're right.

00:44:48.540 --> 00:44:52.910
There's a needs assessment
for a particular project.

00:44:56.450 --> 00:45:02.090
We're working with an NGO in
India called [UNINTELLIGIBLE]

00:45:02.090 --> 00:45:04.570
working in rural Rajasthan.

00:45:04.570 --> 00:45:08.330
And they said, we want to
do more on health in our

00:45:08.330 --> 00:45:09.070
communities.

00:45:09.070 --> 00:45:13.170
We've done a lot of education
and community building.

00:45:13.170 --> 00:45:15.410
But we want to do a lot
more in health.

00:45:15.410 --> 00:45:18.400
But before we start, we want
to know what are the health

00:45:18.400 --> 00:45:19.920
problems in this community?

00:45:19.920 --> 00:45:23.060
It doesn't make sense
to design the

00:45:23.060 --> 00:45:24.280
project until you know.

00:45:24.280 --> 00:45:29.980
So we went in and did., what
are the health problems?

00:45:29.980 --> 00:45:31.930
What's the level of services?

00:45:31.930 --> 00:45:33.430
Who are they getting
their health from?

00:45:33.430 --> 00:45:37.960
We did a very comprehensive
analysis of the issues.

00:45:37.960 --> 00:45:40.640
And that was a needs assessment
for that particular

00:45:40.640 --> 00:45:42.880
NGO in that particular area.

00:45:42.880 --> 00:45:46.780
But you can kind of think of
that in a wider context of

00:45:46.780 --> 00:45:52.940
saying, what are the key
problems in health in India or

00:45:52.940 --> 00:45:54.280
in developing countries?

00:45:54.280 --> 00:45:57.720
What are the top priority
things that we should be

00:45:57.720 --> 00:46:00.810
focusing on?

00:46:00.810 --> 00:46:02.320
Because again--

00:46:02.320 --> 00:46:04.230
and I'm going to get on to
strategy in a minute-- if

00:46:04.230 --> 00:46:07.150
you're thinking as an
organization, you can't do an

00:46:07.150 --> 00:46:09.250
impact evaluation
for everything.

00:46:09.250 --> 00:46:13.280
You can't look at comparative
cost-effectiveness for

00:46:13.280 --> 00:46:14.620
outcomes in the world.

00:46:14.620 --> 00:46:16.860
Or at least you've got
to start somewhere.

00:46:16.860 --> 00:46:20.390
You've got to start on, what
do I most want to know?

00:46:20.390 --> 00:46:24.560
What's the main thing I want
to change to see what's the

00:46:24.560 --> 00:46:26.760
cost of changing that thing?

00:46:26.760 --> 00:46:30.070
So is it test scores
in schools?

00:46:30.070 --> 00:46:31.680
Or is it attendance?

00:46:31.680 --> 00:46:34.160
Am I most concerned about
improving attendance?

00:46:34.160 --> 00:46:36.070
If you look at the Millennium
Development Goals, in the

00:46:36.070 --> 00:46:39.120
sense, that's the world's
prioritizing.

00:46:39.120 --> 00:46:42.230
They're saying, these are the
things that I most want to

00:46:42.230 --> 00:46:44.320
change in the world.

00:46:44.320 --> 00:46:47.360
And there they made the
decision, rightly or wrongly,

00:46:47.360 --> 00:46:51.510
on education, that they wanted
to get kids in school.

00:46:51.510 --> 00:46:55.600
And there isn't anything about
actually learning.

00:46:55.600 --> 00:46:58.210
And whether your needs is
getting kids in school or

00:46:58.210 --> 00:47:00.530
learning, you would design
very different projects.

00:47:00.530 --> 00:47:03.640
But you would also design
different impact evaluations,

00:47:03.640 --> 00:47:06.390
because those are two very
different questions.

00:47:06.390 --> 00:47:09.250
So the needs assessment
is telling you

00:47:09.250 --> 00:47:10.520
what are the problems?

00:47:10.520 --> 00:47:14.260
What am I prioritizing for my
programs, but also for my

00:47:14.260 --> 00:47:16.785
impact evaluations?

00:47:16.785 --> 00:47:18.180
Yeah?

00:47:18.180 --> 00:47:21.570
AUDIENCE: Do you need to make
a decision early on whether

00:47:21.570 --> 00:47:24.260
you're interested in actually
doing a cost-effectiveness

00:47:24.260 --> 00:47:27.200
analysis as opposed
to a cost-benefit.

00:47:27.200 --> 00:47:27.490
RACHEL GLENNERSTER:
[INTERPOSING VOICE]

00:47:27.490 --> 00:47:28.740
AUDIENCE:
[UNINTELLIGIBLE PHRASE].

00:47:30.826 --> 00:47:33.296
efficiency measure, where as
cost [UNINTELLIGIBLE}--

00:47:38.140 --> 00:47:42.610
RACHEL GLENNERSTER: So I'm kind
of using those too easily

00:47:42.610 --> 00:47:44.210
interchangeably.

00:47:44.210 --> 00:47:48.010
I don't think it's so
important here.

00:47:51.660 --> 00:47:55.702
How would you define the
difference between them?

00:47:55.702 --> 00:47:59.960
AUDIENCE: As I understand it,
but [UNINTELLIGIBLE PHRASE]

00:47:59.960 --> 00:48:03.620
they getting a better answer.

00:48:03.620 --> 00:48:06.450
But cost-effectiveness is
a productivity measure.

00:48:06.450 --> 00:48:09.360
And it would mean that you
would have to, in an

00:48:09.360 --> 00:48:13.080
evaluation say, OK, I'm going to
look at I put one buck into

00:48:13.080 --> 00:48:15.380
this program and I get
how many more days of

00:48:15.380 --> 00:48:16.410
schooling out of it.

00:48:16.410 --> 00:48:16.730
Right?

00:48:16.730 --> 00:48:17.650
RACHEL GLENNERSTER: Right.

00:48:17.650 --> 00:48:26.650
AUDIENCE: Whereas cost-benefit
requires that it all be in

00:48:26.650 --> 00:48:27.955
dollars or some other
[UNINTELLIGIBLE].

00:48:27.955 --> 00:48:30.380
RACHEL GLENNERSTER: So you've
got to change your benefit

00:48:30.380 --> 00:48:31.850
into dollars.

00:48:31.850 --> 00:48:35.173
So I'll give you an example
of the difference.

00:48:35.173 --> 00:48:36.423
AUDIENCE: Like
[INAUDIBLE PHRASE].

00:48:42.285 --> 00:48:43.790
RACHEL GLENNERSTER: Let's make
sure everybody's following

00:48:43.790 --> 00:48:45.040
this discussion.

00:48:48.900 --> 00:48:52.350
A cost-effectiveness question
would be to say, I want to

00:48:52.350 --> 00:48:56.430
increase the number
of kids in school.

00:48:56.430 --> 00:49:00.050
How much would it cost to get
an additional year of

00:49:00.050 --> 00:49:03.430
schooling from all of these
different programs?

00:49:03.430 --> 00:49:06.810
And I'm just assuming that
getting kids in school is a

00:49:06.810 --> 00:49:08.560
good thing to do.

00:49:08.560 --> 00:49:08.900
Right?

00:49:08.900 --> 00:49:10.540
I want to do it.

00:49:10.540 --> 00:49:13.610
So I'm asking what's the cost
per additional year of

00:49:13.610 --> 00:49:21.430
schooling from conditional cash
transfer, from making it

00:49:21.430 --> 00:49:25.140
cheaper to go to school by
giving free school uniforms,

00:49:25.140 --> 00:49:27.420
or providing school meals.

00:49:27.420 --> 00:49:29.260
There are many different things
I could that will

00:49:29.260 --> 00:49:31.530
encourage children to
come to school.

00:49:31.530 --> 00:49:33.250
But I know I want children
to come to school.

00:49:33.250 --> 00:49:35.690
I'm not questioning that goal.

00:49:35.690 --> 00:49:37.100
So I just want to
know the cost of

00:49:37.100 --> 00:49:38.980
getting a child in school.

00:49:38.980 --> 00:49:42.610
Cost-benefit kind of
squishes is it all.

00:49:42.610 --> 00:49:46.200
And it really asks the question,
is it worth getting

00:49:46.200 --> 00:49:47.050
kids in school?

00:49:47.050 --> 00:49:49.990
Because then you can say, if I
get kids in school, they will

00:49:49.990 --> 00:49:54.000
earn more and that will
generate income.

00:49:54.000 --> 00:49:56.700
So if I put a dollar in, am
I going to get more than a

00:49:56.700 --> 00:49:57.950
dollar out at the end?

00:50:01.860 --> 00:50:04.080
I'm not going to flick all
the way back to it.

00:50:04.080 --> 00:50:07.800
But if you remember that chart
that went through the process

00:50:07.800 --> 00:50:10.920
and impact, and then the final
thing of high test scores was

00:50:10.920 --> 00:50:17.910
higher income, ultimately am I
getting more money out of it

00:50:17.910 --> 00:50:19.160
than I'm putting in?

00:50:22.340 --> 00:50:27.870
I think that's sort of a
philosophical decision for the

00:50:27.870 --> 00:50:29.460
organization to make.

00:50:32.800 --> 00:50:35.690
It's very convincing to be able
to say, for every dollar

00:50:35.690 --> 00:50:39.760
we put in, I think for the
deworming case that you've got

00:50:39.760 --> 00:50:44.760
and you do later in the
week, they do both,

00:50:44.760 --> 00:50:46.520
cost-effectiveness
and cost-benefit.

00:50:46.520 --> 00:50:49.630
And the cost-effectiveness
says, this is the most

00:50:49.630 --> 00:50:52.230
cost-effective way to get
children in school.

00:50:52.230 --> 00:50:59.210
But they also then go further
and say, assuming that these

00:50:59.210 --> 00:51:02.680
studies that look at children in
school in Kenya earn higher

00:51:02.680 --> 00:51:07.220
incomes are correct, then given
how much it costs to get

00:51:07.220 --> 00:51:09.420
an additional year of schooling,
and given an

00:51:09.420 --> 00:51:13.440
assumption about how much extra
kids will in the future

00:51:13.440 --> 00:51:16.370
because they went to school,
then for every dollar we put

00:51:16.370 --> 00:51:20.520
in, I think it's you
get $30 back.

00:51:20.520 --> 00:51:23.570
So you kind of have to make an
awful lot more assumptions.

00:51:23.570 --> 00:51:25.600
You have to go to that
final thing and put

00:51:25.600 --> 00:51:26.910
everything on income.

00:51:26.910 --> 00:51:33.150
Now if I was doing the women's
empowerment study, then I'm

00:51:33.150 --> 00:51:36.720
not sure that I would want
to reduce women's

00:51:36.720 --> 00:51:38.650
empowerment to dollars.

00:51:38.650 --> 00:51:41.050
I might just care about it.

00:51:41.050 --> 00:51:46.220
I might care that women are more
empowered whether or not

00:51:46.220 --> 00:51:49.510
it actually leads to
higher incomes.

00:51:49.510 --> 00:51:54.270
So it kind of depends on the
argument that you're making.

00:51:54.270 --> 00:51:59.440
If you want to try and make a
this is really worth it, this

00:51:59.440 --> 00:52:02.100
is a great program, not just
because it's more effective

00:52:02.100 --> 00:52:05.870
than another program, but that
it generates more income then

00:52:05.870 --> 00:52:07.260
I'm putting in.

00:52:07.260 --> 00:52:08.400
That's a great motivation.

00:52:08.400 --> 00:52:11.940
But I wouldn't say you always
have to reduce it to dollars.

00:52:11.940 --> 00:52:15.370
Because you have to make an
awful lot of assumptions.

00:52:15.370 --> 00:52:19.400
And we don't necessarily
always want to reduce

00:52:19.400 --> 00:52:21.215
everything to dollars.

00:52:24.600 --> 00:52:25.830
So here it is.

00:52:25.830 --> 00:52:27.850
We've just been talking
about it.

00:52:27.850 --> 00:52:29.770
So this is a cost-effectiveness.

00:52:29.770 --> 00:52:34.410
So this is the cost per
additional year

00:52:34.410 --> 00:52:35.320
of schooling induced.

00:52:35.320 --> 00:52:39.250
We're not linking it back to
dollars we're measuring.

00:52:39.250 --> 00:52:42.370
We're just assuming that
we want kids in school.

00:52:42.370 --> 00:52:45.980
Millennium Development Goals
have it as a goal.

00:52:45.980 --> 00:52:49.260
we just think it's a good
thing, whether or not it

00:52:49.260 --> 00:52:52.430
generates income.

00:52:52.430 --> 00:52:56.510
What we did is take all the
randomized impact evaluations

00:52:56.510 --> 00:53:02.450
that had as an outcome getting
more children in school and

00:53:02.450 --> 00:53:07.420
calculated the cost per
additional year of schooling

00:53:07.420 --> 00:53:09.640
the resulted.

00:53:09.640 --> 00:53:12.330
So you see a very wide range
of different things.

00:53:12.330 --> 00:53:18.190
Now conditional cash transfers
turn out to be by far the most

00:53:18.190 --> 00:53:23.730
expensive way of getting an
additional year of schooling.

00:53:23.730 --> 00:53:27.020
Now that's partly because mainly
they're done in Latin

00:53:27.020 --> 00:53:31.490
America where enrollment rates
are already very high.

00:53:31.490 --> 00:53:36.370
So it's often more expensive to
get the last kid in school

00:53:36.370 --> 00:53:41.930
than the 50th percentile
kid in school.

00:53:41.930 --> 00:53:46.010
And then the other thing, of
course, in general, things

00:53:46.010 --> 00:53:49.290
cost more in Mexico than in
Kenya, especially when you're

00:53:49.290 --> 00:53:50.950
talking about people.

00:53:50.950 --> 00:53:54.380
Teacher's wages or
wages outside of

00:53:54.380 --> 00:53:55.630
school are more expensive.

00:53:58.650 --> 00:54:03.860
But the thing that was amazing
was that providing children

00:54:03.860 --> 00:54:07.680
with deworming tablets
was just unbelievably

00:54:07.680 --> 00:54:08.450
cost-effective.

00:54:08.450 --> 00:54:14.850
So $3.50 for an additional year
of schooling induced.

00:54:14.850 --> 00:54:18.970
And putting it this way I think
really brought out that

00:54:18.970 --> 00:54:20.000
difference.

00:54:20.000 --> 00:54:22.985
The other thing I should say
in comparing this is, there

00:54:22.985 --> 00:54:26.030
were other benefits
to these programs.

00:54:26.030 --> 00:54:31.510
So Progresa actually gave
people cash as well.

00:54:31.510 --> 00:54:33.700
So it wasn't just about getting
kids in school.

00:54:33.700 --> 00:54:35.690
So of course it was
expensive, right?

00:54:35.690 --> 00:54:39.280
And we haven't calculated
in those costs.

00:54:39.280 --> 00:54:41.730
In cost-benefit, if we reduced
everything to dollars, it

00:54:41.730 --> 00:54:44.845
would look very different
because you've got a value of

00:54:44.845 --> 00:54:46.095
all these other benefits.

00:54:48.810 --> 00:54:50.810
But again, deworming
had other benefits.

00:54:50.810 --> 00:54:53.490
It had health benefits as well
as education benefits.

00:54:53.490 --> 00:54:57.498
So we're just looking at one
measure of outcomes here.

00:54:57.498 --> 00:54:59.210
AUDIENCE: Excuse me.

00:54:59.210 --> 00:54:59.740
RACHEL GLENNERSTER: Yeah?

00:54:59.740 --> 00:54:59.900
AUDIENCE: Are these being
adjusted for Purchasing Power

00:54:59.900 --> 00:55:01.150
Parity, PPP?

00:55:02.650 --> 00:55:05.220
RACHEL GLENNERSTER: So
this is not PPP.

00:55:05.220 --> 00:55:06.820
This is absolute.

00:55:06.820 --> 00:55:09.560
So again, we've sort
of debated it

00:55:09.560 --> 00:55:10.530
backwards and forwards.

00:55:10.530 --> 00:55:15.280
So if you're a country, you
care more about PPP.

00:55:15.280 --> 00:55:17.240
But if you're a donor and you're
wondering whether to

00:55:17.240 --> 00:55:22.510
send a dollar or a pound to
Mexico or Kenya, you don't

00:55:22.510 --> 00:55:23.410
care about PPP.

00:55:23.410 --> 00:55:26.980
You care about where your dollar
is going to get most

00:55:26.980 --> 00:55:28.190
kids in school.

00:55:28.190 --> 00:55:31.650
So there's different ways
of thinking about it.

00:55:31.650 --> 00:55:34.120
It sort of depends on the
question you're asking and

00:55:34.120 --> 00:55:34.910
who's the person.

00:55:34.910 --> 00:55:39.820
For a donor, I think this
is the relevant way.

00:55:39.820 --> 00:55:42.440
If you're a donor who only cares
about getting kids in

00:55:42.440 --> 00:55:45.260
school, this is what
you care about.

00:55:45.260 --> 00:55:52.030
We also can redo this taking
out the transfers.

00:55:52.030 --> 00:55:54.030
There's this other benefit,
the families,

00:55:54.030 --> 00:55:55.560
of getting the money.

00:55:55.560 --> 00:55:59.440
So this is the cost
to a donor.

00:55:59.440 --> 00:56:00.940
So that's one way of
presenting it.

00:56:00.940 --> 00:56:02.580
But you can present it
in other ways too.

00:56:02.580 --> 00:56:05.442
AUDIENCE: Can you also sometimes
do a cost-benefit of

00:56:05.442 --> 00:56:07.500
the evaluation itself?

00:56:07.500 --> 00:56:14.340
RACHEL GLENNERSTER: That's kind
of hard to do because the

00:56:14.340 --> 00:56:16.650
benefits may come
ten years later.

00:56:25.900 --> 00:56:28.850
The way to think about that is
to think about who's going to

00:56:28.850 --> 00:56:33.040
use it, and only do it if you
think it's going to actually

00:56:33.040 --> 00:56:37.210
have some benefits in terms of
being used and not just maybe

00:56:37.210 --> 00:56:38.330
within the organization.

00:56:38.330 --> 00:56:41.430
But if it's expensive,
is it going to be

00:56:41.430 --> 00:56:43.060
useful for other people?

00:56:43.060 --> 00:56:45.920
Is it answering a general
question that lots of people

00:56:45.920 --> 00:56:47.520
will find useful?

00:56:47.520 --> 00:56:50.460
So often evaluations are
expensive in the context of a

00:56:50.460 --> 00:56:51.800
particular program.

00:56:51.800 --> 00:56:54.120
But they're answering a question
the lots of other

00:56:54.120 --> 00:56:55.450
people will benefit from.

00:56:55.450 --> 00:57:00.930
So the Progresa evaluation has
spurred not just the expansion

00:57:00.930 --> 00:57:06.210
of Progresa in Mexico, but it
has spurred it in many other

00:57:06.210 --> 00:57:10.300
countries as well because it
did prove very effective.

00:57:10.300 --> 00:57:13.560
Although it's slightly less
cost-effective in these terms.

00:57:13.560 --> 00:57:18.410
But it led to an awful
lot of learning in

00:57:18.410 --> 00:57:19.590
many, many other countries.

00:57:19.590 --> 00:57:23.390
So, I think, in that sense, it
was an extremely effective

00:57:23.390 --> 00:57:24.960
program evaluation.

00:57:24.960 --> 00:57:28.302
AUDIENCE: Excuse me, I just have
a question on that very

00:57:28.302 --> 00:57:28.650
last item there.

00:57:28.650 --> 00:57:32.890
RACHEL GLENNERSTER: OK, so this
one is even cheaper and

00:57:32.890 --> 00:57:35.990
it's a relatively new result.

00:57:35.990 --> 00:57:39.660
But it only works in certain
circumstances.

00:57:39.660 --> 00:57:44.120
When people don't know the
benefits of staying on in

00:57:44.120 --> 00:57:50.170
school, ie., how much higher
wages they're going to get if

00:57:50.170 --> 00:57:54.140
they have a primary education,
then telling them that

00:57:54.140 --> 00:57:58.130
information is very cheap.

00:57:58.130 --> 00:58:03.270
And both in the Dominican
Republic and Madagascar--

00:58:03.270 --> 00:58:07.580
so two completely different
contexts, different rates of

00:58:07.580 --> 00:58:11.930
staying on in school, different
continents, very

00:58:11.930 --> 00:58:13.770
different schooling systems--

00:58:13.770 --> 00:58:19.430
in both cases it was extremely
effective at increasing the

00:58:19.430 --> 00:58:21.040
number of kids staying
on in school.

00:58:26.950 --> 00:58:30.300
But that only works if people
are underestimating the

00:58:30.300 --> 00:58:31.880
returns of staying
on in school.

00:58:31.880 --> 00:58:34.760
If they're overestimating them,
then it would reduce

00:58:34.760 --> 00:58:44.070
staying on in school or if they
know already, then it's

00:58:44.070 --> 00:58:45.260
not going to be effective.

00:58:45.260 --> 00:58:48.960
So this is something that I
think is a very interesting

00:58:48.960 --> 00:58:53.010
thing ti do, and again,
is worth doing.

00:58:53.010 --> 00:58:57.130
But you need to first go in and
test whether people know

00:58:57.130 --> 00:59:00.600
what the benefits of staying
on in school are.

00:59:00.600 --> 00:59:03.530
Basically they just told them
what's the wage if you

00:59:03.530 --> 00:59:07.020
complete primary education
versus what's the wage if you

00:59:07.020 --> 00:59:09.060
don't complete primary
education.

00:59:09.060 --> 00:59:10.360
It's very cheap.

00:59:10.360 --> 00:59:15.484
So if it changes anything, it's
incredibly effective.

00:59:15.484 --> 00:59:17.355
AUDIENCE: Is the issue of
marginal returns a problem?

00:59:17.355 --> 00:59:20.040
Do you have to say that every
program is only relevant to

00:59:20.040 --> 00:59:23.580
places where it's
at same level of

00:59:23.580 --> 00:59:25.910
enrollment or admission?

00:59:25.910 --> 00:59:34.920
RACHEL GLENNERSTER: Well this is
a sort of wider question of

00:59:34.920 --> 00:59:36.170
external validity.

00:59:39.070 --> 00:59:41.900
When we do a randomized
evaluation, we look at what's

00:59:41.900 --> 00:59:48.180
the impact of a project
in that situation.

00:59:48.180 --> 00:59:52.750
Now at least you know whether
it worked in that situation,

00:59:52.750 --> 00:59:55.200
which is better than not really
knowing whether it

00:59:55.200 --> 00:59:58.550
worked in that situation.

00:59:58.550 --> 01:00:00.540
Then you've got to make a
decision about whether you

01:00:00.540 --> 01:00:03.840
think that is useful to
another situation.

01:00:03.840 --> 01:00:06.690
A great way of doing that is
to test it in a couple of

01:00:06.690 --> 01:00:07.590
different places.

01:00:07.590 --> 01:00:11.750
So again, this was tested in two
very different situations.

01:00:11.750 --> 01:00:14.350
The deworming had very
similar effects.

01:00:17.590 --> 01:00:21.800
In rural primary schools in
Kenya, it works through

01:00:21.800 --> 01:00:22.850
reducing anemia.

01:00:22.850 --> 01:00:27.400
Reducing anemia in preschool
urban India had almost

01:00:27.400 --> 01:00:29.340
identical effects.

01:00:29.340 --> 01:00:33.730
Getting rid of worms in a
non-randomized evaluation to

01:00:33.730 --> 01:00:37.780
be true, but kind of a really
nicely designed one in the

01:00:37.780 --> 01:00:40.280
south of the United
States had almost

01:00:40.280 --> 01:00:41.400
exactly the same effect.

01:00:41.400 --> 01:00:45.420
So they got rid of hookworm
in the 1900s.

01:00:45.420 --> 01:00:48.450
And again, it would increase
school attendance, increase

01:00:48.450 --> 01:00:54.840
test scores, and actually
increase wages just from

01:00:54.840 --> 01:00:56.310
getting rid of hookworm.

01:00:56.310 --> 01:00:59.180
And they reckoned a quite
substantial percentage.

01:00:59.180 --> 01:01:04.380
This paper by Hoyt Bleakley at
Chicago found that quite a

01:01:04.380 --> 01:01:07.840
substantial difference in the
income of the North and the

01:01:07.840 --> 01:01:12.780
South of United States in 1900
was simply due to hookworm.

01:01:12.780 --> 01:01:14.210
So this is being tested.

01:01:14.210 --> 01:01:16.960
So ideally you test something
in very different

01:01:16.960 --> 01:01:17.410
environments.

01:01:17.410 --> 01:01:22.330
But you also think about whether
it makes sense that it

01:01:22.330 --> 01:01:23.040
replicates.

01:01:23.040 --> 01:01:31.450
So if I take the findings of the
women's empowerment study

01:01:31.450 --> 01:01:38.090
in India where it works through
local governance

01:01:38.090 --> 01:01:41.200
bodies that are quite active in
India and have quite a lot

01:01:41.200 --> 01:01:43.920
of power, and tried to replicate
in Bangladesh where

01:01:43.920 --> 01:01:49.860
there is no equivalent system,
I would worry about it.

01:01:49.860 --> 01:01:52.700
Whereas worms cause anemia
around the world.

01:01:52.700 --> 01:01:55.330
And anemia causes
you to be tired.

01:01:55.330 --> 01:01:59.890
And being tired is likely to
affect you going to school.

01:01:59.890 --> 01:02:02.530
That's something that seems
like it would replicate.

01:02:02.530 --> 01:02:04.950
So you have to think through
these things and

01:02:04.950 --> 01:02:05.920
ideally test them.

01:02:05.920 --> 01:02:08.890
If I'm doing microfinance,
would I assume it has

01:02:08.890 --> 01:02:12.620
identical effects in Africa, or
Asia, and Latin American?

01:02:12.620 --> 01:02:13.340
No.

01:02:13.340 --> 01:02:16.500
Because it's very dependent
on what are the learning

01:02:16.500 --> 01:02:19.050
opportunities in those
environments.

01:02:19.050 --> 01:02:20.590
And they're likely to
be very different.

01:02:20.590 --> 01:02:23.660
So I'd want to test it in those
different environments.

01:02:23.660 --> 01:02:25.570
We're falling a bit behind.

01:02:25.570 --> 01:02:29.930
So I promised to do when to
do an impact evaluation.

01:02:29.930 --> 01:02:33.110
So there are important
questions you need to

01:02:33.110 --> 01:02:34.360
know the answer to.

01:02:37.100 --> 01:02:40.520
So that might be because there's
a program that you do

01:02:40.520 --> 01:02:43.470
in lots of places, and you have
no idea whether it works.

01:02:43.470 --> 01:02:45.940
That would be a reason
to do one.

01:02:45.940 --> 01:02:48.310
You're very uncertain about
which strategy to

01:02:48.310 --> 01:02:49.932
use to solve a problem.

01:02:53.240 --> 01:02:56.790
Or there are key questions that
underline a lot of your

01:02:56.790 --> 01:03:02.680
programs, for example, adding
beneficiary control, having

01:03:02.680 --> 01:03:05.115
some participatory element
to your program.

01:03:05.115 --> 01:03:07.360
It might be something that you
do in lots of different

01:03:07.360 --> 01:03:10.490
programs when you don't know
what's the best way to do it

01:03:10.490 --> 01:03:11.740
or whether it's being
effective.

01:03:14.650 --> 01:03:17.750
An opportunity to do it is when
you're rolling out a big

01:03:17.750 --> 01:03:20.000
new program.

01:03:20.000 --> 01:03:22.830
And you're going to invest an
awful lot of money in this

01:03:22.830 --> 01:03:24.295
program, you want to know
whether it works.

01:03:26.920 --> 01:03:29.690
This is a tricky one.

01:03:29.690 --> 01:03:31.940
You're developing a new
program and you

01:03:31.940 --> 01:03:33.345
want to scale it up.

01:03:33.345 --> 01:03:36.660
At what point in that process
should you do the impact

01:03:36.660 --> 01:03:37.940
evaluation?

01:03:37.940 --> 01:03:40.010
Well you don't want to do it
once you've scaled it up for

01:03:40.010 --> 01:03:40.800
everywhere.

01:03:40.800 --> 01:03:44.720
Because then you find out it
doesn't work, and you've just

01:03:44.720 --> 01:03:47.640
spend millions of dollars
scaling it up.

01:03:47.640 --> 01:03:49.900
Well that's not a good idea.

01:03:49.900 --> 01:03:53.280
On the other hand, you don't
want to do it when it's your

01:03:53.280 --> 01:03:55.070
very first designs.

01:03:55.070 --> 01:03:59.200
Because often it changes an
awful lot in the first couple

01:03:59.200 --> 01:04:02.150
of years as your tweaking it,
and developing it, and

01:04:02.150 --> 01:04:06.860
understanding how to make
it work on the ground.

01:04:06.860 --> 01:04:09.760
So you want to wait until you've
got the basic kinks

01:04:09.760 --> 01:04:11.260
ironed out.

01:04:11.260 --> 01:04:14.950
But you want to do it before
you scale it up too far.

01:04:14.950 --> 01:04:19.550
We've done a lot of work
with this NGO in

01:04:19.550 --> 01:04:21.010
India called Pratham.

01:04:21.010 --> 01:04:23.810
And we started doing
some work for them.

01:04:23.810 --> 01:04:26.370
And by the time we finished
doing an evaluation, their

01:04:26.370 --> 01:04:27.620
program had completely
changed.

01:04:29.740 --> 01:04:31.960
So we kind of did another one.

01:04:31.960 --> 01:04:36.190
So we probably did that one
a little bit too early.

01:04:36.190 --> 01:04:40.410
But on the other hand, now
they're scaling up massively.

01:04:40.410 --> 01:04:43.980
And it would be silly to wait
until they'd done the whole of

01:04:43.980 --> 01:04:46.860
India before we evaluated it.

01:04:46.860 --> 01:04:50.160
AUDIENCE: You said it may be
more appropriate to do a

01:04:50.160 --> 01:04:54.610
process evaluation initially to
get a program to the point

01:04:54.610 --> 01:04:56.650
where it can be fully
implemented and all the kinks

01:04:56.650 --> 01:04:58.780
are worked out.

01:04:58.780 --> 01:05:01.050
RACHEL GLENNERSTER:
Yeah, exactly.

01:05:01.050 --> 01:05:06.540
If we're going back to our
textbook example again, you

01:05:06.540 --> 01:05:08.790
don't want to be doing it until
you've got your delivery

01:05:08.790 --> 01:05:12.520
system for the textbooks worked
out, and you've made

01:05:12.520 --> 01:05:15.510
sure you've got the
right textbook.

01:05:15.510 --> 01:05:19.500
It's a bit of a waste of money
until you've got those things.

01:05:19.500 --> 01:05:22.350
And exactly, a process
evaluation can tell you

01:05:22.350 --> 01:05:27.240
whether you've got those
things working.

01:05:27.240 --> 01:05:31.160
The other thing that makes it a
good time or a good program

01:05:31.160 --> 01:05:34.730
to do an impact evaluation of
is one that's representative

01:05:34.730 --> 01:05:37.480
and not gold-plated.

01:05:37.480 --> 01:05:43.410
Because if Millennium
Development Villages, $1

01:05:43.410 --> 01:05:45.960
million per village.

01:05:45.960 --> 01:05:51.140
If we find that that has an
impact on people's lives,

01:05:51.140 --> 01:05:51.930
that's great.

01:05:51.930 --> 01:05:53.270
But what do we do with that?

01:05:53.270 --> 01:05:55.960
We can't give $1 million to
every village in Africa.

01:05:55.960 --> 01:06:01.210
So it's not quite,
what's the point?

01:06:01.210 --> 01:06:05.870
But it's less useful than
testing something that you

01:06:05.870 --> 01:06:08.940
could replicate across the whole
of Africa, that you have

01:06:08.940 --> 01:06:12.170
enough money to replicate
in a big scale.

01:06:12.170 --> 01:06:17.470
So that's interesting because
you can use it more.

01:06:17.470 --> 01:06:20.560
Because if you throw everything
at a community,

01:06:20.560 --> 01:06:22.020
yes, you can probably
change things.

01:06:22.020 --> 01:06:25.680
But what are you learning
from it?

01:06:25.680 --> 01:06:28.000
So it takes time, and
expertise, and

01:06:28.000 --> 01:06:29.700
money to do it right.

01:06:29.700 --> 01:06:33.170
So it's very important to think
about when you're going

01:06:33.170 --> 01:06:38.490
to do it and designing the right
evaluation to answer the

01:06:38.490 --> 01:06:42.762
right question that you're
going to learn from.

01:06:42.762 --> 01:06:47.154
AUDIENCE: If a program hasn't
been successful, have you

01:06:47.154 --> 01:06:50.250
found that the NGO's have
abandoned that program?

01:06:50.250 --> 01:06:51.500
RACHEL GLENNERSTER:
Yes, mainly.

01:06:58.500 --> 01:07:04.450
We worked with an NGO in
Kenya that didn't work.

01:07:04.450 --> 01:07:07.140
They just moved on to
something else.

01:07:07.140 --> 01:07:11.140
Pratham, we actually did two
things, both of which worked,

01:07:11.140 --> 01:07:15.140
but one which was more
cost-effective than the other.

01:07:15.140 --> 01:07:18.660
And they dumped the computer
assisted learning even though

01:07:18.660 --> 01:07:21.570
it was like phenomenally
successful.

01:07:21.570 --> 01:07:23.600
But the other one was
even cheaper.

01:07:23.600 --> 01:07:25.270
So they really scaled that up.

01:07:25.270 --> 01:07:27.720
And they haven't really done
computer assisted learning

01:07:27.720 --> 01:07:32.240
even though it had a very big
effect on math test scores.

01:07:32.240 --> 01:07:35.240
And compared to anybody else
doing education, it was very

01:07:35.240 --> 01:07:38.600
cost-effective But compared to
their other approach, which

01:07:38.600 --> 01:07:43.140
was even more cost-effective,
they were like, OK.

01:07:43.140 --> 01:07:45.540
We'll do the one that's
most cost-effective.

01:07:45.540 --> 01:07:47.240
Now there are some organizations
that

01:07:47.240 --> 01:07:48.845
kind of do one thing.

01:07:48.845 --> 01:07:53.080
And it's much harder for them
to stop doing that one thing

01:07:53.080 --> 01:07:54.180
if you find it doesn't work.

01:07:54.180 --> 01:07:57.010
They tend to think, well,
how can I adapt it?

01:07:57.010 --> 01:08:01.420
But these organizations that do
many things are often very

01:08:01.420 --> 01:08:03.550
happy to, OK, that
didn't work.

01:08:03.550 --> 01:08:05.280
We'll go this direction.

01:08:10.250 --> 01:08:13.580
So we want to develop an
evaluation strategy to help us

01:08:13.580 --> 01:08:18.600
prioritize what evaluations
to do when.

01:08:18.600 --> 01:08:21.580
So the first thing to do is step
back and ask, what are

01:08:21.580 --> 01:08:25.580
the key questions for
your organization?

01:08:25.580 --> 01:08:28.870
What are the things that I
really, really need to know?

01:08:28.870 --> 01:08:34.300
What are the things that would
make me be more successful,

01:08:34.300 --> 01:08:36.880
that I'm spending lots of money
on but I don't know the

01:08:36.880 --> 01:08:43.279
answer, or some of these more
fundamental questions, as they

01:08:43.279 --> 01:08:48.359
say, about how do I get
beneficiary control across my

01:08:48.359 --> 01:08:49.609
different programs.

01:08:53.849 --> 01:08:57.500
The other key thing is you're
not going to be able to answer

01:08:57.500 --> 01:09:01.890
all of them by your own
impact evaluations.

01:09:01.890 --> 01:09:04.910
And as they say, it's expensive
to do them.

01:09:04.910 --> 01:09:08.319
So the first thing to do is to
go out and see if somebody

01:09:08.319 --> 01:09:11.859
else has done a really good
impact evaluation that's

01:09:11.859 --> 01:09:15.750
relevant to you to answer
your questions already.

01:09:15.750 --> 01:09:19.580
Or half answer or more gives you
the hypotheses to look at.

01:09:23.939 --> 01:09:26.350
How many can I answer just
from improved process?

01:09:26.350 --> 01:09:31.330
Because if my problems are about
logistics, and getting

01:09:31.330 --> 01:09:35.390
things to people, and getting
cooperation from people, then

01:09:35.390 --> 01:09:38.740
I can get that from process
evaluation.

01:09:38.740 --> 01:09:41.584
So from that you can select your
top priority questions

01:09:41.584 --> 01:09:45.200
for an impact evaluation
and establish a plan

01:09:45.200 --> 01:09:46.210
for answering them.

01:09:46.210 --> 01:09:50.479
So then you've go to look for
opportunities where you can

01:09:50.479 --> 01:09:54.109
develop an impact evaluation
that will enable you to answer

01:09:54.109 --> 01:09:55.960
those questions.

01:09:55.960 --> 01:10:02.320
So am I rolling out a new
program in a new area?

01:10:02.320 --> 01:10:04.670
And I can do an impact
evaluation there.

01:10:04.670 --> 01:10:07.590
Or you might even want to
say, I want to set up an

01:10:07.590 --> 01:10:08.840
experimental site.

01:10:11.350 --> 01:10:14.260
I don't really know whether to
go this way or that way.

01:10:14.260 --> 01:10:17.010
So I'm just going to
take a place and

01:10:17.010 --> 01:10:19.720
try different things.

01:10:19.720 --> 01:10:23.070
And it's not going to be really
part of my general

01:10:23.070 --> 01:10:27.550
rollout But I'm going to focus
in on the questions.

01:10:27.550 --> 01:10:29.460
Should I be charging
for this or not?

01:10:29.460 --> 01:10:30.750
How much should I charge?

01:10:30.750 --> 01:10:33.690
Or how should I present
this to people?

01:10:33.690 --> 01:10:36.750
And you can take a site and
kind of try a bunch of

01:10:36.750 --> 01:10:39.880
different things against each
other, figure out your design,

01:10:39.880 --> 01:10:43.280
really hone it down, and
then roll that out.

01:10:43.280 --> 01:10:47.500
So those are two kinds of
different options of thinking

01:10:47.500 --> 01:10:49.930
about how to do it.

01:10:49.930 --> 01:10:52.950
And then, when you've got those
key questions of your

01:10:52.950 --> 01:11:00.320
impact, you can combine that
with process evaluations to

01:11:00.320 --> 01:11:01.500
get your global impact.

01:11:01.500 --> 01:11:02.750
What do I mean by that?

01:11:05.460 --> 01:11:08.410
Let's go back to our
textbook example.

01:11:08.410 --> 01:11:15.970
If you're giving out textbooks
across many states or

01:11:15.970 --> 01:11:18.760
throughout the country,
you've evaluated it

01:11:18.760 --> 01:11:21.150
carefully in one region.

01:11:21.150 --> 01:11:23.740
And you find that the impact
on test scores

01:11:23.740 --> 01:11:26.530
is whatever it is.

01:11:26.530 --> 01:11:29.460
And then you know very
carefully, and maybe you've

01:11:29.460 --> 01:11:33.040
tested it in two different
locations in the country and

01:11:33.040 --> 01:11:34.790
you've got very similar
results.

01:11:34.790 --> 01:11:38.040
So then you can say, well I know
that every time I give a

01:11:38.040 --> 01:11:41.680
textbook, I get this impact
on test schools.

01:11:41.680 --> 01:11:44.780
Then from the process
evaluation, you know how many

01:11:44.780 --> 01:11:47.850
textbooks are getting in
the hands of kids.

01:11:47.850 --> 01:11:52.840
Then you can combine the two,
multiply up your impact

01:11:52.840 --> 01:11:56.300
numbers by the number of
textbooks you give out.

01:11:56.300 --> 01:12:03.167
Malaria control with bed nets,
if I hand out this many bed

01:12:03.167 --> 01:12:08.150
nets, then I'm saving
this many lives.

01:12:08.150 --> 01:12:10.890
I've done that through a careful
impact evaluation.

01:12:10.890 --> 01:12:13.505
And then all I need to do is
just count the number of bed

01:12:13.505 --> 01:12:15.640
nets that are getting
to people and I

01:12:15.640 --> 01:12:17.520
know my overall impact.

01:12:17.520 --> 01:12:20.100
So that's a way that you
can combine the two.

01:12:20.100 --> 01:12:23.320
You don't have to do an impact
evaluation for every single

01:12:23.320 --> 01:12:24.360
bed net you hand out.

01:12:24.360 --> 01:12:30.280
Because you've really got the
underlying evaluation impact

01:12:30.280 --> 01:12:34.456
model, and you can
extrapolate.

01:12:34.456 --> 01:12:35.392
AUDIENCE: Rachel?

01:12:35.392 --> 01:12:35.860
RACHEL GLENNERSTER: Yeah?

01:12:35.860 --> 01:12:39.750
AUDIENCE: Do you think in the
beginning when you got a

01:12:39.750 --> 01:12:41.620
program that you're interested
in, do you think that's the

01:12:41.620 --> 01:12:46.370
moment to think about the size
of the impact that you're

01:12:46.370 --> 01:12:49.010
looking at that people expect?

01:12:49.010 --> 01:12:53.830
And also, as part of that,
what's going to be the

01:12:53.830 --> 01:12:56.580
audience, the ultimate audience
that you're trying to

01:12:56.580 --> 01:13:00.056
get to if you're successful
with a scale-up.

01:13:00.056 --> 01:13:01.710
And those two things, I think,
frequently come together.

01:13:01.710 --> 01:13:05.013
Because it's the scaling up
process where people are going

01:13:05.013 --> 01:13:07.328
to start to look at those
cost-effectiveness measures

01:13:07.328 --> 01:13:08.720
and cost-benefit.

01:13:08.720 --> 01:13:10.790
RACHEL GLENNERSTER: I mean, I
would argue that you've always

01:13:10.790 --> 01:13:14.750
got to be thinking about your
ultimate plans for scaling it

01:13:14.750 --> 01:13:17.680
up when you're designing
the project.

01:13:17.680 --> 01:13:20.950
Because you design a project
very differently if you're

01:13:20.950 --> 01:13:25.180
just trying to treat a small
area than if you're thinking

01:13:25.180 --> 01:13:27.960
about, if I get this right,
I want to do it on

01:13:27.960 --> 01:13:29.560
a much wider area.

01:13:29.560 --> 01:13:33.290
If you've always got that in
mind, you're thinking a lot

01:13:33.290 --> 01:13:35.050
about is this scalable?

01:13:35.050 --> 01:13:40.430
Am I using a resource that is
either money or expertise that

01:13:40.430 --> 01:13:44.420
is in very short supply, in
which case there's no point in

01:13:44.420 --> 01:13:48.340
designing it this way because
I won't able to scale it

01:13:48.340 --> 01:13:51.570
beyond this small study area.

01:13:51.570 --> 01:13:55.760
So if that's your ultimate
objective, you need to be

01:13:55.760 --> 01:14:00.620
putting that into the impact
evaluation from the moment.

01:14:00.620 --> 01:14:03.530
Because there's no point in
doing the impact evaluation,

01:14:03.530 --> 01:14:08.750
very resource-intensive
project, and

01:14:08.750 --> 01:14:10.010
say, well, that works.

01:14:10.010 --> 01:14:12.380
But I can't do that
everywhere.

01:14:12.380 --> 01:14:14.610
Well then what have
you learned?

01:14:14.610 --> 01:14:17.210
You want to be testing the thing
that ultimately you're

01:14:17.210 --> 01:14:19.960
going to be able to
bring everywhere.

01:14:19.960 --> 01:14:25.120
So in a lot of our cases, we're
encouraging our partners

01:14:25.120 --> 01:14:26.100
to scale it back.

01:14:26.100 --> 01:14:30.260
Because you won't be able to
do this on a big scale.

01:14:30.260 --> 01:14:33.850
So scale it back to what you
would actually be doing if

01:14:33.850 --> 01:14:36.540
you're trying to do the
whole of India or the

01:14:36.540 --> 01:14:38.990
whole of this state.

01:14:38.990 --> 01:14:42.350
Because that's what's
useful to learn.

01:14:42.350 --> 01:14:45.290
And you want to be able
to sell to someone

01:14:45.290 --> 01:14:47.810
to finance the scale-up.

01:14:47.810 --> 01:14:53.410
So I think having those ideas in
your mind at the beginning

01:14:53.410 --> 01:14:56.720
is very important, and as they
say, making it into a

01:14:56.720 --> 01:14:59.730
strategy, not a project by
project evaluation, but

01:14:59.730 --> 01:15:03.080
thinking about where do I want
to go as an organization.

01:15:03.080 --> 01:15:08.320
What's the evidence I need to
get there, and then designing

01:15:08.320 --> 01:15:10.340
the impact evaluations to
get you that evidence.

01:15:13.010 --> 01:15:18.660
And people often ask me about
how do you make sure that

01:15:18.660 --> 01:15:25.440
people use the evidence from
impact evaluations.

01:15:25.440 --> 01:15:29.600
And I think the main
answer to that is

01:15:29.600 --> 01:15:32.180
ask the right question.

01:15:32.180 --> 01:15:35.670
Because it's not about
browbeating people to make

01:15:35.670 --> 01:15:37.600
them read studies afterwards.

01:15:37.600 --> 01:15:40.920
If you find the answer to an
interesting question, it'll

01:15:40.920 --> 01:15:42.350
take off like wildfire.

01:15:42.350 --> 01:15:43.770
It will be used.

01:15:43.770 --> 01:15:46.835
But if you answer a stupid
question, then nobody is going

01:15:46.835 --> 01:15:49.730
to want to read your results.

01:15:49.730 --> 01:15:52.480
So we're learning from an impact
evaluation, so learning

01:15:52.480 --> 01:15:56.240
from in a single study did the
program work in this context?

01:15:56.240 --> 01:15:59.890
Should we expand it to
a similar population?

01:15:59.890 --> 01:16:02.600
Learning from an accumulation
of studies, which is what we

01:16:02.600 --> 01:16:06.230
want to get to eventually, is
did the same program work in a

01:16:06.230 --> 01:16:09.720
range of different contexts,
India, Kenya, south of the

01:16:09.720 --> 01:16:12.250
United States?

01:16:12.250 --> 01:16:16.580
And that's incredibly valuable
because then your learning is

01:16:16.580 --> 01:16:20.900
much wider and you can take
it to many more places.

01:16:20.900 --> 01:16:25.010
Did some variation in the same
program work differently, ie.,

01:16:25.010 --> 01:16:28.560
take one program and try
different variants of it and

01:16:28.560 --> 01:16:32.900
test it out so that we know
how to design it.

01:16:32.900 --> 01:16:35.480
Did this same mechanism
seem to be present

01:16:35.480 --> 01:16:37.750
in different areas?

01:16:37.750 --> 01:16:41.550
So there's a lot of studies
looking at the impact of user

01:16:41.550 --> 01:16:43.890
fees in education and health.

01:16:43.890 --> 01:16:46.270
You seem to get some very
similar results.

01:16:46.270 --> 01:16:48.600
And again, that's even
more useful.

01:16:48.600 --> 01:16:51.340
Because then you're not just
talking about moving deworming

01:16:51.340 --> 01:16:52.140
to another country.

01:16:52.140 --> 01:16:54.800
You're talking about
user fees.

01:16:54.800 --> 01:16:57.540
What have we learned about
user fees across a lot of

01:16:57.540 --> 01:16:58.440
different sectors?

01:16:58.440 --> 01:17:01.560
There's some common
understandings and learnings

01:17:01.560 --> 01:17:03.370
to take to even a sector
that we may not

01:17:03.370 --> 01:17:07.460
have studied before.

01:17:07.460 --> 01:17:11.200
And then, as they say, putting
these learnings in the place,

01:17:11.200 --> 01:17:16.160
in filling in an overall
strategy of what were my gaps

01:17:16.160 --> 01:17:16.710
in knowledge?

01:17:16.710 --> 01:17:18.190
And am I slowly filling
them in?

01:17:21.142 --> 01:17:23.610
So, I think that's it.

01:17:23.610 --> 01:17:26.030
So I'm sorry the last bit
was a little bit rushed.

01:17:28.900 --> 01:17:32.660
The idea was to kind of
motivate why we're

01:17:32.660 --> 01:17:35.050
doing all of this.

01:17:35.050 --> 01:17:39.210
Today you're going to
be in your groups.

01:17:39.210 --> 01:17:41.750
The task for your groups today,
as well as doing the

01:17:41.750 --> 01:17:48.860
case, is to decide on a question
for an evaluation

01:17:48.860 --> 01:17:52.330
that you're going to design
over the next five days.

01:17:52.330 --> 01:17:56.010
So hopefully that's made you
think about what's an

01:17:56.010 --> 01:17:58.120
interesting question.

01:17:58.120 --> 01:17:59.440
What should we be testing?

01:17:59.440 --> 01:18:04.240
Because I think often an
underlooked element of

01:18:04.240 --> 01:18:10.340
designing an evaluation is
what's the question that we

01:18:10.340 --> 01:18:12.180
want to be answering with
this evaluation?

01:18:12.180 --> 01:18:14.010
Is it a useful question?

01:18:14.010 --> 01:18:16.570
How am I going to use it?

01:18:16.570 --> 01:18:20.730
What's it going to tell me for
making my program, my whole

01:18:20.730 --> 01:18:23.440
organization more effective
in the future?

01:18:23.440 --> 01:18:26.138
So any questions?

01:18:26.138 --> 01:18:28.826
AUDIENCE: What would you say
are some of the main

01:18:28.826 --> 01:18:29.580
limitations of randomization?

01:18:29.580 --> 01:18:32.340
So I assume one of them is
extrapolate the populations

01:18:32.340 --> 01:18:33.390
that are different?

01:18:33.390 --> 01:18:36.260
Are there other main ones
that you can think of?

01:18:36.260 --> 01:18:43.020
RACHEL GLENNERSTER: So it's
important to distinguish when

01:18:43.020 --> 01:18:44.810
we talk about limitations.

01:18:44.810 --> 01:18:50.460
One is just general, what's
the limitation to say,

01:18:50.460 --> 01:18:52.720
extrapolating beyond?

01:18:52.720 --> 01:18:54.980
But the other thing is to think
of it in the context of

01:18:54.980 --> 01:18:59.010
what's the limitation versus
other mechanisms?

01:18:59.010 --> 01:19:01.980
Because, for example, the
extrapolating to other

01:19:01.980 --> 01:19:05.800
populations is not really a
limitation of randomized

01:19:05.800 --> 01:19:08.610
evaluations compared to
any other impact.

01:19:08.610 --> 01:19:13.160
Any impact evaluation is done
on a particular population.

01:19:13.160 --> 01:19:16.510
And so there's always a question
as to whether it

01:19:16.510 --> 01:19:18.035
generalizes to another
population.

01:19:22.760 --> 01:19:26.800
And the way to deal with that is
to design it in a way that

01:19:26.800 --> 01:19:29.500
you learn as much as you
possibly can about the

01:19:29.500 --> 01:19:33.440
mechanisms, about the routes
through which it worked.

01:19:33.440 --> 01:19:35.710
And then you can ask yourself
when you bring it to another

01:19:35.710 --> 01:19:40.800
population, do those routes
seem like they might be

01:19:40.800 --> 01:19:43.870
applicable, or is there
an obvious gap?

01:19:43.870 --> 01:19:49.640
This worked through the
local organization.

01:19:49.640 --> 01:19:52.300
But if that organization isn't
there, is there another

01:19:52.300 --> 01:19:58.190
organization that it could work
through there or not?

01:19:58.190 --> 01:20:01.820
If you think that deworming
works through the mechanism of

01:20:01.820 --> 01:20:05.080
anemia, well it works between
the mechanism of there being

01:20:05.080 --> 01:20:09.290
worms and of anemia, you
can go out and test.

01:20:09.290 --> 01:20:10.880
Are there worms in the area?

01:20:10.880 --> 01:20:13.070
Is the population anemic because
there may be worms and

01:20:13.070 --> 01:20:13.970
they're not anemic.

01:20:13.970 --> 01:20:21.020
So that's a way to design the
evaluation to limit that

01:20:21.020 --> 01:20:24.740
limitation or reduce the problem
of that limitation.

01:20:24.740 --> 01:20:27.800
But it's not like the very
active flipping a coin and

01:20:27.800 --> 01:20:31.670
randomizing causes the
problem of not being

01:20:31.670 --> 01:20:32.330
able to extend it.

01:20:32.330 --> 01:20:35.070
It's true of any impact
evaluation.

01:20:35.070 --> 01:20:40.290
I think one limitation which
you will find in your

01:20:40.290 --> 01:20:43.650
frustration as you want to try
and answer every single

01:20:43.650 --> 01:20:46.910
question that you have, and you
get into the mechanics of

01:20:46.910 --> 01:20:50.900
sample size and how much
sample size do I need--

01:20:50.900 --> 01:20:54.690
and again, that's not
necessarily just of a

01:20:54.690 --> 01:20:58.620
randomized evaluation, but any
quantitative evaluation--

01:20:58.620 --> 01:21:03.060
you can test a limited
number of hypotheses.

01:21:03.060 --> 01:21:08.040
And every hypothesis you want
to test needs more sample.

01:21:08.040 --> 01:21:12.830
And so the number of questions
you can answer very rigorously

01:21:12.830 --> 01:21:14.300
is limited.

01:21:14.300 --> 01:21:18.460
And I think that's the
limitation that we often find

01:21:18.460 --> 01:21:19.090
very binding.

01:21:19.090 --> 01:21:22.280
Again, any rigorous quantitative
evaluation will

01:21:22.280 --> 01:21:24.860
have that limitation.

01:21:24.860 --> 01:21:30.020
We'll talk a lot tomorrow
about sometimes

01:21:30.020 --> 01:21:33.190
you just can't randomize.

01:21:33.190 --> 01:21:36.980
Freedom of the press is not
something that you can

01:21:36.980 --> 01:21:39.570
randomize except by country.

01:21:39.570 --> 01:21:42.160
And then we'd need every
country in the world.

01:21:42.160 --> 01:21:44.970
It's just not going to happen.

01:21:44.970 --> 01:21:48.510
So we'll look at a lot of new
techniques or different

01:21:48.510 --> 01:21:51.670
techniques that you can use to
bring randomization to areas

01:21:51.670 --> 01:21:55.703
where you think it would be
impossible to bring it to.

01:21:58.960 --> 01:22:02.400
Compared to other quantitative
evaluations, you sometimes

01:22:02.400 --> 01:22:06.680
have political constraints about
where you can randomize.

01:22:06.680 --> 01:22:11.880
But as I say, quantitative
versus qualitative, the

01:22:11.880 --> 01:22:16.910
qualitative isn't so limited
by sample size constraints.

01:22:16.910 --> 01:22:19.950
And you're not so limited
to answer very specific

01:22:19.950 --> 01:22:22.780
hypotheses.

01:22:22.780 --> 01:22:25.800
The flip side is you don't
answer any specific

01:22:25.800 --> 01:22:27.120
hypotheses.

01:22:27.120 --> 01:22:28.720
And it's the same
rigorous way.

01:22:28.720 --> 01:22:33.240
But it's much more open.

01:22:33.240 --> 01:22:36.280
So very often what we do is we
combine a qualitative and

01:22:36.280 --> 01:22:38.430
quantitative, and spend a lot
of time doing qualitative

01:22:38.430 --> 01:22:43.450
before to hone our hypotheses,
and then use a randomized

01:22:43.450 --> 01:22:45.980
impact evaluation to test
those specific.

01:22:45.980 --> 01:22:49.710
But if you sit in your office
and design your hypotheses

01:22:49.710 --> 01:22:54.190
without any going out into the
field, you will almost

01:22:54.190 --> 01:22:57.160
certainly waste your money
because you won't have asked

01:22:57.160 --> 01:22:58.830
the right question.

01:22:58.830 --> 01:22:59.880
You won't have designed it.

01:22:59.880 --> 01:23:04.110
So you need some element of
qualitative to make sure some

01:23:04.110 --> 01:23:06.580
needs assessment, some work on
the ground to make sure that

01:23:06.580 --> 01:23:09.240
you are asking, you're designing
your hypotheses

01:23:09.240 --> 01:23:11.045
correctly because you've
only got a few shots.

01:23:13.685 --> 01:23:14.627
Yeah?

01:23:14.627 --> 01:23:16.982
AUDIENCE: I was wondering,
do you know of some good

01:23:16.982 --> 01:23:20.230
evaluation or randomized
impact evaluation on

01:23:20.230 --> 01:23:21.810
conservation programs?

01:23:21.810 --> 01:23:25.445
RACHEL GLENNERSTER: On
conservation programs?

01:23:25.445 --> 01:23:30.732
I can't think of any, I'm
afraid, but eminently doable.

01:23:30.732 --> 01:23:35.140
But we can talk about that if
you can persuade your group to

01:23:35.140 --> 01:23:37.840
think about designing one.

01:23:37.840 --> 01:23:41.610
Anyone else think of a
conservation program?

01:23:45.020 --> 01:23:46.505
Yes?

01:23:46.505 --> 01:23:48.980
AUDIENCE: I don't
have an example.

01:23:48.980 --> 01:23:52.180
I wish I did.

01:23:52.180 --> 01:23:54.036
And you've mentioned this.

01:23:54.036 --> 01:23:56.671
I just need to really underline
it for myself.

01:23:56.671 --> 01:24:00.280
A lot of the programs that
my organization does are

01:24:00.280 --> 01:24:01.880
comprehensive in nature.

01:24:01.880 --> 01:24:06.150
So they have lots of different
elements meant to in the end,

01:24:06.150 --> 01:24:09.340
collectively,
[UNINTELLIGIBLE PHRASE].

01:24:09.340 --> 01:24:14.120
What I'm understanding here is
that you could do an impact

01:24:14.120 --> 01:24:16.010
evaluation of all of
those collectively.

01:24:16.010 --> 01:24:20.650
But really it would be more
useful to pull them out and

01:24:20.650 --> 01:24:23.040
look at the different
interventions

01:24:23.040 --> 01:24:25.055
side by side or something.

01:24:25.055 --> 01:24:28.930
Because that way you'll
get a more targeted--

01:24:28.930 --> 01:24:32.060
RACHEL GLENNERSTER: It's true.

01:24:32.060 --> 01:24:35.730
The question was, if you have a
big package of programs that

01:24:35.730 --> 01:24:39.180
does lots of things, you can do
an evaluation of the whole

01:24:39.180 --> 01:24:41.950
package and see whether
it works as a package.

01:24:41.950 --> 01:24:47.030
But in terms of learning about
how you should design future

01:24:47.030 --> 01:24:51.880
programs, you would probably
learn more by trying to tease

01:24:51.880 --> 01:24:55.820
out, take one away, or
try them separately.

01:24:55.820 --> 01:24:58.030
Because there might be elements
of the package that

01:24:58.030 --> 01:25:04.030
are very expensive but are not
generating as much benefit as

01:25:04.030 --> 01:25:04.940
they are cost.

01:25:04.940 --> 01:25:09.180
And you would get more effect by
doing a smaller package in

01:25:09.180 --> 01:25:10.940
more places.

01:25:10.940 --> 01:25:14.420
You don't know unless you
take the package apart.

01:25:14.420 --> 01:25:18.130
Now then if you test each one
individually, that's a very

01:25:18.130 --> 01:25:19.250
expensive process.

01:25:19.250 --> 01:25:21.440
Because it needs a lot of
sample size to test each

01:25:21.440 --> 01:25:22.320
individually.

01:25:22.320 --> 01:25:24.720
There's also a very interesting
hypothesis that's

01:25:24.720 --> 01:25:26.720
true in lots of different
areas.

01:25:26.720 --> 01:25:30.370
People often feel, where there
are lots of barriers, so we

01:25:30.370 --> 01:25:33.940
have to attack all of them.

01:25:33.940 --> 01:25:35.250
It only makes sense.

01:25:35.250 --> 01:25:37.700
You won't get any movement
unless you do.

01:25:37.700 --> 01:25:40.390
There are lots of things
stopping kids going to school.

01:25:40.390 --> 01:25:44.950
There's stopping, say, girls
going to school.

01:25:44.950 --> 01:25:47.980
They're needed a home.

01:25:47.980 --> 01:25:50.040
There are attitudes.

01:25:50.040 --> 01:25:52.270
There is their own health.

01:25:52.270 --> 01:25:54.120
There's maybe they
are sick a lot.

01:25:54.120 --> 01:25:56.950
So we have to address all
of those if we're

01:25:56.950 --> 01:25:59.870
going to have an impact.

01:25:59.870 --> 01:26:03.010
We don't know, the answer is.

01:26:03.010 --> 01:26:06.680
And indeed, in that example
where we're working with Save

01:26:06.680 --> 01:26:09.270
the Children in Bangladesh,
they had this

01:26:09.270 --> 01:26:10.500
comprehensive approach.

01:26:10.500 --> 01:26:11.550
Where there are all
these problems.

01:26:11.550 --> 01:26:13.960
So let's tackle them all.

01:26:13.960 --> 01:26:18.290
We convinced them to divide it
up a bit, and test different

01:26:18.290 --> 01:26:24.850
things, and see some of their
own worked, or whether you

01:26:24.850 --> 01:26:27.170
needed to do all of them
together before you changed

01:26:27.170 --> 01:26:31.910
anything, which is a perfectly
possible hypotheses and one

01:26:31.910 --> 01:26:37.050
that a lot of people have, but
hasn't really been tested.

01:26:37.050 --> 01:26:39.820
The idea that you've got to get
over a critical threshold.

01:26:39.820 --> 01:26:41.010
And you've got to
build up to it.

01:26:41.010 --> 01:26:43.560
And only once you're over there
do you see any movement.

01:26:43.560 --> 01:26:46.070
Well actually on girls going
to school, it's quite

01:26:46.070 --> 01:26:46.880
interesting.

01:26:46.880 --> 01:26:49.360
Most of the evaluations that
have looked at, just

01:26:49.360 --> 01:26:54.345
generally, improving attendance
at school, have had

01:26:54.345 --> 01:26:57.920
their biggest impact on girls.

01:26:57.920 --> 01:27:02.420
I should say most of those were
not done in the toughest

01:27:02.420 --> 01:27:05.890
environments for girls
going to school.

01:27:05.890 --> 01:27:10.510
They're not in Afghanistan
or somewhere where it's

01:27:10.510 --> 01:27:12.630
particularly difficult.

01:27:12.630 --> 01:27:14.390
But it is interesting.

01:27:14.390 --> 01:27:19.990
Just general things and
approaches in Africa and India

01:27:19.990 --> 01:27:23.250
have had their biggest impacts
on girls, which suggests that

01:27:23.250 --> 01:27:26.000
you've got a hit every
possible thing

01:27:26.000 --> 01:27:29.280
is maybe not right.

01:27:29.280 --> 01:27:31.160
Yeah?

01:27:31.160 --> 01:27:33.040
AUDIENCE: [INAUDIBLE PHRASE]

01:27:33.040 --> 01:27:36.569
and the political constraints
[INAUDIBLE PHRASE].

01:27:48.910 --> 01:27:49.230
RACHEL GLENNERSTER: Right.

01:27:49.230 --> 01:27:55.820
So we'll talk actually tomorrow
quite a lot about the

01:27:55.820 --> 01:28:01.900
politics of introducing an
evaluation or at least the

01:28:01.900 --> 01:28:04.590
different ways that you can
introduce randomization to

01:28:04.590 --> 01:28:07.010
make it more politically
acceptable.

01:28:07.010 --> 01:28:13.980
That's slightly different from
whether the senior political

01:28:13.980 --> 01:28:17.440
figures want to know whether
the program works or are

01:28:17.440 --> 01:28:20.080
willing to fund an evaluation.

01:28:23.530 --> 01:28:26.870
I've actually been amazingly
surprised.

01:28:26.870 --> 01:28:28.530
Obviously we find that
some places.

01:28:32.500 --> 01:28:34.250
There are certain partners
or people we've

01:28:34.250 --> 01:28:35.780
started talking with.

01:28:35.780 --> 01:28:41.640
And you can see the moment the
penny drops that they're not

01:28:41.640 --> 01:28:42.900
going to have any control.

01:28:42.900 --> 01:28:46.100
Because you're going to do
a treatment comparison.

01:28:46.100 --> 01:28:47.080
You're going to stand back.

01:28:47.080 --> 01:28:49.870
At the end of the day the
results going to be that.

01:28:49.870 --> 01:28:52.800
There's no fiddling with it,
which is one of the beauties

01:28:52.800 --> 01:28:54.070
of the design.

01:28:54.070 --> 01:28:58.360
But it will be what it will be,
which is kind of why it's

01:28:58.360 --> 01:28:58.980
convincing.

01:28:58.980 --> 01:29:02.840
But there are certain groups who
kind of figure that out.

01:29:02.840 --> 01:29:07.610
And they run for the exit
because there's going to be an

01:29:07.610 --> 01:29:13.030
MIT stamp of approval evaluation
potentially saying

01:29:13.030 --> 01:29:14.280
their program doesn't work.

01:29:19.450 --> 01:29:20.320
That's life.

01:29:20.320 --> 01:29:23.720
Some people don't
want to know.

01:29:23.720 --> 01:29:28.100
The best thing I can say in
that situation is test

01:29:28.100 --> 01:29:29.800
alternatives.

01:29:29.800 --> 01:29:33.440
It's much less threatening
to test alternatives.

01:29:33.440 --> 01:29:36.440
Because there's always some
alternative of this versus

01:29:36.440 --> 01:29:38.330
that, that people don't know.

01:29:38.330 --> 01:29:42.020
And then you're not raising
the does it work.

01:29:42.020 --> 01:29:45.010
You're saying well, does this
work better than that?

01:29:45.010 --> 01:29:47.570
And that is much less
threatening.

01:29:47.570 --> 01:29:50.280
It doesn't tell you
quite as much.

01:29:50.280 --> 01:29:52.080
But it's much less
threatening.

01:30:02.910 --> 01:30:05.520
There's a report called When
Will We Ever Learn, looking at

01:30:05.520 --> 01:30:09.430
the politics of why don't we
have more impact evaluations,

01:30:09.430 --> 01:30:11.760
which was very pessimistic.

01:30:11.760 --> 01:30:14.710
But if you look at somewhere
like the World Bank that just

01:30:14.710 --> 01:30:18.720
put a purse of money for
doing randomized impact

01:30:18.720 --> 01:30:19.900
evaluations out there.

01:30:19.900 --> 01:30:22.190
And anybody in the
bank could apply.

01:30:22.190 --> 01:30:23.000
And people were like why?

01:30:23.000 --> 01:30:26.290
There's no incentives
for them to do it.

01:30:26.290 --> 01:30:27.710
Program office's, they've
already got a

01:30:27.710 --> 01:30:28.560
lot on their plate.

01:30:28.560 --> 01:30:31.180
Why would they add doing this?

01:30:31.180 --> 01:30:34.240
It's going to find out that
they're opening themselves to

01:30:34.240 --> 01:30:38.040
all these risks because maybe
their program doesn't work.

01:30:38.040 --> 01:30:41.650
Massively oversubscribed, first
year, six times more

01:30:41.650 --> 01:30:43.940
applicants then there
was money.

01:30:43.940 --> 01:30:46.130
It just came out of the woodwork
as soon as there was

01:30:46.130 --> 01:30:48.200
some money to do it.

01:30:48.200 --> 01:30:50.510
So I'm not saying every
organization is like that.

01:30:50.510 --> 01:30:52.520
Obviously not everybody in
their bank did that.

01:30:52.520 --> 01:30:56.560
But it was, to me, actually
quite surprising how many

01:30:56.560 --> 01:30:59.330
people were willing
to come forward.

01:30:59.330 --> 01:31:02.560
Now we have the luxury of
working with the willing,

01:31:02.560 --> 01:31:06.850
which if you're working within
an organization, you don't

01:31:06.850 --> 01:31:10.390
necessarily have that luxury.

01:31:10.390 --> 01:31:13.410
You will see as you get into the
details of these things,

01:31:13.410 --> 01:31:18.380
that you need absolutely full
cooperation and complete

01:31:18.380 --> 01:31:20.950
dedication on the part of the
practitioners who were doing

01:31:20.950 --> 01:31:24.610
these evaluations alongside
the evaluators.

01:31:24.610 --> 01:31:27.570
You can't do this with
a partner who

01:31:27.570 --> 01:31:28.745
doesn't want to be evaluated.

01:31:28.745 --> 01:31:30.050
It just doesn't work.

01:31:30.050 --> 01:31:35.350
They are so able to throw monkey
wrenches in there if

01:31:35.350 --> 01:31:36.870
they don't want to find
out the answer.

01:31:36.870 --> 01:31:41.150
Then it's just not worth
doing it because it's a

01:31:41.150 --> 01:31:43.230
partnership like that.

01:31:43.230 --> 01:31:47.260
It's not someone coming along
afterwards and interviewing.

01:31:47.260 --> 01:31:51.010
It is the practitioners and the
evaluators working hand in

01:31:51.010 --> 01:31:52.950
hand throughout the
whole process.

01:31:52.950 --> 01:31:55.760
And therefore if the
practitioners don't want to be

01:31:55.760 --> 01:32:01.280
evaluated, there's not a hope
in hell of getting a result.

01:32:01.280 --> 01:32:02.830
We should wrap up.

01:32:02.830 --> 01:32:06.900
A lot of these things we're
going to talk about.

01:32:06.900 --> 01:32:07.980
But I'll take one more.

01:32:07.980 --> 01:32:08.470
Yeah?

01:32:08.470 --> 01:32:11.842
AUDIENCE: How important, or
how relevant is it, or how

01:32:11.842 --> 01:32:15.300
much skepticism can there be
about a case where the

01:32:15.300 --> 01:32:17.100
evaluators and the practitioners
work for the

01:32:17.100 --> 01:32:20.300
same people or are funded
by the same people?

01:32:20.300 --> 01:32:22.395
RACHEL GLENNERSTER: Yeah, we've
even got practitioners

01:32:22.395 --> 01:32:27.570
as co-authors on our studies.

01:32:27.570 --> 01:32:31.500
This is another place where I
kind of part company from the

01:32:31.500 --> 01:32:39.570
classic evaluation guidelines,
which say that it's very

01:32:39.570 --> 01:32:41.660
important to be independent.

01:32:41.660 --> 01:32:45.780
I'd argue if your methodology is
independent, what you want

01:32:45.780 --> 01:32:46.400
is not independence.

01:32:46.400 --> 01:32:49.000
You want objectivity.

01:32:49.000 --> 01:32:52.300
And the methodology of a
randomized evaluation can

01:32:52.300 --> 01:32:53.800
provide you the objectivity.

01:32:53.800 --> 01:32:55.685
And therefore you don't have to
worry about independence.

01:32:58.200 --> 01:33:02.820
Now there's one caveat
to that.

01:33:02.820 --> 01:33:06.540
The beauty of the design is
you set it up, as I say.

01:33:06.540 --> 01:33:08.650
Well you don't stand
back in the sense.

01:33:08.650 --> 01:33:11.770
You've got to manage all your
threats and things.

01:33:11.770 --> 01:33:15.830
But you can't fiddle very
much with it at the end.

01:33:15.830 --> 01:33:18.680
The one exception
to that is that

01:33:18.680 --> 01:33:21.010
you can look at subgroups.

01:33:21.010 --> 01:33:28.550
So there was an evaluation in
UK of a welfare program.

01:33:28.550 --> 01:33:30.240
And it was a randomized
evaluation.

01:33:32.790 --> 01:33:34.050
And there was some
complaining.

01:33:34.050 --> 01:33:36.950
Because at the end, they went
through and looked at every

01:33:36.950 --> 01:33:39.240
ethnic minority.

01:33:39.240 --> 01:33:43.200
and then you know I can't
remember whether it did work

01:33:43.200 --> 01:33:43.820
in general.

01:33:43.820 --> 01:33:49.060
But it didn't work for one
minority, or it didn't work.

01:33:49.060 --> 01:33:53.280
But anyway, you can find one
subgroup for whom the result

01:33:53.280 --> 01:33:54.220
was flipped.

01:33:54.220 --> 01:33:56.690
And that was the thing on the
front page of the newspapers,

01:33:56.690 --> 01:33:59.570
rather than the overall
effect.

01:33:59.570 --> 01:34:05.280
So there's a way to deal with
that, which is increasingly

01:34:05.280 --> 01:34:11.200
being stressed by people who are
kind of looking over the

01:34:11.200 --> 01:34:15.670
shoulder and making sure that
what is done in randomized

01:34:15.670 --> 01:34:19.230
evaluations is done properly,
which is to say that you need

01:34:19.230 --> 01:34:21.495
to set out in advance--

01:34:21.495 --> 01:34:24.190
we'll talk about this a bit
later on-- but you need to set

01:34:24.190 --> 01:34:27.260
out in advance what you're
going to do.

01:34:27.260 --> 01:34:32.910
So if you want to look at a
subgroup like does it affect

01:34:32.910 --> 01:34:35.860
the lowest performing kids in
the school differently from

01:34:35.860 --> 01:34:38.130
the highest performing kids--

01:34:38.130 --> 01:34:40.820
do I care most about the
lowest performing kid--

01:34:40.820 --> 01:34:43.570
if you want to do that, you need
to say you're going to do

01:34:43.570 --> 01:34:46.170
that before you actually
look at the numbers.

01:34:46.170 --> 01:34:49.490
Because even with a randomized
evaluation, you can data mine

01:34:49.490 --> 01:34:52.660
to some extent.

01:34:52.660 --> 01:34:54.980
Well if I look at least ten
kids, does it work for them?

01:34:54.980 --> 01:34:58.050
If I look at least ten kids,
does it work for them?

01:34:58.050 --> 01:35:02.590
Statistically you will be able
to find some subset of your

01:35:02.590 --> 01:35:05.500
sample for whom it does work.

01:35:05.500 --> 01:35:09.640
So you can't just keep trying
100 different subgroups.

01:35:09.640 --> 01:35:12.550
Because eventually it will
work for one of them.

01:35:12.550 --> 01:35:14.660
So on the whole, you
need to look at the

01:35:14.660 --> 01:35:15.990
main average effect.

01:35:15.990 --> 01:35:19.290
What's the average effect
for the whole sample?

01:35:19.290 --> 01:35:23.090
If you are particularly
interested in a special group

01:35:23.090 --> 01:35:26.690
within the whole sample, you
need to say that before you

01:35:26.690 --> 01:35:28.940
start looking at the data.

01:35:28.940 --> 01:35:33.690
So that's the only way in which
you get to fiddle with

01:35:33.690 --> 01:35:34.860
the results.

01:35:34.860 --> 01:35:39.000
And otherwise it provides an
enormous amount of objectivity

01:35:39.000 --> 01:35:40.320
in the methodology.

01:35:40.320 --> 01:35:43.080
And therefore, you don't have
to worry so much about a

01:35:43.080 --> 01:35:45.680
Chinese wall between the
evaluators and the

01:35:45.680 --> 01:35:49.130
practitioners, which, I think,
is incredibly important.

01:35:49.130 --> 01:35:53.540
Because we couldn't do the work
that we do if we had that

01:35:53.540 --> 01:35:54.160
Chinese wall.

01:35:54.160 --> 01:35:57.920
It just wouldn't make sense,
doing your theory of change,

01:35:57.920 --> 01:36:02.120
finding out how it's working,
designing it so it asks the

01:36:02.120 --> 01:36:03.390
right questions.

01:36:03.390 --> 01:36:06.890
None of that would be possible
if you had wall between you.

01:36:06.890 --> 01:36:09.150
So it just wouldn't be anything
like as useful.

01:36:09.150 --> 01:36:12.970
So getting your objectivity from
the methodology allows

01:36:12.970 --> 01:36:17.300
you to be very integrated
with the evaluation, and

01:36:17.300 --> 01:36:18.720
practitioners to be
very integrated.