WEBVTT

00:00:04.490 --> 00:00:09.210
Now it's time to evaluate our
models on the testing set.

00:00:09.210 --> 00:00:11.570
So the first model we're
going to want to look at

00:00:11.570 --> 00:00:14.530
is that smart baseline model
that basically just took

00:00:14.530 --> 00:00:17.880
a look at the polling results
from the Rasmussen poll

00:00:17.880 --> 00:00:19.630
and used those to
determine who was

00:00:19.630 --> 00:00:22.060
predicted to win the election.

00:00:22.060 --> 00:00:24.850
So it's very easy to
compute the outcome

00:00:24.850 --> 00:00:27.980
for this simple baseline
on the testing set.

00:00:27.980 --> 00:00:31.620
We're going to want to table
the testing set outcome

00:00:31.620 --> 00:00:35.260
variable, Republican,
and we're going

00:00:35.260 --> 00:00:38.090
to compare that against
the actual outcome

00:00:38.090 --> 00:00:40.110
of the smart baseline,
which as you recall

00:00:40.110 --> 00:00:44.490
would be the sign of the testing
set's Rasmussen variables.

00:00:47.770 --> 00:00:51.180
And we can see that
for these results,

00:00:51.180 --> 00:00:54.890
there are 18 times where
the smart baseline predicted

00:00:54.890 --> 00:00:57.320
that the Democrat would
win and it's correct,

00:00:57.320 --> 00:01:00.880
21 where it predicted
the Republican would win

00:01:00.880 --> 00:01:04.870
and was correct, two times
when it was inconclusive,

00:01:04.870 --> 00:01:07.430
and four times where
it predicted Republican

00:01:07.430 --> 00:01:09.650
but the Democrat actually won.

00:01:09.650 --> 00:01:13.050
So that's four mistakes and
two inconclusive results

00:01:13.050 --> 00:01:14.539
on the testing set.

00:01:14.539 --> 00:01:15.920
So this is going
to be what we're

00:01:15.920 --> 00:01:20.490
going to compare our logistic
regression-based model against.

00:01:20.490 --> 00:01:25.110
So we need to obtain final
testing set prediction

00:01:25.110 --> 00:01:25.930
from our model.

00:01:25.930 --> 00:01:30.030
So we selected mod2, which
was the two variable model.

00:01:30.030 --> 00:01:35.400
So we'll say, TestPrediction
is equal to the predict

00:01:35.400 --> 00:01:37.500
of that model that we selected.

00:01:37.500 --> 00:01:40.450
Now, since we're actually
making testing set predictions,

00:01:40.450 --> 00:01:44.490
we'll pass in newdata
= Test, and again,

00:01:44.490 --> 00:01:47.250
since we want probabilities
to be returned,

00:01:47.250 --> 00:01:48.750
we're going to pass
type="response".

00:01:54.280 --> 00:01:58.070
And the moment of truth,
we're finally going to table

00:01:58.070 --> 00:02:04.840
the test set Republican value
against the test prediction

00:02:04.840 --> 00:02:09.419
being greater than or equal to
0.5, at least a 50% probability

00:02:09.419 --> 00:02:11.910
of the Republican winning.

00:02:11.910 --> 00:02:15.840
And we see that for this
particular case, in all but one

00:02:15.840 --> 00:02:20.050
of the 45 observations in the
testing set, we're correct.

00:02:20.050 --> 00:02:22.370
Now, we could have tried
changing this threshold

00:02:22.370 --> 00:02:27.250
from 0.5 to other values and
computed out an ROC curve,

00:02:27.250 --> 00:02:29.710
but that doesn't quite make
as much sense in this setting

00:02:29.710 --> 00:02:31.640
where we're just trying
to accurately predict

00:02:31.640 --> 00:02:34.440
the outcome of each state
and we don't care more

00:02:34.440 --> 00:02:37.040
about one sort of error--
when we predicted Republican

00:02:37.040 --> 00:02:39.510
and it was actually
Democrat-- than the other,

00:02:39.510 --> 00:02:42.540
where we predicted Democrat
and it was actually Republican.

00:02:42.540 --> 00:02:45.450
So in this particular
case, we feel OK just

00:02:45.450 --> 00:02:51.170
using the cutoff of 0.5
to evaluate our model.

00:02:51.170 --> 00:02:54.010
So let's take a look now
at the mistake we made

00:02:54.010 --> 00:02:56.850
and see if we can
understand what's going on.

00:02:56.850 --> 00:03:00.140
So to actually pull out
the mistake we made,

00:03:00.140 --> 00:03:03.950
we can just take a
subset of the testing set

00:03:03.950 --> 00:03:07.120
and limit it to when
we predicted true,

00:03:07.120 --> 00:03:09.100
but actually the
Democrat won, which

00:03:09.100 --> 00:03:11.930
is the case when
that one failed.

00:03:11.930 --> 00:03:15.880
So this would be
when TestPrediction

00:03:15.880 --> 00:03:21.700
is greater than or equal to 0.5,
and it was not a Republican.

00:03:21.700 --> 00:03:26.980
So Republican was equal to zero.

00:03:26.980 --> 00:03:29.150
So here is that
subset, which just

00:03:29.150 --> 00:03:32.170
has one observation since
we made just one mistake.

00:03:32.170 --> 00:03:34.840
So this was for the year
2012, the testing set year.

00:03:34.840 --> 00:03:36.930
This was the state of Florida.

00:03:36.930 --> 00:03:39.470
And looking through these
predictor variables,

00:03:39.470 --> 00:03:42.070
we see why we made the mistake.

00:03:42.070 --> 00:03:45.520
The Rasmussen poll gave the
Republican a two percentage

00:03:45.520 --> 00:03:49.000
point lead, SurveyUSA
called a tie,

00:03:49.000 --> 00:03:51.400
DiffCount said there
were six more polls that

00:03:51.400 --> 00:03:53.320
predicted Republican
than Democrat,

00:03:53.320 --> 00:03:55.110
and two thirds of
the polls predicted

00:03:55.110 --> 00:03:56.880
the Republican was going to win.

00:03:56.880 --> 00:03:59.090
But actually in this case,
the Republican didn't win.

00:03:59.090 --> 00:04:02.280
Barack Obama won
the state of Florida

00:04:02.280 --> 00:04:04.450
in 2012 over Mitt Romney.

00:04:04.450 --> 00:04:06.990
So the models here
are not magic,

00:04:06.990 --> 00:04:11.030
and given this sort of data,
it's pretty unsurprising

00:04:11.030 --> 00:04:13.010
that our model actually
didn't get Florida

00:04:13.010 --> 00:04:15.280
correct in this case
and made the mistake.

00:04:15.280 --> 00:04:17.690
However, overall, it
seems to be outperforming

00:04:17.690 --> 00:04:19.740
the smart baseline
that we selected,

00:04:19.740 --> 00:04:22.380
and so we think that maybe
this would be a nice model

00:04:22.380 --> 00:04:25.150
to use in the
election prediction.