1
00:00:00,060 --> 00:00:02,500
The following content is
provided under a Creative

2
00:00:02,500 --> 00:00:04,019
Commons license.

3
00:00:04,019 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,730
continue to offer high quality
educational resources for free.

5
00:00:10,730 --> 00:00:13,340
To make a donation or
view additional materials

6
00:00:13,340 --> 00:00:17,217
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,217 --> 00:00:17,842
at ocw.mit.edu.

8
00:00:22,240 --> 00:00:25,970
PROFESSOR: Today's topic
is factor modeling,

9
00:00:25,970 --> 00:00:32,420
and the subject here basically
exploits multivariate analysis

10
00:00:32,420 --> 00:00:37,910
in statistics to financial
markets where our concern is

11
00:00:37,910 --> 00:00:42,790
using factors to model
returns and variances,

12
00:00:42,790 --> 00:00:44,900
covariances, correlations.

13
00:00:44,900 --> 00:00:48,970
And with these models,
there are two basic cases.

14
00:00:48,970 --> 00:00:52,150
There's one where the
factors are observable.

15
00:00:52,150 --> 00:00:55,150
Those can be
macroeconomic factors.

16
00:00:55,150 --> 00:00:59,690
They can be fundamental
factors on assets or securities

17
00:00:59,690 --> 00:01:03,070
that might explain
returns and covariances.

18
00:01:03,070 --> 00:01:06,490
A second class of models
is where these factors

19
00:01:06,490 --> 00:01:08,930
are hidden or latent.

20
00:01:08,930 --> 00:01:11,850
And statistical
factor models are then

21
00:01:11,850 --> 00:01:15,240
used to specify these models.

22
00:01:15,240 --> 00:01:17,110
In particular, there
are two methodologies.

23
00:01:17,110 --> 00:01:21,310
There's factor analysis and
principal components analysis,

24
00:01:21,310 --> 00:01:24,930
which we'll get into some
detail during the lecture.

25
00:01:24,930 --> 00:01:31,200
So let's proceed to talk about
the setup for a linear factor

26
00:01:31,200 --> 00:01:33,410
model.

27
00:01:33,410 --> 00:01:38,540
We have m assets, or
instruments, or indexes

28
00:01:38,540 --> 00:01:42,710
whose values correspond to a
multivariate stochastic process

29
00:01:42,710 --> 00:01:44,030
we're modeling.

30
00:01:44,030 --> 00:01:47,530
And we have n time periods t.

31
00:01:47,530 --> 00:01:52,840
And with the factor model
we model the t-th value

32
00:01:52,840 --> 00:01:58,140
for the i-th object-- whether
it's a stock price, futures

33
00:01:58,140 --> 00:02:04,750
price, currency-- as a
linear function of factors

34
00:02:04,750 --> 00:02:07,360
f_1 through f_k.

35
00:02:07,360 --> 00:02:10,690
So there's basically
like a state-space model

36
00:02:10,690 --> 00:02:12,845
for the value of the
stochastic process,

37
00:02:12,845 --> 00:02:16,020
as it depends on these
underlying factors.

38
00:02:16,020 --> 00:02:20,080
And the dependence is given
by coefficients beta_1

39
00:02:20,080 --> 00:02:27,600
through beta_k, which are
depending upon i, the asset.

40
00:02:27,600 --> 00:02:31,730
So we allow each asset, say
if we're thinking of stocks,

41
00:02:31,730 --> 00:02:34,770
to depend on factors
in different ways.

42
00:02:34,770 --> 00:02:38,900
If a certain underlying
factor changes in value,

43
00:02:38,900 --> 00:02:44,340
the beta corresponds to the
impact of that underlying

44
00:02:44,340 --> 00:02:46,330
factor.

45
00:02:46,330 --> 00:02:49,440
So we have common factors.

46
00:02:52,080 --> 00:02:54,250
Now these common factors
f, this is really

47
00:02:54,250 --> 00:02:58,150
going to be a nice model if the
number of factors that we're

48
00:02:58,150 --> 00:03:01,300
using is relatively small.

49
00:03:01,300 --> 00:03:05,360
So the number k
of common factors

50
00:03:05,360 --> 00:03:09,490
is generally very, very
small relative to m.

51
00:03:09,490 --> 00:03:13,010
And if you think about modeling,
say asset-- equity asset

52
00:03:13,010 --> 00:03:16,000
returns in a market, there
can be hundreds and thousands

53
00:03:16,000 --> 00:03:17,570
of securities.

54
00:03:17,570 --> 00:03:22,530
And so in terms of modeling
those returns and covariances,

55
00:03:22,530 --> 00:03:24,360
what we're trying to
do is characterize

56
00:03:24,360 --> 00:03:28,230
those in terms of a modest
number of underlying factors

57
00:03:28,230 --> 00:03:30,510
which simplifies
the problem greatly.

58
00:03:33,450 --> 00:03:37,190
The vectors beta_i are
termed the factor loadings

59
00:03:37,190 --> 00:03:38,610
of an asset.

60
00:03:38,610 --> 00:03:43,680
And the epsilon_(i,t)'s are
a specific factor of asset i,

61
00:03:43,680 --> 00:03:44,470
period t.

62
00:03:44,470 --> 00:03:48,260
So in factor modeling,
we talk about there

63
00:03:48,260 --> 00:03:53,340
being common factors affecting
the dynamics of the system,

64
00:03:53,340 --> 00:03:59,210
and the factor associated
with particular cases

65
00:03:59,210 --> 00:04:02,450
are the specific factors.

66
00:04:02,450 --> 00:04:05,120
So this setup is
really very familiar.

67
00:04:05,120 --> 00:04:08,430
It just looks like a standard
sort of regression type model

68
00:04:08,430 --> 00:04:11,240
that we've seen before.

69
00:04:11,240 --> 00:04:14,270
And so let's see how
this can be set up

70
00:04:14,270 --> 00:04:18,100
as a set of cross-sectional
regressions.

71
00:04:18,100 --> 00:04:25,870
So now we're going to fix
the period t, the time t,

72
00:04:25,870 --> 00:04:31,040
and consider the
m-variate x variable

73
00:04:31,040 --> 00:04:38,460
as satisfying a regression model
with intercept given by alphas.

74
00:04:38,460 --> 00:04:43,140
And then the independent
variables matrix

75
00:04:43,140 --> 00:04:48,710
is B, given by the coefficients
of the factor loadings.

76
00:04:48,710 --> 00:04:54,210
And then we have the residuals
epsilon_t for the m assets.

77
00:04:54,210 --> 00:04:57,640
So the cross-sectional
terminology

78
00:04:57,640 --> 00:05:00,700
means we're fixing time
and looking across all

79
00:05:00,700 --> 00:05:02,970
the assets for one fixed time.

80
00:05:02,970 --> 00:05:09,310
And we're trying to explain
how, say, the returns of assets

81
00:05:09,310 --> 00:05:12,240
are varying depending upon
the underlying factors.

82
00:05:12,240 --> 00:05:19,990
And so the-- well OK, what's
random in this process?

83
00:05:19,990 --> 00:05:23,770
Well certainly the residual
term is considered to be random.

84
00:05:23,770 --> 00:05:26,410
That's basically
going to be assumed

85
00:05:26,410 --> 00:05:29,660
to be white noise with mean 0.

86
00:05:29,660 --> 00:05:35,490
There's going to be possibly
a covariance matrix psi.

87
00:05:35,490 --> 00:05:38,010
And it's going to
be uncorrelated

88
00:05:38,010 --> 00:05:41,860
across different
time cross sections.

89
00:05:41,860 --> 00:05:44,550
Let's see if I can move the
mouse, if this is what's

90
00:05:44,550 --> 00:05:46,700
causing the problem down here.

91
00:05:46,700 --> 00:05:54,450
So in this model we have the
realizations on the underlying

92
00:05:54,450 --> 00:05:56,600
factors being random variables.

93
00:05:56,600 --> 00:05:59,710
The returns on the assets depend
on the underlying factors.

94
00:05:59,710 --> 00:06:04,540
Those are going to be assumed
to have some mean, mu_f,

95
00:06:04,540 --> 00:06:07,010
and some covariance matrix.

96
00:06:07,010 --> 00:06:09,760
And basically the
dimension of that

97
00:06:09,760 --> 00:06:13,250
covariance matrix omega_f
is going to be k by k.

98
00:06:13,250 --> 00:06:16,975
So in terms of modeling this
problem, we go from an m

99
00:06:16,975 --> 00:06:22,130
by m system of
covariances, correlations,

100
00:06:22,130 --> 00:06:27,360
to focusing initially on an a
k by k system of covariances

101
00:06:27,360 --> 00:06:30,730
and correlations between
the underlying factors.

102
00:06:30,730 --> 00:06:38,380
Psi is a diagonal matrix
with the specific variances

103
00:06:38,380 --> 00:06:40,310
of the underlying assets.

104
00:06:40,310 --> 00:06:50,270
So we have basically epsilon--
the covariance matrix

105
00:06:50,270 --> 00:06:53,010
of the epsilons is
a diagonal matrix,

106
00:06:53,010 --> 00:06:59,500
and the covariance matrix of
f is given by this omega_f.

107
00:06:59,500 --> 00:07:01,690
Well, with those
specifications we

108
00:07:01,690 --> 00:07:09,070
can get the covariance
for the overall vector

109
00:07:09,070 --> 00:07:13,880
of the m-variate
stochastic process.

110
00:07:13,880 --> 00:07:19,880
And we have this model here
for the conditional moments.

111
00:07:19,880 --> 00:07:23,270
Basically, the
conditional expectation

112
00:07:23,270 --> 00:07:25,810
of the process given
the underlying factors

113
00:07:25,810 --> 00:07:30,310
is this linear model in terms
of the underlying factors f.

114
00:07:30,310 --> 00:07:34,025
And the covariance matrix is the
psi matrix, which is diagonal.

115
00:07:38,040 --> 00:07:42,840
And the unconditional
moments are

116
00:07:42,840 --> 00:07:46,290
obtained by just taking
the expectations of these.

117
00:07:46,290 --> 00:07:50,130
Well actually, the unconditional
expectation of x is this.

118
00:07:50,130 --> 00:07:52,860
The unconditional
covariance of x

119
00:07:52,860 --> 00:07:56,340
is actually equal
to the expectation

120
00:07:56,340 --> 00:08:02,129
of this plus the variance of
the conditional expectation,

121
00:08:02,129 --> 00:08:04,170
or the covariance of the
conditional expectation.

122
00:08:04,170 --> 00:08:08,690
So one of the formulas that's
important to realize here

123
00:08:08,690 --> 00:08:13,620
is that if we're considering
the covariance of x_t, which

124
00:08:13,620 --> 00:08:21,530
is equal to covariance of B
f_t plus epsilon_t, that's

125
00:08:21,530 --> 00:08:27,715
equal to the covariance of
B f_t plus the covariance

126
00:08:27,715 --> 00:08:35,100
of epsilon_t plus
twice the covariance

127
00:08:35,100 --> 00:08:39,600
between this term and this,
but those are uncorrelated.

128
00:08:39,600 --> 00:08:47,520
And so this is equal to B
covariance of f_t B transpose

129
00:08:47,520 --> 00:08:49,240
plus psi.

130
00:08:54,700 --> 00:08:56,865
With m assets, how
many parameters

131
00:08:56,865 --> 00:08:59,890
are in the covariance
matrix if there's

132
00:08:59,890 --> 00:09:02,987
no constraints on the
covariance matrix?

133
00:09:02,987 --> 00:09:03,903
AUDIENCE: [INAUDIBLE].

134
00:09:07,340 --> 00:09:08,670
PROFESSOR: How many parameters?

135
00:09:08,670 --> 00:09:09,560
Right.

136
00:09:09,560 --> 00:09:11,370
So this is sigma.

137
00:09:11,370 --> 00:09:15,214
So the number of
parameters in sigma.

138
00:09:15,214 --> 00:09:16,130
AUDIENCE: [INAUDIBLE].

139
00:09:19,954 --> 00:09:21,595
PROFESSOR: m plus what?

140
00:09:21,595 --> 00:09:23,970
AUDIENCE: [INAUDIBLE].

141
00:09:23,970 --> 00:09:29,250
PROFESSOR: OK, this is
a square matrix, m by m.

142
00:09:29,250 --> 00:09:32,440
So there's possibly m
squared, but it's symmetric.

143
00:09:32,440 --> 00:09:36,660
So we're double-counting
off the diagonal.

144
00:09:36,660 --> 00:09:39,950
So it's m times m plus 1 over 2.

145
00:09:43,490 --> 00:09:47,540
How many parameters do we
have with the factor model?

146
00:09:52,150 --> 00:09:57,210
So if we think of a--
let's call this sigma star.

147
00:09:57,210 --> 00:10:01,640
The number of parameters
in sigma star is what?

148
00:10:05,810 --> 00:10:10,986
Well, B is an m by k matrix.

149
00:10:15,920 --> 00:10:22,315
This is m by k, so we have
possibly m times k values.

150
00:10:25,870 --> 00:10:39,030
The f_x is-- or the
covariance of f_t

151
00:10:39,030 --> 00:10:45,460
is the number of elements in
the covariance matrix of f,

152
00:10:45,460 --> 00:10:48,360
which is k by k.

153
00:10:48,360 --> 00:10:58,470
And then we have psi, which
is a diagonal of dimension m.

154
00:10:58,470 --> 00:11:00,680
So depending on how
we structure things,

155
00:11:00,680 --> 00:11:03,970
we can have many, many fewer
parameters in this factor model

156
00:11:03,970 --> 00:11:05,609
than in the unconstrained case.

157
00:11:05,609 --> 00:11:07,400
And we're going to see
that we can actually

158
00:11:07,400 --> 00:11:12,630
reduce this number in the
covariance matrix of f

159
00:11:12,630 --> 00:11:15,637
dramatically because
of flexibility

160
00:11:15,637 --> 00:11:16,720
in choosing those factors.

161
00:11:21,940 --> 00:11:27,990
Well let's also look at the
interpretation of the factor

162
00:11:27,990 --> 00:11:30,110
model as a series of
time series regressions.

163
00:11:30,110 --> 00:11:35,410
You remember when we talked
about multivariate regression

164
00:11:35,410 --> 00:11:38,490
a few lectures ago, we
talked about cross-sectional

165
00:11:38,490 --> 00:11:41,760
regressions and time
series regressions,

166
00:11:41,760 --> 00:11:45,760
and looking at the collection
of all the regressions

167
00:11:45,760 --> 00:11:47,770
in a multivariate
regression setting.

168
00:11:47,770 --> 00:11:50,460
Here we can do the same thing.

169
00:11:50,460 --> 00:11:52,620
In contrast to the
cross-sectional regression

170
00:11:52,620 --> 00:11:55,680
where we're fixing time and
looking at all the assets,

171
00:11:55,680 --> 00:12:01,570
here we're looking at fixing
the asset i and the regression

172
00:12:01,570 --> 00:12:04,590
over time for that single asset.

173
00:12:04,590 --> 00:12:09,980
So the values of x_i,
ranging from time 1

174
00:12:09,980 --> 00:12:16,130
up to time capital T, basically
follows a regression model

175
00:12:16,130 --> 00:12:22,890
that's equal to the intercept
alpha_i plus this matrix F

176
00:12:22,890 --> 00:12:30,055
times beta_i, where beta_i
corresponds to the regression

177
00:12:30,055 --> 00:12:31,680
parameters in this
regression, but they

178
00:12:31,680 --> 00:12:35,985
are the factor corresponding to
an asset i on the different k

179
00:12:35,985 --> 00:12:36,485
factors.

180
00:12:39,430 --> 00:12:45,470
In this setting, the covariance
matrix of the epsilon_i vector

181
00:12:45,470 --> 00:12:50,640
is the diagonal matrix sigma
squared times the identity.

182
00:12:50,640 --> 00:12:54,580
And so this is the classic
Gauss-Markov assumptions

183
00:12:54,580 --> 00:12:58,180
for a single linear
regression model.

184
00:13:04,530 --> 00:13:09,600
Well, as we did previously,
we can group together

185
00:13:09,600 --> 00:13:13,700
all of these time series
regressions for all the m

186
00:13:13,700 --> 00:13:19,220
assets together by simply
putting them all together.

187
00:13:19,220 --> 00:13:28,620
So we start off with x_i
equal to basically F beta_i

188
00:13:28,620 --> 00:13:31,030
plus epsilon_i.

189
00:13:31,030 --> 00:13:39,980
And we can basically
consider x_1, x_2, up to x_n.

190
00:13:39,980 --> 00:13:46,260
So we have a T by m
matrix for the m assets.

191
00:13:46,260 --> 00:13:56,230
And that's equal to a regression
model given by basically

192
00:13:56,230 --> 00:13:58,470
what's on the slides here.

193
00:13:58,470 --> 00:14:01,370
So basically, we're able to
group everything together

194
00:14:01,370 --> 00:14:05,900
and deal with everything all
at once, which computationally

195
00:14:05,900 --> 00:14:08,530
is applied in fitting these.

196
00:14:16,630 --> 00:14:21,780
Let's look at the simplest
example of a factor model.

197
00:14:21,780 --> 00:14:24,610
This is the single-factor
model of Sharpe.

198
00:14:24,610 --> 00:14:27,640
We went through the capital
asset pricing model,

199
00:14:27,640 --> 00:14:33,382
how returns on assets and
stocks are basically--

200
00:14:33,382 --> 00:14:35,090
the excess return on
stock can be modeled

201
00:14:35,090 --> 00:14:39,360
in terms as a linear
regression on the excess return

202
00:14:39,360 --> 00:14:40,530
of the market.

203
00:14:40,530 --> 00:14:43,860
And the regression
parameter beta_i

204
00:14:43,860 --> 00:14:48,760
corresponds to the level of
risk associated with the asset.

205
00:14:48,760 --> 00:14:54,110
And all we're doing
in this model is,

206
00:14:54,110 --> 00:14:57,050
by choosing different
assets we're choosing assets

207
00:14:57,050 --> 00:15:01,800
with different levels of
risk scaled by the beta_i.

208
00:15:01,800 --> 00:15:04,510
And they may have
returns that vary

209
00:15:04,510 --> 00:15:08,760
across assets given by alpha_i.

210
00:15:08,760 --> 00:15:16,380
The covariance
matrix of the assets

211
00:15:16,380 --> 00:15:18,600
has-- the unconditional
covariance matrix

212
00:15:18,600 --> 00:15:20,540
has this structure.

213
00:15:20,540 --> 00:15:25,190
It's basically equal to the
variance of the market times

214
00:15:25,190 --> 00:15:28,580
beta beta prime plus psi.

215
00:15:28,580 --> 00:15:33,780
And so that equation
is really very simple.

216
00:15:37,070 --> 00:15:41,270
It's really self-evident from
what we've discussed, but let

217
00:15:41,270 --> 00:15:45,580
me just highlight what it is.

218
00:15:45,580 --> 00:15:53,276
Sigma squared beta beta
transposed plus psi.

219
00:15:53,276 --> 00:15:55,170
And that's equal
to sigma squared

220
00:15:55,170 --> 00:15:58,720
times basically a column vector
of all the betas, beta_1 down

221
00:15:58,720 --> 00:16:08,460
to beta_m times its transpose
plus a diagonal matrix

222
00:16:08,460 --> 00:16:09,740
with the psi.

223
00:16:09,740 --> 00:16:12,460
So this is really a very,
very simple structure

224
00:16:12,460 --> 00:16:14,790
for the covariance.

225
00:16:14,790 --> 00:16:18,610
And if you had wanted to
apply this model to thousands

226
00:16:18,610 --> 00:16:20,850
of securities, it's
basically no problem.

227
00:16:20,850 --> 00:16:23,270
You can construct a
covariance matrix.

228
00:16:23,270 --> 00:16:26,510
And if this were appropriate
for a large collection

229
00:16:26,510 --> 00:16:30,110
of securities, then the
amount-- the reduction

230
00:16:30,110 --> 00:16:35,810
in terms of what you're
estimating is enormous.

231
00:16:35,810 --> 00:16:39,103
Rather than estimating
each cross-correlation

232
00:16:39,103 --> 00:16:44,190
and covariance of
all the assets,

233
00:16:44,190 --> 00:16:49,190
the factor model tells us what
those cross covariances are.

234
00:16:49,190 --> 00:16:54,141
So that's really where the
power of the model comes in.

235
00:16:54,141 --> 00:16:58,310
And in terms of why
is this so useful,

236
00:16:58,310 --> 00:17:03,980
well in portfolio management
one of the key drivers

237
00:17:03,980 --> 00:17:07,450
of asset allocation is
the covariance matrix

238
00:17:07,450 --> 00:17:08,460
of the assets.

239
00:17:08,460 --> 00:17:09,910
So if you have an
effective model

240
00:17:09,910 --> 00:17:12,319
for modeling the
covariance, then that

241
00:17:12,319 --> 00:17:14,920
simplifies the portfolio
allocation problem

242
00:17:14,920 --> 00:17:17,800
because you can specify
a covariance matrix

243
00:17:17,800 --> 00:17:20,510
that you are confident with.

244
00:17:20,510 --> 00:17:28,089
And also in risk
management, effective models

245
00:17:28,089 --> 00:17:32,010
of risk management
deal with, how

246
00:17:32,010 --> 00:17:37,820
do we anticipate what would
happen if different scenarios

247
00:17:37,820 --> 00:17:38,750
occur in the market?

248
00:17:38,750 --> 00:17:41,320
Well, the different
scenarios that can occur

249
00:17:41,320 --> 00:17:45,900
can be associated with what's
happening to underlying factors

250
00:17:45,900 --> 00:17:48,460
that affect the system.

251
00:17:48,460 --> 00:17:51,580
And so we can consider
risk management approaches

252
00:17:51,580 --> 00:17:54,200
that vary these underlying
factors, and look at

253
00:17:54,200 --> 00:17:57,172
how that has an impact
on the covariance matrix

254
00:17:57,172 --> 00:17:57,755
very directly.

255
00:18:04,640 --> 00:18:08,350
Estimation of Sharpe's
single index model.

256
00:18:08,350 --> 00:18:11,460
We went through
before how we estimate

257
00:18:11,460 --> 00:18:14,970
the alphas and the betas.

258
00:18:14,970 --> 00:18:17,440
In terms of estimating
the sigmas--

259
00:18:17,440 --> 00:18:20,800
the specific
variances-- basically,

260
00:18:20,800 --> 00:18:23,640
that comes from the
simple regression as well.

261
00:18:23,640 --> 00:18:26,920
Basically, the sum of the
squared estimated residuals

262
00:18:26,920 --> 00:18:28,840
divided by t minus 2.

263
00:18:28,840 --> 00:18:31,870
Here we're assuming unbiasedness
because we have two parameters

264
00:18:31,870 --> 00:18:34,220
estimated per model.

265
00:18:34,220 --> 00:18:40,700
Then for the market
portfolio, that basically

266
00:18:40,700 --> 00:18:42,580
has a simple estimate as well.

267
00:18:42,580 --> 00:18:46,470
The psi hat matrix would
just be the diagonal

268
00:18:46,470 --> 00:18:53,680
of that-- the diagonal of
the specific variances.

269
00:18:53,680 --> 00:18:56,620
And then the unconditional
covariance matrix

270
00:18:56,620 --> 00:19:00,940
is estimated by simply plugging
in these parameter estimates.

271
00:19:00,940 --> 00:19:08,490
So, very simple and effective
if that single-factor

272
00:19:08,490 --> 00:19:09,760
model is appropriate.

273
00:19:09,760 --> 00:19:13,660
Now needless to say,
a single-factor model

274
00:19:13,660 --> 00:19:18,860
doesn't characterize the
structure of the covariances

275
00:19:18,860 --> 00:19:22,220
and/or the returns typically.

276
00:19:22,220 --> 00:19:26,860
And so we want to consider
more general models,

277
00:19:26,860 --> 00:19:28,880
multi-factor models.

278
00:19:28,880 --> 00:19:31,160
And the first set of models
we're going to talk about

279
00:19:31,160 --> 00:19:36,640
are common factor variables
that can actually be observed.

280
00:19:39,630 --> 00:19:44,430
Basically any factor
that you can observe

281
00:19:44,430 --> 00:19:49,480
is a potential candidate
for being a relevant factor

282
00:19:49,480 --> 00:19:51,240
in a linear factor model.

283
00:19:51,240 --> 00:19:54,100
The effectiveness of
a potential factor

284
00:19:54,100 --> 00:19:56,490
is determined by
fitting the model

285
00:19:56,490 --> 00:20:00,050
and seeing how much
contribution that factor

286
00:20:00,050 --> 00:20:03,567
makes to the explanation
of the returns

287
00:20:03,567 --> 00:20:04,775
and the covariance structure.

288
00:20:07,300 --> 00:20:12,970
Chen, Ross, and Roll wrote
a classic paper in 1986.

289
00:20:12,970 --> 00:20:17,460
Now Ross is actually
here at MIT.

290
00:20:17,460 --> 00:20:30,580
And with their paper,
they looked at modeling--

291
00:20:30,580 --> 00:20:32,180
rather than looking
at these factors

292
00:20:32,180 --> 00:20:34,560
directly, including
those in the model,

293
00:20:34,560 --> 00:20:39,940
they looked at
transforming these factors

294
00:20:39,940 --> 00:20:43,080
into surprise factors.

295
00:20:43,080 --> 00:20:47,550
So rather than having
interest rates just

296
00:20:47,550 --> 00:20:50,230
as a simple factor directly
plugged into the model,

297
00:20:50,230 --> 00:20:54,100
it would be the change
in interest rates.

298
00:20:54,100 --> 00:20:56,890
And additionally, not only just
the change in interest rate,

299
00:20:56,890 --> 00:20:59,180
but the unanticipated
change in interest

300
00:20:59,180 --> 00:21:01,550
rates given market information.

301
00:21:01,550 --> 00:21:07,480
So their paper goes
through modeling different

302
00:21:07,480 --> 00:21:12,130
macroeconomic variables with
vector autoregression models,

303
00:21:12,130 --> 00:21:17,270
and then using those to
specify unanticipated changes

304
00:21:17,270 --> 00:21:19,540
in these underlying factors.

305
00:21:19,540 --> 00:21:22,680
And so that's where
the power comes in.

306
00:21:22,680 --> 00:21:27,680
And that highlights how when
you're applying these models,

307
00:21:27,680 --> 00:21:30,960
it does involve some
creativity to get the most bang

308
00:21:30,960 --> 00:21:32,340
for the buck with your models.

309
00:21:32,340 --> 00:21:36,780
And the idea they had of
incorporating unanticipated

310
00:21:36,780 --> 00:21:39,060
changes was really
a very good one

311
00:21:39,060 --> 00:21:41,555
and is applied quite widely now.

312
00:21:55,050 --> 00:22:03,380
Now with this setup,
one basically--

313
00:22:03,380 --> 00:22:10,040
if one has empirical data over
times 1 through capital T,

314
00:22:10,040 --> 00:22:13,580
then if one wants to
specify these models,

315
00:22:13,580 --> 00:22:17,740
one has their observations
on the x_i process.

316
00:22:17,740 --> 00:22:22,120
You basically have observed
all the returns historically.

317
00:22:22,120 --> 00:22:24,940
We also, because the
factors are observable,

318
00:22:24,940 --> 00:22:29,290
have the F matrix as a
set of observed variables.

319
00:22:29,290 --> 00:22:36,300
So we can basically use those to
estimate the beta_i's and also

320
00:22:36,300 --> 00:22:40,970
estimate the variances
of the residual terms

321
00:22:40,970 --> 00:22:44,310
with simple regression methods.

322
00:22:44,310 --> 00:22:49,970
So implementing these
is quite feasible,

323
00:22:49,970 --> 00:22:53,860
and basically applies methods
that we have from before.

324
00:22:53,860 --> 00:23:01,110
So what this slide now discusses
is how we basically estimate

325
00:23:01,110 --> 00:23:02,700
the underlying parameters.

326
00:23:02,700 --> 00:23:06,990
We need to be a little bit
careful about the Gauss-Markov

327
00:23:06,990 --> 00:23:08,210
assumptions.

328
00:23:08,210 --> 00:23:15,100
You'll remember that if we
have a regression model where

329
00:23:15,100 --> 00:23:18,650
the residual terms are
uncorrelated and constant

330
00:23:18,650 --> 00:23:22,560
variance, then the
simple linear regression

331
00:23:22,560 --> 00:23:25,210
estimates are the best ones.

332
00:23:25,210 --> 00:23:32,740
If there is unequal
variances of the residuals,

333
00:23:32,740 --> 00:23:36,650
and maybe even
covariances, then we

334
00:23:36,650 --> 00:23:40,250
need to use generalized
least squares.

335
00:23:40,250 --> 00:23:46,850
So the notes go through those
computations and the formulas,

336
00:23:46,850 --> 00:23:51,380
which are just simple extensions
of our regression model theory

337
00:23:51,380 --> 00:23:53,755
that we had in
previous lectures.

338
00:24:04,240 --> 00:24:11,433
Let me go through an example.

339
00:24:17,720 --> 00:24:19,790
With common factor
variables that

340
00:24:19,790 --> 00:24:25,560
are using either fundamental
or asset-specific attributes,

341
00:24:25,560 --> 00:24:29,047
there's the approach of-- well,
it's called the BARRA Approach.

342
00:24:29,047 --> 00:24:30,213
This is from Barr Rosenberg.

343
00:24:33,470 --> 00:24:36,090
Actually, I have to say, he
was one of the inspirations

344
00:24:36,090 --> 00:24:41,040
to me for going into statistical
modeling and finance.

345
00:24:41,040 --> 00:24:44,910
He was a professor at UC
Berkeley who left academics

346
00:24:44,910 --> 00:24:51,770
very early to basically
apply models in trade money.

347
00:24:51,770 --> 00:24:55,250
As an anecdote, his
current situation

348
00:24:55,250 --> 00:24:57,210
is a little different.

349
00:24:57,210 --> 00:24:58,620
I'll let you look that up.

350
00:24:58,620 --> 00:25:04,170
But anyway, this
approach basically

351
00:25:04,170 --> 00:25:09,260
provided the BARRA Approach
for factor modeling and risk

352
00:25:09,260 --> 00:25:11,950
analysis, which is still
used extensively today.

353
00:25:15,360 --> 00:25:23,710
So with common factor
variables using

354
00:25:23,710 --> 00:25:28,740
asset-specific
attributes-- in fact,

355
00:25:28,740 --> 00:25:33,890
the factor realizations
are unobserved

356
00:25:33,890 --> 00:25:38,960
but are estimated in the
application of these models.

357
00:25:38,960 --> 00:25:41,930
So let's see how that goes.

358
00:25:41,930 --> 00:25:50,410
Oh, OK, this slide talks about
the Fama-French approach, which

359
00:25:50,410 --> 00:25:54,610
concerns-- OK, Fama and
French, Fama of course

360
00:25:54,610 --> 00:25:56,780
we talked about him
in the last lecture.

361
00:25:56,780 --> 00:25:58,920
He got the Nobel
Prize for his work

362
00:25:58,920 --> 00:26:05,220
in modeling asset price
returns and market efficiency.

363
00:26:05,220 --> 00:26:08,230
Fama and French
found that there were

364
00:26:08,230 --> 00:26:11,860
some very important
factors affecting

365
00:26:11,860 --> 00:26:14,330
asset returns in equities.

366
00:26:14,330 --> 00:26:18,300
Basically, returns
tended to vary depending

367
00:26:18,300 --> 00:26:20,910
upon the size of firms.

368
00:26:20,910 --> 00:26:25,680
So if you consider small
firms versus large firms,

369
00:26:25,680 --> 00:26:27,516
small firms tended to
have returns that were

370
00:26:27,516 --> 00:26:28,640
more similar to each other.

371
00:26:28,640 --> 00:26:30,660
Large firms tended to
have returns that were

372
00:26:30,660 --> 00:26:32,110
more similar to each other.

373
00:26:32,110 --> 00:26:35,920
So there's basically a
big versus small factor

374
00:26:35,920 --> 00:26:38,500
that is operating in the market.

375
00:26:38,500 --> 00:26:40,610
Sometimes the market
prefers small stocks,

376
00:26:40,610 --> 00:26:42,790
sometimes it prefers
large stocks.

377
00:26:42,790 --> 00:26:48,580
And similarly,
there's another factor

378
00:26:48,580 --> 00:26:50,950
which is value versus growth.

379
00:26:54,030 --> 00:26:58,410
Basically, stocks that
are considered good values

380
00:26:58,410 --> 00:27:02,914
are stocks which are cheap,
basically, for what they have.

381
00:27:02,914 --> 00:27:04,955
So you're basically getting
a stock at a discount

382
00:27:04,955 --> 00:27:08,390
if you're getting a good value.

383
00:27:08,390 --> 00:27:12,500
And value stocks can be
measured by looking at the price

384
00:27:12,500 --> 00:27:13,165
to book equity.

385
00:27:13,165 --> 00:27:15,940
If that's low, then
the price you're

386
00:27:15,940 --> 00:27:20,600
paying for that equity in the
firm is low, and it's cheap.

387
00:27:20,600 --> 00:27:24,150
And that compares
with stocks for which

388
00:27:24,150 --> 00:27:28,110
the price relative to the
book value is very, very high.

389
00:27:28,110 --> 00:27:32,240
Why are people willing
to pay a lot for stocks?

390
00:27:32,240 --> 00:27:35,000
In that case, well it's
because the growth prospects

391
00:27:35,000 --> 00:27:39,030
of those stocks is high,
and there's an anticipation

392
00:27:39,030 --> 00:27:41,580
basically that the
current price is just

393
00:27:41,580 --> 00:27:47,610
reflecting a projection of how
much growth potential there is.

394
00:27:47,610 --> 00:27:51,670
Now the Fama French approach
is for each of these factors

395
00:27:51,670 --> 00:27:57,080
to basically rank order all
the stocks by this factor

396
00:27:57,080 --> 00:28:01,800
and divide them
up into quintiles.

397
00:28:01,800 --> 00:28:06,550
So say this is market cap.

398
00:28:06,550 --> 00:28:12,030
We can divide up all the
stocks in-- basically

399
00:28:12,030 --> 00:28:15,000
consider a histogram,
or whatever,

400
00:28:15,000 --> 00:28:18,230
of all the market caps of all
the stocks in our universe.

401
00:28:18,230 --> 00:28:23,500
And then divide it up into
basically the bottom fifth,

402
00:28:23,500 --> 00:28:27,026
the next fifth, and
then-- it probably

403
00:28:27,026 --> 00:28:29,830
needs to go up-- the top fifth.

404
00:28:33,120 --> 00:28:35,960
And the Fama-French
approach says, well,

405
00:28:35,960 --> 00:28:41,080
let's look at an equal-weighted
average of the top fifth.

406
00:28:41,080 --> 00:28:50,920
And basically, buy that
and sell the bottom fifth.

407
00:28:50,920 --> 00:28:55,620
And so that would be the
big versus small market

408
00:28:55,620 --> 00:28:57,640
factor of Fama and French.

409
00:28:57,640 --> 00:29:00,300
Now, if you look at
their work, they actually

410
00:29:00,300 --> 00:29:03,080
do the bottom minus the
top, because the value

411
00:29:03,080 --> 00:29:04,670
tends to outperform the other.

412
00:29:04,670 --> 00:29:07,010
So they have a factor
whose more positive

413
00:29:07,010 --> 00:29:08,510
values and associated
more generally

414
00:29:08,510 --> 00:29:10,660
with positive returns.

415
00:29:10,660 --> 00:29:14,780
But that factor can
be applied and used

416
00:29:14,780 --> 00:29:20,400
to correlate with individual
asset returns as well.

417
00:29:26,590 --> 00:29:28,580
Now, with the BARRA
Industry Factor--

418
00:29:28,580 --> 00:29:35,960
this is just getting back
to the BARRA Approach--

419
00:29:35,960 --> 00:29:37,840
the simplest case
of understanding

420
00:29:37,840 --> 00:29:40,580
the BARRA industry
factor models is

421
00:29:40,580 --> 00:29:42,820
to consider looking
at dividing stocks up

422
00:29:42,820 --> 00:29:45,020
into different industry groups.

423
00:29:45,020 --> 00:29:56,610
So we might expect that,
say oil stocks will

424
00:29:56,610 --> 00:30:02,100
tend to move together and
have greater variability

425
00:30:02,100 --> 00:30:04,790
or common variability.

426
00:30:04,790 --> 00:30:10,640
And that could be very different
from utility stocks, which

427
00:30:10,640 --> 00:30:13,105
tend to actually be
quite low-risk stocks.

428
00:30:17,749 --> 00:30:19,290
Utility companies
are companies which

429
00:30:19,290 --> 00:30:21,910
are very highly regulated.

430
00:30:21,910 --> 00:30:26,360
And the profitability
of those firms

431
00:30:26,360 --> 00:30:30,850
is basically overlooked
by the regulators.

432
00:30:30,850 --> 00:30:37,110
They don't want the
utilities to gouge consumers

433
00:30:37,110 --> 00:30:41,890
and make too much profit from
delivering power to customers.

434
00:30:41,890 --> 00:30:46,555
So utilities tend to have
fairly low volatility

435
00:30:46,555 --> 00:30:50,330
but very consistent
returns, which

436
00:30:50,330 --> 00:30:55,530
are based on reasonable,
from a regulatory standpoint,

437
00:30:55,530 --> 00:30:58,110
levels of profitability
for those companies.

438
00:30:58,110 --> 00:31:03,270
Well with an industry
factor model, what we can do

439
00:31:03,270 --> 00:31:08,710
is associate factor
loadings, which basically

440
00:31:08,710 --> 00:31:13,570
are loading each asset in
terms of which industry group

441
00:31:13,570 --> 00:31:14,520
it's in.

442
00:31:14,520 --> 00:31:20,140
So we actually know the beta
values for these stocks,

443
00:31:20,140 --> 00:31:23,080
but we don't know
the underlying factor

444
00:31:23,080 --> 00:31:26,400
realizations for these stocks.

445
00:31:26,400 --> 00:31:29,480
But in terms of the
betas, with these factors

446
00:31:29,480 --> 00:31:34,641
we can basically get a well
defined beta vectors and B

447
00:31:34,641 --> 00:31:37,390
matrix for all the stocks.

448
00:31:37,390 --> 00:31:40,650
And the problem
then is, how do we

449
00:31:40,650 --> 00:31:44,540
specify the realization
of the underlying factors?

450
00:31:44,540 --> 00:31:51,000
Well the realization of
the underlying factors

451
00:31:51,000 --> 00:31:56,190
basically is just estimated
with a regression model.

452
00:31:56,190 --> 00:32:06,300
And so if we have all of our
assets x_i for different times

453
00:32:06,300 --> 00:32:13,700
t, those would have a model
given by factor realizations

454
00:32:13,700 --> 00:32:21,380
corresponding to the k industry
groups with known beta_(i,j)

455
00:32:21,380 --> 00:32:21,880
values.

456
00:32:29,010 --> 00:32:34,030
And the estimation of
these, we basically

457
00:32:34,030 --> 00:32:36,940
have a simple
regression model where

458
00:32:36,940 --> 00:32:43,060
the realizations of
the factor returns f_t

459
00:32:43,060 --> 00:32:45,840
are given by essentially
a regression coefficient

460
00:32:45,840 --> 00:32:50,270
in this regression, where
we have the asset returns

461
00:32:50,270 --> 00:32:54,840
x_t, the known
factor loadings B,

462
00:32:54,840 --> 00:32:58,520
the unknown factor
realizations f_t.

463
00:32:58,520 --> 00:33:01,930
And just plugging
into the regression,

464
00:33:01,930 --> 00:33:05,500
if we do it very simply
we get this expression

465
00:33:05,500 --> 00:33:10,710
for f hat t, which is the
simple linear regression model

466
00:33:10,710 --> 00:33:13,310
estimating those realizations.

467
00:33:13,310 --> 00:33:20,660
Now this particular estimate
of the factor realizations

468
00:33:20,660 --> 00:33:29,020
is assuming that the variability
of the components of x

469
00:33:29,020 --> 00:33:31,960
have the same variance.

470
00:33:31,960 --> 00:33:33,830
This is like the
linear regression

471
00:33:33,830 --> 00:33:36,940
estimates under normal
Gauss-Markov assumptions.

472
00:33:36,940 --> 00:33:42,970
But basically the
epsilon_i's will

473
00:33:42,970 --> 00:33:44,420
vary across the
different assets.

474
00:33:44,420 --> 00:33:47,240
The different assets will
have different variabilities,

475
00:33:47,240 --> 00:33:48,500
different specific variances.

476
00:33:48,500 --> 00:33:53,630
So that's actually going
to be heteroscedasticity

477
00:33:53,630 --> 00:33:54,720
in these models.

478
00:33:54,720 --> 00:33:57,840
So this particular vector
of industry averages

479
00:33:57,840 --> 00:34:06,670
should actually be extended
to accommodate for that.

480
00:34:06,670 --> 00:34:10,940
So we have the estimation
of the covariance matrix

481
00:34:10,940 --> 00:34:14,900
of the factors can
then be estimated

482
00:34:14,900 --> 00:34:17,909
using these estimates
of the realizations.

483
00:34:17,909 --> 00:34:21,599
And our estimation of the
residual covariance matrix

484
00:34:21,599 --> 00:34:22,515
can then be estimated.

485
00:34:25,310 --> 00:34:29,639
So I guess an initial estimate
of the covariance matrix sigma

486
00:34:29,639 --> 00:34:34,409
hat is given by this known
matrix B times our sample

487
00:34:34,409 --> 00:34:39,340
estimate of the factor
realizations plus the diagonal

488
00:34:39,340 --> 00:34:42,330
matrix C hat.

489
00:34:42,330 --> 00:34:46,310
And a second step
in this process

490
00:34:46,310 --> 00:34:49,659
can incorporate
information about there

491
00:34:49,659 --> 00:34:53,659
being heteroscedasticity along
the diagonal of the psi's

492
00:34:53,659 --> 00:34:56,699
to adjust the
regression estimates.

493
00:34:56,699 --> 00:35:00,950
So we basically get a
refinement of the estimates

494
00:35:00,950 --> 00:35:05,640
that does account for the
non-constant variability.

495
00:35:05,640 --> 00:35:13,750
Now this property of
heteroscedasticity verses

496
00:35:13,750 --> 00:35:20,270
homoscedasticity in estimating
the regression parameters,

497
00:35:20,270 --> 00:35:22,460
it may seem like
this is a nicety

498
00:35:22,460 --> 00:35:27,550
of the statistical theory that
we just have to try and check,

499
00:35:27,550 --> 00:35:29,180
but it's not too big a deal.

500
00:35:29,180 --> 00:35:36,820
But let me highlight where this
issue comes up again and again.

501
00:35:36,820 --> 00:35:43,880
With portfolio optimization,
we went through last time--

502
00:35:43,880 --> 00:35:46,300
for mean-variance
optimization, we

503
00:35:46,300 --> 00:35:50,550
want to consider a weighting of
assets that basically weights

504
00:35:50,550 --> 00:35:56,640
the assets by the expected
returns, pre-multiplied

505
00:35:56,640 --> 00:36:00,500
by the inverse of the
covariance matrix.

506
00:36:00,500 --> 00:36:04,150
And so we basically in
portfolio allocation

507
00:36:04,150 --> 00:36:07,010
want to allocate to
assets with high return,

508
00:36:07,010 --> 00:36:10,970
but we're going to penalize
those with high variance.

509
00:36:10,970 --> 00:36:21,450
And so the impact of discounting
values with high variability

510
00:36:21,450 --> 00:36:23,810
arises in asset allocation.

511
00:36:23,810 --> 00:36:27,480
And then of course arises
in statistical estimation.

512
00:36:27,480 --> 00:36:30,160
Basically with signals
with high noise,

513
00:36:30,160 --> 00:36:33,070
you want to normalize
by the level of noise

514
00:36:33,070 --> 00:36:37,900
before you incorporate the
impact of that variable

515
00:36:37,900 --> 00:36:38,900
on the particular model.

516
00:36:45,400 --> 00:36:47,560
So here are just some notes
about the inefficiency

517
00:36:47,560 --> 00:36:50,170
of estimates due to
heteroscedasticity.

518
00:36:50,170 --> 00:36:53,032
We can apply generalized
least squares.

519
00:36:53,032 --> 00:36:56,470
A second bullet here is
that factor realizations

520
00:36:56,470 --> 00:37:00,063
can be scaled to represent
factor mimicking portfolios.

521
00:37:02,690 --> 00:37:06,360
Now with the
Fama-French factors,

522
00:37:06,360 --> 00:37:09,360
where we have say big
versus small stocks or value

523
00:37:09,360 --> 00:37:11,410
versus growth
stocks, it would be

524
00:37:11,410 --> 00:37:16,060
nice to know, well what's
the real value of trading

525
00:37:16,060 --> 00:37:17,390
that factor?

526
00:37:17,390 --> 00:37:21,550
If you were to have unit
weight to trading that factor,

527
00:37:21,550 --> 00:37:22,740
would you make money or not?

528
00:37:22,740 --> 00:37:26,040
Or under what periods
would you make money?

529
00:37:26,040 --> 00:37:31,010
And the notion of factor
mimicking portfolios

530
00:37:31,010 --> 00:37:31,590
is important.

531
00:37:31,590 --> 00:37:38,060
Let's go back to the
specification of the factor

532
00:37:38,060 --> 00:37:41,440
realizations here.

533
00:37:41,440 --> 00:37:48,210
f hat t, the t-th realization
of the factors, their k factors,

534
00:37:48,210 --> 00:37:50,680
is given by essentially
the regression estimate

535
00:37:50,680 --> 00:37:54,400
of those factors from the
realizations of the asset

536
00:37:54,400 --> 00:37:55,150
returns.

537
00:37:55,150 --> 00:37:57,230
And if we're doing
this in the proper way,

538
00:37:57,230 --> 00:38:01,370
we'd be correcting for
the heteroscedasticity.

539
00:38:01,370 --> 00:38:06,810
Well this realization
of the factor returns

540
00:38:06,810 --> 00:38:17,430
is a weighted average or
a weighted sum of the x_t.

541
00:38:17,430 --> 00:38:29,100
So we have basically f_t
is equal to a matrix times

542
00:38:29,100 --> 00:38:39,409
x_t, where this is B B prime
toe the minus one, B prime.

543
00:38:42,250 --> 00:38:49,216
So our k-dimensional
realizations-- let's see,

544
00:38:49,216 --> 00:38:54,000
this is basically k by 1.

545
00:38:57,470 --> 00:39:04,100
Each of these k factors is
a weighted sum of these x's.

546
00:39:04,100 --> 00:39:06,570
Now the x's, if these are
returns on the underlying

547
00:39:06,570 --> 00:39:13,770
assets, then we can consider
normalizing these factors.

548
00:39:13,770 --> 00:39:16,740
Or basically normalizing
this matrix here

549
00:39:16,740 --> 00:39:23,880
so that the row weights sum to
1, say, for a unit of capital.

550
00:39:23,880 --> 00:39:28,380
So if we were to invest
a net unit of one capital

551
00:39:28,380 --> 00:39:32,550
unit in these assets, then
this factor realization

552
00:39:32,550 --> 00:39:38,510
would give us the return on
a portfolio of the assets

553
00:39:38,510 --> 00:39:43,290
that is perfectly correlated
with the factor realization.

554
00:39:43,290 --> 00:39:49,820
So factor mimicking portfolios
can be defined that way.

555
00:39:49,820 --> 00:39:54,220
And they have a
good interpretation

556
00:39:54,220 --> 00:39:57,080
in terms of the realization
of potential investments.

557
00:40:02,980 --> 00:40:03,990
So let's go back.

558
00:40:18,000 --> 00:40:21,870
The next subject is
statistical factor models.

559
00:40:21,870 --> 00:40:26,540
This is the case where
we begin the analysis

560
00:40:26,540 --> 00:40:31,740
with just our collection of
outcomes for the process x_t.

561
00:40:31,740 --> 00:40:34,150
So basically our
time series of asset

562
00:40:34,150 --> 00:40:40,100
returns for m assets
over T time units.

563
00:40:40,100 --> 00:40:44,300
And we have no clue initially
what the underlying factors

564
00:40:44,300 --> 00:40:47,360
are, but we hypothesize
that there are factors that

565
00:40:47,360 --> 00:40:49,570
do characterize the returns.

566
00:40:49,570 --> 00:40:52,090
And factor analysis and
principal components analysis

567
00:40:52,090 --> 00:40:57,890
provide ways of uncovering
those underlying factors

568
00:40:57,890 --> 00:41:01,335
and defining them in terms
of the data themselves.

569
00:41:15,540 --> 00:41:18,020
So we'll first talk
about factor analysis.

570
00:41:18,020 --> 00:41:21,290
Then we'll turn to principal
components analysis.

571
00:41:21,290 --> 00:41:26,290
Both of these
methods are efforts

572
00:41:26,290 --> 00:41:29,360
to model the covariance matrix.

573
00:41:29,360 --> 00:41:37,710
And the underlying covariance
matrix for the assets x

574
00:41:37,710 --> 00:41:40,810
can be estimated with sample
data in terms of the sample

575
00:41:40,810 --> 00:41:42,100
covariance matrix.

576
00:41:42,100 --> 00:41:45,090
So here I've just written
out in matrix form

577
00:41:45,090 --> 00:41:47,700
how that would be computed.

578
00:41:47,700 --> 00:41:57,260
And so with this m by T
matrix x, we basically

579
00:41:57,260 --> 00:42:03,280
take that matrix, take
out the means by computing

580
00:42:03,280 --> 00:42:06,380
the means with multiplying
by this matrix,

581
00:42:06,380 --> 00:42:10,480
and then take the
sum of deviations

582
00:42:10,480 --> 00:42:14,280
about the means for
all the m assets

583
00:42:14,280 --> 00:42:17,470
individually and
across each other,

584
00:42:17,470 --> 00:42:40,170
and divide that by capital T.

585
00:42:40,170 --> 00:42:42,750
Now, the setup for
statistical factor models

586
00:42:42,750 --> 00:42:48,810
is exactly the same as
before, except the only thing

587
00:42:48,810 --> 00:42:51,620
that we observe is x_t.

588
00:42:51,620 --> 00:42:59,490
So we're hypothesizing a
model where alpha is basically

589
00:42:59,490 --> 00:43:06,580
a vector that is basically
the vector of mean returns

590
00:43:06,580 --> 00:43:08,420
of the individual assets.

591
00:43:08,420 --> 00:43:14,500
B is a matrix of factor
loadings on k factors f_t.

592
00:43:14,500 --> 00:43:18,240
And epsilon_t is white
noise with mean 0,

593
00:43:18,240 --> 00:43:20,910
covariance matrix
given by the diagonal.

594
00:43:20,910 --> 00:43:25,410
So the setup here is the
same basic setup as before,

595
00:43:25,410 --> 00:43:33,100
but we don't have the
matrix B or the vector f_t.

596
00:43:35,790 --> 00:43:37,030
Or, of course, alpha.

597
00:43:39,920 --> 00:43:43,240
Now in this setup,
it's important

598
00:43:43,240 --> 00:43:49,920
that there is an
indeterminacy of this model,

599
00:43:49,920 --> 00:43:57,450
because for any given
specification of the matrix B

600
00:43:57,450 --> 00:44:04,540
or the factors f, we can
actually transform those

601
00:44:04,540 --> 00:44:09,840
by a k by k invertible
matrix H. So for a given

602
00:44:09,840 --> 00:44:13,900
specification of this model,
if we transform the underlying

603
00:44:13,900 --> 00:44:19,230
factor realizations f by the
matrix H, which is k by k,

604
00:44:19,230 --> 00:44:25,890
then if we transform the
factor loadings B by H inverse,

605
00:44:25,890 --> 00:44:28,290
we get the same model.

606
00:44:28,290 --> 00:44:31,800
So there is an indeterminacy
here, or a-- OK,

607
00:44:31,800 --> 00:44:37,940
there's an indeterminacy of
these particular variables,

608
00:44:37,940 --> 00:44:41,970
but there's basically
flexibility in how

609
00:44:41,970 --> 00:44:44,630
we define the factor model.

610
00:44:44,630 --> 00:44:48,050
So in trying to uncover a factor
model with statistical factor

611
00:44:48,050 --> 00:44:50,990
analysis, there is
some flexibility

612
00:44:50,990 --> 00:44:53,130
in defining our factors.

613
00:44:53,130 --> 00:44:57,030
We can arbitrarily
transform the factors

614
00:44:57,030 --> 00:45:00,500
by an invertible
transformation in the k space.

615
00:45:15,050 --> 00:45:19,560
And I guess it's important
to note that what changes

616
00:45:19,560 --> 00:45:22,550
when we do that transformation?

617
00:45:22,550 --> 00:45:24,570
Well the linear
function stays the same

618
00:45:24,570 --> 00:45:28,040
in terms of the covariance
matrix of the underlying

619
00:45:28,040 --> 00:45:29,180
factors.

620
00:45:29,180 --> 00:45:31,880
Well, if we have a covariance
matrix for those underlying

621
00:45:31,880 --> 00:45:36,350
factors, we need to accommodate
the matrix transformation

622
00:45:36,350 --> 00:45:37,690
H in that.

623
00:45:37,690 --> 00:45:39,640
So that has an impact there.

624
00:45:39,640 --> 00:45:44,030
But one of the
things we can do is

625
00:45:44,030 --> 00:45:49,040
consider trying to
define a matrix H, that

626
00:45:49,040 --> 00:45:50,780
diagonalizes the factors.

627
00:45:50,780 --> 00:45:53,790
So in some settings, it's useful
to consider factor models where

628
00:45:53,790 --> 00:46:00,260
you have uncorrelated
factor components.

629
00:46:00,260 --> 00:46:04,140
And it's possible to
define linear factor

630
00:46:04,140 --> 00:46:09,290
models with uncorrelated factor
components by a choice of H.

631
00:46:09,290 --> 00:46:12,440
So with any linear
factor model in fact,

632
00:46:12,440 --> 00:46:17,571
we can have uncorrelated factor
components if that's useful.

633
00:46:21,720 --> 00:46:26,300
So this first bullet
highlights that point

634
00:46:26,300 --> 00:46:30,200
that we can get
orthonormal factors.

635
00:46:32,930 --> 00:46:37,490
And we can also have
zero mean factors

636
00:46:37,490 --> 00:46:41,530
by adjusting the
data to incorporate

637
00:46:41,530 --> 00:46:43,340
the mean of these factors.

638
00:46:45,930 --> 00:46:53,290
And if we make these
particular assumptions,

639
00:46:53,290 --> 00:46:55,840
then the model does
simplify to just

640
00:46:55,840 --> 00:47:02,400
being the covariance matrix
sigma_x is the factor

641
00:47:02,400 --> 00:47:08,020
loadings B times its transpose
plus a diagonal matrix.

642
00:47:08,020 --> 00:47:11,060
And just to reiterate,
the power of this

643
00:47:11,060 --> 00:47:19,130
is basically no matter how
large m is, as m increases

644
00:47:19,130 --> 00:47:28,004
the B matrix just increases
by k for every increment in m.

645
00:47:28,004 --> 00:47:32,490
And we also have an additional
diagonal entry on the psi.

646
00:47:32,490 --> 00:47:39,320
So as we add more and more
assets to our modeling,

647
00:47:39,320 --> 00:47:42,660
the complexity basically
doesn't increase very much.

648
00:47:51,720 --> 00:47:55,520
With all of our statistical
models, one of the questions

649
00:47:55,520 --> 00:47:59,850
is how do we specify the
particular parameters?

650
00:47:59,850 --> 00:48:05,930
Maximum likelihood estimation is
the first thing to go through,

651
00:48:05,930 --> 00:48:12,050
and with normal
linear factor models

652
00:48:12,050 --> 00:48:13,580
we have normal
distributions for all

653
00:48:13,580 --> 00:48:16,890
the underlying random variables.

654
00:48:16,890 --> 00:48:19,755
So the residuals
epsilon_t are independent

655
00:48:19,755 --> 00:48:23,970
and identically distributed,
multivariate normal dimension m

656
00:48:23,970 --> 00:48:30,760
with diagonal matrix psi given
by the individual elements'

657
00:48:30,760 --> 00:48:31,740
variances.

658
00:48:31,740 --> 00:48:36,950
f_t, the realization
of the factors,

659
00:48:36,950 --> 00:48:40,490
the k-dimensional
factors can have mean 0,

660
00:48:40,490 --> 00:48:43,970
and just to have the
identity covariance

661
00:48:43,970 --> 00:48:48,550
we can scale them and
make them uncorrelated.

662
00:48:48,550 --> 00:48:53,050
And then x_t will be
normally distributed

663
00:48:53,050 --> 00:48:55,360
with mean alpha and
covariance matrix

664
00:48:55,360 --> 00:48:59,210
sigma_x given by the formulas
in the previous slide.

665
00:49:03,020 --> 00:49:05,370
With these assumptions,
we can write down

666
00:49:05,370 --> 00:49:08,130
the model likelihood.

667
00:49:08,130 --> 00:49:10,440
The model likelihood
is the joint density

668
00:49:10,440 --> 00:49:12,195
of our data given the
unknown parameters.

669
00:49:22,670 --> 00:49:28,720
And the standard setup actually
for statistical factor modeling

670
00:49:28,720 --> 00:49:31,290
is to assume
independence over time.

671
00:49:31,290 --> 00:49:34,900
Now we know that there can
be time series dependence.

672
00:49:34,900 --> 00:49:37,240
We won't deal with
that at this point.

673
00:49:37,240 --> 00:49:41,340
Let's just assume that they
are independent across time.

674
00:49:41,340 --> 00:49:46,280
Then we can consider this
as simply the product

675
00:49:46,280 --> 00:49:51,570
of the conditional density
of x_t given alpha and sigma,

676
00:49:51,570 --> 00:49:54,020
which has this form.

677
00:49:54,020 --> 00:50:00,120
This form for the density
function of a multivariate

678
00:50:00,120 --> 00:50:05,210
normal should be very
familiar to you at this point.

679
00:50:05,210 --> 00:50:07,950
It's basically the extension
of the univariate normal

680
00:50:07,950 --> 00:50:10,010
distribution to m-variate.

681
00:50:10,010 --> 00:50:14,370
So we have 1 over the square
root of pi to the m power.

682
00:50:14,370 --> 00:50:16,380
There are m components.

683
00:50:16,380 --> 00:50:23,170
And then we divide by the square
root of the individual variance

684
00:50:23,170 --> 00:50:26,430
or the determinant of
the covariance matrix.

685
00:50:26,430 --> 00:50:31,970
And then the exponential
of this term here,

686
00:50:31,970 --> 00:50:41,370
which for the t-th case is
a quadratic form in the x's.

687
00:50:41,370 --> 00:50:46,050
So this multivariate normal
x, we take off its mean

688
00:50:46,050 --> 00:50:48,759
and look at the quadratic
form of that with the inverse

689
00:50:48,759 --> 00:50:49,800
of its covariance matrix.

690
00:50:57,650 --> 00:50:59,660
So there's the
log-likelihood function.

691
00:50:59,660 --> 00:51:06,400
It reduces to this form here.

692
00:51:06,400 --> 00:51:09,170
And maximum likelihood
estimation methods

693
00:51:09,170 --> 00:51:16,550
can be applied to specify all
the parameters of B and psi.

694
00:51:16,550 --> 00:51:23,620
And there's an EM algorithm,
which is applied in this case.

695
00:51:23,620 --> 00:51:26,070
I think I may have
highlighted it before,

696
00:51:26,070 --> 00:51:30,000
but the EM algorithm is a very
powerful estimation methodology

697
00:51:30,000 --> 00:51:33,850
for maximum likelihood
in statistics.

698
00:51:33,850 --> 00:51:40,520
When one has very
complicated models which

699
00:51:40,520 --> 00:51:44,530
can be simplified-- well,
models that are complicated

700
00:51:44,530 --> 00:51:47,830
by the fact that we have
hidden variables-- basically

701
00:51:47,830 --> 00:51:51,760
the hidden variables lead
to very complex likelihood

702
00:51:51,760 --> 00:51:54,550
functions.

703
00:51:54,550 --> 00:51:56,330
A simplification
of the EM algorithm

704
00:51:56,330 --> 00:52:00,450
is that if we could observe
some of the hidden variables,

705
00:52:00,450 --> 00:52:02,325
then our likelihood
functions are very simple

706
00:52:02,325 --> 00:52:05,070
and can be computed directly.

707
00:52:05,070 --> 00:52:10,820
And the EM algorithm
alternates estimating

708
00:52:10,820 --> 00:52:14,620
the hidden variables, assuming
the hidden variables are known

709
00:52:14,620 --> 00:52:18,362
doing the simple estimates with
the observed hidden variables,

710
00:52:18,362 --> 00:52:20,320
and then estimating the
hidden variables again,

711
00:52:20,320 --> 00:52:22,860
and just iterating that
process again and again.

712
00:52:22,860 --> 00:52:24,100
And it converges.

713
00:52:24,100 --> 00:52:26,460
And their paper
demonstrates that this

714
00:52:26,460 --> 00:52:29,980
applies in many, many
different application settings.

715
00:52:29,980 --> 00:52:33,610
And it's just a very, very
powerful estimation methodology

716
00:52:33,610 --> 00:52:39,900
that is applied here with
statistical factor analysis.

717
00:52:39,900 --> 00:52:45,460
I indicated that for now we
could just assume independence

718
00:52:45,460 --> 00:52:49,970
over time of the data points
in computing its likelihood.

719
00:52:49,970 --> 00:52:53,060
You recall our discussion
a couple of lectures back

720
00:52:53,060 --> 00:52:57,260
about the state-space models,
linear state-space models.

721
00:52:57,260 --> 00:53:00,710
Essentially, that linear
state-space model framework

722
00:53:00,710 --> 00:53:03,830
can be applied here as
well to incorporate time

723
00:53:03,830 --> 00:53:06,840
dependence in the data as well.

724
00:53:10,220 --> 00:53:16,190
So that simplification is not
binding in terms of holding us

725
00:53:16,190 --> 00:53:17,970
up in estimating these models.

726
00:53:25,555 --> 00:53:28,320
Let me go back here, OK.

727
00:53:28,320 --> 00:53:32,160
So the maximum likelihood
estimation process

728
00:53:32,160 --> 00:53:37,920
will give us estimates of the
B matrix and the psi matrix.

729
00:53:37,920 --> 00:53:43,630
So applying this EM
algorithm, a good computer

730
00:53:43,630 --> 00:53:51,880
can actually get estimates of
B and psi and the underlying

731
00:53:51,880 --> 00:53:53,880
alpha, of course.

732
00:53:53,880 --> 00:54:03,660
Now from these we can estimate
the factor realizations f_t.

733
00:54:03,660 --> 00:54:11,560
And these can be estimated by
simply this regression formula,

734
00:54:11,560 --> 00:54:13,640
using our estimates for
the factor loadings B

735
00:54:13,640 --> 00:54:17,720
hat, our estimates of
alpha, we can actually

736
00:54:17,720 --> 00:54:20,540
estimate the factor
realizations.

737
00:54:20,540 --> 00:54:24,390
So with statistical
factor analysis,

738
00:54:24,390 --> 00:54:27,360
we use the EM algorithm to
estimate the covariance matrix

739
00:54:27,360 --> 00:54:28,510
parameters.

740
00:54:28,510 --> 00:54:32,455
Then the next step, we
can estimate the factor

741
00:54:32,455 --> 00:54:32,996
realizations.

742
00:54:37,240 --> 00:54:41,310
So as the output
from factor analysis,

743
00:54:41,310 --> 00:54:45,830
we can work with these
factor realizations.

744
00:54:45,830 --> 00:54:50,610
And those realizations
or those estimates

745
00:54:50,610 --> 00:54:52,590
of the realizations
of the factors

746
00:54:52,590 --> 00:55:00,570
can then be used basically
for risk modeling as well.

747
00:55:00,570 --> 00:55:10,150
So we could do a statistical
factor analysis of returns

748
00:55:10,150 --> 00:55:15,980
in, say, the
commodities markets.

749
00:55:15,980 --> 00:55:21,610
And identify what factors are
driving returns and covariances

750
00:55:21,610 --> 00:55:23,910
in commodity markets.

751
00:55:23,910 --> 00:55:26,120
We can then get estimates
of those underlying

752
00:55:26,120 --> 00:55:29,570
factors from the methodology.

753
00:55:29,570 --> 00:55:32,610
We could then use those
as inputs to other models.

754
00:55:32,610 --> 00:55:35,210
Certain stocks may depend
on significant factors

755
00:55:35,210 --> 00:55:36,900
in the commodity markets.

756
00:55:36,900 --> 00:55:41,310
And what they depend on, well
we can use statistical modeling

757
00:55:41,310 --> 00:55:44,880
to identify where
the dependencies are.

758
00:55:44,880 --> 00:55:49,530
So getting these realizations
of the statistical factors

759
00:55:49,530 --> 00:55:52,610
is very useful, not
only to understand

760
00:55:52,610 --> 00:55:55,330
what happened in the
past with the process

761
00:55:55,330 --> 00:55:57,030
and how these
underlying factors vary,

762
00:55:57,030 --> 00:56:00,080
but you can also use those
as inputs to other models.

763
00:56:11,770 --> 00:56:16,950
Finally, let's see,
there was a lot

764
00:56:16,950 --> 00:56:19,050
of interest with
statistical factor

765
00:56:19,050 --> 00:56:23,030
analysis on the interpretation
of the underlying factors.

766
00:56:23,030 --> 00:56:28,320
Of course, in terms
of using any model,

767
00:56:28,320 --> 00:56:32,460
it's once confidence
rises when you have

768
00:56:32,460 --> 00:56:34,580
highly interpretable results.

769
00:56:34,580 --> 00:56:37,630
One of the initial applications
of statistical factor analysis

770
00:56:37,630 --> 00:56:40,310
was in measuring IQ.

771
00:56:40,310 --> 00:56:45,070
And how many people here
have taken an IQ test?

772
00:56:45,070 --> 00:56:47,580
Probably everybody
or almost everybody?

773
00:56:47,580 --> 00:56:51,240
Well actually if you want to
work for some hedge funds,

774
00:56:51,240 --> 00:56:54,690
you'll have to
take some IQ tests.

775
00:56:54,690 --> 00:57:00,402
But basically in an IQ test
there are 20, 30, 40 questions.

776
00:57:00,402 --> 00:57:02,360
And they're trying to
measure different aspects

777
00:57:02,360 --> 00:57:04,630
of your ability.

778
00:57:04,630 --> 00:57:09,820
And statistical
factor analysis has

779
00:57:09,820 --> 00:57:12,990
been used to try and understand
what are the underlying

780
00:57:12,990 --> 00:57:14,930
dimensions of intelligence.

781
00:57:14,930 --> 00:57:21,930
And one has the
flexibility of considering

782
00:57:21,930 --> 00:57:25,350
different transformations
of any given

783
00:57:25,350 --> 00:57:30,290
set of estimated factors by this
H matrix for transformation.

784
00:57:30,290 --> 00:57:35,230
And so there has been work in
statistical factor analysis

785
00:57:35,230 --> 00:57:38,520
to find rotations of
the factor loadings

786
00:57:38,520 --> 00:57:42,220
that make the factors
more interpretable.

787
00:57:42,220 --> 00:57:48,650
So I just raise that as there's
potential to get insight

788
00:57:48,650 --> 00:57:51,390
into these underlying factors
if that's appropriate.

789
00:57:51,390 --> 00:57:54,100
In the IQ setting, the
effort was actually

790
00:57:54,100 --> 00:57:57,979
to try and find what
are interpretations

791
00:57:57,979 --> 00:57:59,645
of different dimensions
of intelligence?

792
00:58:07,400 --> 00:58:10,940
We previously talked about
factor mimicking portfolios.

793
00:58:10,940 --> 00:58:13,280
The same thing applies.

794
00:58:13,280 --> 00:58:18,460
One final thing is with
likelihood ratio tests,

795
00:58:18,460 --> 00:58:23,700
one can test for whether
the linear factor model is

796
00:58:23,700 --> 00:58:25,870
a good description of the data.

797
00:58:25,870 --> 00:58:29,950
And so with likelihood
ratio tests,

798
00:58:29,950 --> 00:58:32,950
we compare the
likelihood of the data

799
00:58:32,950 --> 00:58:36,650
where we fit our unknown
parameters, the mean vector

800
00:58:36,650 --> 00:58:41,190
alpha and covariance matrix
sigma, without any constraints.

801
00:58:41,190 --> 00:58:45,850
And then we compare that
to the likelihood function

802
00:58:45,850 --> 00:58:50,070
under the factor model
with, say, k factors.

803
00:58:50,070 --> 00:58:56,930
And the likelihood
ratio tests are

804
00:58:56,930 --> 00:59:00,510
computed by looking at twice the
difference in log likelihoods.

805
00:59:00,510 --> 00:59:04,710
If you take an advanced
course in statistics,

806
00:59:04,710 --> 00:59:08,790
you'll see that basically this
difference in the likelihood

807
00:59:08,790 --> 00:59:13,280
functions under many conditions
is approximately a chi

808
00:59:13,280 --> 00:59:16,030
squared random
variable with degrees

809
00:59:16,030 --> 00:59:18,030
of freedom equal to the
difference in parameters

810
00:59:18,030 --> 00:59:20,070
under the two models.

811
00:59:20,070 --> 00:59:25,230
So that's why it's
specified this way.

812
00:59:25,230 --> 00:59:29,035
But anyway, one can test for
the dimensionality of the factor

813
00:59:29,035 --> 00:59:29,535
model.

814
00:59:33,940 --> 00:59:36,280
Before going into an
example of factor modeling,

815
00:59:36,280 --> 00:59:39,890
I want to cover principal
components analysis.

816
00:59:42,606 --> 00:59:44,230
Actually, principal
components analysis

817
00:59:44,230 --> 00:59:46,990
comes up in factor
modeling, but it's also

818
00:59:46,990 --> 00:59:52,700
a methodology that's appropriate
for modeling multivariate data

819
00:59:52,700 --> 00:59:56,410
and considering
dimensionality reduction.

820
00:59:56,410 --> 00:59:59,750
You're dealing with data
in very many dimensions.

821
00:59:59,750 --> 01:00:05,680
You're wondering is there
a simple characterization

822
01:00:05,680 --> 01:00:08,720
of the multivariate
structure that lies

823
01:00:08,720 --> 01:00:10,770
in a smaller dimensional space?

824
01:00:10,770 --> 01:00:14,130
And principle components
analysis gives us that.

825
01:00:14,130 --> 01:00:18,320
The theoretical framework for
principal components analysis

826
01:00:18,320 --> 01:00:23,300
is to consider an
m-variate random variable.

827
01:00:23,300 --> 01:00:27,650
So this is like a single
realization of asset returns

828
01:00:27,650 --> 01:00:31,620
in a given time, which has
some mean and covariance matrix

829
01:00:31,620 --> 01:00:32,120
sigma.

830
01:00:34,876 --> 01:00:36,250
The principal
components analysis

831
01:00:36,250 --> 01:00:41,190
is going to exploit the
eigenvalues and eigenvectors

832
01:00:41,190 --> 01:00:42,490
of the covariance matrix.

833
01:00:45,530 --> 01:00:50,320
Choongbum went through
eigenvalues and singular value

834
01:00:50,320 --> 01:00:51,370
decompositions.

835
01:00:51,370 --> 01:00:55,640
So here we basically have
the eigenvalue/eigenvector

836
01:00:55,640 --> 01:00:58,930
decomposition of our
covariance matrix sigma, which

837
01:00:58,930 --> 01:01:04,700
is the scalar eigenvalues
times the eigenvector

838
01:01:04,700 --> 01:01:08,270
gamma_i times its transpose.

839
01:01:08,270 --> 01:01:12,900
So we actually are able to
decompose our covariance matrix

840
01:01:12,900 --> 01:01:15,450
with eigenvalues, eigenvectors.

841
01:01:15,450 --> 01:01:20,980
The principal
component variables

842
01:01:20,980 --> 01:01:28,190
are to consider taking away the
mean from the random vector x,

843
01:01:28,190 --> 01:01:29,390
alpha.

844
01:01:29,390 --> 01:01:35,800
And then consider the weighted
average of those de-meaned x's

845
01:01:35,800 --> 01:01:39,630
given by the i-th eigenvector.

846
01:01:39,630 --> 01:01:42,210
So these are going to be
called the principal component

847
01:01:42,210 --> 01:01:46,710
variables, where gamma_1 is
the first one corresponding

848
01:01:46,710 --> 01:01:48,450
to the largest eigenvalue.

849
01:01:48,450 --> 01:01:51,609
Gamma m is going to be the
m-th, or last, corresponding

850
01:01:51,609 --> 01:01:52,275
to the smallest.

851
01:01:59,690 --> 01:02:07,650
The properties of these
principal component variables

852
01:02:07,650 --> 01:02:14,030
is that they have mean 0,
and their covariance matrix

853
01:02:14,030 --> 01:02:17,610
is given by the diagonal
matrix of eigenvalues.

854
01:02:17,610 --> 01:02:21,670
So the principal
component variables

855
01:02:21,670 --> 01:02:25,210
are a very simple sort
of affine transformation

856
01:02:25,210 --> 01:02:29,760
of the original variable x.

857
01:02:29,760 --> 01:02:38,450
We translate x to a new origin,
basically to the 0 origin,

858
01:02:38,450 --> 01:02:41,260
by subtracting the means off it.

859
01:02:41,260 --> 01:02:46,710
And then we multiply
that de-meaned x value

860
01:02:46,710 --> 01:02:51,335
by an orthogonal
matrix gamma prime.

861
01:02:54,004 --> 01:02:54,920
And what does that do?

862
01:02:54,920 --> 01:02:59,450
That simply rotates
the coordinate axes.

863
01:02:59,450 --> 01:03:04,330
So what we're doing is creating
a new coordinate system

864
01:03:04,330 --> 01:03:07,860
for our data, which
hasn't changed

865
01:03:07,860 --> 01:03:11,380
the relative position of the
data or the random variable

866
01:03:11,380 --> 01:03:14,600
at all in the space.

867
01:03:14,600 --> 01:03:18,280
Basically, it just is using
a new coordinate system

868
01:03:18,280 --> 01:03:22,389
with no change in the
overall variability of what

869
01:03:22,389 --> 01:03:23,886
we're working with.

870
01:03:38,170 --> 01:03:46,350
In matrix form, we can express
this principal component

871
01:03:46,350 --> 01:03:48,660
variables p.

872
01:03:51,540 --> 01:03:54,830
Let's consider partitioning
p into the first k

873
01:03:54,830 --> 01:03:59,750
elements and the last
m minus k elements p_2.

874
01:03:59,750 --> 01:04:05,463
Then our original random vector
x has this decomposition.

875
01:04:09,320 --> 01:04:13,530
And we can think of this
as being approximately

876
01:04:13,530 --> 01:04:14,850
a linear factor model.

877
01:04:19,790 --> 01:04:24,260
We can consider from
principal components analysis

878
01:04:24,260 --> 01:04:26,720
essentially if p_1, the
principle component variables,

879
01:04:26,720 --> 01:04:32,900
correspond to our factors,
then our linear factor model

880
01:04:32,900 --> 01:04:37,940
would have B as given by
gamma_1, F as given by p_1.

881
01:04:37,940 --> 01:04:42,400
And our epsilon vector would
be given by gamma_2 p_2.

882
01:04:42,400 --> 01:04:45,110
So the principal
components decomposition

883
01:04:45,110 --> 01:04:48,830
is almost a linear factor model.

884
01:04:48,830 --> 01:04:59,910
The only issue is this
gamma_2 p_2 is an m-vector,

885
01:04:59,910 --> 01:05:06,340
but it may not have a
diagonal covariance matrix.

886
01:05:06,340 --> 01:05:10,360
Under the linear factor model
with a given set of factors

887
01:05:10,360 --> 01:05:12,630
k less than m, we
always are assuming

888
01:05:12,630 --> 01:05:17,810
that the residual vector
has covariance matrix

889
01:05:17,810 --> 01:05:19,140
equal to a diagonal.

890
01:05:19,140 --> 01:05:21,530
With a principal
components analysis,

891
01:05:21,530 --> 01:05:25,810
that may or may not be true.

892
01:05:25,810 --> 01:05:29,870
So this is like an
approximate factor model,

893
01:05:29,870 --> 01:05:32,540
but that's why this is called
principal components analysis.

894
01:05:32,540 --> 01:05:35,792
It's not called principal
factor analysis yet.

895
01:05:45,130 --> 01:05:49,870
The empirical principal
components analysis now.

896
01:05:49,870 --> 01:05:51,870
We've gone through
just a description

897
01:05:51,870 --> 01:05:54,670
of theoretical principal
components, where

898
01:05:54,670 --> 01:05:58,454
if we have a mean vector alpha,
covariance matrix sigma, how

899
01:05:58,454 --> 01:06:00,620
we would define these
principle component variables.

900
01:06:00,620 --> 01:06:05,400
If we just have sample
data, then this slide

901
01:06:05,400 --> 01:06:08,782
goes through the computations
of the empirical principal

902
01:06:08,782 --> 01:06:09,970
components results.

903
01:06:09,970 --> 01:06:14,120
So all we're doing is
substituting in estimates

904
01:06:14,120 --> 01:06:17,220
of the means and
covariance matrix,

905
01:06:17,220 --> 01:06:19,110
and computing the
eigenvalue/eigenvector

906
01:06:19,110 --> 01:06:21,060
decomposition of that.

907
01:06:21,060 --> 01:06:25,180
And we get sample principal
component variables

908
01:06:25,180 --> 01:06:31,720
which are-- we basically
compute x, the de-meaned vector,

909
01:06:31,720 --> 01:06:38,780
or matrix of realizations and
pre-multiply that by gamma hat

910
01:06:38,780 --> 01:06:44,470
prime, which is the
matrix of eigenvectors

911
01:06:44,470 --> 01:06:46,570
corresponding to the
eigenvalue/eigenvector

912
01:06:46,570 --> 01:06:48,964
decomposition of the
sample covariance matrix.

913
01:06:54,790 --> 01:06:59,960
This slide goes through the
singular value decomposition.

914
01:06:59,960 --> 01:07:03,840
You don't have to go through and
compute variances, covariances

915
01:07:03,840 --> 01:07:08,340
to derive estimates of the
principal component variables.

916
01:07:08,340 --> 01:07:11,600
You can work simply with the
singular value decomposition.

917
01:07:11,600 --> 01:07:15,804
I'll let you go through
that on your own.

918
01:07:15,804 --> 01:07:18,220
There's an alternate definition
of the principal component

919
01:07:18,220 --> 01:07:19,803
variable though
that's very important.

920
01:07:27,270 --> 01:07:32,470
If we consider a
linear combination

921
01:07:32,470 --> 01:07:40,850
of the components of x, x_1
through x_m, given by w,

922
01:07:40,850 --> 01:07:45,150
if we consider a linear
combination of that which

923
01:07:45,150 --> 01:07:48,390
maximizes the variability
of that linear combination

924
01:07:48,390 --> 01:07:56,040
subject to the norm of the
coefficients w equals 1,

925
01:07:56,040 --> 01:08:00,340
then this is the first
principal component variable.

926
01:08:00,340 --> 01:08:08,250
So if we have in two
dimensions the x_1 and x_2,

927
01:08:08,250 --> 01:08:21,540
if we have points that look like
an ellipsoidal distribution,

928
01:08:21,540 --> 01:08:28,850
this would correspond to having
alpha 1 there, alpha 2 there,

929
01:08:28,850 --> 01:08:32,410
a sort of degree of variability.

930
01:08:32,410 --> 01:08:35,770
The principal
components analysis

931
01:08:35,770 --> 01:08:42,620
says, let's shift to the origin
being at (alpha_1, alpha_2).

932
01:08:42,620 --> 01:08:50,370
And then let's rotate the axes
to align with the eigenvectors.

933
01:08:50,370 --> 01:08:54,170
Well the first principal
component variable

934
01:08:54,170 --> 01:09:02,550
finds the dimension at which
the coordinate axis at which

935
01:09:02,550 --> 01:09:04,800
the variability is a maximum.

936
01:09:04,800 --> 01:09:07,350
And basically along
this dimension

937
01:09:07,350 --> 01:09:11,790
here, this is where
the variability

938
01:09:11,790 --> 01:09:12,800
would be the maximum.

939
01:09:12,800 --> 01:09:15,529
And that's the first
principal component variable.

940
01:09:15,529 --> 01:09:18,660
So this principal
components analysis

941
01:09:18,660 --> 01:09:20,930
is identifying
essentially where's

942
01:09:20,930 --> 01:09:23,620
there the most
variability in the data,

943
01:09:23,620 --> 01:09:28,390
where it's the most variability
without doing any change

944
01:09:28,390 --> 01:09:30,270
in the scaling of the data?

945
01:09:30,270 --> 01:09:33,816
All we're doing is
shifting and rotating.

946
01:09:33,816 --> 01:09:35,649
Then the second principal
component variable

947
01:09:35,649 --> 01:09:38,420
is basically the
direction which is

948
01:09:38,420 --> 01:09:42,529
orthogonal to the first, which
has the maximum variance.

949
01:09:42,529 --> 01:09:46,400
And continuing that
process to define all

950
01:09:46,400 --> 01:09:48,160
m principal component variables.

951
01:09:56,780 --> 01:09:58,180
In principal
components analysis,

952
01:09:58,180 --> 01:10:01,600
there's discussions of the
total variability of the data

953
01:10:01,600 --> 01:10:05,460
and how well that's explained
by principal components.

954
01:10:05,460 --> 01:10:09,030
If we have a covariance
matrix sigma,

955
01:10:09,030 --> 01:10:13,390
the total variance
can be defined

956
01:10:13,390 --> 01:10:17,960
and is defined as the sum
of the diagonal entries.

957
01:10:17,960 --> 01:10:21,040
So it's the trace of
a covariance matrix.

958
01:10:21,040 --> 01:10:25,220
We'll call that the total
variance of this multivariate

959
01:10:25,220 --> 01:10:27,160
x.

960
01:10:27,160 --> 01:10:31,630
That is equal to the sum
of the eigenvalues as well.

961
01:10:31,630 --> 01:10:36,520
So we have a decomposition
of the total variability

962
01:10:36,520 --> 01:10:40,050
into the variability of
different principal component

963
01:10:40,050 --> 01:10:42,070
variables.

964
01:10:42,070 --> 01:10:44,250
And the principal
component variables

965
01:10:44,250 --> 01:10:48,459
themselves are uncorrelated.

966
01:10:48,459 --> 01:10:49,875
You remember the
covariance matrix

967
01:10:49,875 --> 01:10:51,374
of the principal
component variables

968
01:10:51,374 --> 01:10:56,750
was the lambda, the diagonal
matrix of eigenvalues.

969
01:10:56,750 --> 01:11:00,060
So the off-diagonal
entries are 0.

970
01:11:00,060 --> 01:11:01,590
So the principal
component variables

971
01:11:01,590 --> 01:11:05,100
are uncorrelated, and
have variability lambda_k,

972
01:11:05,100 --> 01:11:07,610
and basically decompose
the variability.

973
01:11:07,610 --> 01:11:09,760
So principal components
analysis provides

974
01:11:09,760 --> 01:11:14,140
this very nice
decomposition of the data

975
01:11:14,140 --> 01:11:18,020
into different
dimensions, with highest

976
01:11:18,020 --> 01:11:22,450
to lowest information content
as given by the eigenvalues.

977
01:11:28,690 --> 01:11:34,140
I want to go
through a case study

978
01:11:34,140 --> 01:11:41,295
here of doing factor modeling
with the U.S. Treasury yields.

979
01:11:43,922 --> 01:11:49,040
I loaded in data into R, which
ranged from the beginning

980
01:11:49,040 --> 01:11:54,280
of 2000 to the end of May 2013.

981
01:11:54,280 --> 01:11:58,750
And here are the yields on
constant maturity U.S. Treasury

982
01:11:58,750 --> 01:12:01,100
securities ranging from
3 months, 6 months,

983
01:12:01,100 --> 01:12:03,050
up to 20 years.

984
01:12:03,050 --> 01:12:06,100
So this is essentially
the term structure

985
01:12:06,100 --> 01:12:12,858
of US Government [INAUDIBLE]
of varying levels of risk.

986
01:12:18,170 --> 01:12:25,292
Here's a plot of [INAUDIBLE]
over that period.

987
01:12:33,148 --> 01:12:36,585
So starting in the
[INAUDIBLE], we

988
01:12:36,585 --> 01:12:41,570
can see this, the rather
dramatic evolution of the term

989
01:12:41,570 --> 01:12:44,891
structure over
this entire period.

990
01:12:44,891 --> 01:12:47,320
If we wanted to have
set any [INAUDIBLE].

991
01:12:52,732 --> 01:12:55,190
If we wanted to do a principal
components analysis of this,

992
01:12:55,190 --> 01:12:57,900
well if we did the
entire period we'd

993
01:12:57,900 --> 01:13:01,040
be measuring variability
of all kinds of things.

994
01:13:01,040 --> 01:13:03,580
When things go down, up, down.

995
01:13:03,580 --> 01:13:07,380
What I've done in this
note is just initially

996
01:13:07,380 --> 01:13:15,750
to look at the period
from 2001 up through 2005.

997
01:13:15,750 --> 01:13:20,620
So we have five years of data
on basically the early part

998
01:13:20,620 --> 01:13:23,380
of this period that I
want to focus on and do

999
01:13:23,380 --> 01:13:32,340
a principal components analysis
of the yields on this data.

1000
01:13:32,340 --> 01:13:40,845
So here's basically the series
over that five year period.

1001
01:13:44,470 --> 01:13:47,110
Beginning of this
analysis, this analysis

1002
01:13:47,110 --> 01:13:50,590
is on the actual yield changes.

1003
01:13:50,590 --> 01:13:53,940
So just as we might be modeling
say asset prices over time

1004
01:13:53,940 --> 01:13:58,515
and then doing an analysis
of the changes, the returns,

1005
01:13:58,515 --> 01:14:00,015
here we're looking
at yield changes.

1006
01:14:07,080 --> 01:14:12,000
So first, you can see
there's-- basically,

1007
01:14:12,000 --> 01:14:17,170
the average daily value for the
different yield tenors ranging

1008
01:14:17,170 --> 01:14:20,250
from 3 months up to 20, those
are actually all negative.

1009
01:14:20,250 --> 01:14:24,360
That corresponds to the time
series over that five year

1010
01:14:24,360 --> 01:14:25,160
period.

1011
01:14:25,160 --> 01:14:29,480
Basically the time series
were all at lower levels

1012
01:14:29,480 --> 01:14:31,800
from beginning to
end on average.

1013
01:14:31,800 --> 01:14:36,400
The daily volatility is the
daily standard deviation.

1014
01:14:36,400 --> 01:14:42,590
Those vary from
0.0384 up to .0698

1015
01:14:42,590 --> 01:14:45,650
for-- is that the three year?

1016
01:14:45,650 --> 01:14:49,920
And this is the standard
deviation of daily yield

1017
01:14:49,920 --> 01:14:55,650
changes where 1 is like 1%.

1018
01:14:55,650 --> 01:15:02,310
And so basically it's
between three and six basis

1019
01:15:02,310 --> 01:15:05,407
points a day are the variation
in the yield changes.

1020
01:15:05,407 --> 01:15:06,990
So that's something
that's reasonable.

1021
01:15:06,990 --> 01:15:10,720
When you look at the
news or a newspaper

1022
01:15:10,720 --> 01:15:13,520
and see how interest
rates change from one day

1023
01:15:13,520 --> 01:15:15,780
to the next, it's generally
a few basis points

1024
01:15:15,780 --> 01:15:17,250
from one day to the next.

1025
01:15:20,680 --> 01:15:26,560
This next matrix is
the correlation matrix

1026
01:15:26,560 --> 01:15:27,885
of the yield changes.

1027
01:15:30,440 --> 01:15:32,650
If you look at
this closely, which

1028
01:15:32,650 --> 01:15:38,480
you can when you
download these results,

1029
01:15:38,480 --> 01:15:42,310
you'll see that
near the diagonal

1030
01:15:42,310 --> 01:15:47,870
the values are very high, like
above 90% for correlation.

1031
01:15:47,870 --> 01:15:51,930
And as you move across,
away from the diagonal,

1032
01:15:51,930 --> 01:15:53,918
the correlations
get lower and lower.

1033
01:15:58,180 --> 01:16:02,870
Mathematically that
is what is happening.

1034
01:16:02,870 --> 01:16:05,800
We can look at these things
graphically, which I always

1035
01:16:05,800 --> 01:16:06,810
like to do.

1036
01:16:06,810 --> 01:16:11,840
Here is just a graph, a bar
chart of the yield changes

1037
01:16:11,840 --> 01:16:14,290
and the standard
deviations of the yield

1038
01:16:14,290 --> 01:16:18,910
changes, daily volatilities
ranging from very short yields

1039
01:16:18,910 --> 01:16:22,020
to long-tenor yields,
up to 20 years.

1040
01:16:25,956 --> 01:16:28,416
So there's variability there.

1041
01:16:35,840 --> 01:16:40,500
Here is a pairs
plot of the data.

1042
01:16:40,500 --> 01:16:45,460
So what I've done is
just looked at basically

1043
01:16:45,460 --> 01:16:50,390
for every single tenor, this
is say the 5 year, 7 year,

1044
01:16:50,390 --> 01:16:53,015
10 year, 20 year.

1045
01:16:53,015 --> 01:16:55,970
I basically plotted
the yield changes

1046
01:16:55,970 --> 01:16:57,800
of each of those
against each other.

1047
01:16:57,800 --> 01:17:01,245
We could do this with basically
all nine different tenors,

1048
01:17:01,245 --> 01:17:08,690
and we'd have a very dense
page of a pairs plot.

1049
01:17:08,690 --> 01:17:10,950
So I split it up
into looking just

1050
01:17:10,950 --> 01:17:14,890
at the top and bottom
block diagonals.

1051
01:17:14,890 --> 01:17:18,000
But you can see basically
how the correlation

1052
01:17:18,000 --> 01:17:23,190
between these yield
changes is very tight

1053
01:17:23,190 --> 01:17:26,110
and then gets less tight
as you move further away.

1054
01:17:26,110 --> 01:17:33,250
With the long
tenors-- let's see,

1055
01:17:33,250 --> 01:17:39,030
the short tenors--
one, one more.

1056
01:17:39,030 --> 01:17:44,070
Here the short tenors, ranging
from 3 year, 2 year, 1 year,

1057
01:17:44,070 --> 01:17:45,240
6 month, and so forth.

1058
01:17:45,240 --> 01:17:48,660
So here you can see how it
gets less and less correlated

1059
01:17:48,660 --> 01:17:50,950
as you move away
from a given tenor.

1060
01:17:53,730 --> 01:17:58,100
Well the principal
components analysis

1061
01:17:58,100 --> 01:18:11,700
gives us-- if you conduct
a principal components,

1062
01:18:11,700 --> 01:18:14,200
basically the standard
output is first

1063
01:18:14,200 --> 01:18:18,610
a table of how the
variability of the series

1064
01:18:18,610 --> 01:18:22,210
is broken down across the
different component variables.

1065
01:18:22,210 --> 01:18:24,990
And so there's
basically the importance

1066
01:18:24,990 --> 01:18:29,640
of components for each of
the nine component variables

1067
01:18:29,640 --> 01:18:36,260
where it's measured in terms
of the relative squared

1068
01:18:36,260 --> 01:18:41,140
standard deviations of these
variables relative to the sum.

1069
01:18:41,140 --> 01:18:43,400
And the proportion
of variance explained

1070
01:18:43,400 --> 01:18:47,030
by the first component
variable is 0.849.

1071
01:18:47,030 --> 01:18:50,300
So basically 85% of
the total variability

1072
01:18:50,300 --> 01:18:54,000
is explained by the first
principal component variable.

1073
01:18:54,000 --> 01:18:57,990
Looking at the second
row, second in, 0.0919,

1074
01:18:57,990 --> 01:19:02,042
that's the percentage
of total variability

1075
01:19:02,042 --> 01:19:04,250
explained by the second
principal component variable.

1076
01:19:04,250 --> 01:19:05,700
So 9%.

1077
01:19:05,700 --> 01:19:08,920
And then for third
it's around 3%.

1078
01:19:08,920 --> 01:19:20,940
And it just goes
down closer to 0,

1079
01:19:20,940 --> 01:19:23,800
There's a scree plot for
principal components analysis,

1080
01:19:23,800 --> 01:19:26,352
which is just a plot
of the variability

1081
01:19:26,352 --> 01:19:28,310
of the different principal
component variables.

1082
01:19:28,310 --> 01:19:34,510
So you can see whether
the principal components

1083
01:19:34,510 --> 01:19:37,830
is explaining much variability
in the first few components

1084
01:19:37,830 --> 01:19:38,350
or not.

1085
01:19:38,350 --> 01:19:41,280
Here there's a huge
amount of variability

1086
01:19:41,280 --> 01:19:43,910
explained by the first
principal component variable.

1087
01:19:43,910 --> 01:19:47,930
I've plotted here the
standard deviations

1088
01:19:47,930 --> 01:19:50,616
of the original yield
changes in green,

1089
01:19:50,616 --> 01:19:52,990
versus the standard deviations
of the principal component

1090
01:19:52,990 --> 01:19:55,620
variables in blue.

1091
01:19:55,620 --> 01:20:02,090
So we basically are modeling
with principal component

1092
01:20:02,090 --> 01:20:03,900
variables most of
the variability

1093
01:20:03,900 --> 01:20:06,620
in the first few
principal components.

1094
01:20:06,620 --> 01:20:08,489
Now let's look at
the interpretation

1095
01:20:08,489 --> 01:20:10,030
of the principal
component variables.

1096
01:20:10,030 --> 01:20:12,400
There's the loadings
matrix, which

1097
01:20:12,400 --> 01:20:16,280
is the gamma matrix for the
principal components variables.

1098
01:20:19,440 --> 01:20:25,200
Looking at numbers is
less informative for me

1099
01:20:25,200 --> 01:20:26,160
than looking at graphs.

1100
01:20:26,160 --> 01:20:30,620
Here's a plot of the loadings
on the different yield

1101
01:20:30,620 --> 01:20:34,070
changes for the first
principal component variable.

1102
01:20:34,070 --> 01:20:36,120
So the first principal
component variable

1103
01:20:36,120 --> 01:20:39,760
is a weighted average of
all the yield changes,

1104
01:20:39,760 --> 01:20:44,690
giving greatest weight
to the five year.

1105
01:20:44,690 --> 01:20:45,210
What's that?

1106
01:20:45,210 --> 01:20:51,220
Well that's just a measure of a
level shift in the yield curve.

1107
01:20:51,220 --> 01:20:52,750
It's like, what's
the average yield

1108
01:20:52,750 --> 01:20:55,747
change across the whole range?

1109
01:20:55,747 --> 01:20:57,580
So that's what the first
principal component

1110
01:20:57,580 --> 01:20:59,720
variable is measuring.

1111
01:20:59,720 --> 01:21:03,440
The second principal component
variable gives positive weight

1112
01:21:03,440 --> 01:21:07,250
to the long tenors, negative
weight to the short tenors.

1113
01:21:07,250 --> 01:21:11,540
So it's looking at the
difference between the yield

1114
01:21:11,540 --> 01:21:13,920
changes on the long
tenors verses the yield

1115
01:21:13,920 --> 01:21:15,610
change on the short tenors.

1116
01:21:15,610 --> 01:21:19,774
So that's looking at how the
spread in yields is changing.

1117
01:21:27,090 --> 01:21:32,250
Then the third principal
component variable

1118
01:21:32,250 --> 01:21:36,190
has this structure.

1119
01:21:36,190 --> 01:21:38,780
And this structure
for the weights

1120
01:21:38,780 --> 01:21:40,050
is like a double difference.

1121
01:21:40,050 --> 01:21:44,570
It's looking at the difference
between the long tenor

1122
01:21:44,570 --> 01:21:48,150
and medium tenor,
minus the medium tenor,

1123
01:21:48,150 --> 01:21:50,710
minus the short tenor.

1124
01:21:50,710 --> 01:21:54,100
So that's giving us a measure
of the curvature of the term

1125
01:21:54,100 --> 01:21:57,440
structure and how that's
changing over time.

1126
01:21:57,440 --> 01:21:59,350
So these principal
component variables

1127
01:21:59,350 --> 01:22:01,600
are measuring the level
shift for the first,

1128
01:22:01,600 --> 01:22:04,710
the spread for the second, and
the curvature for the third.

1129
01:22:07,350 --> 01:22:09,250
With principal
components analysis,

1130
01:22:09,250 --> 01:22:11,879
many times I think
people focus just

1131
01:22:11,879 --> 01:22:14,170
on the first few principal
component variables and then

1132
01:22:14,170 --> 01:22:16,480
say they're done.

1133
01:22:16,480 --> 01:22:19,137
The last principle component
variable, and the last few,

1134
01:22:19,137 --> 01:22:20,720
can be very, very
interesting as well,

1135
01:22:20,720 --> 01:22:27,640
because these are the variables
of the original scales,

1136
01:22:27,640 --> 01:22:33,420
the linear combinations which
have the least variability.

1137
01:22:33,420 --> 01:22:35,760
And if you look at the
ninth principle component

1138
01:22:35,760 --> 01:22:37,500
variable-- there were
nine yield changes

1139
01:22:37,500 --> 01:22:43,810
here-- it's basically looking
at a weighted average of the 5

1140
01:22:43,810 --> 01:22:47,580
and 10 year minus the 7 year.

1141
01:22:47,580 --> 01:22:53,240
So this is like the hedge of the
7 year yield with the 5 and 10

1142
01:22:53,240 --> 01:22:53,739
year.

1143
01:22:56,910 --> 01:23:00,600
So that difference
in yield change

1144
01:23:00,600 --> 01:23:03,005
is-- that combination
of yield change

1145
01:23:03,005 --> 01:23:04,630
is going to have the
least variability.

1146
01:23:07,310 --> 01:23:08,720
The principal
component variables

1147
01:23:08,720 --> 01:23:10,860
have zero correlation.

1148
01:23:10,860 --> 01:23:14,840
Here's just a pairs plot of the
first three principal component

1149
01:23:14,840 --> 01:23:16,010
variables and the ninth.

1150
01:23:16,010 --> 01:23:18,670
And you can see
that those have been

1151
01:23:18,670 --> 01:23:22,240
transformed to have zero
correlations with each other.

1152
01:23:24,750 --> 01:23:31,540
One can plot the cumulative
principal component variables

1153
01:23:31,540 --> 01:23:35,570
over time to see how the
evolution of these underlying

1154
01:23:35,570 --> 01:23:38,820
factors has changed
over the time period.

1155
01:23:38,820 --> 01:23:42,300
And you'll recall that
we talked about the first

1156
01:23:42,300 --> 01:23:43,720
being the level shift.

1157
01:23:43,720 --> 01:23:49,750
Basically from 2001 to 2005, the
overall level of interest rates

1158
01:23:49,750 --> 01:23:51,150
went down and then went up.

1159
01:23:51,150 --> 01:23:54,030
And this is captured by this
first principal component

1160
01:23:54,030 --> 01:24:00,770
variable accumulating from 0
down to minus 8, back up to 0.

1161
01:24:06,170 --> 01:24:11,920
And the scale of this
change from 0 to minus 8

1162
01:24:11,920 --> 01:24:16,270
is the amount of
greatest variability.

1163
01:24:16,270 --> 01:24:19,340
The second principal
component variable

1164
01:24:19,340 --> 01:24:24,130
accumulates from 0 up to
less than 6, back down to 0.

1165
01:24:24,130 --> 01:24:27,092
So this is a measure
of the spread

1166
01:24:27,092 --> 01:24:28,300
between long and short rates.

1167
01:24:28,300 --> 01:24:31,470
So the spread
increased, and then it

1168
01:24:31,470 --> 01:24:33,805
decreased over the period.

1169
01:24:39,700 --> 01:24:46,560
And then the curvature, it
varies from 0 down to minus 1.5

1170
01:24:46,560 --> 01:24:47,560
back up to 0.

1171
01:24:47,560 --> 01:24:50,590
So how the curvature changed
over this entire period

1172
01:24:50,590 --> 01:24:57,260
was much, much less, which
is perhaps as it should be.

1173
01:24:57,260 --> 01:24:59,170
But these graphs
indicate basically

1174
01:24:59,170 --> 01:25:03,710
how these underlying factors
evolved over the time period.

1175
01:25:03,710 --> 01:25:10,410
In the case note I go through
and fit a statistical factor

1176
01:25:10,410 --> 01:25:13,600
analysis model to
these same data

1177
01:25:13,600 --> 01:25:16,980
and look at identifying
the number of factors.

1178
01:25:16,980 --> 01:25:19,970
And also comparing the results
over this five year period

1179
01:25:19,970 --> 01:25:24,030
with the period
from 2009 to 2013,

1180
01:25:24,030 --> 01:25:27,330
and comparing those
different results.

1181
01:25:27,330 --> 01:25:29,970
They are different,
and so it really

1182
01:25:29,970 --> 01:25:33,890
matters over what period one
specifies these models to.

1183
01:25:33,890 --> 01:25:37,620
And fitting these models is
really just a starting point

1184
01:25:37,620 --> 01:25:41,490
where you want to
ultimately model

1185
01:25:41,490 --> 01:25:44,450
the dynamics in these
factors and their structural

1186
01:25:44,450 --> 01:25:47,150
relationships.

1187
01:25:47,150 --> 01:25:49,000
So we'll finish there.