1
00:00:00,060 --> 00:00:02,500
The following content is
provided under a Creative
2
00:00:02,500 --> 00:00:04,019
Commons license.
3
00:00:04,019 --> 00:00:06,360
Your support will help
MIT OpenCourseWare
4
00:00:06,360 --> 00:00:10,730
continue to offer high quality
educational resources for free.
5
00:00:10,730 --> 00:00:13,340
To make a donation or
view additional materials
6
00:00:13,340 --> 00:00:17,217
from hundreds of MIT courses,
visit MIT OpenCourseWare
7
00:00:17,217 --> 00:00:17,842
at ocw.mit.edu.
8
00:00:22,240 --> 00:00:25,970
PROFESSOR: Today's topic
is factor modeling,
9
00:00:25,970 --> 00:00:32,420
and the subject here basically
exploits multivariate analysis
10
00:00:32,420 --> 00:00:37,910
in statistics to financial
markets where our concern is
11
00:00:37,910 --> 00:00:42,790
using factors to model
returns and variances,
12
00:00:42,790 --> 00:00:44,900
covariances, correlations.
13
00:00:44,900 --> 00:00:48,970
And with these models,
there are two basic cases.
14
00:00:48,970 --> 00:00:52,150
There's one where the
factors are observable.
15
00:00:52,150 --> 00:00:55,150
Those can be
macroeconomic factors.
16
00:00:55,150 --> 00:00:59,690
They can be fundamental
factors on assets or securities
17
00:00:59,690 --> 00:01:03,070
that might explain
returns and covariances.
18
00:01:03,070 --> 00:01:06,490
A second class of models
is where these factors
19
00:01:06,490 --> 00:01:08,930
are hidden or latent.
20
00:01:08,930 --> 00:01:11,850
And statistical
factor models are then
21
00:01:11,850 --> 00:01:15,240
used to specify these models.
22
00:01:15,240 --> 00:01:17,110
In particular, there
are two methodologies.
23
00:01:17,110 --> 00:01:21,310
There's factor analysis and
principal components analysis,
24
00:01:21,310 --> 00:01:24,930
which we'll get into some
detail during the lecture.
25
00:01:24,930 --> 00:01:31,200
So let's proceed to talk about
the setup for a linear factor
26
00:01:31,200 --> 00:01:33,410
model.
27
00:01:33,410 --> 00:01:38,540
We have m assets, or
instruments, or indexes
28
00:01:38,540 --> 00:01:42,710
whose values correspond to a
multivariate stochastic process
29
00:01:42,710 --> 00:01:44,030
we're modeling.
30
00:01:44,030 --> 00:01:47,530
And we have n time periods t.
31
00:01:47,530 --> 00:01:52,840
And with the factor model
we model the t-th value
32
00:01:52,840 --> 00:01:58,140
for the i-th object-- whether
it's a stock price, futures
33
00:01:58,140 --> 00:02:04,750
price, currency-- as a
linear function of factors
34
00:02:04,750 --> 00:02:07,360
f_1 through f_k.
35
00:02:07,360 --> 00:02:10,690
So there's basically
like a state-space model
36
00:02:10,690 --> 00:02:12,845
for the value of the
stochastic process,
37
00:02:12,845 --> 00:02:16,020
as it depends on these
underlying factors.
38
00:02:16,020 --> 00:02:20,080
And the dependence is given
by coefficients beta_1
39
00:02:20,080 --> 00:02:27,600
through beta_k, which are
depending upon i, the asset.
40
00:02:27,600 --> 00:02:31,730
So we allow each asset, say
if we're thinking of stocks,
41
00:02:31,730 --> 00:02:34,770
to depend on factors
in different ways.
42
00:02:34,770 --> 00:02:38,900
If a certain underlying
factor changes in value,
43
00:02:38,900 --> 00:02:44,340
the beta corresponds to the
impact of that underlying
44
00:02:44,340 --> 00:02:46,330
factor.
45
00:02:46,330 --> 00:02:49,440
So we have common factors.
46
00:02:52,080 --> 00:02:54,250
Now these common factors
f, this is really
47
00:02:54,250 --> 00:02:58,150
going to be a nice model if the
number of factors that we're
48
00:02:58,150 --> 00:03:01,300
using is relatively small.
49
00:03:01,300 --> 00:03:05,360
So the number k
of common factors
50
00:03:05,360 --> 00:03:09,490
is generally very, very
small relative to m.
51
00:03:09,490 --> 00:03:13,010
And if you think about modeling,
say asset-- equity asset
52
00:03:13,010 --> 00:03:16,000
returns in a market, there
can be hundreds and thousands
53
00:03:16,000 --> 00:03:17,570
of securities.
54
00:03:17,570 --> 00:03:22,530
And so in terms of modeling
those returns and covariances,
55
00:03:22,530 --> 00:03:24,360
what we're trying to
do is characterize
56
00:03:24,360 --> 00:03:28,230
those in terms of a modest
number of underlying factors
57
00:03:28,230 --> 00:03:30,510
which simplifies
the problem greatly.
58
00:03:33,450 --> 00:03:37,190
The vectors beta_i are
termed the factor loadings
59
00:03:37,190 --> 00:03:38,610
of an asset.
60
00:03:38,610 --> 00:03:43,680
And the epsilon_(i,t)'s are
a specific factor of asset i,
61
00:03:43,680 --> 00:03:44,470
period t.
62
00:03:44,470 --> 00:03:48,260
So in factor modeling,
we talk about there
63
00:03:48,260 --> 00:03:53,340
being common factors affecting
the dynamics of the system,
64
00:03:53,340 --> 00:03:59,210
and the factor associated
with particular cases
65
00:03:59,210 --> 00:04:02,450
are the specific factors.
66
00:04:02,450 --> 00:04:05,120
So this setup is
really very familiar.
67
00:04:05,120 --> 00:04:08,430
It just looks like a standard
sort of regression type model
68
00:04:08,430 --> 00:04:11,240
that we've seen before.
69
00:04:11,240 --> 00:04:14,270
And so let's see how
this can be set up
70
00:04:14,270 --> 00:04:18,100
as a set of cross-sectional
regressions.
71
00:04:18,100 --> 00:04:25,870
So now we're going to fix
the period t, the time t,
72
00:04:25,870 --> 00:04:31,040
and consider the
m-variate x variable
73
00:04:31,040 --> 00:04:38,460
as satisfying a regression model
with intercept given by alphas.
74
00:04:38,460 --> 00:04:43,140
And then the independent
variables matrix
75
00:04:43,140 --> 00:04:48,710
is B, given by the coefficients
of the factor loadings.
76
00:04:48,710 --> 00:04:54,210
And then we have the residuals
epsilon_t for the m assets.
77
00:04:54,210 --> 00:04:57,640
So the cross-sectional
terminology
78
00:04:57,640 --> 00:05:00,700
means we're fixing time
and looking across all
79
00:05:00,700 --> 00:05:02,970
the assets for one fixed time.
80
00:05:02,970 --> 00:05:09,310
And we're trying to explain
how, say, the returns of assets
81
00:05:09,310 --> 00:05:12,240
are varying depending upon
the underlying factors.
82
00:05:12,240 --> 00:05:19,990
And so the-- well OK, what's
random in this process?
83
00:05:19,990 --> 00:05:23,770
Well certainly the residual
term is considered to be random.
84
00:05:23,770 --> 00:05:26,410
That's basically
going to be assumed
85
00:05:26,410 --> 00:05:29,660
to be white noise with mean 0.
86
00:05:29,660 --> 00:05:35,490
There's going to be possibly
a covariance matrix psi.
87
00:05:35,490 --> 00:05:38,010
And it's going to
be uncorrelated
88
00:05:38,010 --> 00:05:41,860
across different
time cross sections.
89
00:05:41,860 --> 00:05:44,550
Let's see if I can move the
mouse, if this is what's
90
00:05:44,550 --> 00:05:46,700
causing the problem down here.
91
00:05:46,700 --> 00:05:54,450
So in this model we have the
realizations on the underlying
92
00:05:54,450 --> 00:05:56,600
factors being random variables.
93
00:05:56,600 --> 00:05:59,710
The returns on the assets depend
on the underlying factors.
94
00:05:59,710 --> 00:06:04,540
Those are going to be assumed
to have some mean, mu_f,
95
00:06:04,540 --> 00:06:07,010
and some covariance matrix.
96
00:06:07,010 --> 00:06:09,760
And basically the
dimension of that
97
00:06:09,760 --> 00:06:13,250
covariance matrix omega_f
is going to be k by k.
98
00:06:13,250 --> 00:06:16,975
So in terms of modeling this
problem, we go from an m
99
00:06:16,975 --> 00:06:22,130
by m system of
covariances, correlations,
100
00:06:22,130 --> 00:06:27,360
to focusing initially on an a
k by k system of covariances
101
00:06:27,360 --> 00:06:30,730
and correlations between
the underlying factors.
102
00:06:30,730 --> 00:06:38,380
Psi is a diagonal matrix
with the specific variances
103
00:06:38,380 --> 00:06:40,310
of the underlying assets.
104
00:06:40,310 --> 00:06:50,270
So we have basically epsilon--
the covariance matrix
105
00:06:50,270 --> 00:06:53,010
of the epsilons is
a diagonal matrix,
106
00:06:53,010 --> 00:06:59,500
and the covariance matrix of
f is given by this omega_f.
107
00:06:59,500 --> 00:07:01,690
Well, with those
specifications we
108
00:07:01,690 --> 00:07:09,070
can get the covariance
for the overall vector
109
00:07:09,070 --> 00:07:13,880
of the m-variate
stochastic process.
110
00:07:13,880 --> 00:07:19,880
And we have this model here
for the conditional moments.
111
00:07:19,880 --> 00:07:23,270
Basically, the
conditional expectation
112
00:07:23,270 --> 00:07:25,810
of the process given
the underlying factors
113
00:07:25,810 --> 00:07:30,310
is this linear model in terms
of the underlying factors f.
114
00:07:30,310 --> 00:07:34,025
And the covariance matrix is the
psi matrix, which is diagonal.
115
00:07:38,040 --> 00:07:42,840
And the unconditional
moments are
116
00:07:42,840 --> 00:07:46,290
obtained by just taking
the expectations of these.
117
00:07:46,290 --> 00:07:50,130
Well actually, the unconditional
expectation of x is this.
118
00:07:50,130 --> 00:07:52,860
The unconditional
covariance of x
119
00:07:52,860 --> 00:07:56,340
is actually equal
to the expectation
120
00:07:56,340 --> 00:08:02,129
of this plus the variance of
the conditional expectation,
121
00:08:02,129 --> 00:08:04,170
or the covariance of the
conditional expectation.
122
00:08:04,170 --> 00:08:08,690
So one of the formulas that's
important to realize here
123
00:08:08,690 --> 00:08:13,620
is that if we're considering
the covariance of x_t, which
124
00:08:13,620 --> 00:08:21,530
is equal to covariance of B
f_t plus epsilon_t, that's
125
00:08:21,530 --> 00:08:27,715
equal to the covariance of
B f_t plus the covariance
126
00:08:27,715 --> 00:08:35,100
of epsilon_t plus
twice the covariance
127
00:08:35,100 --> 00:08:39,600
between this term and this,
but those are uncorrelated.
128
00:08:39,600 --> 00:08:47,520
And so this is equal to B
covariance of f_t B transpose
129
00:08:47,520 --> 00:08:49,240
plus psi.
130
00:08:54,700 --> 00:08:56,865
With m assets, how
many parameters
131
00:08:56,865 --> 00:08:59,890
are in the covariance
matrix if there's
132
00:08:59,890 --> 00:09:02,987
no constraints on the
covariance matrix?
133
00:09:02,987 --> 00:09:03,903
AUDIENCE: [INAUDIBLE].
134
00:09:07,340 --> 00:09:08,670
PROFESSOR: How many parameters?
135
00:09:08,670 --> 00:09:09,560
Right.
136
00:09:09,560 --> 00:09:11,370
So this is sigma.
137
00:09:11,370 --> 00:09:15,214
So the number of
parameters in sigma.
138
00:09:15,214 --> 00:09:16,130
AUDIENCE: [INAUDIBLE].
139
00:09:19,954 --> 00:09:21,595
PROFESSOR: m plus what?
140
00:09:21,595 --> 00:09:23,970
AUDIENCE: [INAUDIBLE].
141
00:09:23,970 --> 00:09:29,250
PROFESSOR: OK, this is
a square matrix, m by m.
142
00:09:29,250 --> 00:09:32,440
So there's possibly m
squared, but it's symmetric.
143
00:09:32,440 --> 00:09:36,660
So we're double-counting
off the diagonal.
144
00:09:36,660 --> 00:09:39,950
So it's m times m plus 1 over 2.
145
00:09:43,490 --> 00:09:47,540
How many parameters do we
have with the factor model?
146
00:09:52,150 --> 00:09:57,210
So if we think of a--
let's call this sigma star.
147
00:09:57,210 --> 00:10:01,640
The number of parameters
in sigma star is what?
148
00:10:05,810 --> 00:10:10,986
Well, B is an m by k matrix.
149
00:10:15,920 --> 00:10:22,315
This is m by k, so we have
possibly m times k values.
150
00:10:25,870 --> 00:10:39,030
The f_x is-- or the
covariance of f_t
151
00:10:39,030 --> 00:10:45,460
is the number of elements in
the covariance matrix of f,
152
00:10:45,460 --> 00:10:48,360
which is k by k.
153
00:10:48,360 --> 00:10:58,470
And then we have psi, which
is a diagonal of dimension m.
154
00:10:58,470 --> 00:11:00,680
So depending on how
we structure things,
155
00:11:00,680 --> 00:11:03,970
we can have many, many fewer
parameters in this factor model
156
00:11:03,970 --> 00:11:05,609
than in the unconstrained case.
157
00:11:05,609 --> 00:11:07,400
And we're going to see
that we can actually
158
00:11:07,400 --> 00:11:12,630
reduce this number in the
covariance matrix of f
159
00:11:12,630 --> 00:11:15,637
dramatically because
of flexibility
160
00:11:15,637 --> 00:11:16,720
in choosing those factors.
161
00:11:21,940 --> 00:11:27,990
Well let's also look at the
interpretation of the factor
162
00:11:27,990 --> 00:11:30,110
model as a series of
time series regressions.
163
00:11:30,110 --> 00:11:35,410
You remember when we talked
about multivariate regression
164
00:11:35,410 --> 00:11:38,490
a few lectures ago, we
talked about cross-sectional
165
00:11:38,490 --> 00:11:41,760
regressions and time
series regressions,
166
00:11:41,760 --> 00:11:45,760
and looking at the collection
of all the regressions
167
00:11:45,760 --> 00:11:47,770
in a multivariate
regression setting.
168
00:11:47,770 --> 00:11:50,460
Here we can do the same thing.
169
00:11:50,460 --> 00:11:52,620
In contrast to the
cross-sectional regression
170
00:11:52,620 --> 00:11:55,680
where we're fixing time and
looking at all the assets,
171
00:11:55,680 --> 00:12:01,570
here we're looking at fixing
the asset i and the regression
172
00:12:01,570 --> 00:12:04,590
over time for that single asset.
173
00:12:04,590 --> 00:12:09,980
So the values of x_i,
ranging from time 1
174
00:12:09,980 --> 00:12:16,130
up to time capital T, basically
follows a regression model
175
00:12:16,130 --> 00:12:22,890
that's equal to the intercept
alpha_i plus this matrix F
176
00:12:22,890 --> 00:12:30,055
times beta_i, where beta_i
corresponds to the regression
177
00:12:30,055 --> 00:12:31,680
parameters in this
regression, but they
178
00:12:31,680 --> 00:12:35,985
are the factor corresponding to
an asset i on the different k
179
00:12:35,985 --> 00:12:36,485
factors.
180
00:12:39,430 --> 00:12:45,470
In this setting, the covariance
matrix of the epsilon_i vector
181
00:12:45,470 --> 00:12:50,640
is the diagonal matrix sigma
squared times the identity.
182
00:12:50,640 --> 00:12:54,580
And so this is the classic
Gauss-Markov assumptions
183
00:12:54,580 --> 00:12:58,180
for a single linear
regression model.
184
00:13:04,530 --> 00:13:09,600
Well, as we did previously,
we can group together
185
00:13:09,600 --> 00:13:13,700
all of these time series
regressions for all the m
186
00:13:13,700 --> 00:13:19,220
assets together by simply
putting them all together.
187
00:13:19,220 --> 00:13:28,620
So we start off with x_i
equal to basically F beta_i
188
00:13:28,620 --> 00:13:31,030
plus epsilon_i.
189
00:13:31,030 --> 00:13:39,980
And we can basically
consider x_1, x_2, up to x_n.
190
00:13:39,980 --> 00:13:46,260
So we have a T by m
matrix for the m assets.
191
00:13:46,260 --> 00:13:56,230
And that's equal to a regression
model given by basically
192
00:13:56,230 --> 00:13:58,470
what's on the slides here.
193
00:13:58,470 --> 00:14:01,370
So basically, we're able to
group everything together
194
00:14:01,370 --> 00:14:05,900
and deal with everything all
at once, which computationally
195
00:14:05,900 --> 00:14:08,530
is applied in fitting these.
196
00:14:16,630 --> 00:14:21,780
Let's look at the simplest
example of a factor model.
197
00:14:21,780 --> 00:14:24,610
This is the single-factor
model of Sharpe.
198
00:14:24,610 --> 00:14:27,640
We went through the capital
asset pricing model,
199
00:14:27,640 --> 00:14:33,382
how returns on assets and
stocks are basically--
200
00:14:33,382 --> 00:14:35,090
the excess return on
stock can be modeled
201
00:14:35,090 --> 00:14:39,360
in terms as a linear
regression on the excess return
202
00:14:39,360 --> 00:14:40,530
of the market.
203
00:14:40,530 --> 00:14:43,860
And the regression
parameter beta_i
204
00:14:43,860 --> 00:14:48,760
corresponds to the level of
risk associated with the asset.
205
00:14:48,760 --> 00:14:54,110
And all we're doing
in this model is,
206
00:14:54,110 --> 00:14:57,050
by choosing different
assets we're choosing assets
207
00:14:57,050 --> 00:15:01,800
with different levels of
risk scaled by the beta_i.
208
00:15:01,800 --> 00:15:04,510
And they may have
returns that vary
209
00:15:04,510 --> 00:15:08,760
across assets given by alpha_i.
210
00:15:08,760 --> 00:15:16,380
The covariance
matrix of the assets
211
00:15:16,380 --> 00:15:18,600
has-- the unconditional
covariance matrix
212
00:15:18,600 --> 00:15:20,540
has this structure.
213
00:15:20,540 --> 00:15:25,190
It's basically equal to the
variance of the market times
214
00:15:25,190 --> 00:15:28,580
beta beta prime plus psi.
215
00:15:28,580 --> 00:15:33,780
And so that equation
is really very simple.
216
00:15:37,070 --> 00:15:41,270
It's really self-evident from
what we've discussed, but let
217
00:15:41,270 --> 00:15:45,580
me just highlight what it is.
218
00:15:45,580 --> 00:15:53,276
Sigma squared beta beta
transposed plus psi.
219
00:15:53,276 --> 00:15:55,170
And that's equal
to sigma squared
220
00:15:55,170 --> 00:15:58,720
times basically a column vector
of all the betas, beta_1 down
221
00:15:58,720 --> 00:16:08,460
to beta_m times its transpose
plus a diagonal matrix
222
00:16:08,460 --> 00:16:09,740
with the psi.
223
00:16:09,740 --> 00:16:12,460
So this is really a very,
very simple structure
224
00:16:12,460 --> 00:16:14,790
for the covariance.
225
00:16:14,790 --> 00:16:18,610
And if you had wanted to
apply this model to thousands
226
00:16:18,610 --> 00:16:20,850
of securities, it's
basically no problem.
227
00:16:20,850 --> 00:16:23,270
You can construct a
covariance matrix.
228
00:16:23,270 --> 00:16:26,510
And if this were appropriate
for a large collection
229
00:16:26,510 --> 00:16:30,110
of securities, then the
amount-- the reduction
230
00:16:30,110 --> 00:16:35,810
in terms of what you're
estimating is enormous.
231
00:16:35,810 --> 00:16:39,103
Rather than estimating
each cross-correlation
232
00:16:39,103 --> 00:16:44,190
and covariance of
all the assets,
233
00:16:44,190 --> 00:16:49,190
the factor model tells us what
those cross covariances are.
234
00:16:49,190 --> 00:16:54,141
So that's really where the
power of the model comes in.
235
00:16:54,141 --> 00:16:58,310
And in terms of why
is this so useful,
236
00:16:58,310 --> 00:17:03,980
well in portfolio management
one of the key drivers
237
00:17:03,980 --> 00:17:07,450
of asset allocation is
the covariance matrix
238
00:17:07,450 --> 00:17:08,460
of the assets.
239
00:17:08,460 --> 00:17:09,910
So if you have an
effective model
240
00:17:09,910 --> 00:17:12,319
for modeling the
covariance, then that
241
00:17:12,319 --> 00:17:14,920
simplifies the portfolio
allocation problem
242
00:17:14,920 --> 00:17:17,800
because you can specify
a covariance matrix
243
00:17:17,800 --> 00:17:20,510
that you are confident with.
244
00:17:20,510 --> 00:17:28,089
And also in risk
management, effective models
245
00:17:28,089 --> 00:17:32,010
of risk management
deal with, how
246
00:17:32,010 --> 00:17:37,820
do we anticipate what would
happen if different scenarios
247
00:17:37,820 --> 00:17:38,750
occur in the market?
248
00:17:38,750 --> 00:17:41,320
Well, the different
scenarios that can occur
249
00:17:41,320 --> 00:17:45,900
can be associated with what's
happening to underlying factors
250
00:17:45,900 --> 00:17:48,460
that affect the system.
251
00:17:48,460 --> 00:17:51,580
And so we can consider
risk management approaches
252
00:17:51,580 --> 00:17:54,200
that vary these underlying
factors, and look at
253
00:17:54,200 --> 00:17:57,172
how that has an impact
on the covariance matrix
254
00:17:57,172 --> 00:17:57,755
very directly.
255
00:18:04,640 --> 00:18:08,350
Estimation of Sharpe's
single index model.
256
00:18:08,350 --> 00:18:11,460
We went through
before how we estimate
257
00:18:11,460 --> 00:18:14,970
the alphas and the betas.
258
00:18:14,970 --> 00:18:17,440
In terms of estimating
the sigmas--
259
00:18:17,440 --> 00:18:20,800
the specific
variances-- basically,
260
00:18:20,800 --> 00:18:23,640
that comes from the
simple regression as well.
261
00:18:23,640 --> 00:18:26,920
Basically, the sum of the
squared estimated residuals
262
00:18:26,920 --> 00:18:28,840
divided by t minus 2.
263
00:18:28,840 --> 00:18:31,870
Here we're assuming unbiasedness
because we have two parameters
264
00:18:31,870 --> 00:18:34,220
estimated per model.
265
00:18:34,220 --> 00:18:40,700
Then for the market
portfolio, that basically
266
00:18:40,700 --> 00:18:42,580
has a simple estimate as well.
267
00:18:42,580 --> 00:18:46,470
The psi hat matrix would
just be the diagonal
268
00:18:46,470 --> 00:18:53,680
of that-- the diagonal of
the specific variances.
269
00:18:53,680 --> 00:18:56,620
And then the unconditional
covariance matrix
270
00:18:56,620 --> 00:19:00,940
is estimated by simply plugging
in these parameter estimates.
271
00:19:00,940 --> 00:19:08,490
So, very simple and effective
if that single-factor
272
00:19:08,490 --> 00:19:09,760
model is appropriate.
273
00:19:09,760 --> 00:19:13,660
Now needless to say,
a single-factor model
274
00:19:13,660 --> 00:19:18,860
doesn't characterize the
structure of the covariances
275
00:19:18,860 --> 00:19:22,220
and/or the returns typically.
276
00:19:22,220 --> 00:19:26,860
And so we want to consider
more general models,
277
00:19:26,860 --> 00:19:28,880
multi-factor models.
278
00:19:28,880 --> 00:19:31,160
And the first set of models
we're going to talk about
279
00:19:31,160 --> 00:19:36,640
are common factor variables
that can actually be observed.
280
00:19:39,630 --> 00:19:44,430
Basically any factor
that you can observe
281
00:19:44,430 --> 00:19:49,480
is a potential candidate
for being a relevant factor
282
00:19:49,480 --> 00:19:51,240
in a linear factor model.
283
00:19:51,240 --> 00:19:54,100
The effectiveness of
a potential factor
284
00:19:54,100 --> 00:19:56,490
is determined by
fitting the model
285
00:19:56,490 --> 00:20:00,050
and seeing how much
contribution that factor
286
00:20:00,050 --> 00:20:03,567
makes to the explanation
of the returns
287
00:20:03,567 --> 00:20:04,775
and the covariance structure.
288
00:20:07,300 --> 00:20:12,970
Chen, Ross, and Roll wrote
a classic paper in 1986.
289
00:20:12,970 --> 00:20:17,460
Now Ross is actually
here at MIT.
290
00:20:17,460 --> 00:20:30,580
And with their paper,
they looked at modeling--
291
00:20:30,580 --> 00:20:32,180
rather than looking
at these factors
292
00:20:32,180 --> 00:20:34,560
directly, including
those in the model,
293
00:20:34,560 --> 00:20:39,940
they looked at
transforming these factors
294
00:20:39,940 --> 00:20:43,080
into surprise factors.
295
00:20:43,080 --> 00:20:47,550
So rather than having
interest rates just
296
00:20:47,550 --> 00:20:50,230
as a simple factor directly
plugged into the model,
297
00:20:50,230 --> 00:20:54,100
it would be the change
in interest rates.
298
00:20:54,100 --> 00:20:56,890
And additionally, not only just
the change in interest rate,
299
00:20:56,890 --> 00:20:59,180
but the unanticipated
change in interest
300
00:20:59,180 --> 00:21:01,550
rates given market information.
301
00:21:01,550 --> 00:21:07,480
So their paper goes
through modeling different
302
00:21:07,480 --> 00:21:12,130
macroeconomic variables with
vector autoregression models,
303
00:21:12,130 --> 00:21:17,270
and then using those to
specify unanticipated changes
304
00:21:17,270 --> 00:21:19,540
in these underlying factors.
305
00:21:19,540 --> 00:21:22,680
And so that's where
the power comes in.
306
00:21:22,680 --> 00:21:27,680
And that highlights how when
you're applying these models,
307
00:21:27,680 --> 00:21:30,960
it does involve some
creativity to get the most bang
308
00:21:30,960 --> 00:21:32,340
for the buck with your models.
309
00:21:32,340 --> 00:21:36,780
And the idea they had of
incorporating unanticipated
310
00:21:36,780 --> 00:21:39,060
changes was really
a very good one
311
00:21:39,060 --> 00:21:41,555
and is applied quite widely now.
312
00:21:55,050 --> 00:22:03,380
Now with this setup,
one basically--
313
00:22:03,380 --> 00:22:10,040
if one has empirical data over
times 1 through capital T,
314
00:22:10,040 --> 00:22:13,580
then if one wants to
specify these models,
315
00:22:13,580 --> 00:22:17,740
one has their observations
on the x_i process.
316
00:22:17,740 --> 00:22:22,120
You basically have observed
all the returns historically.
317
00:22:22,120 --> 00:22:24,940
We also, because the
factors are observable,
318
00:22:24,940 --> 00:22:29,290
have the F matrix as a
set of observed variables.
319
00:22:29,290 --> 00:22:36,300
So we can basically use those to
estimate the beta_i's and also
320
00:22:36,300 --> 00:22:40,970
estimate the variances
of the residual terms
321
00:22:40,970 --> 00:22:44,310
with simple regression methods.
322
00:22:44,310 --> 00:22:49,970
So implementing these
is quite feasible,
323
00:22:49,970 --> 00:22:53,860
and basically applies methods
that we have from before.
324
00:22:53,860 --> 00:23:01,110
So what this slide now discusses
is how we basically estimate
325
00:23:01,110 --> 00:23:02,700
the underlying parameters.
326
00:23:02,700 --> 00:23:06,990
We need to be a little bit
careful about the Gauss-Markov
327
00:23:06,990 --> 00:23:08,210
assumptions.
328
00:23:08,210 --> 00:23:15,100
You'll remember that if we
have a regression model where
329
00:23:15,100 --> 00:23:18,650
the residual terms are
uncorrelated and constant
330
00:23:18,650 --> 00:23:22,560
variance, then the
simple linear regression
331
00:23:22,560 --> 00:23:25,210
estimates are the best ones.
332
00:23:25,210 --> 00:23:32,740
If there is unequal
variances of the residuals,
333
00:23:32,740 --> 00:23:36,650
and maybe even
covariances, then we
334
00:23:36,650 --> 00:23:40,250
need to use generalized
least squares.
335
00:23:40,250 --> 00:23:46,850
So the notes go through those
computations and the formulas,
336
00:23:46,850 --> 00:23:51,380
which are just simple extensions
of our regression model theory
337
00:23:51,380 --> 00:23:53,755
that we had in
previous lectures.
338
00:24:04,240 --> 00:24:11,433
Let me go through an example.
339
00:24:17,720 --> 00:24:19,790
With common factor
variables that
340
00:24:19,790 --> 00:24:25,560
are using either fundamental
or asset-specific attributes,
341
00:24:25,560 --> 00:24:29,047
there's the approach of-- well,
it's called the BARRA Approach.
342
00:24:29,047 --> 00:24:30,213
This is from Barr Rosenberg.
343
00:24:33,470 --> 00:24:36,090
Actually, I have to say, he
was one of the inspirations
344
00:24:36,090 --> 00:24:41,040
to me for going into statistical
modeling and finance.
345
00:24:41,040 --> 00:24:44,910
He was a professor at UC
Berkeley who left academics
346
00:24:44,910 --> 00:24:51,770
very early to basically
apply models in trade money.
347
00:24:51,770 --> 00:24:55,250
As an anecdote, his
current situation
348
00:24:55,250 --> 00:24:57,210
is a little different.
349
00:24:57,210 --> 00:24:58,620
I'll let you look that up.
350
00:24:58,620 --> 00:25:04,170
But anyway, this
approach basically
351
00:25:04,170 --> 00:25:09,260
provided the BARRA Approach
for factor modeling and risk
352
00:25:09,260 --> 00:25:11,950
analysis, which is still
used extensively today.
353
00:25:15,360 --> 00:25:23,710
So with common factor
variables using
354
00:25:23,710 --> 00:25:28,740
asset-specific
attributes-- in fact,
355
00:25:28,740 --> 00:25:33,890
the factor realizations
are unobserved
356
00:25:33,890 --> 00:25:38,960
but are estimated in the
application of these models.
357
00:25:38,960 --> 00:25:41,930
So let's see how that goes.
358
00:25:41,930 --> 00:25:50,410
Oh, OK, this slide talks about
the Fama-French approach, which
359
00:25:50,410 --> 00:25:54,610
concerns-- OK, Fama and
French, Fama of course
360
00:25:54,610 --> 00:25:56,780
we talked about him
in the last lecture.
361
00:25:56,780 --> 00:25:58,920
He got the Nobel
Prize for his work
362
00:25:58,920 --> 00:26:05,220
in modeling asset price
returns and market efficiency.
363
00:26:05,220 --> 00:26:08,230
Fama and French
found that there were
364
00:26:08,230 --> 00:26:11,860
some very important
factors affecting
365
00:26:11,860 --> 00:26:14,330
asset returns in equities.
366
00:26:14,330 --> 00:26:18,300
Basically, returns
tended to vary depending
367
00:26:18,300 --> 00:26:20,910
upon the size of firms.
368
00:26:20,910 --> 00:26:25,680
So if you consider small
firms versus large firms,
369
00:26:25,680 --> 00:26:27,516
small firms tended to
have returns that were
370
00:26:27,516 --> 00:26:28,640
more similar to each other.
371
00:26:28,640 --> 00:26:30,660
Large firms tended to
have returns that were
372
00:26:30,660 --> 00:26:32,110
more similar to each other.
373
00:26:32,110 --> 00:26:35,920
So there's basically a
big versus small factor
374
00:26:35,920 --> 00:26:38,500
that is operating in the market.
375
00:26:38,500 --> 00:26:40,610
Sometimes the market
prefers small stocks,
376
00:26:40,610 --> 00:26:42,790
sometimes it prefers
large stocks.
377
00:26:42,790 --> 00:26:48,580
And similarly,
there's another factor
378
00:26:48,580 --> 00:26:50,950
which is value versus growth.
379
00:26:54,030 --> 00:26:58,410
Basically, stocks that
are considered good values
380
00:26:58,410 --> 00:27:02,914
are stocks which are cheap,
basically, for what they have.
381
00:27:02,914 --> 00:27:04,955
So you're basically getting
a stock at a discount
382
00:27:04,955 --> 00:27:08,390
if you're getting a good value.
383
00:27:08,390 --> 00:27:12,500
And value stocks can be
measured by looking at the price
384
00:27:12,500 --> 00:27:13,165
to book equity.
385
00:27:13,165 --> 00:27:15,940
If that's low, then
the price you're
386
00:27:15,940 --> 00:27:20,600
paying for that equity in the
firm is low, and it's cheap.
387
00:27:20,600 --> 00:27:24,150
And that compares
with stocks for which
388
00:27:24,150 --> 00:27:28,110
the price relative to the
book value is very, very high.
389
00:27:28,110 --> 00:27:32,240
Why are people willing
to pay a lot for stocks?
390
00:27:32,240 --> 00:27:35,000
In that case, well it's
because the growth prospects
391
00:27:35,000 --> 00:27:39,030
of those stocks is high,
and there's an anticipation
392
00:27:39,030 --> 00:27:41,580
basically that the
current price is just
393
00:27:41,580 --> 00:27:47,610
reflecting a projection of how
much growth potential there is.
394
00:27:47,610 --> 00:27:51,670
Now the Fama French approach
is for each of these factors
395
00:27:51,670 --> 00:27:57,080
to basically rank order all
the stocks by this factor
396
00:27:57,080 --> 00:28:01,800
and divide them
up into quintiles.
397
00:28:01,800 --> 00:28:06,550
So say this is market cap.
398
00:28:06,550 --> 00:28:12,030
We can divide up all the
stocks in-- basically
399
00:28:12,030 --> 00:28:15,000
consider a histogram,
or whatever,
400
00:28:15,000 --> 00:28:18,230
of all the market caps of all
the stocks in our universe.
401
00:28:18,230 --> 00:28:23,500
And then divide it up into
basically the bottom fifth,
402
00:28:23,500 --> 00:28:27,026
the next fifth, and
then-- it probably
403
00:28:27,026 --> 00:28:29,830
needs to go up-- the top fifth.
404
00:28:33,120 --> 00:28:35,960
And the Fama-French
approach says, well,
405
00:28:35,960 --> 00:28:41,080
let's look at an equal-weighted
average of the top fifth.
406
00:28:41,080 --> 00:28:50,920
And basically, buy that
and sell the bottom fifth.
407
00:28:50,920 --> 00:28:55,620
And so that would be the
big versus small market
408
00:28:55,620 --> 00:28:57,640
factor of Fama and French.
409
00:28:57,640 --> 00:29:00,300
Now, if you look at
their work, they actually
410
00:29:00,300 --> 00:29:03,080
do the bottom minus the
top, because the value
411
00:29:03,080 --> 00:29:04,670
tends to outperform the other.
412
00:29:04,670 --> 00:29:07,010
So they have a factor
whose more positive
413
00:29:07,010 --> 00:29:08,510
values and associated
more generally
414
00:29:08,510 --> 00:29:10,660
with positive returns.
415
00:29:10,660 --> 00:29:14,780
But that factor can
be applied and used
416
00:29:14,780 --> 00:29:20,400
to correlate with individual
asset returns as well.
417
00:29:26,590 --> 00:29:28,580
Now, with the BARRA
Industry Factor--
418
00:29:28,580 --> 00:29:35,960
this is just getting back
to the BARRA Approach--
419
00:29:35,960 --> 00:29:37,840
the simplest case
of understanding
420
00:29:37,840 --> 00:29:40,580
the BARRA industry
factor models is
421
00:29:40,580 --> 00:29:42,820
to consider looking
at dividing stocks up
422
00:29:42,820 --> 00:29:45,020
into different industry groups.
423
00:29:45,020 --> 00:29:56,610
So we might expect that,
say oil stocks will
424
00:29:56,610 --> 00:30:02,100
tend to move together and
have greater variability
425
00:30:02,100 --> 00:30:04,790
or common variability.
426
00:30:04,790 --> 00:30:10,640
And that could be very different
from utility stocks, which
427
00:30:10,640 --> 00:30:13,105
tend to actually be
quite low-risk stocks.
428
00:30:17,749 --> 00:30:19,290
Utility companies
are companies which
429
00:30:19,290 --> 00:30:21,910
are very highly regulated.
430
00:30:21,910 --> 00:30:26,360
And the profitability
of those firms
431
00:30:26,360 --> 00:30:30,850
is basically overlooked
by the regulators.
432
00:30:30,850 --> 00:30:37,110
They don't want the
utilities to gouge consumers
433
00:30:37,110 --> 00:30:41,890
and make too much profit from
delivering power to customers.
434
00:30:41,890 --> 00:30:46,555
So utilities tend to have
fairly low volatility
435
00:30:46,555 --> 00:30:50,330
but very consistent
returns, which
436
00:30:50,330 --> 00:30:55,530
are based on reasonable,
from a regulatory standpoint,
437
00:30:55,530 --> 00:30:58,110
levels of profitability
for those companies.
438
00:30:58,110 --> 00:31:03,270
Well with an industry
factor model, what we can do
439
00:31:03,270 --> 00:31:08,710
is associate factor
loadings, which basically
440
00:31:08,710 --> 00:31:13,570
are loading each asset in
terms of which industry group
441
00:31:13,570 --> 00:31:14,520
it's in.
442
00:31:14,520 --> 00:31:20,140
So we actually know the beta
values for these stocks,
443
00:31:20,140 --> 00:31:23,080
but we don't know
the underlying factor
444
00:31:23,080 --> 00:31:26,400
realizations for these stocks.
445
00:31:26,400 --> 00:31:29,480
But in terms of the
betas, with these factors
446
00:31:29,480 --> 00:31:34,641
we can basically get a well
defined beta vectors and B
447
00:31:34,641 --> 00:31:37,390
matrix for all the stocks.
448
00:31:37,390 --> 00:31:40,650
And the problem
then is, how do we
449
00:31:40,650 --> 00:31:44,540
specify the realization
of the underlying factors?
450
00:31:44,540 --> 00:31:51,000
Well the realization of
the underlying factors
451
00:31:51,000 --> 00:31:56,190
basically is just estimated
with a regression model.
452
00:31:56,190 --> 00:32:06,300
And so if we have all of our
assets x_i for different times
453
00:32:06,300 --> 00:32:13,700
t, those would have a model
given by factor realizations
454
00:32:13,700 --> 00:32:21,380
corresponding to the k industry
groups with known beta_(i,j)
455
00:32:21,380 --> 00:32:21,880
values.
456
00:32:29,010 --> 00:32:34,030
And the estimation of
these, we basically
457
00:32:34,030 --> 00:32:36,940
have a simple
regression model where
458
00:32:36,940 --> 00:32:43,060
the realizations of
the factor returns f_t
459
00:32:43,060 --> 00:32:45,840
are given by essentially
a regression coefficient
460
00:32:45,840 --> 00:32:50,270
in this regression, where
we have the asset returns
461
00:32:50,270 --> 00:32:54,840
x_t, the known
factor loadings B,
462
00:32:54,840 --> 00:32:58,520
the unknown factor
realizations f_t.
463
00:32:58,520 --> 00:33:01,930
And just plugging
into the regression,
464
00:33:01,930 --> 00:33:05,500
if we do it very simply
we get this expression
465
00:33:05,500 --> 00:33:10,710
for f hat t, which is the
simple linear regression model
466
00:33:10,710 --> 00:33:13,310
estimating those realizations.
467
00:33:13,310 --> 00:33:20,660
Now this particular estimate
of the factor realizations
468
00:33:20,660 --> 00:33:29,020
is assuming that the variability
of the components of x
469
00:33:29,020 --> 00:33:31,960
have the same variance.
470
00:33:31,960 --> 00:33:33,830
This is like the
linear regression
471
00:33:33,830 --> 00:33:36,940
estimates under normal
Gauss-Markov assumptions.
472
00:33:36,940 --> 00:33:42,970
But basically the
epsilon_i's will
473
00:33:42,970 --> 00:33:44,420
vary across the
different assets.
474
00:33:44,420 --> 00:33:47,240
The different assets will
have different variabilities,
475
00:33:47,240 --> 00:33:48,500
different specific variances.
476
00:33:48,500 --> 00:33:53,630
So that's actually going
to be heteroscedasticity
477
00:33:53,630 --> 00:33:54,720
in these models.
478
00:33:54,720 --> 00:33:57,840
So this particular vector
of industry averages
479
00:33:57,840 --> 00:34:06,670
should actually be extended
to accommodate for that.
480
00:34:06,670 --> 00:34:10,940
So we have the estimation
of the covariance matrix
481
00:34:10,940 --> 00:34:14,900
of the factors can
then be estimated
482
00:34:14,900 --> 00:34:17,909
using these estimates
of the realizations.
483
00:34:17,909 --> 00:34:21,599
And our estimation of the
residual covariance matrix
484
00:34:21,599 --> 00:34:22,515
can then be estimated.
485
00:34:25,310 --> 00:34:29,639
So I guess an initial estimate
of the covariance matrix sigma
486
00:34:29,639 --> 00:34:34,409
hat is given by this known
matrix B times our sample
487
00:34:34,409 --> 00:34:39,340
estimate of the factor
realizations plus the diagonal
488
00:34:39,340 --> 00:34:42,330
matrix C hat.
489
00:34:42,330 --> 00:34:46,310
And a second step
in this process
490
00:34:46,310 --> 00:34:49,659
can incorporate
information about there
491
00:34:49,659 --> 00:34:53,659
being heteroscedasticity along
the diagonal of the psi's
492
00:34:53,659 --> 00:34:56,699
to adjust the
regression estimates.
493
00:34:56,699 --> 00:35:00,950
So we basically get a
refinement of the estimates
494
00:35:00,950 --> 00:35:05,640
that does account for the
non-constant variability.
495
00:35:05,640 --> 00:35:13,750
Now this property of
heteroscedasticity verses
496
00:35:13,750 --> 00:35:20,270
homoscedasticity in estimating
the regression parameters,
497
00:35:20,270 --> 00:35:22,460
it may seem like
this is a nicety
498
00:35:22,460 --> 00:35:27,550
of the statistical theory that
we just have to try and check,
499
00:35:27,550 --> 00:35:29,180
but it's not too big a deal.
500
00:35:29,180 --> 00:35:36,820
But let me highlight where this
issue comes up again and again.
501
00:35:36,820 --> 00:35:43,880
With portfolio optimization,
we went through last time--
502
00:35:43,880 --> 00:35:46,300
for mean-variance
optimization, we
503
00:35:46,300 --> 00:35:50,550
want to consider a weighting of
assets that basically weights
504
00:35:50,550 --> 00:35:56,640
the assets by the expected
returns, pre-multiplied
505
00:35:56,640 --> 00:36:00,500
by the inverse of the
covariance matrix.
506
00:36:00,500 --> 00:36:04,150
And so we basically in
portfolio allocation
507
00:36:04,150 --> 00:36:07,010
want to allocate to
assets with high return,
508
00:36:07,010 --> 00:36:10,970
but we're going to penalize
those with high variance.
509
00:36:10,970 --> 00:36:21,450
And so the impact of discounting
values with high variability
510
00:36:21,450 --> 00:36:23,810
arises in asset allocation.
511
00:36:23,810 --> 00:36:27,480
And then of course arises
in statistical estimation.
512
00:36:27,480 --> 00:36:30,160
Basically with signals
with high noise,
513
00:36:30,160 --> 00:36:33,070
you want to normalize
by the level of noise
514
00:36:33,070 --> 00:36:37,900
before you incorporate the
impact of that variable
515
00:36:37,900 --> 00:36:38,900
on the particular model.
516
00:36:45,400 --> 00:36:47,560
So here are just some notes
about the inefficiency
517
00:36:47,560 --> 00:36:50,170
of estimates due to
heteroscedasticity.
518
00:36:50,170 --> 00:36:53,032
We can apply generalized
least squares.
519
00:36:53,032 --> 00:36:56,470
A second bullet here is
that factor realizations
520
00:36:56,470 --> 00:37:00,063
can be scaled to represent
factor mimicking portfolios.
521
00:37:02,690 --> 00:37:06,360
Now with the
Fama-French factors,
522
00:37:06,360 --> 00:37:09,360
where we have say big
versus small stocks or value
523
00:37:09,360 --> 00:37:11,410
versus growth
stocks, it would be
524
00:37:11,410 --> 00:37:16,060
nice to know, well what's
the real value of trading
525
00:37:16,060 --> 00:37:17,390
that factor?
526
00:37:17,390 --> 00:37:21,550
If you were to have unit
weight to trading that factor,
527
00:37:21,550 --> 00:37:22,740
would you make money or not?
528
00:37:22,740 --> 00:37:26,040
Or under what periods
would you make money?
529
00:37:26,040 --> 00:37:31,010
And the notion of factor
mimicking portfolios
530
00:37:31,010 --> 00:37:31,590
is important.
531
00:37:31,590 --> 00:37:38,060
Let's go back to the
specification of the factor
532
00:37:38,060 --> 00:37:41,440
realizations here.
533
00:37:41,440 --> 00:37:48,210
f hat t, the t-th realization
of the factors, their k factors,
534
00:37:48,210 --> 00:37:50,680
is given by essentially
the regression estimate
535
00:37:50,680 --> 00:37:54,400
of those factors from the
realizations of the asset
536
00:37:54,400 --> 00:37:55,150
returns.
537
00:37:55,150 --> 00:37:57,230
And if we're doing
this in the proper way,
538
00:37:57,230 --> 00:38:01,370
we'd be correcting for
the heteroscedasticity.
539
00:38:01,370 --> 00:38:06,810
Well this realization
of the factor returns
540
00:38:06,810 --> 00:38:17,430
is a weighted average or
a weighted sum of the x_t.
541
00:38:17,430 --> 00:38:29,100
So we have basically f_t
is equal to a matrix times
542
00:38:29,100 --> 00:38:39,409
x_t, where this is B B prime
toe the minus one, B prime.
543
00:38:42,250 --> 00:38:49,216
So our k-dimensional
realizations-- let's see,
544
00:38:49,216 --> 00:38:54,000
this is basically k by 1.
545
00:38:57,470 --> 00:39:04,100
Each of these k factors is
a weighted sum of these x's.
546
00:39:04,100 --> 00:39:06,570
Now the x's, if these are
returns on the underlying
547
00:39:06,570 --> 00:39:13,770
assets, then we can consider
normalizing these factors.
548
00:39:13,770 --> 00:39:16,740
Or basically normalizing
this matrix here
549
00:39:16,740 --> 00:39:23,880
so that the row weights sum to
1, say, for a unit of capital.
550
00:39:23,880 --> 00:39:28,380
So if we were to invest
a net unit of one capital
551
00:39:28,380 --> 00:39:32,550
unit in these assets, then
this factor realization
552
00:39:32,550 --> 00:39:38,510
would give us the return on
a portfolio of the assets
553
00:39:38,510 --> 00:39:43,290
that is perfectly correlated
with the factor realization.
554
00:39:43,290 --> 00:39:49,820
So factor mimicking portfolios
can be defined that way.
555
00:39:49,820 --> 00:39:54,220
And they have a
good interpretation
556
00:39:54,220 --> 00:39:57,080
in terms of the realization
of potential investments.
557
00:40:02,980 --> 00:40:03,990
So let's go back.
558
00:40:18,000 --> 00:40:21,870
The next subject is
statistical factor models.
559
00:40:21,870 --> 00:40:26,540
This is the case where
we begin the analysis
560
00:40:26,540 --> 00:40:31,740
with just our collection of
outcomes for the process x_t.
561
00:40:31,740 --> 00:40:34,150
So basically our
time series of asset
562
00:40:34,150 --> 00:40:40,100
returns for m assets
over T time units.
563
00:40:40,100 --> 00:40:44,300
And we have no clue initially
what the underlying factors
564
00:40:44,300 --> 00:40:47,360
are, but we hypothesize
that there are factors that
565
00:40:47,360 --> 00:40:49,570
do characterize the returns.
566
00:40:49,570 --> 00:40:52,090
And factor analysis and
principal components analysis
567
00:40:52,090 --> 00:40:57,890
provide ways of uncovering
those underlying factors
568
00:40:57,890 --> 00:41:01,335
and defining them in terms
of the data themselves.
569
00:41:15,540 --> 00:41:18,020
So we'll first talk
about factor analysis.
570
00:41:18,020 --> 00:41:21,290
Then we'll turn to principal
components analysis.
571
00:41:21,290 --> 00:41:26,290
Both of these
methods are efforts
572
00:41:26,290 --> 00:41:29,360
to model the covariance matrix.
573
00:41:29,360 --> 00:41:37,710
And the underlying covariance
matrix for the assets x
574
00:41:37,710 --> 00:41:40,810
can be estimated with sample
data in terms of the sample
575
00:41:40,810 --> 00:41:42,100
covariance matrix.
576
00:41:42,100 --> 00:41:45,090
So here I've just written
out in matrix form
577
00:41:45,090 --> 00:41:47,700
how that would be computed.
578
00:41:47,700 --> 00:41:57,260
And so with this m by T
matrix x, we basically
579
00:41:57,260 --> 00:42:03,280
take that matrix, take
out the means by computing
580
00:42:03,280 --> 00:42:06,380
the means with multiplying
by this matrix,
581
00:42:06,380 --> 00:42:10,480
and then take the
sum of deviations
582
00:42:10,480 --> 00:42:14,280
about the means for
all the m assets
583
00:42:14,280 --> 00:42:17,470
individually and
across each other,
584
00:42:17,470 --> 00:42:40,170
and divide that by capital T.
585
00:42:40,170 --> 00:42:42,750
Now, the setup for
statistical factor models
586
00:42:42,750 --> 00:42:48,810
is exactly the same as
before, except the only thing
587
00:42:48,810 --> 00:42:51,620
that we observe is x_t.
588
00:42:51,620 --> 00:42:59,490
So we're hypothesizing a
model where alpha is basically
589
00:42:59,490 --> 00:43:06,580
a vector that is basically
the vector of mean returns
590
00:43:06,580 --> 00:43:08,420
of the individual assets.
591
00:43:08,420 --> 00:43:14,500
B is a matrix of factor
loadings on k factors f_t.
592
00:43:14,500 --> 00:43:18,240
And epsilon_t is white
noise with mean 0,
593
00:43:18,240 --> 00:43:20,910
covariance matrix
given by the diagonal.
594
00:43:20,910 --> 00:43:25,410
So the setup here is the
same basic setup as before,
595
00:43:25,410 --> 00:43:33,100
but we don't have the
matrix B or the vector f_t.
596
00:43:35,790 --> 00:43:37,030
Or, of course, alpha.
597
00:43:39,920 --> 00:43:43,240
Now in this setup,
it's important
598
00:43:43,240 --> 00:43:49,920
that there is an
indeterminacy of this model,
599
00:43:49,920 --> 00:43:57,450
because for any given
specification of the matrix B
600
00:43:57,450 --> 00:44:04,540
or the factors f, we can
actually transform those
601
00:44:04,540 --> 00:44:09,840
by a k by k invertible
matrix H. So for a given
602
00:44:09,840 --> 00:44:13,900
specification of this model,
if we transform the underlying
603
00:44:13,900 --> 00:44:19,230
factor realizations f by the
matrix H, which is k by k,
604
00:44:19,230 --> 00:44:25,890
then if we transform the
factor loadings B by H inverse,
605
00:44:25,890 --> 00:44:28,290
we get the same model.
606
00:44:28,290 --> 00:44:31,800
So there is an indeterminacy
here, or a-- OK,
607
00:44:31,800 --> 00:44:37,940
there's an indeterminacy of
these particular variables,
608
00:44:37,940 --> 00:44:41,970
but there's basically
flexibility in how
609
00:44:41,970 --> 00:44:44,630
we define the factor model.
610
00:44:44,630 --> 00:44:48,050
So in trying to uncover a factor
model with statistical factor
611
00:44:48,050 --> 00:44:50,990
analysis, there is
some flexibility
612
00:44:50,990 --> 00:44:53,130
in defining our factors.
613
00:44:53,130 --> 00:44:57,030
We can arbitrarily
transform the factors
614
00:44:57,030 --> 00:45:00,500
by an invertible
transformation in the k space.
615
00:45:15,050 --> 00:45:19,560
And I guess it's important
to note that what changes
616
00:45:19,560 --> 00:45:22,550
when we do that transformation?
617
00:45:22,550 --> 00:45:24,570
Well the linear
function stays the same
618
00:45:24,570 --> 00:45:28,040
in terms of the covariance
matrix of the underlying
619
00:45:28,040 --> 00:45:29,180
factors.
620
00:45:29,180 --> 00:45:31,880
Well, if we have a covariance
matrix for those underlying
621
00:45:31,880 --> 00:45:36,350
factors, we need to accommodate
the matrix transformation
622
00:45:36,350 --> 00:45:37,690
H in that.
623
00:45:37,690 --> 00:45:39,640
So that has an impact there.
624
00:45:39,640 --> 00:45:44,030
But one of the
things we can do is
625
00:45:44,030 --> 00:45:49,040
consider trying to
define a matrix H, that
626
00:45:49,040 --> 00:45:50,780
diagonalizes the factors.
627
00:45:50,780 --> 00:45:53,790
So in some settings, it's useful
to consider factor models where
628
00:45:53,790 --> 00:46:00,260
you have uncorrelated
factor components.
629
00:46:00,260 --> 00:46:04,140
And it's possible to
define linear factor
630
00:46:04,140 --> 00:46:09,290
models with uncorrelated factor
components by a choice of H.
631
00:46:09,290 --> 00:46:12,440
So with any linear
factor model in fact,
632
00:46:12,440 --> 00:46:17,571
we can have uncorrelated factor
components if that's useful.
633
00:46:21,720 --> 00:46:26,300
So this first bullet
highlights that point
634
00:46:26,300 --> 00:46:30,200
that we can get
orthonormal factors.
635
00:46:32,930 --> 00:46:37,490
And we can also have
zero mean factors
636
00:46:37,490 --> 00:46:41,530
by adjusting the
data to incorporate
637
00:46:41,530 --> 00:46:43,340
the mean of these factors.
638
00:46:45,930 --> 00:46:53,290
And if we make these
particular assumptions,
639
00:46:53,290 --> 00:46:55,840
then the model does
simplify to just
640
00:46:55,840 --> 00:47:02,400
being the covariance matrix
sigma_x is the factor
641
00:47:02,400 --> 00:47:08,020
loadings B times its transpose
plus a diagonal matrix.
642
00:47:08,020 --> 00:47:11,060
And just to reiterate,
the power of this
643
00:47:11,060 --> 00:47:19,130
is basically no matter how
large m is, as m increases
644
00:47:19,130 --> 00:47:28,004
the B matrix just increases
by k for every increment in m.
645
00:47:28,004 --> 00:47:32,490
And we also have an additional
diagonal entry on the psi.
646
00:47:32,490 --> 00:47:39,320
So as we add more and more
assets to our modeling,
647
00:47:39,320 --> 00:47:42,660
the complexity basically
doesn't increase very much.
648
00:47:51,720 --> 00:47:55,520
With all of our statistical
models, one of the questions
649
00:47:55,520 --> 00:47:59,850
is how do we specify the
particular parameters?
650
00:47:59,850 --> 00:48:05,930
Maximum likelihood estimation is
the first thing to go through,
651
00:48:05,930 --> 00:48:12,050
and with normal
linear factor models
652
00:48:12,050 --> 00:48:13,580
we have normal
distributions for all
653
00:48:13,580 --> 00:48:16,890
the underlying random variables.
654
00:48:16,890 --> 00:48:19,755
So the residuals
epsilon_t are independent
655
00:48:19,755 --> 00:48:23,970
and identically distributed,
multivariate normal dimension m
656
00:48:23,970 --> 00:48:30,760
with diagonal matrix psi given
by the individual elements'
657
00:48:30,760 --> 00:48:31,740
variances.
658
00:48:31,740 --> 00:48:36,950
f_t, the realization
of the factors,
659
00:48:36,950 --> 00:48:40,490
the k-dimensional
factors can have mean 0,
660
00:48:40,490 --> 00:48:43,970
and just to have the
identity covariance
661
00:48:43,970 --> 00:48:48,550
we can scale them and
make them uncorrelated.
662
00:48:48,550 --> 00:48:53,050
And then x_t will be
normally distributed
663
00:48:53,050 --> 00:48:55,360
with mean alpha and
covariance matrix
664
00:48:55,360 --> 00:48:59,210
sigma_x given by the formulas
in the previous slide.
665
00:49:03,020 --> 00:49:05,370
With these assumptions,
we can write down
666
00:49:05,370 --> 00:49:08,130
the model likelihood.
667
00:49:08,130 --> 00:49:10,440
The model likelihood
is the joint density
668
00:49:10,440 --> 00:49:12,195
of our data given the
unknown parameters.
669
00:49:22,670 --> 00:49:28,720
And the standard setup actually
for statistical factor modeling
670
00:49:28,720 --> 00:49:31,290
is to assume
independence over time.
671
00:49:31,290 --> 00:49:34,900
Now we know that there can
be time series dependence.
672
00:49:34,900 --> 00:49:37,240
We won't deal with
that at this point.
673
00:49:37,240 --> 00:49:41,340
Let's just assume that they
are independent across time.
674
00:49:41,340 --> 00:49:46,280
Then we can consider this
as simply the product
675
00:49:46,280 --> 00:49:51,570
of the conditional density
of x_t given alpha and sigma,
676
00:49:51,570 --> 00:49:54,020
which has this form.
677
00:49:54,020 --> 00:50:00,120
This form for the density
function of a multivariate
678
00:50:00,120 --> 00:50:05,210
normal should be very
familiar to you at this point.
679
00:50:05,210 --> 00:50:07,950
It's basically the extension
of the univariate normal
680
00:50:07,950 --> 00:50:10,010
distribution to m-variate.
681
00:50:10,010 --> 00:50:14,370
So we have 1 over the square
root of pi to the m power.
682
00:50:14,370 --> 00:50:16,380
There are m components.
683
00:50:16,380 --> 00:50:23,170
And then we divide by the square
root of the individual variance
684
00:50:23,170 --> 00:50:26,430
or the determinant of
the covariance matrix.
685
00:50:26,430 --> 00:50:31,970
And then the exponential
of this term here,
686
00:50:31,970 --> 00:50:41,370
which for the t-th case is
a quadratic form in the x's.
687
00:50:41,370 --> 00:50:46,050
So this multivariate normal
x, we take off its mean
688
00:50:46,050 --> 00:50:48,759
and look at the quadratic
form of that with the inverse
689
00:50:48,759 --> 00:50:49,800
of its covariance matrix.
690
00:50:57,650 --> 00:50:59,660
So there's the
log-likelihood function.
691
00:50:59,660 --> 00:51:06,400
It reduces to this form here.
692
00:51:06,400 --> 00:51:09,170
And maximum likelihood
estimation methods
693
00:51:09,170 --> 00:51:16,550
can be applied to specify all
the parameters of B and psi.
694
00:51:16,550 --> 00:51:23,620
And there's an EM algorithm,
which is applied in this case.
695
00:51:23,620 --> 00:51:26,070
I think I may have
highlighted it before,
696
00:51:26,070 --> 00:51:30,000
but the EM algorithm is a very
powerful estimation methodology
697
00:51:30,000 --> 00:51:33,850
for maximum likelihood
in statistics.
698
00:51:33,850 --> 00:51:40,520
When one has very
complicated models which
699
00:51:40,520 --> 00:51:44,530
can be simplified-- well,
models that are complicated
700
00:51:44,530 --> 00:51:47,830
by the fact that we have
hidden variables-- basically
701
00:51:47,830 --> 00:51:51,760
the hidden variables lead
to very complex likelihood
702
00:51:51,760 --> 00:51:54,550
functions.
703
00:51:54,550 --> 00:51:56,330
A simplification
of the EM algorithm
704
00:51:56,330 --> 00:52:00,450
is that if we could observe
some of the hidden variables,
705
00:52:00,450 --> 00:52:02,325
then our likelihood
functions are very simple
706
00:52:02,325 --> 00:52:05,070
and can be computed directly.
707
00:52:05,070 --> 00:52:10,820
And the EM algorithm
alternates estimating
708
00:52:10,820 --> 00:52:14,620
the hidden variables, assuming
the hidden variables are known
709
00:52:14,620 --> 00:52:18,362
doing the simple estimates with
the observed hidden variables,
710
00:52:18,362 --> 00:52:20,320
and then estimating the
hidden variables again,
711
00:52:20,320 --> 00:52:22,860
and just iterating that
process again and again.
712
00:52:22,860 --> 00:52:24,100
And it converges.
713
00:52:24,100 --> 00:52:26,460
And their paper
demonstrates that this
714
00:52:26,460 --> 00:52:29,980
applies in many, many
different application settings.
715
00:52:29,980 --> 00:52:33,610
And it's just a very, very
powerful estimation methodology
716
00:52:33,610 --> 00:52:39,900
that is applied here with
statistical factor analysis.
717
00:52:39,900 --> 00:52:45,460
I indicated that for now we
could just assume independence
718
00:52:45,460 --> 00:52:49,970
over time of the data points
in computing its likelihood.
719
00:52:49,970 --> 00:52:53,060
You recall our discussion
a couple of lectures back
720
00:52:53,060 --> 00:52:57,260
about the state-space models,
linear state-space models.
721
00:52:57,260 --> 00:53:00,710
Essentially, that linear
state-space model framework
722
00:53:00,710 --> 00:53:03,830
can be applied here as
well to incorporate time
723
00:53:03,830 --> 00:53:06,840
dependence in the data as well.
724
00:53:10,220 --> 00:53:16,190
So that simplification is not
binding in terms of holding us
725
00:53:16,190 --> 00:53:17,970
up in estimating these models.
726
00:53:25,555 --> 00:53:28,320
Let me go back here, OK.
727
00:53:28,320 --> 00:53:32,160
So the maximum likelihood
estimation process
728
00:53:32,160 --> 00:53:37,920
will give us estimates of the
B matrix and the psi matrix.
729
00:53:37,920 --> 00:53:43,630
So applying this EM
algorithm, a good computer
730
00:53:43,630 --> 00:53:51,880
can actually get estimates of
B and psi and the underlying
731
00:53:51,880 --> 00:53:53,880
alpha, of course.
732
00:53:53,880 --> 00:54:03,660
Now from these we can estimate
the factor realizations f_t.
733
00:54:03,660 --> 00:54:11,560
And these can be estimated by
simply this regression formula,
734
00:54:11,560 --> 00:54:13,640
using our estimates for
the factor loadings B
735
00:54:13,640 --> 00:54:17,720
hat, our estimates of
alpha, we can actually
736
00:54:17,720 --> 00:54:20,540
estimate the factor
realizations.
737
00:54:20,540 --> 00:54:24,390
So with statistical
factor analysis,
738
00:54:24,390 --> 00:54:27,360
we use the EM algorithm to
estimate the covariance matrix
739
00:54:27,360 --> 00:54:28,510
parameters.
740
00:54:28,510 --> 00:54:32,455
Then the next step, we
can estimate the factor
741
00:54:32,455 --> 00:54:32,996
realizations.
742
00:54:37,240 --> 00:54:41,310
So as the output
from factor analysis,
743
00:54:41,310 --> 00:54:45,830
we can work with these
factor realizations.
744
00:54:45,830 --> 00:54:50,610
And those realizations
or those estimates
745
00:54:50,610 --> 00:54:52,590
of the realizations
of the factors
746
00:54:52,590 --> 00:55:00,570
can then be used basically
for risk modeling as well.
747
00:55:00,570 --> 00:55:10,150
So we could do a statistical
factor analysis of returns
748
00:55:10,150 --> 00:55:15,980
in, say, the
commodities markets.
749
00:55:15,980 --> 00:55:21,610
And identify what factors are
driving returns and covariances
750
00:55:21,610 --> 00:55:23,910
in commodity markets.
751
00:55:23,910 --> 00:55:26,120
We can then get estimates
of those underlying
752
00:55:26,120 --> 00:55:29,570
factors from the methodology.
753
00:55:29,570 --> 00:55:32,610
We could then use those
as inputs to other models.
754
00:55:32,610 --> 00:55:35,210
Certain stocks may depend
on significant factors
755
00:55:35,210 --> 00:55:36,900
in the commodity markets.
756
00:55:36,900 --> 00:55:41,310
And what they depend on, well
we can use statistical modeling
757
00:55:41,310 --> 00:55:44,880
to identify where
the dependencies are.
758
00:55:44,880 --> 00:55:49,530
So getting these realizations
of the statistical factors
759
00:55:49,530 --> 00:55:52,610
is very useful, not
only to understand
760
00:55:52,610 --> 00:55:55,330
what happened in the
past with the process
761
00:55:55,330 --> 00:55:57,030
and how these
underlying factors vary,
762
00:55:57,030 --> 00:56:00,080
but you can also use those
as inputs to other models.
763
00:56:11,770 --> 00:56:16,950
Finally, let's see,
there was a lot
764
00:56:16,950 --> 00:56:19,050
of interest with
statistical factor
765
00:56:19,050 --> 00:56:23,030
analysis on the interpretation
of the underlying factors.
766
00:56:23,030 --> 00:56:28,320
Of course, in terms
of using any model,
767
00:56:28,320 --> 00:56:32,460
it's once confidence
rises when you have
768
00:56:32,460 --> 00:56:34,580
highly interpretable results.
769
00:56:34,580 --> 00:56:37,630
One of the initial applications
of statistical factor analysis
770
00:56:37,630 --> 00:56:40,310
was in measuring IQ.
771
00:56:40,310 --> 00:56:45,070
And how many people here
have taken an IQ test?
772
00:56:45,070 --> 00:56:47,580
Probably everybody
or almost everybody?
773
00:56:47,580 --> 00:56:51,240
Well actually if you want to
work for some hedge funds,
774
00:56:51,240 --> 00:56:54,690
you'll have to
take some IQ tests.
775
00:56:54,690 --> 00:57:00,402
But basically in an IQ test
there are 20, 30, 40 questions.
776
00:57:00,402 --> 00:57:02,360
And they're trying to
measure different aspects
777
00:57:02,360 --> 00:57:04,630
of your ability.
778
00:57:04,630 --> 00:57:09,820
And statistical
factor analysis has
779
00:57:09,820 --> 00:57:12,990
been used to try and understand
what are the underlying
780
00:57:12,990 --> 00:57:14,930
dimensions of intelligence.
781
00:57:14,930 --> 00:57:21,930
And one has the
flexibility of considering
782
00:57:21,930 --> 00:57:25,350
different transformations
of any given
783
00:57:25,350 --> 00:57:30,290
set of estimated factors by this
H matrix for transformation.
784
00:57:30,290 --> 00:57:35,230
And so there has been work in
statistical factor analysis
785
00:57:35,230 --> 00:57:38,520
to find rotations of
the factor loadings
786
00:57:38,520 --> 00:57:42,220
that make the factors
more interpretable.
787
00:57:42,220 --> 00:57:48,650
So I just raise that as there's
potential to get insight
788
00:57:48,650 --> 00:57:51,390
into these underlying factors
if that's appropriate.
789
00:57:51,390 --> 00:57:54,100
In the IQ setting, the
effort was actually
790
00:57:54,100 --> 00:57:57,979
to try and find what
are interpretations
791
00:57:57,979 --> 00:57:59,645
of different dimensions
of intelligence?
792
00:58:07,400 --> 00:58:10,940
We previously talked about
factor mimicking portfolios.
793
00:58:10,940 --> 00:58:13,280
The same thing applies.
794
00:58:13,280 --> 00:58:18,460
One final thing is with
likelihood ratio tests,
795
00:58:18,460 --> 00:58:23,700
one can test for whether
the linear factor model is
796
00:58:23,700 --> 00:58:25,870
a good description of the data.
797
00:58:25,870 --> 00:58:29,950
And so with likelihood
ratio tests,
798
00:58:29,950 --> 00:58:32,950
we compare the
likelihood of the data
799
00:58:32,950 --> 00:58:36,650
where we fit our unknown
parameters, the mean vector
800
00:58:36,650 --> 00:58:41,190
alpha and covariance matrix
sigma, without any constraints.
801
00:58:41,190 --> 00:58:45,850
And then we compare that
to the likelihood function
802
00:58:45,850 --> 00:58:50,070
under the factor model
with, say, k factors.
803
00:58:50,070 --> 00:58:56,930
And the likelihood
ratio tests are
804
00:58:56,930 --> 00:59:00,510
computed by looking at twice the
difference in log likelihoods.
805
00:59:00,510 --> 00:59:04,710
If you take an advanced
course in statistics,
806
00:59:04,710 --> 00:59:08,790
you'll see that basically this
difference in the likelihood
807
00:59:08,790 --> 00:59:13,280
functions under many conditions
is approximately a chi
808
00:59:13,280 --> 00:59:16,030
squared random
variable with degrees
809
00:59:16,030 --> 00:59:18,030
of freedom equal to the
difference in parameters
810
00:59:18,030 --> 00:59:20,070
under the two models.
811
00:59:20,070 --> 00:59:25,230
So that's why it's
specified this way.
812
00:59:25,230 --> 00:59:29,035
But anyway, one can test for
the dimensionality of the factor
813
00:59:29,035 --> 00:59:29,535
model.
814
00:59:33,940 --> 00:59:36,280
Before going into an
example of factor modeling,
815
00:59:36,280 --> 00:59:39,890
I want to cover principal
components analysis.
816
00:59:42,606 --> 00:59:44,230
Actually, principal
components analysis
817
00:59:44,230 --> 00:59:46,990
comes up in factor
modeling, but it's also
818
00:59:46,990 --> 00:59:52,700
a methodology that's appropriate
for modeling multivariate data
819
00:59:52,700 --> 00:59:56,410
and considering
dimensionality reduction.
820
00:59:56,410 --> 00:59:59,750
You're dealing with data
in very many dimensions.
821
00:59:59,750 --> 01:00:05,680
You're wondering is there
a simple characterization
822
01:00:05,680 --> 01:00:08,720
of the multivariate
structure that lies
823
01:00:08,720 --> 01:00:10,770
in a smaller dimensional space?
824
01:00:10,770 --> 01:00:14,130
And principle components
analysis gives us that.
825
01:00:14,130 --> 01:00:18,320
The theoretical framework for
principal components analysis
826
01:00:18,320 --> 01:00:23,300
is to consider an
m-variate random variable.
827
01:00:23,300 --> 01:00:27,650
So this is like a single
realization of asset returns
828
01:00:27,650 --> 01:00:31,620
in a given time, which has
some mean and covariance matrix
829
01:00:31,620 --> 01:00:32,120
sigma.
830
01:00:34,876 --> 01:00:36,250
The principal
components analysis
831
01:00:36,250 --> 01:00:41,190
is going to exploit the
eigenvalues and eigenvectors
832
01:00:41,190 --> 01:00:42,490
of the covariance matrix.
833
01:00:45,530 --> 01:00:50,320
Choongbum went through
eigenvalues and singular value
834
01:00:50,320 --> 01:00:51,370
decompositions.
835
01:00:51,370 --> 01:00:55,640
So here we basically have
the eigenvalue/eigenvector
836
01:00:55,640 --> 01:00:58,930
decomposition of our
covariance matrix sigma, which
837
01:00:58,930 --> 01:01:04,700
is the scalar eigenvalues
times the eigenvector
838
01:01:04,700 --> 01:01:08,270
gamma_i times its transpose.
839
01:01:08,270 --> 01:01:12,900
So we actually are able to
decompose our covariance matrix
840
01:01:12,900 --> 01:01:15,450
with eigenvalues, eigenvectors.
841
01:01:15,450 --> 01:01:20,980
The principal
component variables
842
01:01:20,980 --> 01:01:28,190
are to consider taking away the
mean from the random vector x,
843
01:01:28,190 --> 01:01:29,390
alpha.
844
01:01:29,390 --> 01:01:35,800
And then consider the weighted
average of those de-meaned x's
845
01:01:35,800 --> 01:01:39,630
given by the i-th eigenvector.
846
01:01:39,630 --> 01:01:42,210
So these are going to be
called the principal component
847
01:01:42,210 --> 01:01:46,710
variables, where gamma_1 is
the first one corresponding
848
01:01:46,710 --> 01:01:48,450
to the largest eigenvalue.
849
01:01:48,450 --> 01:01:51,609
Gamma m is going to be the
m-th, or last, corresponding
850
01:01:51,609 --> 01:01:52,275
to the smallest.
851
01:01:59,690 --> 01:02:07,650
The properties of these
principal component variables
852
01:02:07,650 --> 01:02:14,030
is that they have mean 0,
and their covariance matrix
853
01:02:14,030 --> 01:02:17,610
is given by the diagonal
matrix of eigenvalues.
854
01:02:17,610 --> 01:02:21,670
So the principal
component variables
855
01:02:21,670 --> 01:02:25,210
are a very simple sort
of affine transformation
856
01:02:25,210 --> 01:02:29,760
of the original variable x.
857
01:02:29,760 --> 01:02:38,450
We translate x to a new origin,
basically to the 0 origin,
858
01:02:38,450 --> 01:02:41,260
by subtracting the means off it.
859
01:02:41,260 --> 01:02:46,710
And then we multiply
that de-meaned x value
860
01:02:46,710 --> 01:02:51,335
by an orthogonal
matrix gamma prime.
861
01:02:54,004 --> 01:02:54,920
And what does that do?
862
01:02:54,920 --> 01:02:59,450
That simply rotates
the coordinate axes.
863
01:02:59,450 --> 01:03:04,330
So what we're doing is creating
a new coordinate system
864
01:03:04,330 --> 01:03:07,860
for our data, which
hasn't changed
865
01:03:07,860 --> 01:03:11,380
the relative position of the
data or the random variable
866
01:03:11,380 --> 01:03:14,600
at all in the space.
867
01:03:14,600 --> 01:03:18,280
Basically, it just is using
a new coordinate system
868
01:03:18,280 --> 01:03:22,389
with no change in the
overall variability of what
869
01:03:22,389 --> 01:03:23,886
we're working with.
870
01:03:38,170 --> 01:03:46,350
In matrix form, we can express
this principal component
871
01:03:46,350 --> 01:03:48,660
variables p.
872
01:03:51,540 --> 01:03:54,830
Let's consider partitioning
p into the first k
873
01:03:54,830 --> 01:03:59,750
elements and the last
m minus k elements p_2.
874
01:03:59,750 --> 01:04:05,463
Then our original random vector
x has this decomposition.
875
01:04:09,320 --> 01:04:13,530
And we can think of this
as being approximately
876
01:04:13,530 --> 01:04:14,850
a linear factor model.
877
01:04:19,790 --> 01:04:24,260
We can consider from
principal components analysis
878
01:04:24,260 --> 01:04:26,720
essentially if p_1, the
principle component variables,
879
01:04:26,720 --> 01:04:32,900
correspond to our factors,
then our linear factor model
880
01:04:32,900 --> 01:04:37,940
would have B as given by
gamma_1, F as given by p_1.
881
01:04:37,940 --> 01:04:42,400
And our epsilon vector would
be given by gamma_2 p_2.
882
01:04:42,400 --> 01:04:45,110
So the principal
components decomposition
883
01:04:45,110 --> 01:04:48,830
is almost a linear factor model.
884
01:04:48,830 --> 01:04:59,910
The only issue is this
gamma_2 p_2 is an m-vector,
885
01:04:59,910 --> 01:05:06,340
but it may not have a
diagonal covariance matrix.
886
01:05:06,340 --> 01:05:10,360
Under the linear factor model
with a given set of factors
887
01:05:10,360 --> 01:05:12,630
k less than m, we
always are assuming
888
01:05:12,630 --> 01:05:17,810
that the residual vector
has covariance matrix
889
01:05:17,810 --> 01:05:19,140
equal to a diagonal.
890
01:05:19,140 --> 01:05:21,530
With a principal
components analysis,
891
01:05:21,530 --> 01:05:25,810
that may or may not be true.
892
01:05:25,810 --> 01:05:29,870
So this is like an
approximate factor model,
893
01:05:29,870 --> 01:05:32,540
but that's why this is called
principal components analysis.
894
01:05:32,540 --> 01:05:35,792
It's not called principal
factor analysis yet.
895
01:05:45,130 --> 01:05:49,870
The empirical principal
components analysis now.
896
01:05:49,870 --> 01:05:51,870
We've gone through
just a description
897
01:05:51,870 --> 01:05:54,670
of theoretical principal
components, where
898
01:05:54,670 --> 01:05:58,454
if we have a mean vector alpha,
covariance matrix sigma, how
899
01:05:58,454 --> 01:06:00,620
we would define these
principle component variables.
900
01:06:00,620 --> 01:06:05,400
If we just have sample
data, then this slide
901
01:06:05,400 --> 01:06:08,782
goes through the computations
of the empirical principal
902
01:06:08,782 --> 01:06:09,970
components results.
903
01:06:09,970 --> 01:06:14,120
So all we're doing is
substituting in estimates
904
01:06:14,120 --> 01:06:17,220
of the means and
covariance matrix,
905
01:06:17,220 --> 01:06:19,110
and computing the
eigenvalue/eigenvector
906
01:06:19,110 --> 01:06:21,060
decomposition of that.
907
01:06:21,060 --> 01:06:25,180
And we get sample principal
component variables
908
01:06:25,180 --> 01:06:31,720
which are-- we basically
compute x, the de-meaned vector,
909
01:06:31,720 --> 01:06:38,780
or matrix of realizations and
pre-multiply that by gamma hat
910
01:06:38,780 --> 01:06:44,470
prime, which is the
matrix of eigenvectors
911
01:06:44,470 --> 01:06:46,570
corresponding to the
eigenvalue/eigenvector
912
01:06:46,570 --> 01:06:48,964
decomposition of the
sample covariance matrix.
913
01:06:54,790 --> 01:06:59,960
This slide goes through the
singular value decomposition.
914
01:06:59,960 --> 01:07:03,840
You don't have to go through and
compute variances, covariances
915
01:07:03,840 --> 01:07:08,340
to derive estimates of the
principal component variables.
916
01:07:08,340 --> 01:07:11,600
You can work simply with the
singular value decomposition.
917
01:07:11,600 --> 01:07:15,804
I'll let you go through
that on your own.
918
01:07:15,804 --> 01:07:18,220
There's an alternate definition
of the principal component
919
01:07:18,220 --> 01:07:19,803
variable though
that's very important.
920
01:07:27,270 --> 01:07:32,470
If we consider a
linear combination
921
01:07:32,470 --> 01:07:40,850
of the components of x, x_1
through x_m, given by w,
922
01:07:40,850 --> 01:07:45,150
if we consider a linear
combination of that which
923
01:07:45,150 --> 01:07:48,390
maximizes the variability
of that linear combination
924
01:07:48,390 --> 01:07:56,040
subject to the norm of the
coefficients w equals 1,
925
01:07:56,040 --> 01:08:00,340
then this is the first
principal component variable.
926
01:08:00,340 --> 01:08:08,250
So if we have in two
dimensions the x_1 and x_2,
927
01:08:08,250 --> 01:08:21,540
if we have points that look like
an ellipsoidal distribution,
928
01:08:21,540 --> 01:08:28,850
this would correspond to having
alpha 1 there, alpha 2 there,
929
01:08:28,850 --> 01:08:32,410
a sort of degree of variability.
930
01:08:32,410 --> 01:08:35,770
The principal
components analysis
931
01:08:35,770 --> 01:08:42,620
says, let's shift to the origin
being at (alpha_1, alpha_2).
932
01:08:42,620 --> 01:08:50,370
And then let's rotate the axes
to align with the eigenvectors.
933
01:08:50,370 --> 01:08:54,170
Well the first principal
component variable
934
01:08:54,170 --> 01:09:02,550
finds the dimension at which
the coordinate axis at which
935
01:09:02,550 --> 01:09:04,800
the variability is a maximum.
936
01:09:04,800 --> 01:09:07,350
And basically along
this dimension
937
01:09:07,350 --> 01:09:11,790
here, this is where
the variability
938
01:09:11,790 --> 01:09:12,800
would be the maximum.
939
01:09:12,800 --> 01:09:15,529
And that's the first
principal component variable.
940
01:09:15,529 --> 01:09:18,660
So this principal
components analysis
941
01:09:18,660 --> 01:09:20,930
is identifying
essentially where's
942
01:09:20,930 --> 01:09:23,620
there the most
variability in the data,
943
01:09:23,620 --> 01:09:28,390
where it's the most variability
without doing any change
944
01:09:28,390 --> 01:09:30,270
in the scaling of the data?
945
01:09:30,270 --> 01:09:33,816
All we're doing is
shifting and rotating.
946
01:09:33,816 --> 01:09:35,649
Then the second principal
component variable
947
01:09:35,649 --> 01:09:38,420
is basically the
direction which is
948
01:09:38,420 --> 01:09:42,529
orthogonal to the first, which
has the maximum variance.
949
01:09:42,529 --> 01:09:46,400
And continuing that
process to define all
950
01:09:46,400 --> 01:09:48,160
m principal component variables.
951
01:09:56,780 --> 01:09:58,180
In principal
components analysis,
952
01:09:58,180 --> 01:10:01,600
there's discussions of the
total variability of the data
953
01:10:01,600 --> 01:10:05,460
and how well that's explained
by principal components.
954
01:10:05,460 --> 01:10:09,030
If we have a covariance
matrix sigma,
955
01:10:09,030 --> 01:10:13,390
the total variance
can be defined
956
01:10:13,390 --> 01:10:17,960
and is defined as the sum
of the diagonal entries.
957
01:10:17,960 --> 01:10:21,040
So it's the trace of
a covariance matrix.
958
01:10:21,040 --> 01:10:25,220
We'll call that the total
variance of this multivariate
959
01:10:25,220 --> 01:10:27,160
x.
960
01:10:27,160 --> 01:10:31,630
That is equal to the sum
of the eigenvalues as well.
961
01:10:31,630 --> 01:10:36,520
So we have a decomposition
of the total variability
962
01:10:36,520 --> 01:10:40,050
into the variability of
different principal component
963
01:10:40,050 --> 01:10:42,070
variables.
964
01:10:42,070 --> 01:10:44,250
And the principal
component variables
965
01:10:44,250 --> 01:10:48,459
themselves are uncorrelated.
966
01:10:48,459 --> 01:10:49,875
You remember the
covariance matrix
967
01:10:49,875 --> 01:10:51,374
of the principal
component variables
968
01:10:51,374 --> 01:10:56,750
was the lambda, the diagonal
matrix of eigenvalues.
969
01:10:56,750 --> 01:11:00,060
So the off-diagonal
entries are 0.
970
01:11:00,060 --> 01:11:01,590
So the principal
component variables
971
01:11:01,590 --> 01:11:05,100
are uncorrelated, and
have variability lambda_k,
972
01:11:05,100 --> 01:11:07,610
and basically decompose
the variability.
973
01:11:07,610 --> 01:11:09,760
So principal components
analysis provides
974
01:11:09,760 --> 01:11:14,140
this very nice
decomposition of the data
975
01:11:14,140 --> 01:11:18,020
into different
dimensions, with highest
976
01:11:18,020 --> 01:11:22,450
to lowest information content
as given by the eigenvalues.
977
01:11:28,690 --> 01:11:34,140
I want to go
through a case study
978
01:11:34,140 --> 01:11:41,295
here of doing factor modeling
with the U.S. Treasury yields.
979
01:11:43,922 --> 01:11:49,040
I loaded in data into R, which
ranged from the beginning
980
01:11:49,040 --> 01:11:54,280
of 2000 to the end of May 2013.
981
01:11:54,280 --> 01:11:58,750
And here are the yields on
constant maturity U.S. Treasury
982
01:11:58,750 --> 01:12:01,100
securities ranging from
3 months, 6 months,
983
01:12:01,100 --> 01:12:03,050
up to 20 years.
984
01:12:03,050 --> 01:12:06,100
So this is essentially
the term structure
985
01:12:06,100 --> 01:12:12,858
of US Government [INAUDIBLE]
of varying levels of risk.
986
01:12:18,170 --> 01:12:25,292
Here's a plot of [INAUDIBLE]
over that period.
987
01:12:33,148 --> 01:12:36,585
So starting in the
[INAUDIBLE], we
988
01:12:36,585 --> 01:12:41,570
can see this, the rather
dramatic evolution of the term
989
01:12:41,570 --> 01:12:44,891
structure over
this entire period.
990
01:12:44,891 --> 01:12:47,320
If we wanted to have
set any [INAUDIBLE].
991
01:12:52,732 --> 01:12:55,190
If we wanted to do a principal
components analysis of this,
992
01:12:55,190 --> 01:12:57,900
well if we did the
entire period we'd
993
01:12:57,900 --> 01:13:01,040
be measuring variability
of all kinds of things.
994
01:13:01,040 --> 01:13:03,580
When things go down, up, down.
995
01:13:03,580 --> 01:13:07,380
What I've done in this
note is just initially
996
01:13:07,380 --> 01:13:15,750
to look at the period
from 2001 up through 2005.
997
01:13:15,750 --> 01:13:20,620
So we have five years of data
on basically the early part
998
01:13:20,620 --> 01:13:23,380
of this period that I
want to focus on and do
999
01:13:23,380 --> 01:13:32,340
a principal components analysis
of the yields on this data.
1000
01:13:32,340 --> 01:13:40,845
So here's basically the series
over that five year period.
1001
01:13:44,470 --> 01:13:47,110
Beginning of this
analysis, this analysis
1002
01:13:47,110 --> 01:13:50,590
is on the actual yield changes.
1003
01:13:50,590 --> 01:13:53,940
So just as we might be modeling
say asset prices over time
1004
01:13:53,940 --> 01:13:58,515
and then doing an analysis
of the changes, the returns,
1005
01:13:58,515 --> 01:14:00,015
here we're looking
at yield changes.
1006
01:14:07,080 --> 01:14:12,000
So first, you can see
there's-- basically,
1007
01:14:12,000 --> 01:14:17,170
the average daily value for the
different yield tenors ranging
1008
01:14:17,170 --> 01:14:20,250
from 3 months up to 20, those
are actually all negative.
1009
01:14:20,250 --> 01:14:24,360
That corresponds to the time
series over that five year
1010
01:14:24,360 --> 01:14:25,160
period.
1011
01:14:25,160 --> 01:14:29,480
Basically the time series
were all at lower levels
1012
01:14:29,480 --> 01:14:31,800
from beginning to
end on average.
1013
01:14:31,800 --> 01:14:36,400
The daily volatility is the
daily standard deviation.
1014
01:14:36,400 --> 01:14:42,590
Those vary from
0.0384 up to .0698
1015
01:14:42,590 --> 01:14:45,650
for-- is that the three year?
1016
01:14:45,650 --> 01:14:49,920
And this is the standard
deviation of daily yield
1017
01:14:49,920 --> 01:14:55,650
changes where 1 is like 1%.
1018
01:14:55,650 --> 01:15:02,310
And so basically it's
between three and six basis
1019
01:15:02,310 --> 01:15:05,407
points a day are the variation
in the yield changes.
1020
01:15:05,407 --> 01:15:06,990
So that's something
that's reasonable.
1021
01:15:06,990 --> 01:15:10,720
When you look at the
news or a newspaper
1022
01:15:10,720 --> 01:15:13,520
and see how interest
rates change from one day
1023
01:15:13,520 --> 01:15:15,780
to the next, it's generally
a few basis points
1024
01:15:15,780 --> 01:15:17,250
from one day to the next.
1025
01:15:20,680 --> 01:15:26,560
This next matrix is
the correlation matrix
1026
01:15:26,560 --> 01:15:27,885
of the yield changes.
1027
01:15:30,440 --> 01:15:32,650
If you look at
this closely, which
1028
01:15:32,650 --> 01:15:38,480
you can when you
download these results,
1029
01:15:38,480 --> 01:15:42,310
you'll see that
near the diagonal
1030
01:15:42,310 --> 01:15:47,870
the values are very high, like
above 90% for correlation.
1031
01:15:47,870 --> 01:15:51,930
And as you move across,
away from the diagonal,
1032
01:15:51,930 --> 01:15:53,918
the correlations
get lower and lower.
1033
01:15:58,180 --> 01:16:02,870
Mathematically that
is what is happening.
1034
01:16:02,870 --> 01:16:05,800
We can look at these things
graphically, which I always
1035
01:16:05,800 --> 01:16:06,810
like to do.
1036
01:16:06,810 --> 01:16:11,840
Here is just a graph, a bar
chart of the yield changes
1037
01:16:11,840 --> 01:16:14,290
and the standard
deviations of the yield
1038
01:16:14,290 --> 01:16:18,910
changes, daily volatilities
ranging from very short yields
1039
01:16:18,910 --> 01:16:22,020
to long-tenor yields,
up to 20 years.
1040
01:16:25,956 --> 01:16:28,416
So there's variability there.
1041
01:16:35,840 --> 01:16:40,500
Here is a pairs
plot of the data.
1042
01:16:40,500 --> 01:16:45,460
So what I've done is
just looked at basically
1043
01:16:45,460 --> 01:16:50,390
for every single tenor, this
is say the 5 year, 7 year,
1044
01:16:50,390 --> 01:16:53,015
10 year, 20 year.
1045
01:16:53,015 --> 01:16:55,970
I basically plotted
the yield changes
1046
01:16:55,970 --> 01:16:57,800
of each of those
against each other.
1047
01:16:57,800 --> 01:17:01,245
We could do this with basically
all nine different tenors,
1048
01:17:01,245 --> 01:17:08,690
and we'd have a very dense
page of a pairs plot.
1049
01:17:08,690 --> 01:17:10,950
So I split it up
into looking just
1050
01:17:10,950 --> 01:17:14,890
at the top and bottom
block diagonals.
1051
01:17:14,890 --> 01:17:18,000
But you can see basically
how the correlation
1052
01:17:18,000 --> 01:17:23,190
between these yield
changes is very tight
1053
01:17:23,190 --> 01:17:26,110
and then gets less tight
as you move further away.
1054
01:17:26,110 --> 01:17:33,250
With the long
tenors-- let's see,
1055
01:17:33,250 --> 01:17:39,030
the short tenors--
one, one more.
1056
01:17:39,030 --> 01:17:44,070
Here the short tenors, ranging
from 3 year, 2 year, 1 year,
1057
01:17:44,070 --> 01:17:45,240
6 month, and so forth.
1058
01:17:45,240 --> 01:17:48,660
So here you can see how it
gets less and less correlated
1059
01:17:48,660 --> 01:17:50,950
as you move away
from a given tenor.
1060
01:17:53,730 --> 01:17:58,100
Well the principal
components analysis
1061
01:17:58,100 --> 01:18:11,700
gives us-- if you conduct
a principal components,
1062
01:18:11,700 --> 01:18:14,200
basically the standard
output is first
1063
01:18:14,200 --> 01:18:18,610
a table of how the
variability of the series
1064
01:18:18,610 --> 01:18:22,210
is broken down across the
different component variables.
1065
01:18:22,210 --> 01:18:24,990
And so there's
basically the importance
1066
01:18:24,990 --> 01:18:29,640
of components for each of
the nine component variables
1067
01:18:29,640 --> 01:18:36,260
where it's measured in terms
of the relative squared
1068
01:18:36,260 --> 01:18:41,140
standard deviations of these
variables relative to the sum.
1069
01:18:41,140 --> 01:18:43,400
And the proportion
of variance explained
1070
01:18:43,400 --> 01:18:47,030
by the first component
variable is 0.849.
1071
01:18:47,030 --> 01:18:50,300
So basically 85% of
the total variability
1072
01:18:50,300 --> 01:18:54,000
is explained by the first
principal component variable.
1073
01:18:54,000 --> 01:18:57,990
Looking at the second
row, second in, 0.0919,
1074
01:18:57,990 --> 01:19:02,042
that's the percentage
of total variability
1075
01:19:02,042 --> 01:19:04,250
explained by the second
principal component variable.
1076
01:19:04,250 --> 01:19:05,700
So 9%.
1077
01:19:05,700 --> 01:19:08,920
And then for third
it's around 3%.
1078
01:19:08,920 --> 01:19:20,940
And it just goes
down closer to 0,
1079
01:19:20,940 --> 01:19:23,800
There's a scree plot for
principal components analysis,
1080
01:19:23,800 --> 01:19:26,352
which is just a plot
of the variability
1081
01:19:26,352 --> 01:19:28,310
of the different principal
component variables.
1082
01:19:28,310 --> 01:19:34,510
So you can see whether
the principal components
1083
01:19:34,510 --> 01:19:37,830
is explaining much variability
in the first few components
1084
01:19:37,830 --> 01:19:38,350
or not.
1085
01:19:38,350 --> 01:19:41,280
Here there's a huge
amount of variability
1086
01:19:41,280 --> 01:19:43,910
explained by the first
principal component variable.
1087
01:19:43,910 --> 01:19:47,930
I've plotted here the
standard deviations
1088
01:19:47,930 --> 01:19:50,616
of the original yield
changes in green,
1089
01:19:50,616 --> 01:19:52,990
versus the standard deviations
of the principal component
1090
01:19:52,990 --> 01:19:55,620
variables in blue.
1091
01:19:55,620 --> 01:20:02,090
So we basically are modeling
with principal component
1092
01:20:02,090 --> 01:20:03,900
variables most of
the variability
1093
01:20:03,900 --> 01:20:06,620
in the first few
principal components.
1094
01:20:06,620 --> 01:20:08,489
Now let's look at
the interpretation
1095
01:20:08,489 --> 01:20:10,030
of the principal
component variables.
1096
01:20:10,030 --> 01:20:12,400
There's the loadings
matrix, which
1097
01:20:12,400 --> 01:20:16,280
is the gamma matrix for the
principal components variables.
1098
01:20:19,440 --> 01:20:25,200
Looking at numbers is
less informative for me
1099
01:20:25,200 --> 01:20:26,160
than looking at graphs.
1100
01:20:26,160 --> 01:20:30,620
Here's a plot of the loadings
on the different yield
1101
01:20:30,620 --> 01:20:34,070
changes for the first
principal component variable.
1102
01:20:34,070 --> 01:20:36,120
So the first principal
component variable
1103
01:20:36,120 --> 01:20:39,760
is a weighted average of
all the yield changes,
1104
01:20:39,760 --> 01:20:44,690
giving greatest weight
to the five year.
1105
01:20:44,690 --> 01:20:45,210
What's that?
1106
01:20:45,210 --> 01:20:51,220
Well that's just a measure of a
level shift in the yield curve.
1107
01:20:51,220 --> 01:20:52,750
It's like, what's
the average yield
1108
01:20:52,750 --> 01:20:55,747
change across the whole range?
1109
01:20:55,747 --> 01:20:57,580
So that's what the first
principal component
1110
01:20:57,580 --> 01:20:59,720
variable is measuring.
1111
01:20:59,720 --> 01:21:03,440
The second principal component
variable gives positive weight
1112
01:21:03,440 --> 01:21:07,250
to the long tenors, negative
weight to the short tenors.
1113
01:21:07,250 --> 01:21:11,540
So it's looking at the
difference between the yield
1114
01:21:11,540 --> 01:21:13,920
changes on the long
tenors verses the yield
1115
01:21:13,920 --> 01:21:15,610
change on the short tenors.
1116
01:21:15,610 --> 01:21:19,774
So that's looking at how the
spread in yields is changing.
1117
01:21:27,090 --> 01:21:32,250
Then the third principal
component variable
1118
01:21:32,250 --> 01:21:36,190
has this structure.
1119
01:21:36,190 --> 01:21:38,780
And this structure
for the weights
1120
01:21:38,780 --> 01:21:40,050
is like a double difference.
1121
01:21:40,050 --> 01:21:44,570
It's looking at the difference
between the long tenor
1122
01:21:44,570 --> 01:21:48,150
and medium tenor,
minus the medium tenor,
1123
01:21:48,150 --> 01:21:50,710
minus the short tenor.
1124
01:21:50,710 --> 01:21:54,100
So that's giving us a measure
of the curvature of the term
1125
01:21:54,100 --> 01:21:57,440
structure and how that's
changing over time.
1126
01:21:57,440 --> 01:21:59,350
So these principal
component variables
1127
01:21:59,350 --> 01:22:01,600
are measuring the level
shift for the first,
1128
01:22:01,600 --> 01:22:04,710
the spread for the second, and
the curvature for the third.
1129
01:22:07,350 --> 01:22:09,250
With principal
components analysis,
1130
01:22:09,250 --> 01:22:11,879
many times I think
people focus just
1131
01:22:11,879 --> 01:22:14,170
on the first few principal
component variables and then
1132
01:22:14,170 --> 01:22:16,480
say they're done.
1133
01:22:16,480 --> 01:22:19,137
The last principle component
variable, and the last few,
1134
01:22:19,137 --> 01:22:20,720
can be very, very
interesting as well,
1135
01:22:20,720 --> 01:22:27,640
because these are the variables
of the original scales,
1136
01:22:27,640 --> 01:22:33,420
the linear combinations which
have the least variability.
1137
01:22:33,420 --> 01:22:35,760
And if you look at the
ninth principle component
1138
01:22:35,760 --> 01:22:37,500
variable-- there were
nine yield changes
1139
01:22:37,500 --> 01:22:43,810
here-- it's basically looking
at a weighted average of the 5
1140
01:22:43,810 --> 01:22:47,580
and 10 year minus the 7 year.
1141
01:22:47,580 --> 01:22:53,240
So this is like the hedge of the
7 year yield with the 5 and 10
1142
01:22:53,240 --> 01:22:53,739
year.
1143
01:22:56,910 --> 01:23:00,600
So that difference
in yield change
1144
01:23:00,600 --> 01:23:03,005
is-- that combination
of yield change
1145
01:23:03,005 --> 01:23:04,630
is going to have the
least variability.
1146
01:23:07,310 --> 01:23:08,720
The principal
component variables
1147
01:23:08,720 --> 01:23:10,860
have zero correlation.
1148
01:23:10,860 --> 01:23:14,840
Here's just a pairs plot of the
first three principal component
1149
01:23:14,840 --> 01:23:16,010
variables and the ninth.
1150
01:23:16,010 --> 01:23:18,670
And you can see
that those have been
1151
01:23:18,670 --> 01:23:22,240
transformed to have zero
correlations with each other.
1152
01:23:24,750 --> 01:23:31,540
One can plot the cumulative
principal component variables
1153
01:23:31,540 --> 01:23:35,570
over time to see how the
evolution of these underlying
1154
01:23:35,570 --> 01:23:38,820
factors has changed
over the time period.
1155
01:23:38,820 --> 01:23:42,300
And you'll recall that
we talked about the first
1156
01:23:42,300 --> 01:23:43,720
being the level shift.
1157
01:23:43,720 --> 01:23:49,750
Basically from 2001 to 2005, the
overall level of interest rates
1158
01:23:49,750 --> 01:23:51,150
went down and then went up.
1159
01:23:51,150 --> 01:23:54,030
And this is captured by this
first principal component
1160
01:23:54,030 --> 01:24:00,770
variable accumulating from 0
down to minus 8, back up to 0.
1161
01:24:06,170 --> 01:24:11,920
And the scale of this
change from 0 to minus 8
1162
01:24:11,920 --> 01:24:16,270
is the amount of
greatest variability.
1163
01:24:16,270 --> 01:24:19,340
The second principal
component variable
1164
01:24:19,340 --> 01:24:24,130
accumulates from 0 up to
less than 6, back down to 0.
1165
01:24:24,130 --> 01:24:27,092
So this is a measure
of the spread
1166
01:24:27,092 --> 01:24:28,300
between long and short rates.
1167
01:24:28,300 --> 01:24:31,470
So the spread
increased, and then it
1168
01:24:31,470 --> 01:24:33,805
decreased over the period.
1169
01:24:39,700 --> 01:24:46,560
And then the curvature, it
varies from 0 down to minus 1.5
1170
01:24:46,560 --> 01:24:47,560
back up to 0.
1171
01:24:47,560 --> 01:24:50,590
So how the curvature changed
over this entire period
1172
01:24:50,590 --> 01:24:57,260
was much, much less, which
is perhaps as it should be.
1173
01:24:57,260 --> 01:24:59,170
But these graphs
indicate basically
1174
01:24:59,170 --> 01:25:03,710
how these underlying factors
evolved over the time period.
1175
01:25:03,710 --> 01:25:10,410
In the case note I go through
and fit a statistical factor
1176
01:25:10,410 --> 01:25:13,600
analysis model to
these same data
1177
01:25:13,600 --> 01:25:16,980
and look at identifying
the number of factors.
1178
01:25:16,980 --> 01:25:19,970
And also comparing the results
over this five year period
1179
01:25:19,970 --> 01:25:24,030
with the period
from 2009 to 2013,
1180
01:25:24,030 --> 01:25:27,330
and comparing those
different results.
1181
01:25:27,330 --> 01:25:29,970
They are different,
and so it really
1182
01:25:29,970 --> 01:25:33,890
matters over what period one
specifies these models to.
1183
01:25:33,890 --> 01:25:37,620
And fitting these models is
really just a starting point
1184
01:25:37,620 --> 01:25:41,490
where you want to
ultimately model
1185
01:25:41,490 --> 01:25:44,450
the dynamics in these
factors and their structural
1186
01:25:44,450 --> 01:25:47,150
relationships.
1187
01:25:47,150 --> 01:25:49,000
So we'll finish there.