1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:22,370 --> 00:00:23,500 PROFESSOR: All right. 9 00:00:23,500 --> 00:00:24,000 Let's see. 10 00:00:24,000 --> 00:00:25,650 We're going to start today with a wrap 11 00:00:25,650 --> 00:00:29,130 up of our discussion of univariate time series 12 00:00:29,130 --> 00:00:29,990 analysis. 13 00:00:29,990 --> 00:00:35,140 And last time we went through the Wold representation 14 00:00:35,140 --> 00:00:37,830 theorem, which applies to covariance 15 00:00:37,830 --> 00:00:41,090 stationary processes, a very powerful theorem. 16 00:00:41,090 --> 00:00:44,880 And implementations of the covariance stationary 17 00:00:44,880 --> 00:00:47,700 processes with ARMA models. 18 00:00:47,700 --> 00:00:50,430 And we discussed estimation of those models 19 00:00:50,430 --> 00:00:54,840 with maximum likelihood. 20 00:00:54,840 --> 00:00:57,010 And here in this slide I just wanted 21 00:00:57,010 --> 00:01:01,070 to highlight how when we estimate models 22 00:01:01,070 --> 00:01:03,340 with maximum likelihood we need to have 23 00:01:03,340 --> 00:01:07,280 an assumption of a probability distribution for what's random, 24 00:01:07,280 --> 00:01:11,270 and in the ARMA structure we consider the simple case 25 00:01:11,270 --> 00:01:14,120 where the innovations, the eta_t, 26 00:01:14,120 --> 00:01:17,437 are normally distributed white noise. 27 00:01:17,437 --> 00:01:19,520 So they're independent and identically distributed 28 00:01:19,520 --> 00:01:21,500 normal random variables. 29 00:01:21,500 --> 00:01:24,000 And the likelihood function can be 30 00:01:24,000 --> 00:01:28,240 maximized at the maximum likelihood parameters. 31 00:01:28,240 --> 00:01:34,450 And it's simple to implement the limited information maximum 32 00:01:34,450 --> 00:01:38,200 likelihood where one conditions on the first few observations 33 00:01:38,200 --> 00:01:40,590 in the time series. 34 00:01:40,590 --> 00:01:46,870 If you look at the likelihood structure for ARMA models, 35 00:01:46,870 --> 00:01:50,520 the density of an outcome at a given time point 36 00:01:50,520 --> 00:01:53,700 depends on lags of that dependent variable. 37 00:01:53,700 --> 00:01:58,000 So if those are unavailable, then that can be a problem. 38 00:01:58,000 --> 00:02:01,930 One can implement limited information maximum likelihood 39 00:02:01,930 --> 00:02:04,680 where you're just conditioning on those initial values, 40 00:02:04,680 --> 00:02:07,740 or there are full information maximum likelihood methods 41 00:02:07,740 --> 00:02:09,750 that you can apply as well. 42 00:02:09,750 --> 00:02:13,170 Generally though the limited information case 43 00:02:13,170 --> 00:02:16,300 is what's applied. 44 00:02:16,300 --> 00:02:18,800 Then the issue is model selection. 45 00:02:18,800 --> 00:02:22,020 And with model selection the issues 46 00:02:22,020 --> 00:02:23,870 that arise with time series are issues 47 00:02:23,870 --> 00:02:27,480 that arise in fitting any kind of statistical model. 48 00:02:27,480 --> 00:02:30,040 Ordinarily one will have multiple candidates 49 00:02:30,040 --> 00:02:32,270 for the model you want to fit to data. 50 00:02:32,270 --> 00:02:34,240 And the issue is how do you judge which 51 00:02:34,240 --> 00:02:36,390 ones are better than others. 52 00:02:36,390 --> 00:02:38,700 Why would you prefer one over the other? 53 00:02:38,700 --> 00:02:43,230 And if we're considering a collection of different ARMA 54 00:02:43,230 --> 00:02:49,670 models then we could say, fit all ARMA models of order p,q 55 00:02:49,670 --> 00:02:53,100 with p and q varying over some range. 56 00:02:53,100 --> 00:02:57,130 p from 0 up to p_max, q from q up to q_max. 57 00:02:57,130 --> 00:03:02,270 And evaluate those p,q different models. 58 00:03:02,270 --> 00:03:05,570 And if we consider sigma tilde squared of p, 59 00:03:05,570 --> 00:03:09,270 q being the MLE of the error variance, 60 00:03:09,270 --> 00:03:12,490 then there are these model selection criteria 61 00:03:12,490 --> 00:03:14,150 that are very popular. 62 00:03:14,150 --> 00:03:16,450 Akaike information criterion, and Bayes information 63 00:03:16,450 --> 00:03:18,060 criterion, and Hannan-Quinn. 64 00:03:18,060 --> 00:03:22,850 Now these criteria all have the same term, 65 00:03:22,850 --> 00:03:27,120 log of the MLE of the error variance. 66 00:03:27,120 --> 00:03:30,520 So these criteria don't vary at all with that. 67 00:03:30,520 --> 00:03:32,260 They just vary with this second term, 68 00:03:32,260 --> 00:03:35,780 but let's focus first on the AIC criterion. 69 00:03:35,780 --> 00:03:37,970 A given model is going to be better 70 00:03:37,970 --> 00:03:44,760 if the log of the MLE for the error variance is smaller. 71 00:03:44,760 --> 00:03:48,130 Now is that a good thing? 72 00:03:48,130 --> 00:03:50,186 Meaning, what is the interpretation 73 00:03:50,186 --> 00:03:52,560 of that practically when you're fitting different models? 74 00:03:55,720 --> 00:03:58,060 Well, the practical interpretation 75 00:03:58,060 --> 00:04:02,980 is the variability of the model about where you're 76 00:04:02,980 --> 00:04:05,380 predicting things, our estimate of the error variance 77 00:04:05,380 --> 00:04:06,120 is smaller. 78 00:04:06,120 --> 00:04:09,940 So we have essentially a model with a smaller error 79 00:04:09,940 --> 00:04:12,300 variance is better. 80 00:04:12,300 --> 00:04:15,720 So we're trying to minimize the log of that variance. 81 00:04:15,720 --> 00:04:18,040 Minimizing that is a good thing. 82 00:04:18,040 --> 00:04:23,095 Now what happens when you have many sort 83 00:04:23,095 --> 00:04:26,120 of independent variables to include in a model? 84 00:04:26,120 --> 00:04:28,930 Well, if you were doing a Taylor series approximation 85 00:04:28,930 --> 00:04:30,990 of a continuous function, eventually you'd 86 00:04:30,990 --> 00:04:34,150 sort of get to probably the smooth function 87 00:04:34,150 --> 00:04:39,800 with enough terms, but suppose that the actual model, it does 88 00:04:39,800 --> 00:04:43,230 have a finite number of parameters. 89 00:04:43,230 --> 00:04:44,990 And you're considering new factors, 90 00:04:44,990 --> 00:04:46,990 new lags of independent variables 91 00:04:46,990 --> 00:04:49,200 in the autoregressions. 92 00:04:49,200 --> 00:04:51,190 As you add more and more variables, 93 00:04:51,190 --> 00:04:55,480 well, there really should be a penalty 94 00:04:55,480 --> 00:05:00,140 for adding extra variables that aren't adding 95 00:05:00,140 --> 00:05:03,620 real value to the model in terms of reducing the error variance. 96 00:05:03,620 --> 00:05:06,190 So the Akaike information criterion 97 00:05:06,190 --> 00:05:11,020 is penalizing different models by a factor that 98 00:05:11,020 --> 00:05:14,490 depends on the size of the model in terms of the dimensionality 99 00:05:14,490 --> 00:05:17,380 of the model parameters. 100 00:05:17,380 --> 00:05:19,340 So p plus q is the dimensionality 101 00:05:19,340 --> 00:05:24,020 of the autoregression model. 102 00:05:24,020 --> 00:05:29,770 So let's see. 103 00:05:29,770 --> 00:05:34,270 With the BIC criterion the difference between that 104 00:05:34,270 --> 00:05:39,030 and the AIC criterion is that this factor two is 105 00:05:39,030 --> 00:05:43,090 replaced by log n. 106 00:05:43,090 --> 00:05:49,120 So rather than having a sort of unit increment of penalty 107 00:05:49,120 --> 00:05:54,460 for adding an extra parameter, the Bayes information criterion 108 00:05:54,460 --> 00:05:59,730 is adding a log n penalty times the number of parameters. 109 00:05:59,730 --> 00:06:03,700 And so as the sample size gets larger and larger, 110 00:06:03,700 --> 00:06:07,030 that penalty gets higher and higher. 111 00:06:07,030 --> 00:06:12,610 Now the practical interpretation of the Akaike information 112 00:06:12,610 --> 00:06:18,140 criterion is that it is very similar to applying 113 00:06:18,140 --> 00:06:21,520 a rule which says, we're going to include variables 114 00:06:21,520 --> 00:06:28,740 in our model if the square of the t statistic for estimating 115 00:06:28,740 --> 00:06:34,850 the additional parameter in the model is greater than 2 or not. 116 00:06:34,850 --> 00:06:41,710 So in terms of when does the Akaike information criterion 117 00:06:41,710 --> 00:06:45,670 become lower from adding additional terms to a model? 118 00:06:45,670 --> 00:06:49,150 If you're considering two models that differ by just one factor, 119 00:06:49,150 --> 00:06:52,530 it's basically if the t statistic for the model 120 00:06:52,530 --> 00:06:56,890 coefficient on that factor is a squared value greater than two 121 00:06:56,890 --> 00:06:57,830 or not. 122 00:06:57,830 --> 00:07:03,310 Now many of you who have seen regression models before 123 00:07:03,310 --> 00:07:05,560 and applied them, in particular applications 124 00:07:05,560 --> 00:07:08,270 would probably say, I really don't 125 00:07:08,270 --> 00:07:11,590 believe in the value of an additional factor 126 00:07:11,590 --> 00:07:15,590 unless the t statistic is greater than 1.96, 127 00:07:15,590 --> 00:07:18,410 or 2 or something. 128 00:07:18,410 --> 00:07:20,320 But the Akaike information criterion 129 00:07:20,320 --> 00:07:22,410 says the t statistic should be greater 130 00:07:22,410 --> 00:07:24,450 than the square root of 2. 131 00:07:24,450 --> 00:07:27,290 So it's sort of a weaker constraint for adding variables 132 00:07:27,290 --> 00:07:28,510 into the model. 133 00:07:28,510 --> 00:07:31,551 And now why is it called an information criterion? 134 00:07:31,551 --> 00:07:33,050 I won't go into this in the lecture. 135 00:07:33,050 --> 00:07:34,940 I am happy to go into it during office hours, 136 00:07:34,940 --> 00:07:38,085 but there's notions of information theory 137 00:07:38,085 --> 00:07:41,480 and Kullback-Leibler information of the model 138 00:07:41,480 --> 00:07:43,500 versus the true model, and trying 139 00:07:43,500 --> 00:07:47,730 to basically maximize the closeness of our fitted model 140 00:07:47,730 --> 00:07:48,960 to that. 141 00:07:48,960 --> 00:07:50,770 Now the Hannan-Quinn criterion, let's 142 00:07:50,770 --> 00:07:52,270 just look at how that differs. 143 00:07:52,270 --> 00:07:57,480 Well, that basically has a penalty midway between the log 144 00:07:57,480 --> 00:07:58,460 n and two. 145 00:07:58,460 --> 00:08:01,320 It's 2*log(log n). 146 00:08:01,320 --> 00:08:03,890 So this has a penalty that's increasing with size n, 147 00:08:03,890 --> 00:08:07,680 but not as fast as log n. 148 00:08:07,680 --> 00:08:11,480 This becomes relevant when we have 149 00:08:11,480 --> 00:08:16,240 models that get to be very large because we have a lot of data. 150 00:08:16,240 --> 00:08:17,780 Basically the more data you have, 151 00:08:17,780 --> 00:08:19,860 the more parameters you should be 152 00:08:19,860 --> 00:08:22,220 able to incorporate in the model if they're 153 00:08:22,220 --> 00:08:27,360 sort of statistically valid factors, important factors. 154 00:08:27,360 --> 00:08:29,960 And the Hannan-Quinn criterion basically 155 00:08:29,960 --> 00:08:36,179 allows for modeling processes where really an infinite number 156 00:08:36,179 --> 00:08:38,870 of variables might be appropriate, 157 00:08:38,870 --> 00:08:40,860 but you need larger and larger sample sizes 158 00:08:40,860 --> 00:08:44,640 to effectively estimate those. 159 00:08:44,640 --> 00:08:51,530 So those are the criteria that can be applied with time series 160 00:08:51,530 --> 00:08:52,330 models. 161 00:08:52,330 --> 00:08:55,210 And I should point out that, let's see, 162 00:08:55,210 --> 00:08:59,720 if you took sort of this factor 2 over n 163 00:08:59,720 --> 00:09:03,530 and inverted it to n over two log sigma squared, 164 00:09:03,530 --> 00:09:07,560 that term is basically one of the terms in the likelihood 165 00:09:07,560 --> 00:09:09,190 function of the fitted model. 166 00:09:09,190 --> 00:09:11,320 So you can see how this criterion is basically 167 00:09:11,320 --> 00:09:16,280 manipulating the maximum likelihood 168 00:09:16,280 --> 00:09:21,845 value by adjusting it for a penalty for extra parameters. 169 00:09:28,080 --> 00:09:28,590 Let's see. 170 00:09:28,590 --> 00:09:29,360 OK. 171 00:09:29,360 --> 00:09:31,595 Next topic is just test for stationarity 172 00:09:31,595 --> 00:09:32,470 and non-stationarity. 173 00:09:35,540 --> 00:09:40,870 There's a famous test called the Dickey-Fuller test, which 174 00:09:40,870 --> 00:09:47,550 is essentially to evaluate the time series to see if it's 175 00:09:47,550 --> 00:09:49,010 consistent with a random walk. 176 00:09:49,010 --> 00:09:51,490 We know that we've been discussing sort of lecture 177 00:09:51,490 --> 00:09:56,880 after lecture how simple random walks are non-stationary. 178 00:09:56,880 --> 00:10:02,960 And the simple random walk is given by the model up here, 179 00:10:02,960 --> 00:10:05,920 x_t equals phi x_(t-1) plus eta_t. 180 00:10:05,920 --> 00:10:09,345 If phi is equal to 1, right, that 181 00:10:09,345 --> 00:10:11,740 is a non-stationary process. 182 00:10:11,740 --> 00:10:14,190 Well, in the Dickey-Fuller test we 183 00:10:14,190 --> 00:10:17,640 want to test whether phi equals 1 or not. 184 00:10:17,640 --> 00:10:23,090 And so we can fit the AR(1) model by least squares 185 00:10:23,090 --> 00:10:29,110 and define the test statistic to be the estimate of phi minus 1 186 00:10:29,110 --> 00:10:33,667 over its standard error where phi is the least squares 187 00:10:33,667 --> 00:10:36,250 estimate and the standard error is the least squares estimate, 188 00:10:36,250 --> 00:10:39,630 the standard error of that. 189 00:10:39,630 --> 00:10:44,970 If our coefficient phi is less than 1 in modulus, 190 00:10:44,970 --> 00:10:47,430 so this really is a stationary series, 191 00:10:47,430 --> 00:10:56,080 then the estimate phi converges in distribution to a normal 0, 192 00:10:56,080 --> 00:10:58,950 1 minus phi squared. 193 00:10:58,950 --> 00:11:04,920 And let's see. 194 00:11:04,920 --> 00:11:09,990 But if phi is equal to 1, OK, so just 195 00:11:09,990 --> 00:11:12,450 to recap that second to last bullet point 196 00:11:12,450 --> 00:11:17,580 is basically the property that when norm phi is less than 1, 197 00:11:17,580 --> 00:11:22,510 then our least squares estimates are asymptotically 198 00:11:22,510 --> 00:11:26,420 normally distributed with mean 0 if we 199 00:11:26,420 --> 00:11:29,740 normalize by the true value, and 1 minus phi squared. 200 00:11:29,740 --> 00:11:34,220 If phi is equal to 1, then it turns out 201 00:11:34,220 --> 00:11:38,470 that phi hat is super-consistent with rate 1 over t. 202 00:11:38,470 --> 00:11:44,520 Now this super-consistency is related 203 00:11:44,520 --> 00:11:50,900 to statistics converging to some value, 204 00:11:50,900 --> 00:11:55,590 and what is the rate of convergence of those statistics 205 00:11:55,590 --> 00:11:56,490 to different values. 206 00:11:56,490 --> 00:12:04,840 So in normal samples we can estimate sort of the mean 207 00:12:04,840 --> 00:12:06,440 by the sample mean. 208 00:12:06,440 --> 00:12:13,960 And that will converge to the true mean at rate of 1 209 00:12:13,960 --> 00:12:14,570 over root n. 210 00:12:17,660 --> 00:12:23,070 When we have a non-stationary random walk, 211 00:12:23,070 --> 00:12:28,890 the independent variables matrix is such 212 00:12:28,890 --> 00:12:35,250 that X transpose X over n grows without bound. 213 00:12:35,250 --> 00:12:42,270 So if we have y is equal to X beta plus epsilon, 214 00:12:42,270 --> 00:12:46,750 and beta hat is equal to X transpose X inverse X 215 00:12:46,750 --> 00:12:55,940 transpose y, the problem is-- well, 216 00:12:55,940 --> 00:13:01,080 and beta hat is distributed as ultimately 217 00:13:01,080 --> 00:13:03,500 normal with mean beta and variance sigma 218 00:13:03,500 --> 00:13:07,430 squared, X transpose X inverse. 219 00:13:07,430 --> 00:13:10,160 This X transpose X inverse matrix, 220 00:13:10,160 --> 00:13:14,750 when the process is non-stationary, a random walk, 221 00:13:14,750 --> 00:13:15,650 it grows infinitely. 222 00:13:19,320 --> 00:13:23,980 X transpose X over n actually grows 223 00:13:23,980 --> 00:13:31,280 to infinity in magnitude just because it becomes unbounded. 224 00:13:31,280 --> 00:13:34,560 Whereas X transpose X over n, when it's stationary 225 00:13:34,560 --> 00:13:36,130 is bounded. 226 00:13:36,130 --> 00:13:39,210 So anyway, so that leads to the super-consistency, 227 00:13:39,210 --> 00:13:41,930 meaning that it converges to the value much faster 228 00:13:41,930 --> 00:13:44,590 and so this normal distribution isn't appropriate. 229 00:13:44,590 --> 00:13:47,730 And it turns out there's Dickey-Fuller distribution 230 00:13:47,730 --> 00:13:50,910 for this test statistic, which is based on integrals 231 00:13:50,910 --> 00:13:55,150 of diffusions and one can read about that 232 00:13:55,150 --> 00:14:01,990 in the literature on unit roots and test for non-stationarity. 233 00:14:01,990 --> 00:14:05,760 So there's a very rich literature on this problem. 234 00:14:05,760 --> 00:14:12,470 If you're into econometrics, basically a lot of time's 235 00:14:12,470 --> 00:14:15,870 been spent in that field on this topic. 236 00:14:15,870 --> 00:14:22,370 And the mathematics gets very, very involved, 237 00:14:22,370 --> 00:14:26,190 but good results are available. 238 00:14:26,190 --> 00:14:30,230 So let's see an application of some of these time series 239 00:14:30,230 --> 00:14:30,730 methods. 240 00:14:37,550 --> 00:14:40,740 Let me go to the desktop here if I can. 241 00:14:40,740 --> 00:14:46,690 In this supplemental material that'll be on the website, 242 00:14:46,690 --> 00:14:49,540 I just wanted you to be able to work 243 00:14:49,540 --> 00:14:51,150 with time series, real time series 244 00:14:51,150 --> 00:14:53,250 and implement these autoregressive moving 245 00:14:53,250 --> 00:14:58,220 average fits and understand basically how things work. 246 00:14:58,220 --> 00:15:03,440 So in this, it introduces loading the R libraries 247 00:15:03,440 --> 00:15:06,190 and Federal Reserve data into R, basically collecting it 248 00:15:06,190 --> 00:15:07,370 off the web. 249 00:15:07,370 --> 00:15:11,300 Creating weekly and monthly time series from a daily series, 250 00:15:11,300 --> 00:15:14,160 and it's a trivial thing to do, but when you sit down and try 251 00:15:14,160 --> 00:15:16,460 to do it gets involved. 252 00:15:16,460 --> 00:15:20,210 So there's some nice tools that are available. 253 00:15:20,210 --> 00:15:22,080 There's the ACF and the PACF, which 254 00:15:22,080 --> 00:15:25,710 is the auto-correlation function and the partial 255 00:15:25,710 --> 00:15:29,030 auto-correlation function, which are 256 00:15:29,030 --> 00:15:30,690 used for interpreting series. 257 00:15:30,690 --> 00:15:35,350 Then we conduct Dickey-Fuller test for unit roots 258 00:15:35,350 --> 00:15:39,920 and determine, evaluate stationarity, non-stationarity 259 00:15:39,920 --> 00:15:42,170 of the 10-year yield. 260 00:15:42,170 --> 00:15:48,010 And then we evaluate stationarity and cyclicality 261 00:15:48,010 --> 00:15:51,240 in the fitted autoregressive model of order 2 262 00:15:51,240 --> 00:15:53,350 to monthly data. 263 00:15:53,350 --> 00:15:56,940 And actually 1.7 there, that cyclicality issue, 264 00:15:56,940 --> 00:15:59,980 relates to one of the problems on the problem set 265 00:15:59,980 --> 00:16:02,850 for time series, which is looking at, 266 00:16:02,850 --> 00:16:06,030 with second order autoregressive models, 267 00:16:06,030 --> 00:16:10,510 is there cyclicality in the process? 268 00:16:10,510 --> 00:16:12,890 And then finally looking at identifying 269 00:16:12,890 --> 00:16:16,960 the best autoregressive model using the AIC criterion. 270 00:16:16,960 --> 00:16:21,500 So let me just page through and show you a couple of plots 271 00:16:21,500 --> 00:16:22,470 here. 272 00:16:22,470 --> 00:16:22,970 OK. 273 00:16:22,970 --> 00:16:26,379 Well, there's the original 10-year yield 274 00:16:26,379 --> 00:16:28,170 collected directly from the Federal Reserve 275 00:16:28,170 --> 00:16:32,360 website over a 10 year period. 276 00:16:32,360 --> 00:16:34,600 And, oh, here we go. 277 00:16:34,600 --> 00:16:35,540 This is nice. 278 00:16:35,540 --> 00:16:36,040 OK. 279 00:16:42,580 --> 00:16:43,730 OK. 280 00:16:43,730 --> 00:16:46,870 Let's see, this section 1.4 conducts 281 00:16:46,870 --> 00:16:49,930 the Dickey-Fuller test. 282 00:16:49,930 --> 00:17:03,080 And it basically determines that the p-value 283 00:17:03,080 --> 00:17:06,420 for non-stationarity is not rejected. 284 00:17:06,420 --> 00:17:12,819 And so, with the augmented Dickey-Fuller test, 285 00:17:12,819 --> 00:17:15,089 the test statistic is computed. 286 00:17:15,089 --> 00:17:19,849 Its significance is evaluated by the distribution 287 00:17:19,849 --> 00:17:21,760 for that statistic. 288 00:17:21,760 --> 00:17:24,790 And the p-value tells you how extreme the value 289 00:17:24,790 --> 00:17:28,910 of the statistic is, meaning how unusual is it. 290 00:17:28,910 --> 00:17:33,950 The smaller the p-value, the more unlikely the value is. 291 00:17:33,950 --> 00:17:35,910 The p-value is what's the likelihood of getting 292 00:17:35,910 --> 00:17:39,690 as extreme or more extreme a value of the test statistic, 293 00:17:39,690 --> 00:17:41,150 and the test statistic is evidence 294 00:17:41,150 --> 00:17:43,075 against the null hypothesis. 295 00:17:43,075 --> 00:17:48,850 So in this case the p-values range basically 0.2726 296 00:17:48,850 --> 00:18:00,760 for the monthly data, which says that basically there 297 00:18:00,760 --> 00:18:03,345 is evidence of a unit root in the process. 298 00:18:06,530 --> 00:18:08,980 Let's see. 299 00:18:08,980 --> 00:18:09,480 OK. 300 00:18:09,480 --> 00:18:10,896 There's a section on understanding 301 00:18:10,896 --> 00:18:12,815 partial auto-correlation coefficients. 302 00:18:16,740 --> 00:18:20,180 And let me just state what the partial correlation 303 00:18:20,180 --> 00:18:21,010 coefficients are. 304 00:18:21,010 --> 00:18:22,676 You have the auto-correlation functions, 305 00:18:22,676 --> 00:18:25,850 which are simply the correlations of the time 306 00:18:25,850 --> 00:18:28,190 series with lags of its values. 307 00:18:28,190 --> 00:18:30,020 The partial auto-correlation coefficient 308 00:18:30,020 --> 00:18:36,640 is the correlation that's between the time series 309 00:18:36,640 --> 00:18:42,180 and say, it's p-th lag that is not explained by all lags lower 310 00:18:42,180 --> 00:18:42,720 than p. 311 00:18:42,720 --> 00:18:45,690 So it's basically the incremental correlation 312 00:18:45,690 --> 00:18:50,460 of the time series variable with the p-th lag after controlling 313 00:18:50,460 --> 00:18:51,540 for the others. 314 00:18:55,650 --> 00:18:57,420 And then let's see. 315 00:18:57,420 --> 00:19:01,480 With this, in section eight here there's 316 00:19:01,480 --> 00:19:07,220 a function in R called ar, for autoregressive, which basically 317 00:19:07,220 --> 00:19:11,170 will fit all autoregressive models up to a given order 318 00:19:11,170 --> 00:19:14,230 and provide diagnostic statistics for that. 319 00:19:14,230 --> 00:19:18,110 And here is a plot of the relative AIC statistic 320 00:19:18,110 --> 00:19:20,640 for models of the monthly data. 321 00:19:20,640 --> 00:19:25,100 And you can see that basically it takes all the AIC statistics 322 00:19:25,100 --> 00:19:28,950 and subtracts the smallest one from all the others. 323 00:19:28,950 --> 00:19:33,495 So one can see that according to the AIC statistic 324 00:19:33,495 --> 00:19:40,110 a model of order seven is suggested for this treasury 325 00:19:40,110 --> 00:19:40,892 yield data. 326 00:19:43,670 --> 00:19:46,140 OK. 327 00:19:46,140 --> 00:19:49,500 Then finally because these autoregressive models 328 00:19:49,500 --> 00:19:52,920 are implemented with regression models, 329 00:19:52,920 --> 00:19:56,780 one can apply regression diagnostics 330 00:19:56,780 --> 00:20:02,180 that we had introduced earlier to look at those data as well. 331 00:20:02,180 --> 00:20:04,140 All right. 332 00:20:04,140 --> 00:20:07,495 So let's go down now. 333 00:20:14,978 --> 00:20:16,125 [INAUDIBLE] 334 00:20:16,125 --> 00:20:16,625 OK. 335 00:20:25,770 --> 00:20:27,970 [INAUDIBLE] 336 00:20:27,970 --> 00:20:28,660 Full screen. 337 00:20:28,660 --> 00:20:31,170 Here we go. 338 00:20:31,170 --> 00:20:31,670 All right. 339 00:20:36,700 --> 00:20:41,070 So let's move on to the topic of volatility modeling. 340 00:20:44,350 --> 00:20:50,290 The discussion in this section is 341 00:20:50,290 --> 00:20:53,640 going to begin with just defining volatility. 342 00:20:53,640 --> 00:20:56,450 So we know what we're talking about. 343 00:20:56,450 --> 00:21:01,740 And then measuring volatility with historical data 344 00:21:01,740 --> 00:21:05,190 where we don't really apply sort of statistical models so much, 345 00:21:05,190 --> 00:21:07,810 but we're concerned with just historical measures 346 00:21:07,810 --> 00:21:10,180 of volatility and their prediction. 347 00:21:10,180 --> 00:21:11,450 Then there are formal models. 348 00:21:11,450 --> 00:21:14,230 We'll introduce Geometric Brownian Motion, of course. 349 00:21:14,230 --> 00:21:17,080 That's one of the standard models in finance. 350 00:21:17,080 --> 00:21:18,710 But also Poisson jump-diffusions, 351 00:21:18,710 --> 00:21:22,240 which is an extension of Geometric Brownian Motion 352 00:21:22,240 --> 00:21:24,300 to allow for discontinuities. 353 00:21:24,300 --> 00:21:28,410 And then there's a property of these Brownian motion 354 00:21:28,410 --> 00:21:30,860 and jump-diffusion models which is models 355 00:21:30,860 --> 00:21:33,400 with independent increments. 356 00:21:33,400 --> 00:21:43,620 Basically you have disjoint increments of the process, 357 00:21:43,620 --> 00:21:45,750 basically are independent of each other, which 358 00:21:45,750 --> 00:21:51,270 is a key property when there's time dependence in the models. 359 00:21:51,270 --> 00:21:54,040 There can be time dependence actually in the volatility. 360 00:21:54,040 --> 00:21:55,980 And ARCH models were introduced initially 361 00:21:55,980 --> 00:21:57,084 to try and capture that. 362 00:21:57,084 --> 00:21:58,500 And were extended to GARCH models, 363 00:21:58,500 --> 00:22:00,910 and these are the sort of simplest cases 364 00:22:00,910 --> 00:22:03,530 of time-dependent volatility models 365 00:22:03,530 --> 00:22:06,680 that we can work with and introduce. 366 00:22:06,680 --> 00:22:11,630 And in all of these the sort of mathematical framework 367 00:22:11,630 --> 00:22:14,820 for defining these models and the statistical framework 368 00:22:14,820 --> 00:22:18,050 for estimating their parameters is going to be highlighted. 369 00:22:18,050 --> 00:22:22,100 And while it's a very simple setting 370 00:22:22,100 --> 00:22:24,710 in terms of what these models are, 371 00:22:24,710 --> 00:22:28,090 these issues that we'll be covering 372 00:22:28,090 --> 00:22:33,200 relate to virtually all statistical modeling as well. 373 00:22:33,200 --> 00:22:36,120 So let's define volatility. 374 00:22:36,120 --> 00:22:36,620 OK. 375 00:22:36,620 --> 00:22:40,480 In finance it's defined as the annualized standard deviation 376 00:22:40,480 --> 00:22:43,380 of the change in price or value of a financial security, 377 00:22:43,380 --> 00:22:45,280 or an index. 378 00:22:45,280 --> 00:22:49,630 So we're interested in the variability 379 00:22:49,630 --> 00:22:55,220 of this process, a price process or a value process. 380 00:22:55,220 --> 00:22:59,240 And we consider it on an annualized time scale. 381 00:22:59,240 --> 00:23:03,910 Now because of that, when you talk about volatility 382 00:23:03,910 --> 00:23:10,550 it really is meaningful to communicate, levels of 10%. 383 00:23:10,550 --> 00:23:17,500 If you think of, at what level do sort of absolute bond yields 384 00:23:17,500 --> 00:23:19,480 vary over a year? 385 00:23:22,440 --> 00:23:25,120 It's probably less than 5%. 386 00:23:25,120 --> 00:23:26,242 Bond yields don't-- 387 00:23:26,242 --> 00:23:27,950 When you think of currencies, how much do 388 00:23:27,950 --> 00:23:30,860 those vary over a year. 389 00:23:30,860 --> 00:23:32,790 Maybe 10%. 390 00:23:32,790 --> 00:23:35,480 With equity markets, how do those vary? 391 00:23:35,480 --> 00:23:39,700 Well, maybe 30%, 40% or more. 392 00:23:39,700 --> 00:23:43,170 With the estimation and prediction approaches, 393 00:23:43,170 --> 00:23:46,030 OK, these are what we'll be discussing. 394 00:23:46,030 --> 00:23:47,930 There's different cases. 395 00:23:47,930 --> 00:23:52,830 So let's go on to historical volatility. 396 00:23:52,830 --> 00:23:56,270 In terms of computing the historical volatility 397 00:23:56,270 --> 00:23:59,350 we'll be considering basically a price 398 00:23:59,350 --> 00:24:02,080 series of T plus 1 points. 399 00:24:02,080 --> 00:24:06,811 And then we can get T period returns 400 00:24:06,811 --> 00:24:08,310 corresponding to those prices, which 401 00:24:08,310 --> 00:24:12,450 is the difference in the logs of the prices, 402 00:24:12,450 --> 00:24:14,300 or the log of the price relatives. 403 00:24:14,300 --> 00:24:18,370 So R_t is going to be the return for the asset. 404 00:24:18,370 --> 00:24:22,710 And one could use other definitions, 405 00:24:22,710 --> 00:24:26,340 like sort of the absolute return, not take logs. 406 00:24:26,340 --> 00:24:30,160 It's convenient in much empirical analysis, 407 00:24:30,160 --> 00:24:34,250 I guess, to work with the logs because if you sum 408 00:24:34,250 --> 00:24:37,990 logs you get sort of log of the product. 409 00:24:37,990 --> 00:24:41,830 And so total cumulative returns can be computed easily 410 00:24:41,830 --> 00:24:43,670 with sums of logs. 411 00:24:43,670 --> 00:24:47,140 But anyway, we'll work with that scale for now. 412 00:24:47,140 --> 00:24:47,640 OK. 413 00:24:47,640 --> 00:24:52,080 Now the process R_t, the return series process, 414 00:24:52,080 --> 00:24:55,400 is going to be assumed to be covariance stationary, 415 00:24:55,400 --> 00:24:59,820 meaning that it does have a finite variance. 416 00:24:59,820 --> 00:25:04,900 And the sample estimate of that is just 417 00:25:04,900 --> 00:25:10,730 given by the square root of the sample variance. 418 00:25:10,730 --> 00:25:13,445 And we're also considering an unbiased estimate of that. 419 00:25:16,360 --> 00:25:20,770 And if we want to basically convert these 420 00:25:20,770 --> 00:25:22,570 to annualized values so that we're 421 00:25:22,570 --> 00:25:24,410 dealing with a volatility, then if we 422 00:25:24,410 --> 00:25:28,672 have daily prices of which in financial markets 423 00:25:28,672 --> 00:25:30,130 they're usually-- in the US they're 424 00:25:30,130 --> 00:25:33,550 open roughly 252 days a year on average. 425 00:25:33,550 --> 00:25:37,580 We multiply that sigma hat by 252 square root. 426 00:25:37,580 --> 00:25:44,110 And for weekly, root 52, and root 12 for monthly data. 427 00:25:44,110 --> 00:25:48,870 So regardless of the periodicity of our original data 428 00:25:48,870 --> 00:25:51,700 we can get them onto that volatility scale. 429 00:25:56,410 --> 00:26:00,960 Now in terms of prediction methods 430 00:26:00,960 --> 00:26:05,980 that one can make with historical volatility, 431 00:26:05,980 --> 00:26:12,230 and there's a lot of work done in finance by people 432 00:26:12,230 --> 00:26:15,060 who aren't sort of trained as econometricians 433 00:26:15,060 --> 00:26:18,570 or statisticians, they basically just work with the data. 434 00:26:18,570 --> 00:26:23,840 And there's a standard for risk analysis called the risk 435 00:26:23,840 --> 00:26:30,780 metrics approach, where the approach defines volatility 436 00:26:30,780 --> 00:26:33,470 and volatility estimates, historical estimates, just 437 00:26:33,470 --> 00:26:35,750 using simple methodologies. 438 00:26:35,750 --> 00:26:39,870 And so that's just go through what those are here. 439 00:26:39,870 --> 00:26:46,940 One can-- basically for any period t, 440 00:26:46,940 --> 00:26:49,710 one can define the sample volatility, 441 00:26:49,710 --> 00:26:53,670 just to be the sample standard deviation of the period t 442 00:26:53,670 --> 00:26:55,100 returns. 443 00:26:55,100 --> 00:26:58,430 And so with daily data that might just 444 00:26:58,430 --> 00:27:00,800 be the square of that daily return. 445 00:27:00,800 --> 00:27:05,150 With monthly data it could be the sample standard deviation 446 00:27:05,150 --> 00:27:08,240 of the returns over the month and with yearly it 447 00:27:08,240 --> 00:27:10,860 would be the sample over the year. 448 00:27:10,860 --> 00:27:15,150 Also with intraday data, it could be the sample standard 449 00:27:15,150 --> 00:27:22,900 deviation over intraday periods of say, half hours or hours. 450 00:27:22,900 --> 00:27:26,810 And the historical average is simply 451 00:27:26,810 --> 00:27:30,320 the mean of those estimates, which 452 00:27:30,320 --> 00:27:32,490 uses all the available data. 453 00:27:32,490 --> 00:27:34,830 One can consider the simple moving average 454 00:27:34,830 --> 00:27:38,280 of these realized volatilities. 455 00:27:38,280 --> 00:27:44,330 And so that basically is using the last m, for some finite m, 456 00:27:44,330 --> 00:27:46,260 values to average. 457 00:27:46,260 --> 00:27:53,296 And one could also consider an exponential moving average 458 00:27:53,296 --> 00:27:57,170 of these sample volatilities where 459 00:27:57,170 --> 00:28:02,060 we have-- our estimate of the volatility is 1 minus beta 460 00:28:02,060 --> 00:28:05,600 times the current period volatility 461 00:28:05,600 --> 00:28:08,550 plus beta times the previous estimate. 462 00:28:08,550 --> 00:28:10,780 And these exponential moving averages 463 00:28:10,780 --> 00:28:15,990 are really very nice ways to estimate 464 00:28:15,990 --> 00:28:19,740 processes that change over time. 465 00:28:19,740 --> 00:28:23,440 And they're able to track the changes quite well 466 00:28:23,440 --> 00:28:27,214 and they will tend to come up again and again. 467 00:28:27,214 --> 00:28:28,880 This exponential moving average actually 468 00:28:28,880 --> 00:28:31,990 uses all available data. 469 00:28:31,990 --> 00:28:34,770 And there can be discrete versions of those where 470 00:28:34,770 --> 00:28:37,615 you say, well let's use not an equal weighted average 471 00:28:37,615 --> 00:28:39,490 like the simple moving average, but let's use 472 00:28:39,490 --> 00:28:44,220 a geometric average of the last m values in an exponential way. 473 00:28:44,220 --> 00:28:46,790 And that's the exponential weighted moving average 474 00:28:46,790 --> 00:28:47,780 that uses the last m. 475 00:28:54,191 --> 00:28:54,690 OK. 476 00:28:54,690 --> 00:28:55,190 There we go. 477 00:29:03,109 --> 00:29:03,609 OK. 478 00:29:06,610 --> 00:29:11,870 Well, with these different measures of sample volatility, 479 00:29:11,870 --> 00:29:17,610 one can basically build models to estimate them 480 00:29:17,610 --> 00:29:23,990 with regression models and evaluate. 481 00:29:23,990 --> 00:29:26,650 And in terms of the risk metrics benchmark, 482 00:29:26,650 --> 00:29:30,140 they consider a variety of different methodologies 483 00:29:30,140 --> 00:29:32,080 for estimating volatility. 484 00:29:32,080 --> 00:29:35,000 And sort of determine what methods are best 485 00:29:35,000 --> 00:29:38,320 for different kinds of financial instruments. 486 00:29:38,320 --> 00:29:42,030 And different financial indexes. 487 00:29:42,030 --> 00:29:44,140 And there are different performance measures 488 00:29:44,140 --> 00:29:45,000 one can apply. 489 00:29:45,000 --> 00:29:47,740 Sort of mean squared error of prediction, 490 00:29:47,740 --> 00:29:51,020 mean absolute error of prediction, 491 00:29:51,020 --> 00:29:53,360 mean absolute prediction error, and so forth 492 00:29:53,360 --> 00:29:55,680 to evaluate different methodologies. 493 00:29:55,680 --> 00:30:00,640 And on the web you can actually look at the technical documents 494 00:30:00,640 --> 00:30:03,530 for risk metrics and they go through these analyses 495 00:30:03,530 --> 00:30:06,700 and if your interest is in a particular area of finance, 496 00:30:06,700 --> 00:30:09,810 whether it's fixed income or equities, commodities, 497 00:30:09,810 --> 00:30:13,220 or currencies, reviewing their work 498 00:30:13,220 --> 00:30:15,160 there is very interesting because it 499 00:30:15,160 --> 00:30:20,740 does highlight different aspects of those markets. 500 00:30:20,740 --> 00:30:25,690 And it turns out that basically the exponential moving average 501 00:30:25,690 --> 00:30:30,040 is generally a very good method for many instruments. 502 00:30:30,040 --> 00:30:38,050 And the sort of discounting of the values over time 503 00:30:38,050 --> 00:30:41,340 corresponds to having roughly between, I guess, a 45 504 00:30:41,340 --> 00:30:45,910 and a 90 day period in estimating your volatility. 505 00:30:45,910 --> 00:30:50,690 And in these approaches which are, I guess, 506 00:30:50,690 --> 00:30:52,930 they're a bit ad hoc. 507 00:30:52,930 --> 00:30:54,250 There's the formalism. 508 00:30:54,250 --> 00:30:57,530 And defining them is basically just empirically 509 00:30:57,530 --> 00:30:58,750 what has worked in the past. 510 00:31:03,760 --> 00:31:04,260 Let's see. 511 00:31:08,610 --> 00:31:12,170 While these things are ad hoc, they actually 512 00:31:12,170 --> 00:31:13,840 have been very, very effective. 513 00:31:13,840 --> 00:31:23,970 So let's move on to formal statistical models 514 00:31:23,970 --> 00:31:25,940 of volatility. 515 00:31:25,940 --> 00:31:30,740 And the first class is-- model is the Geometric Brownian 516 00:31:30,740 --> 00:31:31,240 Motion. 517 00:31:31,240 --> 00:31:37,700 So here we have basically a stochastic differential 518 00:31:37,700 --> 00:31:41,960 equation defining the model for Geometric Brownian Motion. 519 00:31:41,960 --> 00:31:44,950 And Choongbum will be going in some detail 520 00:31:44,950 --> 00:31:49,360 about stochastic differential equations, 521 00:31:49,360 --> 00:31:52,300 and stochastic calculus for representing 522 00:31:52,300 --> 00:31:55,590 different processes, continuous processes. 523 00:31:55,590 --> 00:32:00,910 And the formulation is basically looking 524 00:32:00,910 --> 00:32:08,470 at increments of the price process S is equal to basically 525 00:32:08,470 --> 00:32:14,910 a mu S of t, sort of a drift term, plus a sigma S of t, 526 00:32:14,910 --> 00:32:18,930 a multiple of d W of t, where sigma 527 00:32:18,930 --> 00:32:21,130 is the volatility of the security price, 528 00:32:21,130 --> 00:32:25,810 mu is the mean return per unit time, d W of t 529 00:32:25,810 --> 00:32:29,830 is the increment of a standard Brownian motion processor, 530 00:32:29,830 --> 00:32:31,350 Wiener process. 531 00:32:31,350 --> 00:32:38,210 And this W process is such that it's increments, 532 00:32:38,210 --> 00:32:42,160 basically the change in value of the process between two time 533 00:32:42,160 --> 00:32:46,410 points is normally distributed, with mean 0 534 00:32:46,410 --> 00:32:51,720 and variance equal to the length of the interval. 535 00:32:54,354 --> 00:32:56,770 And increments on disjoint time intervals are independent. 536 00:33:01,690 --> 00:33:10,810 And well, if you divide both sides 537 00:33:10,810 --> 00:33:16,535 of that equation by S of t then you have d S of t over S of t 538 00:33:16,535 --> 00:33:20,120 is equal to mu dt plus sigma d W of t. 539 00:33:20,120 --> 00:33:25,495 And so the increments d S of t normalized by S of t 540 00:33:25,495 --> 00:33:29,600 are a standard Brownian motion with drift mu and volatility 541 00:33:29,600 --> 00:33:30,100 sigma. 542 00:33:36,200 --> 00:33:44,570 Now with sample data from this process, 543 00:33:44,570 --> 00:33:46,890 now suppose we have prices observed 544 00:33:46,890 --> 00:33:50,820 at times t_0 up to t_n. 545 00:33:50,820 --> 00:33:53,960 And for now we're not going to make any assumptions 546 00:33:53,960 --> 00:33:57,950 about what those time increments are, what those times are. 547 00:33:57,950 --> 00:33:59,724 They could be equally spaced. 548 00:33:59,724 --> 00:34:01,015 They could be unequally spaced. 549 00:34:03,550 --> 00:34:10,420 The returns, the log of the relative price change from time 550 00:34:10,420 --> 00:34:15,880 t_(j-1) to t_j are independent random variables. 551 00:34:15,880 --> 00:34:19,610 And they are independent. 552 00:34:19,610 --> 00:34:21,800 Their distribution is normally distributed 553 00:34:21,800 --> 00:34:27,330 with mean given by mu times the length of the time increment, 554 00:34:27,330 --> 00:34:31,580 and variance sigma squared times the length of the increment. 555 00:34:31,580 --> 00:34:35,909 And these properties will be covered by Choongbum 556 00:34:35,909 --> 00:34:38,139 in some later lectures. 557 00:34:38,139 --> 00:34:41,750 So for now what we can just know that this is true 558 00:34:41,750 --> 00:34:46,420 and apply this result. If we fix various time 559 00:34:46,420 --> 00:34:49,130 points for the observation and compute returns this way. 560 00:34:49,130 --> 00:34:51,260 If it's a Geometric Brownian Motion 561 00:34:51,260 --> 00:34:55,610 we know that this is the distribution of the returns. 562 00:34:55,610 --> 00:34:58,190 Now knowing that distribution we can now 563 00:34:58,190 --> 00:35:01,620 engage in maximum likelihood estimation. 564 00:35:01,620 --> 00:35:02,120 OK. 565 00:35:02,120 --> 00:35:06,030 If the increments are all just equal to 1, 566 00:35:06,030 --> 00:35:09,140 so we're thinking of daily data, say. 567 00:35:09,140 --> 00:35:13,600 Then the maximum likelihood estimates are simple. 568 00:35:13,600 --> 00:35:17,570 It's basically the sample mean and the sample variance with 1 569 00:35:17,570 --> 00:35:20,340 over n instead of 1 over n minus 1 in the MLE's. 570 00:35:20,340 --> 00:35:26,520 If delta_j varies then, well, that's 571 00:35:26,520 --> 00:35:30,810 actually a case in the exercises. 572 00:35:30,810 --> 00:35:39,100 Now does anyone, in terms of, well, 573 00:35:39,100 --> 00:35:46,730 in the class exercise the issue that is important to think 574 00:35:46,730 --> 00:35:53,640 about is if you consider a given interval of time over which 575 00:35:53,640 --> 00:35:57,660 we're observing this Geometric Brownian Motion process, 576 00:35:57,660 --> 00:36:03,440 if we increase the sampling rate of prices over a given 577 00:36:03,440 --> 00:36:06,990 interval, how does that change the properties 578 00:36:06,990 --> 00:36:09,400 of our estimates? 579 00:36:09,400 --> 00:36:11,840 Basically, do we obtain more accurate estimates 580 00:36:11,840 --> 00:36:14,450 of the underlying parameters? 581 00:36:14,450 --> 00:36:19,420 And as you increase the sampling frequency, 582 00:36:19,420 --> 00:36:21,830 it turns out that some parameters are estimated much, 583 00:36:21,830 --> 00:36:26,190 much better and you get basically much 584 00:36:26,190 --> 00:36:28,730 lower standard errors on those estimates. 585 00:36:28,730 --> 00:36:31,900 With other parameters you don't necessarily. 586 00:36:31,900 --> 00:36:35,140 And the exercise is to evaluate that. 587 00:36:35,140 --> 00:36:37,350 Now another issue that's important 588 00:36:37,350 --> 00:36:42,550 is the issue of sort of what is the appropriate time scale 589 00:36:42,550 --> 00:36:46,910 for Geometric Brownian Motion. 590 00:36:46,910 --> 00:36:48,750 Right now we're thinking of, you collect 591 00:36:48,750 --> 00:36:52,055 data, whatever the periodicity is of the data 592 00:36:52,055 --> 00:36:54,430 is you think that's your period for your Brownian Motion. 593 00:36:54,430 --> 00:36:56,630 Let's evaluate that. 594 00:36:56,630 --> 00:37:01,655 Let me go to another example. 595 00:37:08,200 --> 00:37:09,060 Let's see here. 596 00:37:13,515 --> 00:37:15,350 Yep. 597 00:37:15,350 --> 00:37:15,850 OK. 598 00:37:15,850 --> 00:37:17,360 Let's go control-minus here. 599 00:37:24,830 --> 00:37:25,424 OK. 600 00:37:25,424 --> 00:37:25,924 All right. 601 00:37:31,026 --> 00:37:32,060 Let's see. 602 00:37:32,060 --> 00:37:33,640 With this second case study there 603 00:37:33,640 --> 00:37:41,200 was data on exchange rates, looking for regime changes 604 00:37:41,200 --> 00:37:43,920 in exchange rate relationships. 605 00:37:43,920 --> 00:37:46,560 And so we have data from that case study 606 00:37:46,560 --> 00:37:49,880 on different foreign exchange rates. 607 00:37:49,880 --> 00:37:57,390 And here in the top panel I've graphed the euro/dollar 608 00:37:57,390 --> 00:38:01,460 exchange rate from the beginning of 1999 609 00:38:01,460 --> 00:38:05,370 through just a few months ago. 610 00:38:05,370 --> 00:38:12,830 And the second panel is a plot of the daily returns 611 00:38:12,830 --> 00:38:14,730 for that series. 612 00:38:14,730 --> 00:38:21,860 And here is a histogram of those daily returns. 613 00:38:21,860 --> 00:38:28,990 And a fit of the Gaussian distribution for the daily 614 00:38:28,990 --> 00:38:33,270 returns if our sort of time scale is correct. 615 00:38:33,270 --> 00:38:37,350 Basically daily returns are normally distributed. 616 00:38:37,350 --> 00:38:41,630 Days are disjoint in terms of the price change. 617 00:38:41,630 --> 00:38:45,160 And so they're independent and identically distributed 618 00:38:45,160 --> 00:38:46,960 under the model. 619 00:38:46,960 --> 00:38:49,480 And they all have the same normal distribution 620 00:38:49,480 --> 00:38:52,330 with mean mu and variance sigma squared. 621 00:38:55,220 --> 00:38:55,870 OK. 622 00:38:55,870 --> 00:39:00,000 This analysis assumes basically that we're 623 00:39:00,000 --> 00:39:03,340 dealing with trading days for the appropriate time scale, 624 00:39:03,340 --> 00:39:04,630 the Geometric Brownian Motion. 625 00:39:09,640 --> 00:39:10,440 Let's see. 626 00:39:10,440 --> 00:39:15,300 One can ask, well, what if trading dates really 627 00:39:15,300 --> 00:39:19,240 isn't the right time scale, but it's more calendar time. 628 00:39:19,240 --> 00:39:22,060 The change in value over the weekends 629 00:39:22,060 --> 00:39:26,050 maybe correspond to price changes, or value changes 630 00:39:26,050 --> 00:39:28,150 over a longer period of time. 631 00:39:28,150 --> 00:39:30,980 And so this model really needs to be 632 00:39:30,980 --> 00:39:35,270 adjusted for that time scale. 633 00:39:35,270 --> 00:39:41,190 The exercise that allows you to consider 634 00:39:41,190 --> 00:39:45,660 different delta t's shows you what the maximum likelihood 635 00:39:45,660 --> 00:39:47,429 estimates-- you'll be deriving maximum 636 00:39:47,429 --> 00:39:49,470 likely estimates if we have different definitions 637 00:39:49,470 --> 00:39:52,180 of time scale there. 638 00:39:52,180 --> 00:40:02,992 But if you apply the calendar time scale to this euro, 639 00:40:02,992 --> 00:40:05,200 let me just show you what the different estimates are 640 00:40:05,200 --> 00:40:09,590 of the annualized mean return and the annualized volatility. 641 00:40:09,590 --> 00:40:16,030 So if we consider trading days for euro it's 10.25% or 0.1025. 642 00:40:16,030 --> 00:40:22,390 If you consider clock time, it actually turns out to be 12.2%. 643 00:40:22,390 --> 00:40:25,070 So depending on how you specify the model 644 00:40:25,070 --> 00:40:28,640 you get a different definition of volatility here. 645 00:40:28,640 --> 00:40:36,170 And it's important to basically understand 646 00:40:36,170 --> 00:40:40,650 sort of what the assumptions are of your model 647 00:40:40,650 --> 00:40:47,480 and whether perhaps things ought to be different. 648 00:40:47,480 --> 00:40:53,700 In stochastic modeling, there's an area 649 00:40:53,700 --> 00:40:57,030 called subordinated stochastic processes. 650 00:40:57,030 --> 00:41:04,220 And basically the idea is, if you have a stochastic process 651 00:41:04,220 --> 00:41:08,770 like Geometric Brownian Motion of simple Brownian motion, 652 00:41:08,770 --> 00:41:14,005 maybe you're observing that on the wrong time scale. 653 00:41:14,005 --> 00:41:15,963 You may fit the Geometric Brownian Motion model 654 00:41:15,963 --> 00:41:17,560 and it doesn't look right. 655 00:41:17,560 --> 00:41:19,740 But it could be that there's a different time 656 00:41:19,740 --> 00:41:21,180 scale that's appropriate. 657 00:41:21,180 --> 00:41:24,990 And it's really Brownian motion on that time scale. 658 00:41:24,990 --> 00:41:29,830 And so formally it's called a subordinated stochastic 659 00:41:29,830 --> 00:41:30,330 process. 660 00:41:30,330 --> 00:41:32,160 You have a different time function 661 00:41:32,160 --> 00:41:35,970 for how to model the stochastic process. 662 00:41:35,970 --> 00:41:40,530 And the evaluation of subordinated stochastic 663 00:41:40,530 --> 00:41:43,750 processes leads to consideration of different time scales. 664 00:41:43,750 --> 00:41:48,320 With, say, equity markets, and futures markets, 665 00:41:48,320 --> 00:41:50,987 sort of the volume of trading, sort of cumulative volume 666 00:41:50,987 --> 00:41:53,070 of training might be really an appropriate measure 667 00:41:53,070 --> 00:41:54,880 of the real time scale. 668 00:41:54,880 --> 00:41:56,820 Because that's a measure of, in a sense, 669 00:41:56,820 --> 00:41:59,000 information flow coming into the market 670 00:41:59,000 --> 00:42:01,870 through the level of activity. 671 00:42:01,870 --> 00:42:06,720 So anyway I wanted to highlight how with different time scales 672 00:42:06,720 --> 00:42:08,320 you can get different results. 673 00:42:08,320 --> 00:42:11,660 And so that's something to be evaluated. 674 00:42:11,660 --> 00:42:13,620 In looking at these different models, 675 00:42:13,620 --> 00:42:15,420 OK, these first few graphs here show 676 00:42:15,420 --> 00:42:18,880 the fit of the normal model with the trading day time scale. 677 00:42:22,400 --> 00:42:22,966 Let's see. 678 00:42:22,966 --> 00:42:25,340 Those of you who've ever taken a statistics class before, 679 00:42:25,340 --> 00:42:29,780 or an applied statistics, may know about normal q-q plots. 680 00:42:29,780 --> 00:42:33,830 Basically if you want to evaluate 681 00:42:33,830 --> 00:42:37,960 the consistency of the returns here 682 00:42:37,960 --> 00:42:41,620 with a Gaussian distribution, what we can do 683 00:42:41,620 --> 00:42:49,410 is plot the observed ordered, sorted returns 684 00:42:49,410 --> 00:42:52,790 against what we would expect the sorted returns 685 00:42:52,790 --> 00:42:56,200 to be if it were from a Gaussian sample. 686 00:42:56,200 --> 00:42:58,940 So under the Geometric Brownian Motion model 687 00:42:58,940 --> 00:43:04,530 the daily returns are a sample, independent and identically 688 00:43:04,530 --> 00:43:06,740 distributed random variable sampled from a Gaussian 689 00:43:06,740 --> 00:43:07,750 distribution. 690 00:43:07,750 --> 00:43:11,510 So the smallest return should be consistent with the smallest 691 00:43:11,510 --> 00:43:14,080 of the sample size n. 692 00:43:14,080 --> 00:43:18,870 And what's being plotted here is the theoretical quantiles 693 00:43:18,870 --> 00:43:21,980 or percentiles versus the actual ones. 694 00:43:21,980 --> 00:43:24,930 And one would expect that to lie along a straight line 695 00:43:24,930 --> 00:43:30,100 if the theoretical quantiles were well-predicting 696 00:43:30,100 --> 00:43:32,670 the actual extreme values. 697 00:43:32,670 --> 00:43:37,760 What we see here is that as the theoretical quantiles get high, 698 00:43:37,760 --> 00:43:40,990 and it's in units of standard deviation units, 699 00:43:40,990 --> 00:43:45,080 the realized sample returns are in fact 700 00:43:45,080 --> 00:43:47,540 much higher than would be predicted by the Gaussian 701 00:43:47,540 --> 00:43:49,210 distribution. 702 00:43:49,210 --> 00:43:52,400 And similarly, on the low end side. 703 00:43:52,400 --> 00:43:54,550 So there's a normal q-q plot that's 704 00:43:54,550 --> 00:43:57,800 used often in the diagnostics of these models. 705 00:43:57,800 --> 00:44:04,910 Then down here I've actually plotted a fitted percentile 706 00:44:04,910 --> 00:44:06,160 distribution. 707 00:44:06,160 --> 00:44:12,470 Now what's been done here is if we modeled the series 708 00:44:12,470 --> 00:44:16,790 as a series of Gaussian random variables 709 00:44:16,790 --> 00:44:24,960 then we can evaluate the percentile 710 00:44:24,960 --> 00:44:27,130 of the fitted Gaussian distribution that 711 00:44:27,130 --> 00:44:29,430 was realized by every point. 712 00:44:29,430 --> 00:44:38,780 So if we have a return of say negative 2%, what percentile 713 00:44:38,780 --> 00:44:40,540 is the normal fit of that? 714 00:44:45,720 --> 00:44:50,410 And you can evaluate the cumulative distribution 715 00:44:50,410 --> 00:44:54,750 function of the fitted model at that value to get that point. 716 00:44:54,750 --> 00:44:59,100 And what should the distribution of percentiles 717 00:44:59,100 --> 00:45:04,410 be for fitted percentiles if we have a really good model? 718 00:45:04,410 --> 00:45:07,370 OK. 719 00:45:07,370 --> 00:45:08,180 Well, OK. 720 00:45:08,180 --> 00:45:09,860 Let's think. 721 00:45:09,860 --> 00:45:14,890 If you consider the 50th percentile you would expect, 722 00:45:14,890 --> 00:45:18,800 I guess, 50% of the data to lie above the 50th percentile 723 00:45:18,800 --> 00:45:21,930 and 50% to lie below the 50th percentile, right? 724 00:45:21,930 --> 00:45:22,530 OK. 725 00:45:22,530 --> 00:45:24,160 Let's consider, here I divided up 726 00:45:24,160 --> 00:45:27,840 into 100 bins between zero and one 727 00:45:27,840 --> 00:45:31,955 so this bin is the 99th percentile. 728 00:45:38,630 --> 00:45:40,460 How many observations would you expect 729 00:45:40,460 --> 00:45:45,590 to find in between the 99th and 100 percentile? 730 00:45:49,800 --> 00:45:51,170 This is an easy question. 731 00:45:51,170 --> 00:45:52,150 AUDIENCE: 1%. 732 00:45:52,150 --> 00:45:53,070 PROFESSOR: 1%. 733 00:45:53,070 --> 00:45:53,790 Right. 734 00:45:53,790 --> 00:45:55,290 And so in any of these bins we would 735 00:45:55,290 --> 00:46:01,450 expect to see 1% if the Gaussian model were fitting. 736 00:46:01,450 --> 00:46:06,690 And what we see is that, well, at the extremes 737 00:46:06,690 --> 00:46:08,600 they're more extreme values. 738 00:46:08,600 --> 00:46:13,720 And actually inside there are some fewer values. 739 00:46:13,720 --> 00:46:17,660 And actually this is exhibiting a leptokurtic distribution 740 00:46:17,660 --> 00:46:20,070 for the actually realized samples; 741 00:46:20,070 --> 00:46:22,080 basically the middle of the distribution 742 00:46:22,080 --> 00:46:24,280 is a little thinner and it's compensated 743 00:46:24,280 --> 00:46:26,660 for by fatter tails. 744 00:46:26,660 --> 00:46:29,440 But with this particular model we 745 00:46:29,440 --> 00:46:33,900 can basically expect to see a uniform distribution 746 00:46:33,900 --> 00:46:39,690 of percentiles in this graph. 747 00:46:39,690 --> 00:46:46,990 If we compare this with a fit of the clock time 748 00:46:46,990 --> 00:46:51,770 we actually see that clock time does 749 00:46:51,770 --> 00:46:59,490 a bit of a better job at getting the extreme values closer 750 00:46:59,490 --> 00:47:01,110 to what we would expect them to be. 751 00:47:01,110 --> 00:47:07,506 So in terms of being a better model for the returns process, 752 00:47:07,506 --> 00:47:09,380 if we're concerned with these extreme values, 753 00:47:09,380 --> 00:47:12,320 we're actually getting a slightly better value 754 00:47:12,320 --> 00:47:13,720 with those. 755 00:47:13,720 --> 00:47:16,590 So all right. 756 00:47:16,590 --> 00:47:20,890 Let's move on back to the notes. 757 00:47:20,890 --> 00:47:28,410 And talk about the Garman-Klass Estimator. 758 00:47:28,410 --> 00:47:30,905 So let me do this. 759 00:47:34,625 --> 00:47:36,080 All right. 760 00:47:36,080 --> 00:47:37,123 View full screen. 761 00:47:43,040 --> 00:47:45,334 OK. 762 00:47:45,334 --> 00:47:46,120 All right. 763 00:47:46,120 --> 00:47:48,090 So, OK. 764 00:47:48,090 --> 00:47:50,990 The Garman-Klass Estimator is one 765 00:47:50,990 --> 00:47:55,410 where we consider the situation where we actually 766 00:47:55,410 --> 00:47:59,410 have much more information than simply sort of closing 767 00:47:59,410 --> 00:48:01,980 prices at different intervals. 768 00:48:01,980 --> 00:48:05,374 Basically all transaction data's collected 769 00:48:05,374 --> 00:48:06,290 in a financial market. 770 00:48:06,290 --> 00:48:08,150 So really we have virtually all of the data 771 00:48:08,150 --> 00:48:11,280 available if we want it, or can pay for it. 772 00:48:11,280 --> 00:48:14,270 But let's consider a case where we 773 00:48:14,270 --> 00:48:18,300 expand upon just having closing prices to having 774 00:48:18,300 --> 00:48:22,190 additional information over increments of time that 775 00:48:22,190 --> 00:48:27,230 include the open, high, and low price 776 00:48:27,230 --> 00:48:28,355 over the different periods. 777 00:48:33,500 --> 00:48:35,940 So those of you who are familiar with bar data 778 00:48:35,940 --> 00:48:41,000 graphs that you see whenever you plot stock prices over periods 779 00:48:41,000 --> 00:48:46,870 of weeks or months you'll be familiar with having 780 00:48:46,870 --> 00:48:48,610 seen those. 781 00:48:48,610 --> 00:48:52,550 Now the Garman-Klass paper addressed 782 00:48:52,550 --> 00:48:54,730 how can we exploit this additional information 783 00:48:54,730 --> 00:49:00,290 to improve upon our estimates of the close-to-close. 784 00:49:00,290 --> 00:49:03,600 And so we'll just use this notation. 785 00:49:03,600 --> 00:49:06,200 Well, let's make some assumptions and notation. 786 00:49:06,200 --> 00:49:09,470 So we'll assume that mu is equal to 0 in our Geometric Brownian 787 00:49:09,470 --> 00:49:10,482 Motion model. 788 00:49:10,482 --> 00:49:12,190 So we don't have to worry about the mean. 789 00:49:12,190 --> 00:49:14,260 We're just concerned with volatility. 790 00:49:14,260 --> 00:49:18,530 We'll assume that the increments are one for daily, 791 00:49:18,530 --> 00:49:20,760 corresponding to daily. 792 00:49:20,760 --> 00:49:23,640 And we'll let little f, between zero and one, 793 00:49:23,640 --> 00:49:35,130 correspond to the time of day at which the market opens. 794 00:49:35,130 --> 00:49:40,130 So over a day, from day zero to day one at 795 00:49:40,130 --> 00:49:45,420 f we assume that the market opens 796 00:49:45,420 --> 00:49:50,780 and basically the Geometric Brownian Motion process 797 00:49:50,780 --> 00:49:53,020 might have closed on day zero here. 798 00:49:53,020 --> 00:50:00,790 So this would be C_0 and it may have opened on day one 799 00:50:00,790 --> 00:50:01,880 at this value. 800 00:50:01,880 --> 00:50:05,990 So this would be O_1. 801 00:50:05,990 --> 00:50:09,000 Might have gone up and down and then 802 00:50:09,000 --> 00:50:14,801 closed here with the Brownian Motion process. 803 00:50:14,801 --> 00:50:15,300 OK. 804 00:50:15,300 --> 00:50:22,240 This value here would correspond to the high value. 805 00:50:22,240 --> 00:50:25,710 This value here would correspond to the low value on day one. 806 00:50:25,710 --> 00:50:30,840 And then the closing value here would be C_1. 807 00:50:30,840 --> 00:50:35,310 So the model is we have this underlying Brownian Motion 808 00:50:35,310 --> 00:50:41,220 process is actually working over continuous time, 809 00:50:41,220 --> 00:50:45,950 but we just observe it over the time when the markets open. 810 00:50:45,950 --> 00:50:48,840 And so it can move between when the market closes and opens 811 00:50:48,840 --> 00:50:52,640 on any given day and we have the additional information. 812 00:50:52,640 --> 00:50:56,380 Instead of just the close, we also have the high and low. 813 00:50:56,380 --> 00:51:00,450 So let's look at how we might exploit that information 814 00:51:00,450 --> 00:51:03,070 to estimate volatility. 815 00:51:03,070 --> 00:51:03,570 OK. 816 00:51:03,570 --> 00:51:07,920 Using data from the first period as we've graphed here, let's 817 00:51:07,920 --> 00:51:16,370 first just highlight what the close-to-close return is. 818 00:51:16,370 --> 00:51:17,900 And that basically is an estimate 819 00:51:17,900 --> 00:51:20,950 of the one-period variance. 820 00:51:20,950 --> 00:51:25,750 And so sigma hat 0 squared is a single period squared return. 821 00:51:28,300 --> 00:51:33,130 C_1 minus C_0 has a distribution which is normal with mean 0, 822 00:51:33,130 --> 00:51:36,930 and variance sigma squared. 823 00:51:36,930 --> 00:51:45,429 And if we consider squaring that, what's 824 00:51:45,429 --> 00:51:46,470 the distribution of that? 825 00:51:46,470 --> 00:51:50,482 That's the square of a normal random variable, which 826 00:51:50,482 --> 00:51:52,690 is chi squared, but it's a multiple of a chi squared. 827 00:51:52,690 --> 00:51:57,250 It's sigma squared times a chi squared one random variable. 828 00:51:57,250 --> 00:52:00,160 And with a chi squared random variable 829 00:52:00,160 --> 00:52:03,150 the expected value is 1. 830 00:52:03,150 --> 00:52:09,140 The variance of a chi squared random variable is equal to 2. 831 00:52:09,140 --> 00:52:11,430 So just knowing those facts gives us 832 00:52:11,430 --> 00:52:17,120 the fact we have an unbiased estimate of the volatility 833 00:52:17,120 --> 00:52:25,420 parameter sigma and the variance is 2 sigma to the fourth. 834 00:52:25,420 --> 00:52:29,716 So that's basically the precision 835 00:52:29,716 --> 00:52:31,510 of close-to-close returns. 836 00:52:34,140 --> 00:52:36,275 Let's look at two other estimates. 837 00:52:40,170 --> 00:52:42,940 Basically the open-to-close return, 838 00:52:42,940 --> 00:52:46,950 sigma_1 squared, normalized by f, 839 00:52:46,950 --> 00:52:48,410 the length of the interval f. 840 00:52:48,410 --> 00:53:02,821 So we have sigma_1 is equal to O_1 minus C_0 squared. 841 00:53:02,821 --> 00:53:03,320 OK. 842 00:53:06,500 --> 00:53:07,917 Actually why don't I just do this? 843 00:53:07,917 --> 00:53:09,624 I'll just write down a few facts and then 844 00:53:09,624 --> 00:53:11,270 you can see that the results are clear. 845 00:53:11,270 --> 00:53:17,400 Basically O_1 minus C_0 is distributed normal with mean 0 846 00:53:17,400 --> 00:53:19,860 and variance f sigma squared. 847 00:53:19,860 --> 00:53:25,100 And C_1 minus O_1 is distributed normal with mean 0 848 00:53:25,100 --> 00:53:28,490 in variance 1 minus f sigma squared. 849 00:53:28,490 --> 00:53:28,990 OK. 850 00:53:28,990 --> 00:53:31,410 This is simply using the properties of the diffusion 851 00:53:31,410 --> 00:53:33,840 process over different periods of time. 852 00:53:33,840 --> 00:53:37,070 So if we normalize the squared values 853 00:53:37,070 --> 00:53:39,140 by the length of the interval we get 854 00:53:39,140 --> 00:53:42,150 estimates of the volatility. 855 00:53:42,150 --> 00:53:46,670 And what's particularly significant 856 00:53:46,670 --> 00:53:49,250 about these estimates one and two 857 00:53:49,250 --> 00:53:52,360 is that they're independent. 858 00:53:52,360 --> 00:53:54,660 So we actually have two estimates 859 00:53:54,660 --> 00:53:58,130 of the same underlying parameter, 860 00:53:58,130 --> 00:54:01,090 which are independent. 861 00:54:01,090 --> 00:54:05,305 And actually they both have the same mean 862 00:54:05,305 --> 00:54:08,360 and they both have the same variance. 863 00:54:08,360 --> 00:54:11,650 So if we consider a new estimate, 864 00:54:11,650 --> 00:54:15,790 which is basically averaging those two. 865 00:54:15,790 --> 00:54:21,050 Then this new estimate has the same-- is unbiased as well, 866 00:54:21,050 --> 00:54:26,340 but it's variance is basically the variance of this sum. 867 00:54:26,340 --> 00:54:29,330 So it's 1/2 squared times this variance plus 1/2 squared 868 00:54:29,330 --> 00:54:33,640 times this variance, which is a half of the variance of each 869 00:54:33,640 --> 00:54:34,350 of them. 870 00:54:34,350 --> 00:54:38,000 So this estimate has lower variance 871 00:54:38,000 --> 00:54:40,200 than our close-to-close. 872 00:54:40,200 --> 00:54:46,560 And we can define the efficiency of this particular estimate 873 00:54:46,560 --> 00:54:50,870 relative to the close-to-close estimate as 2. 874 00:54:50,870 --> 00:54:54,175 Basically we get double the precision. 875 00:54:58,510 --> 00:55:01,940 Suppose you had the open, high, close for one day. 876 00:55:01,940 --> 00:55:04,220 How many days of close-to-close data 877 00:55:04,220 --> 00:55:08,343 would you need to have the same variance as this estimate? 878 00:55:13,860 --> 00:55:14,729 No. 879 00:55:14,729 --> 00:55:15,645 AUDIENCE: [INAUDIBLE]. 880 00:55:15,645 --> 00:55:17,550 Because of the three data points [INAUDIBLE]. 881 00:55:17,550 --> 00:55:18,360 PROFESSOR: No. 882 00:55:18,360 --> 00:55:20,590 No. 883 00:55:20,590 --> 00:55:22,565 Anyone else? 884 00:55:22,565 --> 00:55:23,760 One more. 885 00:55:23,760 --> 00:55:25,150 Four. 886 00:55:25,150 --> 00:55:25,760 OK. 887 00:55:25,760 --> 00:55:30,070 Basically if the variance is 1/2, 888 00:55:30,070 --> 00:55:36,040 basically to get the standard deviation, or the variance 889 00:55:36,040 --> 00:55:39,205 to be-- I'm sorry. 890 00:55:39,205 --> 00:55:40,770 The ratio of the variance is two. 891 00:55:40,770 --> 00:55:41,430 So no. 892 00:55:41,430 --> 00:55:43,140 So it actually is close to two. 893 00:55:46,571 --> 00:55:47,070 Let's see. 894 00:55:47,070 --> 00:55:49,640 Our 1/n is-- so it actually is two. 895 00:55:49,640 --> 00:55:50,140 OK. 896 00:55:50,140 --> 00:55:51,420 I was thinking standard deviation units 897 00:55:51,420 --> 00:55:52,540 instead of squared units. 898 00:55:52,540 --> 00:55:55,900 So I was trying to be clever there. 899 00:55:55,900 --> 00:55:59,270 So it actually is basically two days. 900 00:55:59,270 --> 00:56:03,150 So sampling this with this information 901 00:56:03,150 --> 00:56:05,842 gives you as much as two days worth of information. 902 00:56:05,842 --> 00:56:06,800 So what does that mean? 903 00:56:06,800 --> 00:56:08,216 Well, if you want something that's 904 00:56:08,216 --> 00:56:09,970 as efficient as daily estimates you'll 905 00:56:09,970 --> 00:56:14,390 need to look back one day instead of two days 906 00:56:14,390 --> 00:56:16,440 to get the same efficiency with the estimate. 907 00:56:16,440 --> 00:56:16,940 All right. 908 00:56:19,870 --> 00:56:21,760 The motivation for the Garman-Klass paper 909 00:56:21,760 --> 00:56:26,035 was actually a paper written by Parkinson 910 00:56:26,035 --> 00:56:31,210 in 1976, which dealt with using the extremes of a Brownian 911 00:56:31,210 --> 00:56:34,730 Motion to estimate the underlying parameters. 912 00:56:34,730 --> 00:56:39,590 And when Choongbum talks about Brownian Motion a bit later, 913 00:56:39,590 --> 00:56:42,080 I don't know if you'll derive this result, 914 00:56:42,080 --> 00:56:43,810 but in courses on stochastic processes 915 00:56:43,810 --> 00:56:47,340 one does derive properties of the maximum of a Brownian 916 00:56:47,340 --> 00:56:50,200 Motion over a given interval and the minimum. 917 00:56:50,200 --> 00:56:53,350 And it turns out that if you look 918 00:56:53,350 --> 00:56:57,070 at the difference between the high and low squared divided 919 00:56:57,070 --> 00:57:02,140 by 4 log 2, this is an estimate of the volatility 920 00:57:02,140 --> 00:57:03,460 of the process. 921 00:57:03,460 --> 00:57:06,300 And the efficiency of this estimate 922 00:57:06,300 --> 00:57:12,589 turns out to be 5.2, which is better yet. 923 00:57:12,589 --> 00:57:14,380 Well, Garman and Klass were excited by that 924 00:57:14,380 --> 00:57:16,710 and wanted to find even better ones. 925 00:57:16,710 --> 00:57:22,350 So they wrote a paper that evaluated all different kinds 926 00:57:22,350 --> 00:57:23,000 of estimates. 927 00:57:23,000 --> 00:57:25,772 And I encourage you to Google that paper 928 00:57:25,772 --> 00:57:27,480 and read it because it's very accessible. 929 00:57:27,480 --> 00:57:31,060 And it sort of highlights the statistical and probability 930 00:57:31,060 --> 00:57:33,150 issues associated with these problems. 931 00:57:33,150 --> 00:57:34,800 But what they did was they derived 932 00:57:34,800 --> 00:57:38,130 the best analytic scale-invariant estimator, 933 00:57:38,130 --> 00:57:46,130 which has this sort of bizarre combination of different terms, 934 00:57:46,130 --> 00:57:48,840 but basically we're using normalized values 935 00:57:48,840 --> 00:57:52,270 of the high, low, close normalized by the open. 936 00:57:52,270 --> 00:57:55,840 And they're able to get an efficiency of 7.4 937 00:57:55,840 --> 00:57:59,390 with this combination. 938 00:57:59,390 --> 00:58:06,700 Now scale-invariant estimates, in statistical theory, 939 00:58:06,700 --> 00:58:08,130 they're different principles that 940 00:58:08,130 --> 00:58:11,610 guide the development of different methodologies. 941 00:58:11,610 --> 00:58:16,420 And one kind of principle is issues of scale invariance. 942 00:58:16,420 --> 00:58:20,600 If you're estimating a scale parameter, and in this case 943 00:58:20,600 --> 00:58:22,210 volatility is telling you essentially 944 00:58:22,210 --> 00:58:25,680 how large is the variability of this process, 945 00:58:25,680 --> 00:58:29,750 if you were to say multiply your original data all by a given 946 00:58:29,750 --> 00:58:33,950 constant, then a scale-invariant estimator 947 00:58:33,950 --> 00:58:37,560 should be such that your estimator changes in that case 948 00:58:37,560 --> 00:58:39,710 only by that same scale factor. 949 00:58:39,710 --> 00:58:42,710 So sort of the estimator doesn't depend 950 00:58:42,710 --> 00:58:45,700 on how you scale the data. 951 00:58:45,700 --> 00:58:50,230 So that's the notion of scale invariance. 952 00:58:50,230 --> 00:58:55,092 The Garman-Klass paper actually goes to the nth degree 953 00:58:55,092 --> 00:58:56,800 and actually finds a particular estimator 954 00:58:56,800 --> 00:59:00,120 that has an efficiency of 8.4, which 955 00:59:00,120 --> 00:59:02,470 is really highly significant. 956 00:59:02,470 --> 00:59:08,350 So if you are working with a modeling process 957 00:59:08,350 --> 00:59:11,680 where you believe that the underlying parameters may 958 00:59:11,680 --> 00:59:16,080 be reasonably assumed to be constant 959 00:59:16,080 --> 00:59:19,590 over short periods of time, well, 960 00:59:19,590 --> 00:59:21,870 over those short periods of time if you use 961 00:59:21,870 --> 00:59:24,360 these extended estimators like this, 962 00:59:24,360 --> 00:59:26,320 you'll get much more precise measures 963 00:59:26,320 --> 00:59:28,970 of the underlying parameters than from just using 964 00:59:28,970 --> 00:59:33,160 simple close-to-close data. 965 00:59:33,160 --> 00:59:33,730 All right. 966 00:59:33,730 --> 00:59:38,790 Let's introduce Poisson Jump Diffusions. 967 00:59:38,790 --> 00:59:44,480 With Poisson Jump Diffusions we have 968 00:59:44,480 --> 00:59:50,560 basically a stochastic differential equation 969 00:59:50,560 --> 00:59:52,560 for representing this model. 970 00:59:52,560 --> 00:59:55,750 And it's just like the Geometric Brownian Motion model, 971 00:59:55,750 --> 01:00:00,930 except we have this additional term, gamma sigma Z d pi of t. 972 01:00:00,930 --> 01:00:04,390 Now that's a lot of different variables, 973 01:00:04,390 --> 01:00:07,370 but essentially what we're thinking about 974 01:00:07,370 --> 01:00:20,300 is over time a Brownian Motion process is fully continuous. 975 01:00:20,300 --> 01:00:24,390 There are basically no jumps in this Brownian Motion process. 976 01:00:24,390 --> 01:00:27,610 In order to allow for jumps, we assume 977 01:00:27,610 --> 01:00:33,960 that there's some process pi of t, which is a Poisson process. 978 01:00:33,960 --> 01:00:38,480 It's counting process that counts when jumps occur, 979 01:00:38,480 --> 01:00:40,160 how many jumps have occurred. 980 01:00:40,160 --> 01:00:44,020 So that might start at 0 at the value 0. 981 01:00:44,020 --> 01:00:50,680 Then if there's a jump here it goes up by one. 982 01:00:50,680 --> 01:00:54,280 And then if there's another jump here, it goes up by one, 983 01:00:54,280 --> 01:00:55,890 and so forth. 984 01:00:55,890 --> 01:01:02,000 And so the Poisson Jump Diffusion model says, 985 01:01:02,000 --> 01:01:04,450 this diffusion process is actually 986 01:01:04,450 --> 01:01:06,630 going to experience some shocks to it. 987 01:01:06,630 --> 01:01:08,940 Those shocks are going to be arriving 988 01:01:08,940 --> 01:01:11,010 according to a Poisson process. 989 01:01:11,010 --> 01:01:12,740 If you've taken stochastic modeling 990 01:01:12,740 --> 01:01:17,925 you know that that's a sort of a purely random process. 991 01:01:17,925 --> 01:01:21,630 Basically exponential arrival rate of shocks occur. 992 01:01:21,630 --> 01:01:23,740 You can't predict them. 993 01:01:23,740 --> 01:01:28,630 And when those occur, d pi of t is 994 01:01:28,630 --> 01:01:31,690 going to change from 0 up to the unit increment. 995 01:01:31,690 --> 01:01:33,740 So d pi of t is 1. 996 01:01:33,740 --> 01:01:36,540 And then we'll realize gamma sigma Z of t. 997 01:01:36,540 --> 01:01:40,975 So at this point we're going to have shocks. 998 01:01:43,580 --> 01:01:50,900 Here this is going to be gamma sigma Z_1 And at this point, 999 01:01:50,900 --> 01:01:57,960 maybe it's a negative shock, gamma sigma Z_2. 1000 01:01:57,960 --> 01:01:59,440 This is 0. 1001 01:01:59,440 --> 01:02:03,400 And so with this overall process we basically 1002 01:02:03,400 --> 01:02:05,720 have a shift in the diffusion, up or down, 1003 01:02:05,720 --> 01:02:08,070 according to these values. 1004 01:02:08,070 --> 01:02:11,746 And so this model allows for the arrival of these processes 1005 01:02:11,746 --> 01:02:13,870 to be random according to the Poisson distribution, 1006 01:02:13,870 --> 01:02:17,770 and for the magnitude of the shocks to be random as well. 1007 01:02:21,350 --> 01:02:24,740 Now like the Geometric Brownian Motion model 1008 01:02:24,740 --> 01:02:31,350 this process sort of has independent increments, which 1009 01:02:31,350 --> 01:02:35,630 helps with this estimation. 1010 01:02:35,630 --> 01:02:39,550 One could estimate this model by maximum likelihood, 1011 01:02:39,550 --> 01:02:43,020 but it does get tricky in that basically 1012 01:02:43,020 --> 01:02:45,970 over different increments of time the change 1013 01:02:45,970 --> 01:02:50,200 in the process corresponds to the diffusion increment, 1014 01:02:50,200 --> 01:02:53,550 plus the sum of the jumps that have occurred 1015 01:02:53,550 --> 01:02:55,820 over that same increment. 1016 01:02:55,820 --> 01:03:01,170 And so the model ultimately is a Poisson mixture 1017 01:03:01,170 --> 01:03:05,410 of Gaussian distributions. 1018 01:03:05,410 --> 01:03:11,040 And in order to evaluate this model, model's properties, 1019 01:03:11,040 --> 01:03:14,370 moment generating functions can be computed rather directly 1020 01:03:14,370 --> 01:03:15,380 with that. 1021 01:03:15,380 --> 01:03:19,650 And so one can understand how the moments of the process 1022 01:03:19,650 --> 01:03:22,410 vary with these different model parameters. 1023 01:03:22,410 --> 01:03:28,410 The likelihood function is a product of Poisson sums. 1024 01:03:28,410 --> 01:03:32,270 And there's a closed form for the EM algorithm, which 1025 01:03:32,270 --> 01:03:34,280 can be used to implement the estimation 1026 01:03:34,280 --> 01:03:37,280 of the unknown parameters. 1027 01:03:37,280 --> 01:03:44,430 And if you think about observing a Poisson Jump Diffusion 1028 01:03:44,430 --> 01:03:52,070 process, if you knew where the jumps occurred, 1029 01:03:52,070 --> 01:03:53,740 so you knew where the jumps occurred 1030 01:03:53,740 --> 01:03:57,080 and how many there were per increment in your data, 1031 01:03:57,080 --> 01:04:00,490 then the maximum likelihood estimation 1032 01:04:00,490 --> 01:04:03,230 would all be very, very simple. 1033 01:04:03,230 --> 01:04:05,660 And because this sort of is a separation 1034 01:04:05,660 --> 01:04:07,840 of the estimation of the Gaussian parameters 1035 01:04:07,840 --> 01:04:10,940 from the Poisson parameters. 1036 01:04:10,940 --> 01:04:14,100 When you haven't observed those values then 1037 01:04:14,100 --> 01:04:18,880 you need to deal with methods appropriate for missing data. 1038 01:04:18,880 --> 01:04:22,910 And the EM algorithm is a very famous algorithm developed 1039 01:04:22,910 --> 01:04:28,730 by the people up at Harvard, Rubin, Laird, and Dempster, 1040 01:04:28,730 --> 01:04:35,720 which deals with, basically if the problem is much simpler, 1041 01:04:35,720 --> 01:04:41,240 if you could posit there being unobserved variables 1042 01:04:41,240 --> 01:04:44,600 that you would observe, then you sort of 1043 01:04:44,600 --> 01:04:46,960 expand the problem to include your observed 1044 01:04:46,960 --> 01:04:50,700 data, plus the missing data, in this case where 1045 01:04:50,700 --> 01:04:52,310 the jumps have occurred. 1046 01:04:52,310 --> 01:04:57,190 And you then do conditional expectations 1047 01:04:57,190 --> 01:04:59,850 of estimating those jumps. 1048 01:04:59,850 --> 01:05:03,560 And then assuming that those jumps had those-- 1049 01:05:03,560 --> 01:05:06,960 occurred with those frequencies, estimating the underlying 1050 01:05:06,960 --> 01:05:07,580 parameters. 1051 01:05:07,580 --> 01:05:09,800 So the EM algorithm is very powerful 1052 01:05:09,800 --> 01:05:14,550 and has extensive applications in all kinds 1053 01:05:14,550 --> 01:05:15,840 of different models. 1054 01:05:15,840 --> 01:05:17,740 I'll put up on the website a paper 1055 01:05:17,740 --> 01:05:24,240 that I wrote with David Pickard and his student 1056 01:05:24,240 --> 01:05:30,470 Arshad Zakaria, which goes through the maximum likelihood 1057 01:05:30,470 --> 01:05:31,600 methodology for this. 1058 01:05:31,600 --> 01:05:34,980 But looking at that, you can see how 1059 01:05:34,980 --> 01:05:39,110 with an extended model, how maximum likelihood gets 1060 01:05:39,110 --> 01:05:44,390 implemented and I think that's useful to see. 1061 01:05:44,390 --> 01:05:44,890 All right. 1062 01:05:44,890 --> 01:05:51,060 So let's turn next to ARCH models. 1063 01:05:51,060 --> 01:05:54,430 And OK. 1064 01:05:54,430 --> 01:06:03,250 Just as a bit of motivation, the Geometric Brownian Motion model 1065 01:06:03,250 --> 01:06:05,640 and also the Poisson Jump Diffusion model 1066 01:06:05,640 --> 01:06:10,750 are models which assume that volatility over time 1067 01:06:10,750 --> 01:06:14,340 is essentially stationary. 1068 01:06:14,340 --> 01:06:20,470 And with the sort of independent increments of those processes, 1069 01:06:20,470 --> 01:06:22,160 the volatility over different increments 1070 01:06:22,160 --> 01:06:23,830 is essentially the same. 1071 01:06:23,830 --> 01:06:27,650 So the ARCH models were introduced 1072 01:06:27,650 --> 01:06:31,440 to accommodate the possibility that there's 1073 01:06:31,440 --> 01:06:33,025 time dependence in volatility. 1074 01:06:37,790 --> 01:06:40,220 And so let's see. 1075 01:06:47,041 --> 01:06:47,540 Let's see. 1076 01:06:47,540 --> 01:06:50,550 Let me go. 1077 01:06:50,550 --> 01:06:52,020 OK. 1078 01:06:52,020 --> 01:06:55,190 At the very end, I'll go through an example showing that time 1079 01:06:55,190 --> 01:07:04,910 dependence with our euro/dollar exchange rates. 1080 01:07:04,910 --> 01:07:09,230 So the set up for this model is basically 1081 01:07:09,230 --> 01:07:13,370 we look at the log of the price relatives y_t 1082 01:07:13,370 --> 01:07:19,610 and we model the residuals to not 1083 01:07:19,610 --> 01:07:27,400 be of constant volatility, but to be multiples of sort 1084 01:07:27,400 --> 01:07:32,900 of white noise with mean 0 and variance 1, where sigma_t 1085 01:07:32,900 --> 01:07:38,920 is given by this essentially ARCH function, which 1086 01:07:38,920 --> 01:07:42,200 says that the volatility at a given period t 1087 01:07:42,200 --> 01:07:48,020 is a weighted average of the squared residuals 1088 01:07:48,020 --> 01:07:50,870 over the last p lags. 1089 01:07:50,870 --> 01:07:56,400 And so if there's a large residual then 1090 01:07:56,400 --> 01:08:02,810 that could persist and make the next observation 1091 01:08:02,810 --> 01:08:04,660 have a large variance. 1092 01:08:04,660 --> 01:08:10,310 And so this accommodates some time dependence. 1093 01:08:10,310 --> 01:08:18,229 Now this model actually has parameter constraints, 1094 01:08:18,229 --> 01:08:21,790 which are never a nice thing to have 1095 01:08:21,790 --> 01:08:24,410 when you're fitting models. 1096 01:08:24,410 --> 01:08:28,290 In this case the parameters alpha_one through alpha_p 1097 01:08:28,290 --> 01:08:30,279 all have to be positive. 1098 01:08:30,279 --> 01:08:33,400 And why do they have to be positive? 1099 01:08:36,166 --> 01:08:37,262 AUDIENCE: [INAUDIBLE]. 1100 01:08:37,262 --> 01:08:37,970 PROFESSOR: Right. 1101 01:08:37,970 --> 01:08:39,460 Variance is positive. 1102 01:08:39,460 --> 01:08:42,089 So if any of these alphas were negative, 1103 01:08:42,089 --> 01:08:44,130 then there would be a possibility that under this 1104 01:08:44,130 --> 01:08:46,250 model that you could have negative volatility, 1105 01:08:46,250 --> 01:08:47,020 which you can't. 1106 01:08:47,020 --> 01:08:55,109 So if we estimate this model to estimate them 1107 01:08:55,109 --> 01:08:57,729 with the constraint that all these parameter values 1108 01:08:57,729 --> 01:09:00,979 are non-negative. 1109 01:09:00,979 --> 01:09:06,529 So that does complicate the estimation a bit. 1110 01:09:06,529 --> 01:09:12,580 In terms of understanding how this process works 1111 01:09:12,580 --> 01:09:17,880 one can actually see how the ARCH model implies 1112 01:09:17,880 --> 01:09:23,492 an autoregressive model for the squared residuals, which 1113 01:09:23,492 --> 01:09:24,450 turns out to be useful. 1114 01:09:24,450 --> 01:09:28,380 So the top line there is the ARCH model 1115 01:09:28,380 --> 01:09:31,149 saying that the variance of the t period return 1116 01:09:31,149 --> 01:09:35,000 is this weighted average of the past residuals. 1117 01:09:35,000 --> 01:09:41,229 And then if we simply add a new variable u_t, which 1118 01:09:41,229 --> 01:09:51,399 is our squared residual minus its variance, to both sides 1119 01:09:51,399 --> 01:09:59,020 we get the next line, which says that epsilon_t squared follows 1120 01:09:59,020 --> 01:10:07,230 an autoregression on itself, with the u_t value being 1121 01:10:07,230 --> 01:10:10,800 the disturbance in that autoregression. 1122 01:10:10,800 --> 01:10:17,230 Now u_t, which is epsilon_t squared minus sigma squared t, 1123 01:10:17,230 --> 01:10:21,150 what is the mean of that? 1124 01:10:21,150 --> 01:10:23,220 The mean is 0. 1125 01:10:23,220 --> 01:10:25,030 So it's almost white noise. 1126 01:10:25,030 --> 01:10:28,920 But its variance is maybe going to change over time. 1127 01:10:28,920 --> 01:10:31,130 So it's not sort of standard white noise, 1128 01:10:31,130 --> 01:10:34,950 but it basically has expectation 0. 1129 01:10:34,950 --> 01:10:38,040 It's also conditional independent, 1130 01:10:38,040 --> 01:10:40,850 but there's some possible variability there. 1131 01:10:40,850 --> 01:10:42,930 But what this implies is that there basically 1132 01:10:42,930 --> 01:10:45,770 is an autoregressive model where we just 1133 01:10:45,770 --> 01:10:50,190 have time-varying variances in the underlying process. 1134 01:10:50,190 --> 01:10:53,820 Now because of that one can sort of quickly 1135 01:10:53,820 --> 01:10:57,120 evaluate whether there's ARCH structure in data 1136 01:10:57,120 --> 01:10:58,960 by simply fitting an autoregressive model 1137 01:10:58,960 --> 01:11:00,970 to the squared residuals. 1138 01:11:00,970 --> 01:11:02,750 And testing whether that regression 1139 01:11:02,750 --> 01:11:04,660 is significant or not. 1140 01:11:04,660 --> 01:11:08,940 And that formally is a Lagrange multiplier test. 1141 01:11:08,940 --> 01:11:13,490 Some of the original papers by Engle go through that analysis. 1142 01:11:13,490 --> 01:11:17,880 And the test statistic turns out to just 1143 01:11:17,880 --> 01:11:24,620 be the multiple of the r squared for that regression fit. 1144 01:11:24,620 --> 01:11:33,690 And basically under, say, a null hypothesis 1145 01:11:33,690 --> 01:11:35,820 that there isn't any ARCH structure, 1146 01:11:35,820 --> 01:11:40,290 then this regression model should have no predictability. 1147 01:11:40,290 --> 01:11:41,820 This ARCH model in the residuals, 1148 01:11:41,820 --> 01:11:45,920 basically if there's no time dependence in those residuals, 1149 01:11:45,920 --> 01:11:52,620 that's evidence of there being an absence of ARCH structure. 1150 01:11:52,620 --> 01:11:56,640 And so under the null hypothesis of no ARCH structure 1151 01:11:56,640 --> 01:11:59,580 that r squared statistic should be small. 1152 01:11:59,580 --> 01:12:04,260 It turns out that sort of n times the r squared statistic 1153 01:12:04,260 --> 01:12:08,230 with p variables is asymptotically 1154 01:12:08,230 --> 01:12:11,370 a chi-square distribution with p degrees of freedom. 1155 01:12:11,370 --> 01:12:17,760 So that's where that test statistic comes into play. 1156 01:12:17,760 --> 01:12:23,180 And in implementing this, the fact that we were applying 1157 01:12:23,180 --> 01:12:25,630 essentially least squares with the autoregression 1158 01:12:25,630 --> 01:12:30,900 to implement this Lagrange multiplier test, but we were 1159 01:12:30,900 --> 01:12:33,530 assuming, well, we're not assuming, 1160 01:12:33,530 --> 01:12:35,980 we're implicitly assuming the assumptions of Gauss-Markov 1161 01:12:35,980 --> 01:12:37,730 in fitting that. 1162 01:12:37,730 --> 01:12:42,920 This corresponds to the notion of quasi-maximum likelihood 1163 01:12:42,920 --> 01:12:47,040 estimates for unknown parameters. 1164 01:12:47,040 --> 01:12:51,870 And quasi-maximum likelihood estimates 1165 01:12:51,870 --> 01:12:56,570 are used extensively in some stochastic volatility models. 1166 01:12:56,570 --> 01:12:59,140 And so essentially situations where you sort of use 1167 01:12:59,140 --> 01:13:04,180 the normal approximation, or the second order approximation, 1168 01:13:04,180 --> 01:13:05,870 to get your estimates, and they turn out 1169 01:13:05,870 --> 01:13:11,930 to be consistent and decent. 1170 01:13:16,570 --> 01:13:18,590 All right. 1171 01:13:18,590 --> 01:13:20,700 Let's go to Maximum Likelihood Estimation. 1172 01:13:20,700 --> 01:13:23,800 OK Maximum Likelihood Estimation basically 1173 01:13:23,800 --> 01:13:28,350 involves-- the hard part is defining the likelihood 1174 01:13:28,350 --> 01:13:33,700 function, which is the density of the data given 1175 01:13:33,700 --> 01:13:35,290 the unknown parameters. 1176 01:13:35,290 --> 01:13:40,743 In this case, the data are conditionally independent. 1177 01:13:50,630 --> 01:13:55,080 The joint density is the product of the density of y_t given 1178 01:13:55,080 --> 01:13:56,570 the information at t minus 1. 1179 01:13:56,570 --> 01:14:00,520 So basically the joint probability density 1180 01:14:00,520 --> 01:14:03,479 is the density at each time point conditional on the past, 1181 01:14:03,479 --> 01:14:06,020 and then the density times the density of the next time point 1182 01:14:06,020 --> 01:14:07,130 conditional on the past. 1183 01:14:07,130 --> 01:14:09,730 And those are all normal random variables. 1184 01:14:09,730 --> 01:14:13,760 So these are the normal PDFs coming into play here. 1185 01:14:13,760 --> 01:14:16,170 And so what we want to do is basically 1186 01:14:16,170 --> 01:14:18,150 maximize this likelihood function 1187 01:14:18,150 --> 01:14:19,440 subject to these constraints. 1188 01:14:22,040 --> 01:14:23,820 And we already went through the fact 1189 01:14:23,820 --> 01:14:25,890 that the alpha_i's have to be greater than zero. 1190 01:14:25,890 --> 01:14:29,080 And it turns out you also have to have 1191 01:14:29,080 --> 01:14:33,124 that the sum of the alphas is less than one. 1192 01:14:33,124 --> 01:14:35,040 Now what would happen if the sum of the alphas 1193 01:14:35,040 --> 01:14:36,578 was not less than one? 1194 01:14:40,036 --> 01:14:41,852 AUDIENCE: [INAUDIBLE]. 1195 01:14:41,852 --> 01:14:42,560 PROFESSOR: Right. 1196 01:14:42,560 --> 01:14:45,680 And you basically could have the process start diverging. 1197 01:14:45,680 --> 01:14:50,570 Basically these autoregressions can explode. 1198 01:14:50,570 --> 01:14:52,590 So let's go through and see. 1199 01:14:57,580 --> 01:14:58,080 Let's see. 1200 01:15:00,828 --> 01:15:06,570 Actually, we're going to go to GARCH models next. 1201 01:15:06,570 --> 01:15:08,800 OK. 1202 01:15:08,800 --> 01:15:09,300 Let's see. 1203 01:15:17,464 --> 01:15:18,880 Let me just go back here a second. 1204 01:15:18,880 --> 01:15:19,379 OK. 1205 01:15:19,379 --> 01:15:20,200 Very good. 1206 01:15:20,200 --> 01:15:21,160 OK. 1207 01:15:21,160 --> 01:15:23,780 In the remaining few minutes let me just introduce you 1208 01:15:23,780 --> 01:15:28,320 to the GARCH models. 1209 01:15:36,560 --> 01:15:43,960 The GARCH model is basically a series 1210 01:15:43,960 --> 01:15:46,610 of past values of the squared volatilities, 1211 01:15:46,610 --> 01:15:55,860 basically the q sum of past squared volatilities 1212 01:15:55,860 --> 01:16:00,620 for the equation for the volatility sigma t squared. 1213 01:16:00,620 --> 01:16:06,300 And so it may be that very high order 1214 01:16:06,300 --> 01:16:09,760 ARCH models are actually important. 1215 01:16:09,760 --> 01:16:17,410 Or very high order ARCH terms are found to be significant 1216 01:16:17,410 --> 01:16:20,680 when you fit ARCH models. 1217 01:16:20,680 --> 01:16:25,610 It could be that much of that need 1218 01:16:25,610 --> 01:16:29,540 is explained by adding these GARCH terms. 1219 01:16:29,540 --> 01:16:33,660 And so let's just consider a simple GARCH model where 1220 01:16:33,660 --> 01:16:40,020 we have only a first order ARCH term and a first order GARCH 1221 01:16:40,020 --> 01:16:40,830 term. 1222 01:16:40,830 --> 01:16:43,230 So we're basically saying that this 1223 01:16:43,230 --> 01:16:48,470 is a weighted average of the previous volatility, 1224 01:16:48,470 --> 01:16:51,210 the new squared residual. 1225 01:16:51,210 --> 01:16:58,060 And this is a very parsimonious representation 1226 01:16:58,060 --> 01:17:01,630 that actually ends up fitting data quite, quite well. 1227 01:17:01,630 --> 01:17:07,710 And there are various properties of this GARCH model 1228 01:17:07,710 --> 01:17:12,270 which we'll go through next time, 1229 01:17:12,270 --> 01:17:14,260 but I want to just close this lecture 1230 01:17:14,260 --> 01:17:20,750 by showing you fits of the ARCH models and of this GARCH model 1231 01:17:20,750 --> 01:17:25,500 to the euro/dollar exchange rate process. 1232 01:17:25,500 --> 01:17:31,080 So let's just look at that here. 1233 01:17:34,521 --> 01:17:35,021 OK. 1234 01:17:40,700 --> 01:17:41,200 OK. 1235 01:17:41,200 --> 01:17:44,160 With the euro/dollar exchange rate, 1236 01:17:44,160 --> 01:17:47,110 actually there's the graph here which 1237 01:17:47,110 --> 01:17:51,380 shows the auto-correlation function 1238 01:17:51,380 --> 01:17:54,070 and the partial auto-correlation function 1239 01:17:54,070 --> 01:17:57,290 of the squared returns. 1240 01:17:57,290 --> 01:18:02,600 So is there dependence in these daily volatilities? 1241 01:18:02,600 --> 01:18:07,470 And basically these blue lines are plus or minus 1242 01:18:07,470 --> 01:18:12,580 two standard deviations of the correlation coefficient. 1243 01:18:12,580 --> 01:18:16,950 Basically we have highly significant auto-correlations 1244 01:18:16,950 --> 01:18:20,900 and very highly significant partial auto-correlations, 1245 01:18:20,900 --> 01:18:24,290 which suggests if you're familiar with ARMA process 1246 01:18:24,290 --> 01:18:27,820 that you would need a very high order ARMA process 1247 01:18:27,820 --> 01:18:34,360 to fit the squared residuals. 1248 01:18:34,360 --> 01:18:38,730 But this highlights how with the statistical tools 1249 01:18:38,730 --> 01:18:45,610 you can actually identify this time dependence quite quickly. 1250 01:18:45,610 --> 01:18:52,530 And here's a plot of the ARCH order one model and the ARCH 1251 01:18:52,530 --> 01:18:54,200 order two model. 1252 01:18:54,200 --> 01:18:56,570 And on each of these I've actually 1253 01:18:56,570 --> 01:19:00,940 drawn a solid line where the sort of constant variance model 1254 01:19:00,940 --> 01:19:01,440 would be. 1255 01:19:01,440 --> 01:19:05,300 So ARCH is saying that we have a lot of variability 1256 01:19:05,300 --> 01:19:09,270 about that constant mean. 1257 01:19:09,270 --> 01:19:14,070 And a property, I guess, of these ARCH models 1258 01:19:14,070 --> 01:19:18,340 is that they all have sort of a minimum value 1259 01:19:18,340 --> 01:19:21,330 for the volatility that they're estimating. 1260 01:19:21,330 --> 01:19:23,020 If you look at the ARCH function, 1261 01:19:23,020 --> 01:19:26,770 that alpha_0 now is-- the constant term 1262 01:19:26,770 --> 01:19:31,460 is basically the minimum value, which that can be. 1263 01:19:31,460 --> 01:19:35,790 So there's a constraint sort of on the lower value. 1264 01:19:35,790 --> 01:19:48,080 Then here's an ARCH(10) fit which, 1265 01:19:48,080 --> 01:19:51,190 it doesn't look like it sort of has quite as much of a uniform 1266 01:19:51,190 --> 01:19:54,940 lower bound in it, but one could go on and on with higher order 1267 01:19:54,940 --> 01:19:58,970 ARCH terms, but rather than doing that one can also fit 1268 01:19:58,970 --> 01:20:01,030 just a GARCH(1,1) model. 1269 01:20:01,030 --> 01:20:04,250 And this is what it looks like. 1270 01:20:04,250 --> 01:20:10,170 So basically the time varying volatility in this process 1271 01:20:10,170 --> 01:20:12,310 is captured really, really well with just 1272 01:20:12,310 --> 01:20:16,040 this two-parameter GARCH model as compared with a high order 1273 01:20:16,040 --> 01:20:17,990 autoregressive model. 1274 01:20:17,990 --> 01:20:20,760 And it sort of highlights the issues 1275 01:20:20,760 --> 01:20:25,090 with the Wold decomposition where a potentially infinite 1276 01:20:25,090 --> 01:20:29,010 order autoregressive model will effectively 1277 01:20:29,010 --> 01:20:30,770 fit most time series. 1278 01:20:30,770 --> 01:20:32,510 Well, that's nice to know, but it's 1279 01:20:32,510 --> 01:20:34,360 nice to have a parsimonious way of defining 1280 01:20:34,360 --> 01:20:36,390 that infinite collection of parameters 1281 01:20:36,390 --> 01:20:39,130 and with the GARCH model a couple of parameters 1282 01:20:39,130 --> 01:20:40,580 do a good job. 1283 01:20:40,580 --> 01:20:42,820 And then finally here's just a simultaneous plot 1284 01:20:42,820 --> 01:20:47,270 of all those volatility estimates on the same graph. 1285 01:20:47,270 --> 01:20:53,010 And so one can see the increased flexibility basically 1286 01:20:53,010 --> 01:20:56,490 of the GARCH models compared to the ARCH models for capturing 1287 01:20:56,490 --> 01:20:59,280 time-varying volatility. 1288 01:20:59,280 --> 01:21:00,840 So all right. 1289 01:21:00,840 --> 01:21:02,970 I'll stop there for today. 1290 01:21:02,970 --> 01:21:05,080 And let's see. 1291 01:21:05,080 --> 01:21:10,290 Next Tuesday is a presentation from Morgan Stanley so. 1292 01:21:10,290 --> 01:21:15,570 And today's the last day to sign up for a field trip.