1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:21,280 --> 00:00:23,640 KENNETH ABBOTT: As I said, my name is Ken Abbott. 9 00:00:23,640 --> 00:00:26,220 I'm the operating officer for Firm Risk Management 10 00:00:26,220 --> 00:00:29,850 at Morgan Stanley, which means I'm the everything else guy. 11 00:00:29,850 --> 00:00:33,480 I'm like the normal stuff with a bar over it. 12 00:00:33,480 --> 00:00:37,640 The complement of normal-- I get all the odd stuff. 13 00:00:37,640 --> 00:00:41,065 I consider myself the Harvey Keitel character. 14 00:00:41,065 --> 00:00:43,954 You know, the fixer? 15 00:00:43,954 --> 00:00:45,870 And so I get a lot of interesting stuff to do. 16 00:00:45,870 --> 00:00:47,920 I've covered commodities, I've covered fixed income, 17 00:00:47,920 --> 00:00:49,900 I've covered equities, I've covered credit derivatives, 18 00:00:49,900 --> 00:00:51,120 I've covered mortgages. 19 00:00:51,120 --> 00:00:52,670 Now I'm also the Chief Risk Officer 20 00:00:52,670 --> 00:00:55,000 for the buy side of Morgan Stanley. 21 00:00:55,000 --> 00:00:58,170 The investment management business and the private equity 22 00:00:58,170 --> 00:01:00,550 holdings that we have. 23 00:01:00,550 --> 00:01:03,160 And I look after lot of that stuff 24 00:01:03,160 --> 00:01:05,237 and I sit on probably 40 different committees 25 00:01:05,237 --> 00:01:07,320 because it's become very, very, very bureaucratic. 26 00:01:07,320 --> 00:01:11,127 But that's the way it goes. 27 00:01:11,127 --> 00:01:13,710 What I want to talk about today is some of the core approaches 28 00:01:13,710 --> 00:01:17,250 we use to measure a risk in a market risk setting. 29 00:01:17,250 --> 00:01:22,380 This is part of a larger course I teach at a couple places. 30 00:01:22,380 --> 00:01:26,400 I'm a triple alum at NYU-- no I'm a double alum 31 00:01:26,400 --> 00:01:28,193 and now I'm on their faculty [INAUDIBLE]. 32 00:01:28,193 --> 00:01:32,370 I have a masters in economics from their arts and sciences 33 00:01:32,370 --> 00:01:33,115 program. 34 00:01:33,115 --> 00:01:35,460 I have a masters in statistics from Stern 35 00:01:35,460 --> 00:01:37,160 when Stern used to have a stat program. 36 00:01:37,160 --> 00:01:40,260 And now I teach at Courant. 37 00:01:40,260 --> 00:01:42,915 I also teach at Claremont and I teach at Baruch, 38 00:01:42,915 --> 00:01:43,790 part of that program. 39 00:01:43,790 --> 00:01:45,990 So I've been through this material many times. 40 00:01:45,990 --> 00:01:51,380 So what I want to do is lay the foundation for this notion 41 00:01:51,380 --> 00:01:54,510 that we call value at risk, this idea of VaR. 42 00:01:58,785 --> 00:02:01,526 [INAUDIBLE] put this back on. 43 00:02:01,526 --> 00:02:02,090 Got it. 44 00:02:02,090 --> 00:02:02,990 I'll make it work. 45 00:02:05,720 --> 00:02:08,300 I'll talk about it from a mathematical standpoint 46 00:02:08,300 --> 00:02:10,210 and from a statistical standpoint, 47 00:02:10,210 --> 00:02:12,960 but also give you some of the intuition behind what 48 00:02:12,960 --> 00:02:17,190 it is that we're trying to do when we measure this thing. 49 00:02:17,190 --> 00:02:20,377 First, a couple words about risk management. 50 00:02:20,377 --> 00:02:21,210 What is the risk do? 51 00:02:21,210 --> 00:02:25,106 25 years ago, maybe three firms had risk management groups. 52 00:02:25,106 --> 00:02:26,730 I was part of the first risk management 53 00:02:26,730 --> 00:02:31,864 group at Bankers Trust in 1986. 54 00:02:31,864 --> 00:02:33,780 No one else had a risk management group as far 55 00:02:33,780 --> 00:02:34,330 as I know. 56 00:02:37,040 --> 00:02:40,590 Market risk management really came to be in the late '80s. 57 00:02:40,590 --> 00:02:42,090 Credit risk management had obviously 58 00:02:42,090 --> 00:02:45,700 been around in large financial institutions the whole time. 59 00:02:45,700 --> 00:02:50,180 So our job is to make sure that management 60 00:02:50,180 --> 00:02:51,660 knows what's on the books. 61 00:02:51,660 --> 00:02:54,910 So step one is, what is the risk profile of the firm? 62 00:02:54,910 --> 00:02:57,582 How do I make sure that management 63 00:02:57,582 --> 00:02:58,540 is informed about this? 64 00:02:58,540 --> 00:03:00,160 So it requires two things. 65 00:03:00,160 --> 00:03:02,520 One, I have to know what the risk profile is 66 00:03:02,520 --> 00:03:04,000 because I have to know it in order 67 00:03:04,000 --> 00:03:05,460 to be able to communicate it. 68 00:03:05,460 --> 00:03:08,210 But the second thing, equally important, particularly 69 00:03:08,210 --> 00:03:11,880 important for you guys and girls, 70 00:03:11,880 --> 00:03:16,310 is that you need to be able to express 71 00:03:16,310 --> 00:03:19,860 relatively complex concepts in simple words 72 00:03:19,860 --> 00:03:21,190 and pretty pictures. 73 00:03:21,190 --> 00:03:22,290 All right? 74 00:03:22,290 --> 00:03:25,900 Chances are if you go to work for big firm, 75 00:03:25,900 --> 00:03:28,260 your boss won't be a quant. 76 00:03:28,260 --> 00:03:30,880 My boss happens to have a degree from Carnegie Mellon. 77 00:03:30,880 --> 00:03:33,770 He can count to 11 with his shoes on. 78 00:03:33,770 --> 00:03:34,790 His boss is a lawyer. 79 00:03:34,790 --> 00:03:37,500 His boss is the chairman. 80 00:03:37,500 --> 00:03:41,220 Commonly, the most senior people are very, very intelligent, 81 00:03:41,220 --> 00:03:45,900 very, very articulate, very, very learned. 82 00:03:45,900 --> 00:03:47,432 But not necessarily quants. 83 00:03:47,432 --> 00:03:49,890 Many of them have had a year or two of calculus, maybe even 84 00:03:49,890 --> 00:03:51,860 linear algebra. 85 00:03:51,860 --> 00:03:54,800 You can't show them-- look, when you and I chat and we talk 86 00:03:54,800 --> 00:03:59,980 about regression analysis, I could say X transpose X inverse 87 00:03:59,980 --> 00:04:01,310 X transpose y. 88 00:04:01,310 --> 00:04:03,476 And those of you that have taken a regression course 89 00:04:03,476 --> 00:04:06,830 think, ah, that's beta hat. 90 00:04:06,830 --> 00:04:08,080 And we can just stop it there. 91 00:04:08,080 --> 00:04:11,010 I can just put this form up there and you may recognize it. 92 00:04:11,010 --> 00:04:13,760 I would have to spend 45 minutes explaining this 93 00:04:13,760 --> 00:04:15,733 to people on the top floor because this is not 94 00:04:15,733 --> 00:04:16,649 what they're studying. 95 00:04:16,649 --> 00:04:20,320 So we can talk the code amongst ourselves, 96 00:04:20,320 --> 00:04:25,372 but when we go outside our little group-- getting bigger-- 97 00:04:25,372 --> 00:04:27,830 we have to make sure that we can express ourselves clearly. 98 00:04:27,830 --> 00:04:30,950 That's done in clear, effective prose, and in graphs. 99 00:04:30,950 --> 00:04:33,800 And I'll show you some of that stuff as we go on. 100 00:04:33,800 --> 00:04:36,360 So step one, make sure management 101 00:04:36,360 --> 00:04:38,130 knows what the risk profile is. 102 00:04:38,130 --> 00:04:41,450 Step two, protect the firm against unacceptably large 103 00:04:41,450 --> 00:04:42,290 concentrations. 104 00:04:42,290 --> 00:04:44,090 This is the subjective part. 105 00:04:44,090 --> 00:04:46,690 I can know the risk, but how big is big? 106 00:04:46,690 --> 00:04:48,045 How much is too much? 107 00:04:48,045 --> 00:04:50,430 How much is too concentrated? 108 00:04:50,430 --> 00:04:53,550 If I have $1 million of sensitivity per basis point, 109 00:04:53,550 --> 00:04:56,405 that's a 1/100th of 1% move in a rate. 110 00:04:56,405 --> 00:04:57,350 Is that big? 111 00:04:57,350 --> 00:04:58,920 Is that small? 112 00:04:58,920 --> 00:05:00,197 How do I know how much? 113 00:05:00,197 --> 00:05:02,280 How much of a particular stock issue should I own? 114 00:05:02,280 --> 00:05:05,100 How much of a bond issue? 115 00:05:05,100 --> 00:05:06,610 How much futures open interest? 116 00:05:06,610 --> 00:05:08,800 How big a limit should I have on this type of risk? 117 00:05:08,800 --> 00:05:12,121 That's where intuition and experience come into play. 118 00:05:12,121 --> 00:05:13,620 So that's the second part of our job 119 00:05:13,620 --> 00:05:17,040 is to protect against unacceptably large losses. 120 00:05:17,040 --> 00:05:22,150 So the third, no surprises, you can 121 00:05:22,150 --> 00:05:27,980 liken the trading business-- it's taking calculated risks. 122 00:05:27,980 --> 00:05:29,560 Sometimes you're going to lose. 123 00:05:29,560 --> 00:05:31,190 Many times you're going to lose. 124 00:05:31,190 --> 00:05:35,630 In fact, if you win 51% of the time, life is pretty good. 125 00:05:35,630 --> 00:05:37,740 So what you want to do is make sure you have 126 00:05:37,740 --> 00:05:41,090 the right information so you can estimate, if things get bad, 127 00:05:41,090 --> 00:05:43,240 how bad will they get? 128 00:05:43,240 --> 00:05:44,750 And to use that, we leverage a lot 129 00:05:44,750 --> 00:05:49,710 of relatively simple notions that we see in statistics. 130 00:05:49,710 --> 00:05:54,804 And so I should use a coloring mask here, not a spotlight. 131 00:05:54,804 --> 00:05:55,720 We do a couple things. 132 00:05:55,720 --> 00:05:58,750 Just like the way when they talk about the press in your course 133 00:05:58,750 --> 00:06:01,400 about journalism, we can shine a light 134 00:06:01,400 --> 00:06:03,430 anywhere we want, and we do all the time. 135 00:06:03,430 --> 00:06:04,040 You know what? 136 00:06:04,040 --> 00:06:05,660 I'm going to think about this particular kind of risk. 137 00:06:05,660 --> 00:06:07,400 I'm going to point out that this is really important. 138 00:06:07,400 --> 00:06:08,733 You need to pay attention to it. 139 00:06:08,733 --> 00:06:09,924 And then I could shade it. 140 00:06:09,924 --> 00:06:12,340 I can make it blue, I can make a red, I can make it green. 141 00:06:12,340 --> 00:06:14,660 I'd say this is good, this is bad, this is too big, 142 00:06:14,660 --> 00:06:17,080 this is too small, this is perfectly fine. 143 00:06:17,080 --> 00:06:21,610 So that's just a little bit of quick background on what we do. 144 00:06:21,610 --> 00:06:25,035 So I'm going to go through as much of this as I can. 145 00:06:25,035 --> 00:06:26,760 I'm going to fly through the first part 146 00:06:26,760 --> 00:06:30,670 and I want to hit these because these are the ways that we 147 00:06:30,670 --> 00:06:31,990 actually estimate risk. 148 00:06:31,990 --> 00:06:34,590 Variance, covariance as a quadratic form. 149 00:06:34,590 --> 00:06:36,610 Monte Carlo simulation, the way I'll show you 150 00:06:36,610 --> 00:06:38,310 is based on a quadratic form. 151 00:06:38,310 --> 00:06:40,490 And historical simulation is Monte Carlo simulation 152 00:06:40,490 --> 00:06:42,130 without the Monte Carlo part. 153 00:06:42,130 --> 00:06:43,890 It's using historical data. 154 00:06:43,890 --> 00:06:47,080 And I'll go through that fairly quickly. 155 00:06:47,080 --> 00:06:50,320 Questions, comments? 156 00:06:50,320 --> 00:06:51,960 No? 157 00:06:51,960 --> 00:06:52,780 Excellent. 158 00:06:52,780 --> 00:06:55,710 Stop me-- look, if any one of you 159 00:06:55,710 --> 00:06:58,750 doesn't understand something I say, probably many of you 160 00:06:58,750 --> 00:07:00,084 don't understand it. 161 00:07:00,084 --> 00:07:02,250 I don't know you guys, so I don't know what you know 162 00:07:02,250 --> 00:07:02,930 and what you don't know. 163 00:07:02,930 --> 00:07:05,096 So if there's a term that comes up, you're not sure, 164 00:07:05,096 --> 00:07:08,930 just say, Ken, I don't have a PhD. 165 00:07:08,930 --> 00:07:11,420 I work for a living. 166 00:07:11,420 --> 00:07:13,130 I make fun of academics. 167 00:07:13,130 --> 00:07:15,010 I know you work for a living too. 168 00:07:15,010 --> 00:07:16,780 All right. 169 00:07:16,780 --> 00:07:18,809 There's a guy I tease at Claremont [INAUDIBLE] 170 00:07:18,809 --> 00:07:21,100 in his class, I say, who is this pointy-headed academic 171 00:07:21,100 --> 00:07:22,760 [INAUDIBLE]. 172 00:07:22,760 --> 00:07:24,937 Only kidding. 173 00:07:24,937 --> 00:07:27,520 All right, so I'm going to talk about one-asset value at risk. 174 00:07:27,520 --> 00:07:29,200 First I'm going to introduce the notion of value at risk. 175 00:07:29,200 --> 00:07:30,780 I'm going to talk about one asset. 176 00:07:30,780 --> 00:07:32,780 I'm going to talk about price-based instruments. 177 00:07:32,780 --> 00:07:35,052 We're going to go into yield space, 178 00:07:35,052 --> 00:07:36,510 so we'll talk about the conversions 179 00:07:36,510 --> 00:07:38,300 we have to do there. 180 00:07:38,300 --> 00:07:40,481 One thing I'll do after this class is over, 181 00:07:40,481 --> 00:07:42,980 since I know I'm going to fly through some of the material-- 182 00:07:42,980 --> 00:07:44,563 and since this is MIT, I'm sure you're 183 00:07:44,563 --> 00:07:46,276 used to just flying through material. 184 00:07:46,276 --> 00:07:48,150 And there's a lot of this, the proof of which 185 00:07:48,150 --> 00:07:50,066 is left to the reader as an exercise. 186 00:07:50,066 --> 00:07:51,690 I'm sure you get a fair amount of that. 187 00:07:51,690 --> 00:07:53,120 I will give you papers. 188 00:07:53,120 --> 00:07:57,070 If you have questions, my email is on the first page. 189 00:07:57,070 --> 00:07:58,610 I welcome your questions. 190 00:07:58,610 --> 00:08:00,540 I tell my students that every year. 191 00:08:00,540 --> 00:08:02,500 I'm OK with you sending me an email asking me 192 00:08:02,500 --> 00:08:05,450 for a reference, a citation, something. 193 00:08:05,450 --> 00:08:06,760 I'm perfectly fine with that. 194 00:08:06,760 --> 00:08:09,040 Don't worry, oh, he's too busy. 195 00:08:09,040 --> 00:08:10,245 I'm fine. 196 00:08:10,245 --> 00:08:12,950 If you've got a question, something is not clear, 197 00:08:12,950 --> 00:08:16,750 I've got access to thousands of papers. 198 00:08:16,750 --> 00:08:17,790 And I've screened them. 199 00:08:17,790 --> 00:08:18,860 I've read thousands of papers, I say 200 00:08:18,860 --> 00:08:20,780 this is a good one, that's a waste of time. 201 00:08:20,780 --> 00:08:22,475 But I can give you background material 202 00:08:22,475 --> 00:08:26,110 on regulation, on bond pricing, on derivative algorithms. 203 00:08:26,110 --> 00:08:26,680 Let me know. 204 00:08:26,680 --> 00:08:29,000 I'm happy to provide that at any point in time. 205 00:08:29,000 --> 00:08:30,536 You get that free with your tuition. 206 00:08:33,789 --> 00:08:35,120 A couple of key metrics. 207 00:08:35,120 --> 00:08:36,953 I don't want to spend too much time on this. 208 00:08:36,953 --> 00:08:38,620 Interest rate exposure, how sensitive 209 00:08:38,620 --> 00:08:40,850 am I to changes in interest rates, equity exposure, 210 00:08:40,850 --> 00:08:43,200 commodity exposure, credit spread exposure. 211 00:08:43,200 --> 00:08:47,080 We'll talk about linearity, we won't talk too much 212 00:08:47,080 --> 00:08:48,512 about regularity of cash flow. 213 00:08:48,512 --> 00:08:49,970 We won't really get into that here. 214 00:08:49,970 --> 00:08:53,129 And we need to know correlation across different asset classes. 215 00:08:53,129 --> 00:08:54,545 And I'll show you what that means. 216 00:08:57,140 --> 00:09:00,790 At the heart of this notion of value at risk 217 00:09:00,790 --> 00:09:05,350 is this idea of a statistical order statistic. 218 00:09:05,350 --> 00:09:08,317 Who here has heard of order statistics? 219 00:09:08,317 --> 00:09:10,150 All right, I'm going to give you 30 seconds. 220 00:09:10,150 --> 00:09:15,165 The best simple description of an order statistic. 221 00:09:15,165 --> 00:09:16,710 PROFESSOR: The maximum or the minimum 222 00:09:16,710 --> 00:09:19,600 of a set of observations. 223 00:09:19,600 --> 00:09:21,230 KENNETH ABBOTT: All right? 224 00:09:21,230 --> 00:09:23,540 When we talk about value at risk, 225 00:09:23,540 --> 00:09:28,420 I want to know the worst 1% of the outcomes. 226 00:09:28,420 --> 00:09:32,030 And what's cool about order statistics 227 00:09:32,030 --> 00:09:34,610 is they're well established in the literature. 228 00:09:34,610 --> 00:09:36,840 Pretty well understood. 229 00:09:36,840 --> 00:09:39,430 And so people are familiar with it. 230 00:09:39,430 --> 00:09:42,180 Once we put our toe into the academic water 231 00:09:42,180 --> 00:09:44,160 and we start talking about this notion, 232 00:09:44,160 --> 00:09:46,071 there's a vast body of literature 233 00:09:46,071 --> 00:09:47,570 that says this is how this thing is. 234 00:09:47,570 --> 00:09:48,460 This is how it pays. 235 00:09:48,460 --> 00:09:50,680 This is what the distribution looks like. 236 00:09:50,680 --> 00:09:55,200 And so we can estimate these things. 237 00:09:55,200 --> 00:09:57,830 And so what we're looking at in value at risk, 238 00:09:57,830 --> 00:10:03,867 if my distribution of returns, how much I make. 239 00:10:03,867 --> 00:10:05,450 In particular, if I look historically, 240 00:10:05,450 --> 00:10:06,860 I have a position. 241 00:10:06,860 --> 00:10:10,630 How much would this position have earned me over the last n 242 00:10:10,630 --> 00:10:13,140 days, n weeks, n months. 243 00:10:13,140 --> 00:10:16,430 If I look at a frequency distribution of that, 244 00:10:16,430 --> 00:10:19,690 I'm likely-- don't have to-- I'm likely to get something that's 245 00:10:19,690 --> 00:10:20,417 symmetric. 246 00:10:20,417 --> 00:10:22,250 I'm likely to get something that's unimodal. 247 00:10:24,860 --> 00:10:26,610 It may or may not have fat tails. 248 00:10:26,610 --> 00:10:29,290 We'll talk about that a little later. 249 00:10:29,290 --> 00:10:32,040 If my return distribution were beautifully symmetric 250 00:10:32,040 --> 00:10:38,490 and beautifully normal and independent, then the risk-- 251 00:10:38,490 --> 00:10:40,790 I could measure this 1% order statistic. 252 00:10:40,790 --> 00:10:46,570 What's the 1% likely worst case outcome tomorrow? 253 00:10:46,570 --> 00:10:49,960 I might do that by integrating the normal function 254 00:10:49,960 --> 00:10:54,150 from negative infinity-- for all intents and purposes 255 00:10:54,150 --> 00:10:55,640 five or six standard deviations. 256 00:10:55,640 --> 00:10:59,970 Anyway, from negative infinity to negative 2.33 257 00:10:59,970 --> 00:11:01,080 standard deviations. 258 00:11:01,080 --> 00:11:01,710 Why? 259 00:11:01,710 --> 00:11:05,250 Because the area under the curve, that's 0.01. 260 00:11:05,250 --> 00:11:07,440 Now this is a one-sided confidence interval 261 00:11:07,440 --> 00:11:09,540 as opposed to a two-sided confidence integral. 262 00:11:09,540 --> 00:11:12,060 And this is one of these things that as an undergrad 263 00:11:12,060 --> 00:11:14,370 you learn two-sided, and then the first time someone 264 00:11:14,370 --> 00:11:15,670 shows you one sided you're like, wait a minute. 265 00:11:15,670 --> 00:11:16,212 What is this? 266 00:11:16,212 --> 00:11:17,336 Than you say, oh, I get it. 267 00:11:17,336 --> 00:11:18,690 You're just looking at the area. 268 00:11:18,690 --> 00:11:21,210 I could build a gazillion two-sided confidence intervals. 269 00:11:21,210 --> 00:11:23,620 One sided, it's got to stop at one place. 270 00:11:23,620 --> 00:11:25,870 All right so this set of outcomes-- and this 271 00:11:25,870 --> 00:11:31,480 is standardized-- this is in standard deviation space-- 272 00:11:31,480 --> 00:11:34,150 negative infinity to 2.33. 273 00:11:34,150 --> 00:11:37,900 If I want 95%, or 5% likely loss, so I could say, 274 00:11:37,900 --> 00:11:39,910 tomorrow there's a 5% chance my loss is 275 00:11:39,910 --> 00:11:42,230 going to be x or greater, I would go 276 00:11:42,230 --> 00:11:45,950 to 1.645 standard deviations. 277 00:11:45,950 --> 00:11:48,120 Because the integral from negative infinity to 1.645 278 00:11:48,120 --> 00:11:52,970 standard deviations is about 0.05. 279 00:11:52,970 --> 00:11:55,650 It's not just a good idea, it's the law. 280 00:11:55,650 --> 00:11:58,560 Does that make sense? 281 00:11:58,560 --> 00:12:00,602 And again, I'm going to say assuming the normal. 282 00:12:00,602 --> 00:12:02,060 That's like the old economist joke, 283 00:12:02,060 --> 00:12:04,630 assume a can opener when he's on a desert island. 284 00:12:04,630 --> 00:12:05,900 You guys don't know that one. 285 00:12:05,900 --> 00:12:07,410 I got lots of economics jokes. 286 00:12:07,410 --> 00:12:10,970 I'll tell them later on maybe-- or after class. 287 00:12:10,970 --> 00:12:13,950 If I'm assuming normal distribution, and that's 288 00:12:13,950 --> 00:12:16,084 what I'm going to do, what I want to do 289 00:12:16,084 --> 00:12:19,600 is I'm going to set this thing up in a normal distribution 290 00:12:19,600 --> 00:12:20,140 framework. 291 00:12:20,140 --> 00:12:24,200 Now doing this approach and assuming normal distributions, 292 00:12:24,200 --> 00:12:28,720 I liken it to using Latin. 293 00:12:28,720 --> 00:12:32,240 Nobody really uses it anymore but everything we do 294 00:12:32,240 --> 00:12:33,872 is based upon it. 295 00:12:33,872 --> 00:12:35,080 So that's our starting point. 296 00:12:35,080 --> 00:12:37,710 And it's really easy to teach it this way 297 00:12:37,710 --> 00:12:39,560 and then we relax the assumptions 298 00:12:39,560 --> 00:12:41,210 like so many things in life. 299 00:12:41,210 --> 00:12:42,880 I teach you the strict case then we 300 00:12:42,880 --> 00:12:45,630 relax the assumptions to get to the way it's done now. 301 00:12:45,630 --> 00:12:48,360 So this makes sense? 302 00:12:48,360 --> 00:12:48,870 All right. 303 00:12:48,870 --> 00:12:51,290 So let's get there. 304 00:12:51,290 --> 00:12:53,670 This is way oversimplified-- but let's say 305 00:12:53,670 --> 00:12:55,117 I have something like this. 306 00:12:55,117 --> 00:12:56,700 Who has taken intermediate statistics? 307 00:13:01,530 --> 00:13:05,290 We have the notion of stationarity 308 00:13:05,290 --> 00:13:06,720 that we talk about all the time. 309 00:13:06,720 --> 00:13:10,022 The mean and variance constant is one simplistic way 310 00:13:10,022 --> 00:13:10,980 of thinking about this. 311 00:13:10,980 --> 00:13:13,147 Do you have a better way for me to put that to them? 312 00:13:13,147 --> 00:13:15,146 Because you know what their background would be. 313 00:13:15,146 --> 00:13:15,870 PROFESSOR: No. 314 00:13:15,870 --> 00:13:16,953 KENNETH ABBOTT: All right. 315 00:13:16,953 --> 00:13:20,400 Just, mean and variance are constant. 316 00:13:20,400 --> 00:13:22,930 When I look at the time series itself, 317 00:13:22,930 --> 00:13:26,110 the time series mean and the time series variance 318 00:13:26,110 --> 00:13:28,520 are not constant. 319 00:13:28,520 --> 00:13:30,920 And there also could be other time series stuff going on. 320 00:13:30,920 --> 00:13:34,900 There could be seasonality, there could be autocorrelation. 321 00:13:34,900 --> 00:13:38,430 This looks something like a random walk 322 00:13:38,430 --> 00:13:40,560 but it's not stationary. 323 00:13:40,560 --> 00:13:43,371 It's hard for me to draw inference by looking at that 324 00:13:43,371 --> 00:13:43,870 alone. 325 00:13:43,870 --> 00:13:45,328 So we want to try to predict what's 326 00:13:45,328 --> 00:13:47,460 going to happen in the future, it's kind of hard. 327 00:13:47,460 --> 00:13:51,180 And the game, here, that we're playing, is we want to know 328 00:13:51,180 --> 00:13:55,510 how much money do I need to hold to support that position? 329 00:13:55,510 --> 00:13:58,146 Now, who here has taken an accounting course? 330 00:13:58,146 --> 00:14:00,780 All right, word to the wise-- there's 331 00:14:00,780 --> 00:14:03,020 two things I tell students in quant finance programs. 332 00:14:03,020 --> 00:14:05,686 First of all, I know you have to take a time series course-- I'm 333 00:14:05,686 --> 00:14:07,447 sure-- this is MIT. 334 00:14:07,447 --> 00:14:09,030 If you don't get a time series course, 335 00:14:09,030 --> 00:14:11,910 get your money back because you've got to take time series. 336 00:14:11,910 --> 00:14:13,540 Accounting is important. 337 00:14:13,540 --> 00:14:15,500 Accounting is important because so much 338 00:14:15,500 --> 00:14:17,350 of what we do, the way we think about things 339 00:14:17,350 --> 00:14:18,760 is predicated on the dollars. 340 00:14:18,760 --> 00:14:22,320 And you need to know how the dollars are recorded. 341 00:14:22,320 --> 00:14:22,850 Quick aside. 342 00:14:28,650 --> 00:14:29,530 Balance sheet. 343 00:14:29,530 --> 00:14:31,405 I'll give you a 30 second accounting lecture. 344 00:14:34,800 --> 00:14:38,230 Assets, what we own. 345 00:14:38,230 --> 00:14:41,410 Everything we own-- we have stuff, it's assets. 346 00:14:41,410 --> 00:14:43,560 We came to that stuff one of two ways. 347 00:14:43,560 --> 00:14:46,750 We either pay for it out of our pocket, or we borrowed money. 348 00:14:46,750 --> 00:14:48,150 There's no third way. 349 00:14:48,150 --> 00:14:50,624 So everything we own, we either paid 350 00:14:50,624 --> 00:14:52,290 for out of our pocket or borrowed money. 351 00:14:52,290 --> 00:14:57,950 The amount we paid for out of our pocket is the equity. 352 00:14:57,950 --> 00:15:01,000 The ratio of this to this is called leverage 353 00:15:01,000 --> 00:15:02,510 among other things. 354 00:15:02,510 --> 00:15:04,040 All right? 355 00:15:04,040 --> 00:15:05,400 If I'm this company. 356 00:15:05,400 --> 00:15:08,230 I have this much stuff and I bought it 357 00:15:08,230 --> 00:15:10,630 with this much debt, and this much equity. 358 00:15:10,630 --> 00:15:13,760 Again, that's a gross oversimplification. 359 00:15:13,760 --> 00:15:16,710 When this gets down to zero, it's game over. 360 00:15:16,710 --> 00:15:17,870 Belly up. 361 00:15:17,870 --> 00:15:19,160 All right? 362 00:15:19,160 --> 00:15:20,520 Does that make sense? 363 00:15:20,520 --> 00:15:23,590 Now you've taken a semester of accounting. 364 00:15:23,590 --> 00:15:26,050 No, only kidding. 365 00:15:26,050 --> 00:15:30,970 But it's actually important to have a grip on how that works. 366 00:15:30,970 --> 00:15:33,270 Because what we need to make sure of 367 00:15:33,270 --> 00:15:36,560 is that if we're going to take this position and hold it, 368 00:15:36,560 --> 00:15:40,690 we need to make sure that with some level of certainty-- 369 00:15:40,690 --> 00:15:43,310 every time we lose money this gets reduced. 370 00:15:43,310 --> 00:15:47,040 When this goes down to zero, I go bankrupt. 371 00:15:47,040 --> 00:15:48,650 So that's what we're trying to do. 372 00:15:48,650 --> 00:15:51,340 We need to protect this, and we do it 373 00:15:51,340 --> 00:15:55,220 by knowing how much of this could move against us. 374 00:15:55,220 --> 00:15:56,247 Everybody with me? 375 00:15:56,247 --> 00:15:57,080 Anybody not with me? 376 00:15:57,080 --> 00:16:00,624 It's OK to have questions, it really is. 377 00:16:00,624 --> 00:16:03,540 Excellent. 378 00:16:03,540 --> 00:16:05,680 All right, so if I do a frequency distribution 379 00:16:05,680 --> 00:16:10,980 of this time series, I just say, show me the frequency 380 00:16:10,980 --> 00:16:13,270 with which this thing shows. 381 00:16:13,270 --> 00:16:16,891 I get this thing, it's kind of trimodal. 382 00:16:16,891 --> 00:16:17,890 It's all over the place. 383 00:16:17,890 --> 00:16:19,480 It doesn't tell me anything. 384 00:16:19,480 --> 00:16:22,200 If I look at the levels-- the frequency distribution, 385 00:16:22,200 --> 00:16:24,290 the relative frequency distribution of the levels 386 00:16:24,290 --> 00:16:26,800 themselves, I don't get a whole lot of intuition. 387 00:16:26,800 --> 00:16:29,080 If I go into return space, which is either 388 00:16:29,080 --> 00:16:31,440 looking at the log differences from day to day, 389 00:16:31,440 --> 00:16:33,340 or the percentage changes from day to day, 390 00:16:33,340 --> 00:16:36,030 or perhaps the absolute changes from day to day-- 391 00:16:36,030 --> 00:16:38,810 it varies from market to market. 392 00:16:38,810 --> 00:16:41,910 Oh, look, now we're in familiar territory. 393 00:16:41,910 --> 00:16:44,260 So what I'm doing here-- and this 394 00:16:44,260 --> 00:16:46,350 is why I started out with a normal distribution 395 00:16:46,350 --> 00:16:48,640 because this thing is unimodal. 396 00:16:48,640 --> 00:16:50,310 It's more or less symmetric. 397 00:16:50,310 --> 00:16:50,960 Right? 398 00:16:50,960 --> 00:16:52,940 Now is it a perfect measure? 399 00:16:52,940 --> 00:16:55,146 No, because it's probably got fat tails. 400 00:16:55,146 --> 00:16:56,520 So it's a little bit like looking 401 00:16:56,520 --> 00:16:59,500 for the glasses you lost up on 67th Street down on 59th street 402 00:16:59,500 --> 00:17:01,140 because there's more light there. 403 00:17:01,140 --> 00:17:03,410 But it's a starting point. 404 00:17:03,410 --> 00:17:09,369 So what I'm saying to you is once I difference it-- no, 405 00:17:09,369 --> 00:17:10,660 I won't talk about [INAUDIBLE]. 406 00:17:10,660 --> 00:17:12,099 Once I difference the timeshares, 407 00:17:12,099 --> 00:17:14,682 once I take the timeshares and look at the percentage changes, 408 00:17:14,682 --> 00:17:17,140 and I look at the frequency distribution of those changes, 409 00:17:17,140 --> 00:17:20,109 I get this which is far more amenable. 410 00:17:20,109 --> 00:17:22,140 And I can draw inference from that. 411 00:17:22,140 --> 00:17:24,950 I can say, ah, now if this thing is normal, 412 00:17:24,950 --> 00:17:28,630 then I know that x% of my observations 413 00:17:28,630 --> 00:17:30,410 will take place over here. 414 00:17:30,410 --> 00:17:33,280 Now I can start drawing inferences. 415 00:17:33,280 --> 00:17:35,310 And a thing to keep in mind here, 416 00:17:35,310 --> 00:17:39,180 one thing we do constantly in statistics 417 00:17:39,180 --> 00:17:43,900 is we do parameter estimates. 418 00:17:43,900 --> 00:17:46,410 And remember, every time you estimate something 419 00:17:46,410 --> 00:17:48,480 you estimate it with error. 420 00:17:48,480 --> 00:17:50,610 I think that maybe the single most important thing 421 00:17:50,610 --> 00:17:52,951 I learned when I got my statistics degree. 422 00:17:52,951 --> 00:17:54,950 Everything you estimate you estimate with error. 423 00:17:54,950 --> 00:17:57,060 People do means, they say, oh, it's x. 424 00:17:57,060 --> 00:17:59,712 No, that's the average and that's an unbiased estimator, 425 00:17:59,712 --> 00:18:01,670 but guess what, there's a huge amount of noise. 426 00:18:01,670 --> 00:18:03,045 And there's a certain probability 427 00:18:03,045 --> 00:18:04,400 that you're wrong by x%. 428 00:18:04,400 --> 00:18:08,110 So every time we come up with a number, when somebody tells me 429 00:18:08,110 --> 00:18:11,300 the risk is 10, that means it's probably not 10,000, 430 00:18:11,300 --> 00:18:13,532 it's probably not zero. 431 00:18:13,532 --> 00:18:14,490 Just keep that in mind. 432 00:18:14,490 --> 00:18:17,200 Just sort of throw that in on the side for nothing. 433 00:18:17,200 --> 00:18:20,250 All right, so when I take the returns of this same time 434 00:18:20,250 --> 00:18:22,980 series, I get something that's unimodal, symmetric, 435 00:18:22,980 --> 00:18:24,560 may or may not have fat tails. 436 00:18:24,560 --> 00:18:26,750 That has important implications for whether or not 437 00:18:26,750 --> 00:18:28,550 my normal distribution underestimates 438 00:18:28,550 --> 00:18:30,300 the amount of risk I'm taking. 439 00:18:30,300 --> 00:18:32,990 Everybody with me on that more or less? 440 00:18:32,990 --> 00:18:33,744 Questions? 441 00:18:33,744 --> 00:18:34,660 Now would be the time. 442 00:18:38,000 --> 00:18:41,160 Good enough? 443 00:18:41,160 --> 00:18:42,660 He's lived this. 444 00:18:42,660 --> 00:18:44,170 All right. 445 00:18:44,170 --> 00:18:47,610 So once I have my time series of returns, which I just 446 00:18:47,610 --> 00:18:50,170 plotted there, I can gauge their dispersion 447 00:18:50,170 --> 00:18:52,382 with this measure called variance. 448 00:18:52,382 --> 00:18:53,715 And you guys probably know this. 449 00:19:01,390 --> 00:19:07,360 Variance, the expected value of x_i minus x bar-- I 450 00:19:07,360 --> 00:19:10,960 love these thick chalks-- squared. 451 00:19:10,960 --> 00:19:19,640 And it's the sum of x_i minus x bar squared over n minus 1. 452 00:19:19,640 --> 00:19:22,680 It's a measure of dispersion. 453 00:19:22,680 --> 00:19:25,940 Variance has its-- Now, I should say 454 00:19:25,940 --> 00:19:31,840 that this is sigma squared hat. 455 00:19:31,840 --> 00:19:32,340 Right? 456 00:19:32,340 --> 00:19:33,660 Estimate-- parameter estimate. 457 00:19:33,660 --> 00:19:35,190 Parameter. 458 00:19:35,190 --> 00:19:37,600 Parameter estimate. 459 00:19:37,600 --> 00:19:41,990 This is measured with error. 460 00:19:41,990 --> 00:19:45,616 Anybody here know what the distribution of this is? 461 00:19:45,616 --> 00:19:46,750 Anyone? 462 00:19:46,750 --> 00:19:49,270 $5. 463 00:19:49,270 --> 00:19:50,530 Close. 464 00:19:50,530 --> 00:19:51,810 n chi-squared. 465 00:19:51,810 --> 00:19:52,550 Worth $2. 466 00:19:52,550 --> 00:19:56,010 Talk to me after class. 467 00:19:56,010 --> 00:19:57,487 It's a chi-squared distribution. 468 00:19:57,487 --> 00:19:58,320 What does that mean? 469 00:19:58,320 --> 00:20:04,177 That means that we know it can't be 0 or less than 0. 470 00:20:04,177 --> 00:20:06,510 If you figure out a way to get variances less than zero, 471 00:20:06,510 --> 00:20:08,750 let's talk. 472 00:20:08,750 --> 00:20:10,810 And it's got a long right tail, but that's 473 00:20:10,810 --> 00:20:12,020 because this is squared. 474 00:20:12,020 --> 00:20:14,660 [INAUDIBLE] one point can move it up. 475 00:20:14,660 --> 00:20:18,100 Anyway, once I have my returns, I 476 00:20:18,100 --> 00:20:20,960 have a measure of the dispersion of these returns called 477 00:20:20,960 --> 00:20:22,290 variance. 478 00:20:22,290 --> 00:20:24,270 I take the square root of the variance, 479 00:20:24,270 --> 00:20:28,390 which is the standard deviation, or the volatility. 480 00:20:28,390 --> 00:20:29,890 When I'm doing it with a data set, 481 00:20:29,890 --> 00:20:32,220 I usually refer to it as the standard deviation. 482 00:20:32,220 --> 00:20:36,120 When I'm referring to the standard deviation 483 00:20:36,120 --> 00:20:39,500 of the distribution, I usually call it the standard error. 484 00:20:39,500 --> 00:20:42,072 Is that a law or is that just common parlance? 485 00:20:42,072 --> 00:20:43,052 PROFESSOR: Both. 486 00:20:43,052 --> 00:20:45,218 The standard error is typically for something that's 487 00:20:45,218 --> 00:20:47,264 random, like an estimate. 488 00:20:47,264 --> 00:20:50,087 Whereas the standard deviation is more like for sample-- 489 00:20:50,087 --> 00:20:51,170 KENNETH ABBOTT: Empirical. 490 00:20:51,170 --> 00:20:54,909 See, it's important because when you first learn this, 491 00:20:54,909 --> 00:20:55,950 they don't tell you that. 492 00:20:55,950 --> 00:20:57,420 And they flip them back and forth. 493 00:20:57,420 --> 00:20:59,420 And then when you take the intermediate courses, 494 00:20:59,420 --> 00:21:01,360 they say, no, don't use standard deviation 495 00:21:01,360 --> 00:21:03,460 when you mean standard error. 496 00:21:03,460 --> 00:21:06,350 And you'll get points off on your exam for that, right? 497 00:21:06,350 --> 00:21:09,710 All right, so, the standard deviation 498 00:21:09,710 --> 00:21:11,790 is the square root of the variance, 499 00:21:11,790 --> 00:21:13,790 also called the volatility. 500 00:21:13,790 --> 00:21:16,470 In a normal distribution, 1% of the observations 501 00:21:16,470 --> 00:21:18,775 is outside of 2.33 standard deviations. 502 00:21:21,330 --> 00:21:27,380 For 95%, it's out past 1.64, 1.645 standard deviations. 503 00:21:27,380 --> 00:21:28,880 Now you're saying, wait a minute, 504 00:21:28,880 --> 00:21:32,000 where did my 1.96 go that I learned as an undergrad. 505 00:21:32,000 --> 00:21:33,740 Two-sided. 506 00:21:33,740 --> 00:21:37,740 So if I go from the mean to 1.96 standard deviations 507 00:21:37,740 --> 00:21:40,570 on either side, that encompasses 95% 508 00:21:40,570 --> 00:21:43,000 of the total area of the integral from negative infinity 509 00:21:43,000 --> 00:21:43,880 to positive infinity. 510 00:21:43,880 --> 00:21:44,963 Everybody with me on that? 511 00:21:44,963 --> 00:21:46,560 Does that make sense? 512 00:21:46,560 --> 00:21:48,020 The two-sided versus one-sided. 513 00:21:48,020 --> 00:21:49,270 That's confused me. 514 00:21:49,270 --> 00:21:51,310 When I was your age, it confused me a lot. 515 00:21:51,310 --> 00:21:53,950 But I got there. 516 00:21:53,950 --> 00:21:56,960 All right so this is how we do it. 517 00:21:56,960 --> 00:22:00,092 Excel functions are VAR and-- you don't need to know that. 518 00:22:00,092 --> 00:22:02,300 All right, so in this case, I estimating the variance 519 00:22:02,300 --> 00:22:04,340 of this particular time series. 520 00:22:04,340 --> 00:22:06,920 I took the standard deviation by taking the square root 521 00:22:06,920 --> 00:22:09,040 of the variance. 522 00:22:09,040 --> 00:22:10,610 It's in percentages. 523 00:22:10,610 --> 00:22:12,990 When you do this, I tell you, it's like physics, 524 00:22:12,990 --> 00:22:16,737 your units will screw you up every time. 525 00:22:16,737 --> 00:22:17,570 What am I measuring? 526 00:22:17,570 --> 00:22:18,740 What are my units? 527 00:22:18,740 --> 00:22:20,130 I still make units mistakes. 528 00:22:20,130 --> 00:22:20,890 I want you to know that. 529 00:22:20,890 --> 00:22:21,880 And I'm in this business 30 years. 530 00:22:21,880 --> 00:22:23,400 I still make units mistakes. 531 00:22:23,400 --> 00:22:24,950 Just like physics. 532 00:22:24,950 --> 00:22:26,740 I'm in percentage change space, so I 533 00:22:26,740 --> 00:22:28,573 want to talk in terms of percentage changes. 534 00:22:28,573 --> 00:22:31,820 The standard deviation is 1.8% of that time series 535 00:22:31,820 --> 00:22:33,090 I showed you. 536 00:22:33,090 --> 00:22:36,410 So 2.33 standard deviations times the standard deviation 537 00:22:36,410 --> 00:22:38,720 is about 4.2%. 538 00:22:38,720 --> 00:22:46,280 What that says, given this data set-- one time series-- I'm 539 00:22:46,280 --> 00:22:50,430 saying, I expect to lose, on any given day, 540 00:22:50,430 --> 00:22:54,540 if I have that position, 99% of the time I'm going 541 00:22:54,540 --> 00:22:58,240 to lose 4.2% of it or less. 542 00:23:00,840 --> 00:23:01,472 Very important. 543 00:23:01,472 --> 00:23:02,180 Think about that. 544 00:23:02,180 --> 00:23:04,750 Is that clear? 545 00:23:04,750 --> 00:23:06,960 That's how I get there. 546 00:23:06,960 --> 00:23:09,880 I'm making a statement about the probability of loss. 547 00:23:09,880 --> 00:23:12,490 I'm saying there's a 1% probability, 548 00:23:12,490 --> 00:23:19,910 for that particular time series-- which is-- all right? 549 00:23:19,910 --> 00:23:21,530 If this is my historical data set 550 00:23:21,530 --> 00:23:24,640 and it's my only historical data set, and I own this, 551 00:23:24,640 --> 00:23:28,812 tomorrow I may be 4.2% lighter than I was today 552 00:23:28,812 --> 00:23:30,520 because the market could move against me. 553 00:23:30,520 --> 00:23:34,930 And I'm 99% sure, if the future's like the past, 554 00:23:34,930 --> 00:23:40,240 that my loss tomorrow is going to be 4.2% or less. 555 00:23:40,240 --> 00:23:42,670 That's VaR. 556 00:23:42,670 --> 00:23:45,860 Simplest case, assuming normal distribution, 557 00:23:45,860 --> 00:23:50,500 single asset, not fixed income. 558 00:23:50,500 --> 00:23:51,290 Yes, no? 559 00:23:51,290 --> 00:23:52,180 Questions, comments? 560 00:23:52,180 --> 00:23:58,779 AUDIENCE: Yes, [INAUDIBLE] positive and [INAUDIBLE]. 561 00:23:58,779 --> 00:23:59,820 KENNETH ABBOTT: Yes, yes. 562 00:23:59,820 --> 00:24:02,960 Assuming my distribution is symmetric. 563 00:24:02,960 --> 00:24:08,210 Now that's the right assumption to point out. 564 00:24:08,210 --> 00:24:11,110 Because in the real world, it may not be symmetric. 565 00:24:11,110 --> 00:24:13,580 And when we go into historical simulation, 566 00:24:13,580 --> 00:24:15,081 we use empirical distributions where 567 00:24:15,081 --> 00:24:17,163 we don't care if it's symmetric because we're only 568 00:24:17,163 --> 00:24:18,310 looking at the downside. 569 00:24:18,310 --> 00:24:21,390 And whether I'm long or short, I might 570 00:24:21,390 --> 00:24:24,337 care about the downside or the putative upside. 571 00:24:24,337 --> 00:24:26,170 Because I'm short, and I care about how much 572 00:24:26,170 --> 00:24:27,220 is going to move up. 573 00:24:27,220 --> 00:24:27,750 Make sense? 574 00:24:27,750 --> 00:24:29,720 That's the right question to ask. 575 00:24:29,720 --> 00:24:30,350 Yes? 576 00:24:30,350 --> 00:24:34,172 AUDIENCE: [INAUDIBLE] if you're doing it for upside as well? 577 00:24:34,172 --> 00:24:35,005 KENNETH ABBOTT: Yes. 578 00:24:35,005 --> 00:24:36,000 AUDIENCE: Could it just be the same thing? 579 00:24:36,000 --> 00:24:36,833 KENNETH ABBOTT: Yes. 580 00:24:36,833 --> 00:24:38,970 In fact, in this case, in what we're doing here 581 00:24:38,970 --> 00:24:41,380 of variance/covariance or closed form VaR, 582 00:24:41,380 --> 00:24:43,954 it's for long or short. 583 00:24:43,954 --> 00:24:45,870 But getting your signs right, I'm telling you, 584 00:24:45,870 --> 00:24:47,380 it's like physics. 585 00:24:47,380 --> 00:24:49,940 I still make that mistake. 586 00:24:49,940 --> 00:24:50,688 Yes? 587 00:24:50,688 --> 00:24:53,178 AUDIENCE: [INAUDIBLE] symmetric. 588 00:24:53,178 --> 00:24:55,600 Do you guys still use this process to say, OK-- 589 00:24:55,600 --> 00:24:59,300 KENNETH ABBOTT: I use it all the time as a heuristic. 590 00:24:59,300 --> 00:25:00,290 All right? 591 00:25:00,290 --> 00:25:05,530 Because let's say I've got-- and that's a very good question-- 592 00:25:05,530 --> 00:25:09,260 let's say I've got five years worth of data 593 00:25:09,260 --> 00:25:13,330 and I don't have time to do an empirical estimate. 594 00:25:13,330 --> 00:25:17,090 It could be lopsided. 595 00:25:17,090 --> 00:25:19,910 If you tell me a two standard deviation move 596 00:25:19,910 --> 00:25:22,880 is x, that means something to me. 597 00:25:22,880 --> 00:25:25,230 Now, there's a problem with that. 598 00:25:25,230 --> 00:25:28,432 And the problem is that people extrapolate that. 599 00:25:28,432 --> 00:25:30,890 Sometimes people talk to me and, oh, it's an eight standard 600 00:25:30,890 --> 00:25:32,610 deviation move. 601 00:25:32,610 --> 00:25:34,769 Eight standard deviation moves don't happen. 602 00:25:34,769 --> 00:25:36,935 I don't think we've seen an eight standard deviation 603 00:25:36,935 --> 00:25:40,120 move in the Cenozoic era. 604 00:25:40,120 --> 00:25:43,020 It just doesn't happen. 605 00:25:43,020 --> 00:25:45,500 Three standard deviation-- you will see a three standard 606 00:25:45,500 --> 00:25:50,160 deviation move once every 10,000 observations. 607 00:25:50,160 --> 00:25:53,874 Now, I learned this the hard way by just, 608 00:25:53,874 --> 00:25:55,540 see how many times do I have to do this? 609 00:25:55,540 --> 00:25:57,789 And then I looked it up in the table, oh, I was right. 610 00:26:00,400 --> 00:26:02,950 When we oversimplify, and start to talk about everything 611 00:26:02,950 --> 00:26:04,491 in terms of that normal distribution, 612 00:26:04,491 --> 00:26:07,100 we really just lose our grip on reality. 613 00:26:07,100 --> 00:26:10,020 But I use it as a heuristic all the time. 614 00:26:10,020 --> 00:26:12,457 I'll do it even now, and I know better. 615 00:26:12,457 --> 00:26:14,290 But I'll go, what's two standard deviations? 616 00:26:14,290 --> 00:26:15,664 What's three standard deviations? 617 00:26:15,664 --> 00:26:22,270 Because by and large-- and I still do this, I get my data 618 00:26:22,270 --> 00:26:25,560 and I line it up and I do frequency distributions. 619 00:26:25,560 --> 00:26:29,510 Hold on, I do this all the time with my data. 620 00:26:29,510 --> 00:26:30,660 Is it symmetric? 621 00:26:30,660 --> 00:26:32,050 Is it fat tailed? 622 00:26:32,050 --> 00:26:33,000 Is it unimodal? 623 00:26:33,000 --> 00:26:35,435 So that's a very good question. 624 00:26:35,435 --> 00:26:36,270 Any other questions? 625 00:26:36,270 --> 00:26:37,770 AUDIENCE: [INAUDIBLE] have we talked 626 00:26:37,770 --> 00:26:39,830 about the [? standard t ?] distribution? 627 00:26:39,830 --> 00:26:43,511 PROFESSOR: We Introduced it in the last lecture. 628 00:26:43,511 --> 00:26:48,540 And the problems set this week does relate to that. 629 00:26:48,540 --> 00:26:51,180 KENNETH ABBOTT: All right, perfect lead-in. 630 00:26:51,180 --> 00:26:54,090 So the statement I made, it's 1% of the time 631 00:26:54,090 --> 00:26:59,340 I'd expect to lose more than 4.2 pesos on 100 peso position. 632 00:26:59,340 --> 00:27:03,870 That's my inferential statement. 633 00:27:03,870 --> 00:27:08,020 In fact, over the same time period 634 00:27:08,020 --> 00:27:14,760 I lost 4.2% 1.5% of the time instead of 1% of the time. 635 00:27:14,760 --> 00:27:19,220 What that tells me, what that suggests to me, is my data set 636 00:27:19,220 --> 00:27:20,860 has fat tails. 637 00:27:20,860 --> 00:27:24,820 What that means is the likelihood of a loss-- 638 00:27:24,820 --> 00:27:27,340 a simple way of thinking about it [INAUDIBLE] care 639 00:27:27,340 --> 00:27:30,590 whether what that means in a metaphysical sense, 640 00:27:30,590 --> 00:27:32,410 a way to interpret it. 641 00:27:32,410 --> 00:27:35,290 The likelihood of a loss is greater 642 00:27:35,290 --> 00:27:38,491 than would be implied by the normal distribution. 643 00:27:38,491 --> 00:27:38,990 All right? 644 00:27:38,990 --> 00:27:41,330 So when you hear people say fat tails, 645 00:27:41,330 --> 00:27:43,870 generally, that's what they're talking about. 646 00:27:43,870 --> 00:27:46,370 There are different ways you could interpret that statement, 647 00:27:46,370 --> 00:27:49,860 but when somebody is talking about a financial time series, 648 00:27:49,860 --> 00:27:51,252 it has fat tails. 649 00:27:51,252 --> 00:27:52,960 Roughly 3/4 of your financial time series 650 00:27:52,960 --> 00:27:53,829 will have fat tails. 651 00:27:53,829 --> 00:27:55,620 They will also have time series properties, 652 00:27:55,620 --> 00:27:57,110 they won't be true random walks. 653 00:27:57,110 --> 00:27:59,080 True random walks says that I don't 654 00:27:59,080 --> 00:28:01,840 know whether it's going to go up or down based on the data 655 00:28:01,840 --> 00:28:03,150 I have. 656 00:28:03,150 --> 00:28:05,659 The time series has no memory. 657 00:28:05,659 --> 00:28:07,950 When we start introducing time series properties, which 658 00:28:07,950 --> 00:28:12,950 many financial time series have, then there's seasonality, 659 00:28:12,950 --> 00:28:15,130 there's mean reversion, there's all kinds 660 00:28:15,130 --> 00:28:17,130 of other stuff, other ways that we have to think 661 00:28:17,130 --> 00:28:19,250 about modeling the data. 662 00:28:19,250 --> 00:28:20,764 Make sense? 663 00:28:20,764 --> 00:28:25,930 AUDIENCE: [INAUDIBLE] higher standard deviation than 664 00:28:25,930 --> 00:28:26,430 [INAUDIBLE]. 665 00:28:26,430 --> 00:28:28,410 KENNETH ABBOTT: Say it once again. 666 00:28:28,410 --> 00:28:30,605 AUDIENCE: Better yield, does it mean 667 00:28:30,605 --> 00:28:33,030 that we have a higher standard deviation than [INAUDIBLE]? 668 00:28:33,030 --> 00:28:33,570 KENNETH ABBOTT: No. 669 00:28:33,570 --> 00:28:35,611 The standard deviation is the standard deviation. 670 00:28:37,820 --> 00:28:42,100 No matter what I do, this is standard deviation, that's it. 671 00:28:42,100 --> 00:28:44,790 Don't have a higher standard deviation. 672 00:28:44,790 --> 00:28:46,730 But the likelihood of-- the put it 673 00:28:46,730 --> 00:28:52,740 this way-- the likelihood of a move of 2.33 674 00:28:52,740 --> 00:28:54,860 standard deviations is more than 1%. 675 00:28:57,890 --> 00:28:59,670 That's the way I think of it. 676 00:28:59,670 --> 00:29:02,585 Make sense? 677 00:29:02,585 --> 00:29:05,091 AUDIENCE: Is there any way for you to [INAUDIBLE] to-- 678 00:29:05,091 --> 00:29:05,966 KENNETH ABBOTT: What? 679 00:29:05,966 --> 00:29:07,340 AUDIENCE: Sorry, is there any way 680 00:29:07,340 --> 00:29:09,690 to put into that graph what a fatter tail looks like? 681 00:29:09,690 --> 00:29:11,890 KENNETH ABBOTT: Oh, well, be patient. 682 00:29:11,890 --> 00:29:13,365 If we have time. 683 00:29:13,365 --> 00:29:14,740 In fact, we do that all the time. 684 00:29:14,740 --> 00:29:16,700 And one of our techniques doesn't care. 685 00:29:16,700 --> 00:29:18,290 It goes to the empirical distribution. 686 00:29:18,290 --> 00:29:20,620 So it captures the fat tails completely. 687 00:29:20,620 --> 00:29:25,210 In fact, the homework assignment which I usually 688 00:29:25,210 --> 00:29:27,010 precede this lecture by has people 689 00:29:27,010 --> 00:29:28,740 graphing all kinds of distributions 690 00:29:28,740 --> 00:29:31,384 to see what these things look like. 691 00:29:31,384 --> 00:29:32,550 We won't have time for that. 692 00:29:32,550 --> 00:29:35,130 But if you have questions, send them to me. 693 00:29:35,130 --> 00:29:37,520 I'll send you some stuff to read about this. 694 00:29:37,520 --> 00:29:39,228 All right, so now you know one asset VaR, 695 00:29:39,228 --> 00:29:41,240 now you're qualified to go work for a big bank. 696 00:29:41,240 --> 00:29:42,470 All right? 697 00:29:42,470 --> 00:29:44,530 Get your data, calculate returns. 698 00:29:44,530 --> 00:29:50,100 Now I usually put in step 2b, graph your data and look at it. 699 00:29:50,100 --> 00:29:50,630 All right? 700 00:29:50,630 --> 00:29:54,350 Because everybody's data has dirt in it. 701 00:29:54,350 --> 00:29:55,410 Don't trust anyone else. 702 00:29:55,410 --> 00:29:56,850 If you're going to get fired, get 703 00:29:56,850 --> 00:29:58,266 fired for being incompetent, don't 704 00:29:58,266 --> 00:30:01,310 get fired for using someone else's bad data. 705 00:30:01,310 --> 00:30:02,260 Don't trust anyone. 706 00:30:02,260 --> 00:30:05,100 My mother gives me data, Mom, I'm graphing it. 707 00:30:05,100 --> 00:30:09,290 Because I think you let some poop slip into my data. 708 00:30:09,290 --> 00:30:15,430 Mother Teresa could come to me with a thumb drive: "Ken, S&P 709 00:30:15,430 --> 00:30:17,316 500." 710 00:30:17,316 --> 00:30:18,190 Sorry, Mother Teresa. 711 00:30:18,190 --> 00:30:20,500 I'm graphing it before I use it. 712 00:30:20,500 --> 00:30:21,070 All right? 713 00:30:21,070 --> 00:30:23,236 So I don't want to say that this is usually in here. 714 00:30:23,236 --> 00:30:24,855 We do extensive error testing. 715 00:30:24,855 --> 00:30:27,670 Because there could be bad data, there could be missing data. 716 00:30:27,670 --> 00:30:31,210 And missing data is a whole other lecture that I give. 717 00:30:31,210 --> 00:30:32,723 You might be shocked at [INAUDIBLE]. 718 00:30:36,590 --> 00:30:40,590 So for one asset VaR, get my data, create my return series. 719 00:30:40,590 --> 00:30:42,010 Percentage changes, log changes. 720 00:30:42,010 --> 00:30:43,720 Sometimes that's absolute differences. 721 00:30:43,720 --> 00:30:46,053 Take the variance, take the square root of the variance, 722 00:30:46,053 --> 00:30:47,100 multiply by 2.33. 723 00:30:47,100 --> 00:30:47,800 Done and dusted. 724 00:30:47,800 --> 00:30:49,990 Go home, take your shoes off, relax. 725 00:30:53,730 --> 00:30:54,640 OK. 726 00:30:54,640 --> 00:30:57,290 Percentage changes versus log changes. 727 00:30:57,290 --> 00:31:00,530 For all intents and purposes, it doesn't really matter 728 00:31:00,530 --> 00:31:04,161 and I will often use one or the other. 729 00:31:04,161 --> 00:31:19,030 The way I think about this-- all right, 730 00:31:19,030 --> 00:31:21,110 there'll be a little bit of bias at the ends. 731 00:31:21,110 --> 00:31:23,240 But for the overwhelming bulk of the observations 732 00:31:23,240 --> 00:31:25,800 whether you use percentage changes or log changes 733 00:31:25,800 --> 00:31:27,200 doesn't matter. 734 00:31:27,200 --> 00:31:30,250 Generally, even though I know the data is closer 735 00:31:30,250 --> 00:31:32,850 to log-normally distributed than normally distributed, 736 00:31:32,850 --> 00:31:36,082 I'll use percentage changes just because it's easier. 737 00:31:36,082 --> 00:31:37,790 Why would we use log-normal distribution? 738 00:31:37,790 --> 00:31:42,587 Well, when we're doing simulation, 739 00:31:42,587 --> 00:31:44,920 the log-normal distribution has this very nifty property 740 00:31:44,920 --> 00:31:48,430 of keeping your yields from going negative. 741 00:31:48,430 --> 00:31:50,440 But, even that-- I can call that into 742 00:31:50,440 --> 00:31:52,510 question because there are instances 743 00:31:52,510 --> 00:31:54,436 of yields going negative. 744 00:31:54,436 --> 00:31:55,670 It's happened. 745 00:31:55,670 --> 00:31:58,180 Doesn't happen a lot, but it happens. 746 00:31:58,180 --> 00:31:58,824 All right. 747 00:31:58,824 --> 00:32:00,240 So I talked about bad data, talked 748 00:32:00,240 --> 00:32:02,420 about one-sided versus two-sided. 749 00:32:02,420 --> 00:32:05,727 I'll talk about longs and shorts a little bit later when we 750 00:32:05,727 --> 00:32:06,810 we're talking multi-asset. 751 00:32:11,040 --> 00:32:14,280 I'm going to cover a fixed income piece. 752 00:32:14,280 --> 00:32:17,410 We use this thing called a PV01 because what I measure in fixed 753 00:32:17,410 --> 00:32:18,730 income markets isn't a price. 754 00:32:18,730 --> 00:32:20,290 I usually measure a yield. 755 00:32:20,290 --> 00:32:23,592 I have to get from a change of yield to a change of price. 756 00:32:23,592 --> 00:32:25,050 Hmm, sounds like a Jacobian, right? 757 00:32:25,050 --> 00:32:27,990 With kind of a poor man's Jacobian. 758 00:32:27,990 --> 00:32:30,560 It's a measure that captures the fact 759 00:32:30,560 --> 00:32:36,340 that my price-yield relationship-- 760 00:32:36,340 --> 00:32:40,180 price, yield-- is non-linear. 761 00:32:40,180 --> 00:32:44,700 For any small approximation I look at the tangent. 762 00:32:44,700 --> 00:32:49,020 And I use my PV01 which has a similar notion to duration, 763 00:32:49,020 --> 00:32:51,390 but PV01 is a little more practical. 764 00:32:51,390 --> 00:32:53,019 The slope of that tells me how much 765 00:32:53,019 --> 00:32:55,060 my price will change for a given change of yield. 766 00:32:55,060 --> 00:32:55,690 See, there it is. 767 00:32:55,690 --> 00:32:57,280 You knew you were going to use the calculus, right? 768 00:32:57,280 --> 00:32:58,730 You're always using the calculus. 769 00:32:58,730 --> 00:33:01,280 You can't escape it. 770 00:33:01,280 --> 00:33:03,380 But the price-yield line is non-linear. 771 00:33:03,380 --> 00:33:05,710 But for all intents and purposes, what I'm doing is 772 00:33:05,710 --> 00:33:07,940 I'm shifting the price-yield relationship-- 773 00:33:07,940 --> 00:33:10,120 I'm shifting my yield change into price change 774 00:33:10,120 --> 00:33:14,560 by multiplying my yield change by my PV01 which 775 00:33:14,560 --> 00:33:21,131 is my price sensitivity to 1/100th percent move in yields. 776 00:33:21,131 --> 00:33:22,380 Think about that for a second. 777 00:33:22,380 --> 00:33:23,880 We don't have time to-- I would love 778 00:33:23,880 --> 00:33:26,270 to spend an hour on this, and about trading strategies, 779 00:33:26,270 --> 00:33:28,524 and about bull steepeners and bear steepeners 780 00:33:28,524 --> 00:33:30,690 and barbell trades, but we don't have time for that. 781 00:33:30,690 --> 00:33:35,090 Suffice to say if I'm measuring yields the thing 782 00:33:35,090 --> 00:33:41,960 is going to trade as a 789 or a 622 or a 401 yield. 783 00:33:41,960 --> 00:33:43,880 How do I get that in the change in price? 784 00:33:43,880 --> 00:33:46,046 Because I can't tell my boss, hey, I had a good day. 785 00:33:46,046 --> 00:33:48,360 I bought it at 402 and sold it at 401. 786 00:33:48,360 --> 00:33:50,170 No, how much money did you make? 787 00:33:50,170 --> 00:33:51,920 Yield to coffee break yield to lunch time, 788 00:33:51,920 --> 00:33:53,880 yield to go home at the end of day. 789 00:33:53,880 --> 00:33:56,710 How do I get from change in yield to change in price? 790 00:33:56,710 --> 00:33:58,330 Usually PV01. 791 00:33:58,330 --> 00:34:00,446 I could use duration. 792 00:34:00,446 --> 00:34:02,820 Bond traders who think in terms of yield to coffee break, 793 00:34:02,820 --> 00:34:04,515 yield to lunch time, yield to go home at the end of the day 794 00:34:04,515 --> 00:34:05,525 typically think in terms of PV01. 795 00:34:05,525 --> 00:34:07,130 Do you agree with that statement? 796 00:34:07,130 --> 00:34:08,420 AUDIENCE: [INAUDIBLE] 797 00:34:08,420 --> 00:34:10,503 KENNETH ABBOTT: How often on the fixed income desk 798 00:34:10,503 --> 00:34:11,957 did you use duration measures? 799 00:34:11,957 --> 00:34:15,880 AUDIENCE: Well, actually, [INAUDIBLE]. 800 00:34:15,880 --> 00:34:18,070 KENNETH ABBOTT: Because of the investor horizon? 801 00:34:18,070 --> 00:34:20,761 OK, the insurance companies. 802 00:34:20,761 --> 00:34:23,219 Very important point I want to reach here as a quick aside. 803 00:34:23,219 --> 00:34:25,389 You're going to hear this notion of PV01, 804 00:34:25,389 --> 00:34:28,239 which is called PVBP or DV01. 805 00:34:28,239 --> 00:34:31,460 That's the price sensitivity to a one basis point move. 806 00:34:31,460 --> 00:34:35,840 One basis point is 1/100th of a percent in yield. 807 00:34:35,840 --> 00:34:41,360 Duration is the half life, essentially, of my cash flow. 808 00:34:41,360 --> 00:34:44,870 What's the weighted expected time to owe my cash flows? 809 00:34:44,870 --> 00:34:51,070 If my duration is 7.9 years, my PV01 is probably about $790 810 00:34:51,070 --> 00:34:53,500 per million. 811 00:34:53,500 --> 00:34:56,069 In terms of significant digits, they're roughly the same 812 00:34:56,069 --> 00:34:58,610 but they have different meanings and the units are different. 813 00:34:58,610 --> 00:35:03,530 Duration is measured in yield, PV01 is measured in dollars. 814 00:35:03,530 --> 00:35:06,280 In bond space I typically think in PV01. 815 00:35:06,280 --> 00:35:08,580 If I'm selling to long term investors 816 00:35:08,580 --> 00:35:11,530 they have particular demands because they've got cash flow 817 00:35:11,530 --> 00:35:13,310 payments they have to hedge. 818 00:35:13,310 --> 00:35:15,740 So they may think of it in terms of duration. 819 00:35:15,740 --> 00:35:20,560 For our purposes, we're talking DV01 or PV01 or PVBP, 820 00:35:20,560 --> 00:35:23,050 those three terms more or less equal. 821 00:35:23,050 --> 00:35:23,710 Make sense? 822 00:35:23,710 --> 00:35:24,553 Yes? 823 00:35:24,553 --> 00:35:27,018 AUDIENCE: [INAUDIBLE] in terms of [INAUDIBLE] versus 824 00:35:27,018 --> 00:35:28,259 [INAUDIBLE]? 825 00:35:28,259 --> 00:35:29,300 KENNETH ABBOTT: We could. 826 00:35:29,300 --> 00:35:32,850 In some instances, in some areas and options 827 00:35:32,850 --> 00:35:34,800 we might look at an overall 1% move. 828 00:35:34,800 --> 00:35:38,000 But we have to look at what trades in the market. 829 00:35:38,000 --> 00:35:40,970 What trades in the market is the yield. 830 00:35:40,970 --> 00:35:42,710 When we quote the yield, I'm going 831 00:35:42,710 --> 00:35:45,380 to quote it going from 702 to 701. 832 00:35:45,380 --> 00:35:47,840 I'm not going to have the calculator handy to say, 833 00:35:47,840 --> 00:35:49,410 a 702 move to a 701. 834 00:35:49,410 --> 00:35:52,990 What's 702 minus 701 divided by 702? 835 00:35:52,990 --> 00:35:54,000 Make sense? 836 00:35:54,000 --> 00:35:56,159 It's the path of least resistance. 837 00:35:56,159 --> 00:35:58,450 What's the difference between a bond and a bond trader? 838 00:35:58,450 --> 00:36:00,116 A bond matures. 839 00:36:00,116 --> 00:36:01,892 A little fixed income humor for you. 840 00:36:01,892 --> 00:36:02,850 Apparently very little. 841 00:36:06,659 --> 00:36:08,450 I don't want to spend too much time on this 842 00:36:08,450 --> 00:36:11,570 because we just don't have the time. 843 00:36:11,570 --> 00:36:12,930 I provide an example here. 844 00:36:12,930 --> 00:36:15,141 If you guys want examples, contact me. 845 00:36:15,141 --> 00:36:17,390 I'll send you the spreadsheets I use for other classes 846 00:36:17,390 --> 00:36:19,056 if you just want to play around with it. 847 00:36:21,320 --> 00:36:26,360 When I talk about PV01, when I talk about yields, 848 00:36:26,360 --> 00:36:28,509 I usually have some kind of risk-free rate. 849 00:36:28,509 --> 00:36:30,800 Although this whole notion of the risk-free rate, which 850 00:36:30,800 --> 00:36:33,680 is-- so much of modern finance is predicated 851 00:36:33,680 --> 00:36:35,800 on this assumption that there is a risk-free rate, 852 00:36:35,800 --> 00:36:37,460 which used to be considered the US treasury. 853 00:36:37,460 --> 00:36:38,918 It used to be considered risk-free. 854 00:36:38,918 --> 00:36:43,390 Well, there's a credit spread out there for US Treasury. 855 00:36:43,390 --> 00:36:45,750 I don't mean to throw a monkey wrench into the works. 856 00:36:45,750 --> 00:36:47,739 But there's no such thing. 857 00:36:47,739 --> 00:36:50,030 I'm not going to question 75 years of academic finance. 858 00:36:50,030 --> 00:36:52,210 But it's troublesome. 859 00:36:52,210 --> 00:36:55,440 Just like when I was taking economics 30 years ago, 860 00:36:55,440 --> 00:36:57,850 inflation just mucked with everything. 861 00:36:57,850 --> 00:36:59,555 All of the models fell apart. 862 00:36:59,555 --> 00:37:01,260 There were appendices to every chapter 863 00:37:01,260 --> 00:37:04,610 on how you have to change this model to address inflation. 864 00:37:04,610 --> 00:37:07,500 And then inflation went away and everything was better. 865 00:37:07,500 --> 00:37:09,140 But this may not go away. 866 00:37:09,140 --> 00:37:10,650 I've got two components here. 867 00:37:10,650 --> 00:37:17,530 If the yield is 6%, I might have a 450 treasury rate and 150 868 00:37:17,530 --> 00:37:19,820 basis point credit spread. 869 00:37:19,820 --> 00:37:22,430 The credit spread reflects the probability of default. 870 00:37:22,430 --> 00:37:24,770 And I don't want to get into measures of risk neutrality 871 00:37:24,770 --> 00:37:25,830 here. 872 00:37:25,830 --> 00:37:29,840 But if I'm an issuer and I have a chance of default, 873 00:37:29,840 --> 00:37:33,010 I have to pay my investors more. 874 00:37:33,010 --> 00:37:40,767 Usually when we measure sensitivity 875 00:37:40,767 --> 00:37:42,600 we talk about that credit spread sensitivity 876 00:37:42,600 --> 00:37:45,340 and the risk-free sensitivity. 877 00:37:45,340 --> 00:37:47,492 We say, well, how could they possibly be different? 878 00:37:47,492 --> 00:37:49,200 And I don't want to get into detail here, 879 00:37:49,200 --> 00:37:52,380 but the notion is, when credit spreads start getting high, 880 00:37:52,380 --> 00:37:56,964 it implies a higher probability of default. 881 00:37:56,964 --> 00:37:59,380 You have to think about credit spread sensitivity a little 882 00:37:59,380 --> 00:37:59,750 differently. 883 00:37:59,750 --> 00:38:01,541 Because when you get to 1,000 basis points, 884 00:38:01,541 --> 00:38:03,450 1,500 basis points credit spread, 885 00:38:03,450 --> 00:38:04,920 it's a high probability of default. 886 00:38:04,920 --> 00:38:07,230 And your credit models will think differently. 887 00:38:07,230 --> 00:38:09,270 Your credit models will say, ah, that means 888 00:38:09,270 --> 00:38:11,260 I'm not going to get my next three payments. 889 00:38:11,260 --> 00:38:13,690 There's an expected, there's a probability of default, 890 00:38:13,690 --> 00:38:16,400 there's a loss given default, and there's recovery. 891 00:38:16,400 --> 00:38:18,660 A bunch of other stochastic measures come into play. 892 00:38:18,660 --> 00:38:20,180 I don't want to spend any more time on it because it's just 893 00:38:20,180 --> 00:38:21,820 going to confuse you now. 894 00:38:21,820 --> 00:38:25,720 Suffice to say we have these yields and yields 895 00:38:25,720 --> 00:38:29,390 are composed of risk-free rates and credit spreads. 896 00:38:29,390 --> 00:38:31,597 And I apologize for rushing through that, 897 00:38:31,597 --> 00:38:32,930 but we don't have time to do it. 898 00:38:36,290 --> 00:38:38,450 Typically you have more than one asset. 899 00:38:38,450 --> 00:38:43,030 So in this framework where I take 2.33 standard deviations 900 00:38:43,030 --> 00:38:46,370 times my dollar investment, or my renminbi investment 901 00:38:46,370 --> 00:38:47,705 or my sterling investment. 902 00:38:50,520 --> 00:38:52,060 That example was with one asset. 903 00:38:52,060 --> 00:38:55,010 If I want to expand this, I can expand 904 00:38:55,010 --> 00:39:01,276 this using this notion of covariance and correlation. 905 00:39:01,276 --> 00:39:03,650 You guys covered correlation and covariance at some point 906 00:39:03,650 --> 00:39:04,940 in your careers? 907 00:39:04,940 --> 00:39:05,780 Yes, no? 908 00:39:05,780 --> 00:39:08,460 All right? 909 00:39:08,460 --> 00:39:12,820 Both of them measure the way one asset moves vis-à-vis another 910 00:39:12,820 --> 00:39:14,310 asset. 911 00:39:14,310 --> 00:39:19,040 Correlation is scaled between negative 1 and positive 1. 912 00:39:19,040 --> 00:39:23,280 So I think of correlation as an index of linearity. 913 00:39:23,280 --> 00:39:24,650 Covariance is not scaled. 914 00:39:24,650 --> 00:39:26,290 I'll give you an example of the difference between covariance 915 00:39:26,290 --> 00:39:27,240 and correlation. 916 00:39:27,240 --> 00:39:31,430 What if I have 50 years of data on crop yields 917 00:39:31,430 --> 00:39:35,900 and that same 50 years of data on tons of fertilizer used? 918 00:39:35,900 --> 00:39:38,550 I would expect a positive correlation 919 00:39:38,550 --> 00:39:41,709 between tons of fertilizer used and crop yields. 920 00:39:41,709 --> 00:39:43,750 So the correlation would exist between negative 1 921 00:39:43,750 --> 00:39:45,360 and positive 1. 922 00:39:45,360 --> 00:39:47,072 The covariance could be any number, 923 00:39:47,072 --> 00:39:48,780 and that covariance will change depending 924 00:39:48,780 --> 00:39:50,670 on whether I measure my fertilizer 925 00:39:50,670 --> 00:39:56,950 in tons, or in pounds, or in ounces, or in kilos. 926 00:39:56,950 --> 00:39:59,500 The correlation will always be exactly the same. 927 00:39:59,500 --> 00:40:03,630 The linear relationship is captured by the correlation. 928 00:40:03,630 --> 00:40:08,440 But the units-- in covariance, the units count. 929 00:40:08,440 --> 00:40:14,100 If I have covariance-- here it is. 930 00:40:23,050 --> 00:40:25,450 Covariance matrices are symmetric. 931 00:40:25,450 --> 00:40:28,940 They have the variance along the diagonal. 932 00:40:28,940 --> 00:40:31,500 And the covariance is on the off-diagonal. 933 00:40:31,500 --> 00:40:33,980 Which is to say that the variance is the covariance 934 00:40:33,980 --> 00:40:36,510 of an item with itself. 935 00:40:36,510 --> 00:40:39,750 The correlation matrix, also symmetric, 936 00:40:39,750 --> 00:40:44,800 is the same thing scaled, with correlations, 937 00:40:44,800 --> 00:40:47,900 where the diagonal is 1.0. 938 00:40:47,900 --> 00:40:51,430 If I have covariance-- because correlation 939 00:40:51,430 --> 00:41:01,560 is covariance-- covariance divided 940 00:41:01,560 --> 00:41:04,100 by the product of the standard deviations 941 00:41:04,100 --> 00:41:07,650 gets me-- sorry-- correlation hat. 942 00:41:12,280 --> 00:41:15,410 This is like the apostrophe in French. 943 00:41:15,410 --> 00:41:16,980 You forget it all the time. 944 00:41:16,980 --> 00:41:20,060 But the one time you really need it, you won't do it 945 00:41:20,060 --> 00:41:22,820 and you'll be in trouble. 946 00:41:22,820 --> 00:41:25,650 If you have the covariances, you can get to the correlations. 947 00:41:25,650 --> 00:41:27,090 If you have the correlations, you 948 00:41:27,090 --> 00:41:29,700 can't get to the covariances unless you know the variances. 949 00:41:29,700 --> 00:41:31,750 That's a classic mid-term question. 950 00:41:31,750 --> 00:41:37,300 I give that almost-- not every year, maybe every other year. 951 00:41:37,300 --> 00:41:39,350 Don't have time to spend much more time on it. 952 00:41:39,350 --> 00:41:41,500 Suffice to say this measure of covariance 953 00:41:41,500 --> 00:41:43,870 says when x is a certain distance from its mean, 954 00:41:43,870 --> 00:41:47,131 how far is y from its mean and in what direction? 955 00:41:47,131 --> 00:41:47,630 Yes? 956 00:41:50,520 --> 00:41:52,320 Now this is just a little empirical stuff 957 00:41:52,320 --> 00:41:54,400 because I'm not as clever as you guys. 958 00:41:54,400 --> 00:41:56,530 And I don't trust anyone. 959 00:41:56,530 --> 00:42:00,550 I read it in the textbook, I don't trust anyone. 960 00:42:00,550 --> 00:42:03,707 a, b, here's a plus b. 961 00:42:03,707 --> 00:42:06,040 Variance of a plus b is variance of a plus variance of b 962 00:42:06,040 --> 00:42:07,340 plus 2 times covariance a b. 963 00:42:07,340 --> 00:42:10,180 It's not just a good idea, it's the law. 964 00:42:10,180 --> 00:42:12,350 I saw it in a thousand statistics textbooks, 965 00:42:12,350 --> 00:42:13,762 I tested it anyway. 966 00:42:13,762 --> 00:42:15,220 Because if I want to get fired, I'm 967 00:42:15,220 --> 00:42:17,170 going to get fired for making my own mistake, 968 00:42:17,170 --> 00:42:18,590 not making someone else's mistake. 969 00:42:18,590 --> 00:42:20,280 I do this all the time. 970 00:42:20,280 --> 00:42:22,610 And I just prove it empirically here. 971 00:42:22,610 --> 00:42:26,752 The proof of which will be left to the reader as an exercise. 972 00:42:26,752 --> 00:42:28,313 I hated when books said that. 973 00:42:28,313 --> 00:42:29,771 PROFESSOR: I actually kind of think 974 00:42:29,771 --> 00:42:31,385 that's a proven point, that you really 975 00:42:31,385 --> 00:42:34,719 should never trust output from computer programs or packages-- 976 00:42:34,719 --> 00:42:36,760 KENNETH ABBOTT: Or your mother, or Mother Teresa. 977 00:42:36,760 --> 00:42:38,218 PROFESSOR: It's good to check them. 978 00:42:38,218 --> 00:42:39,344 Check all the calculations. 979 00:42:39,344 --> 00:42:41,717 KENNETH ABBOTT: Mother Teresa will slip you some bad data 980 00:42:41,717 --> 00:42:42,335 if she can. 981 00:42:42,335 --> 00:42:43,550 I'm telling you, she will. 982 00:42:43,550 --> 00:42:45,420 She's tricky that way. 983 00:42:45,420 --> 00:42:47,370 Don't trust anyone. 984 00:42:47,370 --> 00:42:51,010 I've caught mistakes in software, all right? 985 00:42:51,010 --> 00:42:56,930 I had a programmer-- it's one of my favorite stories-- 986 00:42:56,930 --> 00:42:59,260 we're doing one of our first Monte Carlo simulations, 987 00:42:59,260 --> 00:43:00,990 and we're factoring a matrix. 988 00:43:00,990 --> 00:43:05,640 If we have time, we'll get-- so I factor a covariance matrix 989 00:43:05,640 --> 00:43:13,900 into E transpose lambda E. It's our friend the quadratic form. 990 00:43:13,900 --> 00:43:16,330 We're going to see this again. 991 00:43:16,330 --> 00:43:20,872 And this is a diagonal matrix of eigenvalues. 992 00:43:20,872 --> 00:43:22,330 And I take the square root of that. 993 00:43:22,330 --> 00:43:28,060 So I can say this is E transpose lambda to the 1/2 lambda 994 00:43:28,060 --> 00:43:30,860 to the 1/2 E. 995 00:43:30,860 --> 00:43:33,850 And so my programmer had gotten this, and I said, 996 00:43:33,850 --> 00:43:34,910 do me a favor. 997 00:43:34,910 --> 00:43:38,040 I said, take this, and transpose and multiply by itself. 998 00:43:38,040 --> 00:43:39,730 So take the square root and multiply it 999 00:43:39,730 --> 00:43:43,650 by the other square root, and show me that you get this. 1000 00:43:43,650 --> 00:43:44,410 Just show me. 1001 00:43:44,410 --> 00:43:45,130 He said I got it. 1002 00:43:45,130 --> 00:43:45,880 I said you got it? 1003 00:43:45,880 --> 00:43:48,450 He said out to 16 decimals. 1004 00:43:48,450 --> 00:43:50,510 I said stop. 1005 00:43:50,510 --> 00:43:56,075 On my block, the square root of 2 times the square root of 2 1006 00:43:56,075 --> 00:43:58,820 equals 2.0. 1007 00:43:58,820 --> 00:43:59,860 All right? 1008 00:43:59,860 --> 00:44:04,470 2.0000000, what do you mean out to 16 decimal places? 1009 00:44:04,470 --> 00:44:06,980 What planet are you on? 1010 00:44:06,980 --> 00:44:09,640 And I scratched the surface, and I dug, 1011 00:44:09,640 --> 00:44:11,570 and I asked a bunch of questions. 1012 00:44:11,570 --> 00:44:13,385 And it turned out in this code he 1013 00:44:13,385 --> 00:44:16,390 was passing a float to a fixed. 1014 00:44:16,390 --> 00:44:17,760 All right? 1015 00:44:17,760 --> 00:44:20,480 Don't trust anyone's software. 1016 00:44:20,480 --> 00:44:21,330 Check it yourself. 1017 00:44:24,690 --> 00:44:27,610 Someday when I'm dead and you guys are in my position, 1018 00:44:27,610 --> 00:44:29,880 you'll be thanking me for that. 1019 00:44:29,880 --> 00:44:31,657 Put a stone on my grave or something. 1020 00:44:34,500 --> 00:44:36,154 All right so covariance. 1021 00:44:36,154 --> 00:44:37,820 Covariance tells me some measure of when 1022 00:44:37,820 --> 00:44:41,250 x moves, how far does y move? [? Or ?] for any other asset? 1023 00:44:41,250 --> 00:44:42,750 Could I have a piece of your cookie? 1024 00:44:42,750 --> 00:44:43,670 I hardly had lunch. 1025 00:44:43,670 --> 00:44:45,461 You want me to have a piece of this, right? 1026 00:44:48,143 --> 00:44:49,559 It's just looking very good there. 1027 00:44:49,559 --> 00:44:50,059 Thank you. 1028 00:44:54,250 --> 00:44:55,140 It's foraging. 1029 00:44:55,140 --> 00:44:57,940 I'm convinced 10 million years ago, 1030 00:44:57,940 --> 00:44:59,620 my ape ancestors were the first one 1031 00:44:59,620 --> 00:45:03,350 at the dead antelope on the planes. 1032 00:45:03,350 --> 00:45:05,314 All right. 1033 00:45:05,314 --> 00:45:08,960 So we're talking about correlation, covariance. 1034 00:45:08,960 --> 00:45:10,190 Covariance is not unit free. 1035 00:45:12,840 --> 00:45:16,420 I can use either, but I have to make sure I get my units right. 1036 00:45:16,420 --> 00:45:18,130 Units screw me up every time. 1037 00:45:18,130 --> 00:45:21,610 They still screw me up. 1038 00:45:21,610 --> 00:45:23,570 That was a good cookie. 1039 00:45:23,570 --> 00:45:26,030 All right. 1040 00:45:26,030 --> 00:45:28,220 So more facts. 1041 00:45:28,220 --> 00:45:31,030 Variance of xa times yb; x squared 1042 00:45:31,030 --> 00:45:35,467 variance a, y squared variance b plus 2xy covariance ab. 1043 00:45:35,467 --> 00:45:36,550 You guys seen this before? 1044 00:45:36,550 --> 00:45:37,530 I assume you have. 1045 00:45:40,240 --> 00:45:43,390 Now I can get pretty silly with this if I want. 1046 00:45:43,390 --> 00:45:46,970 x, a, y, b you get the picture, right? 1047 00:45:46,970 --> 00:45:51,160 But what you should be thinking, this 1048 00:45:51,160 --> 00:45:58,860 is a covariance matrix, sigma squared, sigma squared, 1049 00:45:58,860 --> 00:46:00,840 sigma squared. 1050 00:46:00,840 --> 00:46:04,050 It's the sum of the variances plus 2 times the sum 1051 00:46:04,050 --> 00:46:05,500 of the covariances. 1052 00:46:05,500 --> 00:46:09,100 So if I have one unit of every asset, I've got n assets, 1053 00:46:09,100 --> 00:46:12,300 all have to do to get the portfolio variance is sum up 1054 00:46:12,300 --> 00:46:13,890 the whole covariance matrix. 1055 00:46:13,890 --> 00:46:16,700 Now, you never get only one unit, but just saying. 1056 00:46:16,700 --> 00:46:20,840 But you notice that this is kind of a regular pattern 1057 00:46:20,840 --> 00:46:22,770 that we see here. 1058 00:46:22,770 --> 00:46:24,210 And so what I can do is I can use 1059 00:46:24,210 --> 00:46:27,230 a combination of my correlation matrix 1060 00:46:27,230 --> 00:46:31,180 and a little bit of linear algebra legerdemain, 1061 00:46:31,180 --> 00:46:33,330 to do some very convenient calculations. 1062 00:46:33,330 --> 00:46:35,580 And here I just give an example of a covariance matrix 1063 00:46:35,580 --> 00:46:37,060 and a correlation matrix. 1064 00:46:37,060 --> 00:46:39,607 Note the correlation matrices between negative 1 1065 00:46:39,607 --> 00:46:41,430 and positive 1. 1066 00:46:41,430 --> 00:46:41,930 All right. 1067 00:46:44,932 --> 00:46:46,140 Let me cut to the chase here. 1068 00:46:51,724 --> 00:46:53,140 I'll draw it here because I really 1069 00:46:53,140 --> 00:46:57,180 want to get into some of the other stuff. 1070 00:46:57,180 --> 00:47:04,160 What this means, if I have a covariance structure sigma. 1071 00:47:04,160 --> 00:47:06,830 And I have a vector of positions, 1072 00:47:06,830 --> 00:47:10,430 x dollars in dollar/yen, y dollars in gold, 1073 00:47:10,430 --> 00:47:13,800 z dollars in oil. 1074 00:47:13,800 --> 00:47:17,140 And let's say I've got a position vector, 1075 00:47:17,140 --> 00:47:22,560 x_1, x_2, x_3, x_n. 1076 00:47:28,730 --> 00:47:31,580 If I have all my positions recorded 1077 00:47:31,580 --> 00:47:34,450 as a vector-- this is asset one, asset two, and this 1078 00:47:34,450 --> 00:47:41,000 is in dollars-- and I have the covariance structure, 1079 00:47:41,000 --> 00:47:44,490 the variance of this portfolio that 1080 00:47:44,490 --> 00:47:47,460 has these assets and this covariance structure-- this 1081 00:47:47,460 --> 00:48:01,780 is where the magic happens-- is x transpose sigma x equals 1082 00:48:01,780 --> 00:48:04,260 sigma squared hat portfolio. 1083 00:48:07,500 --> 00:48:10,900 Now you really could go work for a bank. 1084 00:48:10,900 --> 00:48:14,580 This is how portfolio variance, using 1085 00:48:14,580 --> 00:48:17,640 the variance/covariance method, is done. 1086 00:48:17,640 --> 00:48:22,740 In fact, when we were doing it this way 20 years ago, 1087 00:48:22,740 --> 00:48:25,380 spreadsheets only have 256 columns. 1088 00:48:25,380 --> 00:48:27,600 So we tried to simplify everything into 256-- 1089 00:48:27,600 --> 00:48:29,270 or sometimes you had to sum it up 1090 00:48:29,270 --> 00:48:31,560 using two different spreadsheets. 1091 00:48:31,560 --> 00:48:33,620 We didn't have multitab spreadsheets. 1092 00:48:33,620 --> 00:48:36,740 That was a dream, multitab spreadsheets. 1093 00:48:36,740 --> 00:48:40,530 This was Lotus 1-2-3 we're talking about here, OK? 1094 00:48:40,530 --> 00:48:42,600 You guys don't even know what Lotus 1-2-3 is. 1095 00:48:42,600 --> 00:48:45,121 It's like an abacus but on the screen. 1096 00:48:45,121 --> 00:48:45,620 Yes? 1097 00:48:45,620 --> 00:48:47,107 AUDIENCE: What's x again in this? 1098 00:48:47,107 --> 00:48:48,440 KENNETH ABBOTT: Position vector. 1099 00:48:48,440 --> 00:48:52,050 Let's say I tell you that you've got dollar/yen, gold, and oil. 1100 00:48:52,050 --> 00:48:58,490 You've got $100 of dollar/yen, $50 of oil, and $25 of gold. 1101 00:48:58,490 --> 00:49:01,370 It would be 100, 50, 25. 1102 00:49:01,370 --> 00:49:05,189 Now, I should say $100 of dollar/yen, 1103 00:49:05,189 --> 00:49:06,980 your position vector would actually show up 1104 00:49:06,980 --> 00:49:10,240 as negative 100, 50, 25. 1105 00:49:10,240 --> 00:49:11,150 Why is that? 1106 00:49:11,150 --> 00:49:14,330 Because if I'm measuring my dollar/yen-- 1107 00:49:14,330 --> 00:49:18,830 and this is just a little aside-- typically, 1108 00:49:18,830 --> 00:49:22,740 I measure dollar/yen in yen per dollar. 1109 00:49:22,740 --> 00:49:25,800 So dollar/yen might be 95. 1110 00:49:25,800 --> 00:49:30,640 If I own yen and I'm a dollar investor and I own yen, 1111 00:49:30,640 --> 00:49:34,340 and yen go from 95 per dollar to 100 per dollar, 1112 00:49:34,340 --> 00:49:37,270 do I make or lose money? 1113 00:49:37,270 --> 00:49:38,540 I lose money. 1114 00:49:38,540 --> 00:49:42,090 Negative 100. 1115 00:49:42,090 --> 00:49:43,840 Just store that. 1116 00:49:43,840 --> 00:49:47,699 You won't be tested on that, but we think about that 1117 00:49:47,699 --> 00:49:48,240 all the time. 1118 00:49:48,240 --> 00:49:49,410 Same thing with yields. 1119 00:49:49,410 --> 00:49:51,694 Typically, when I record my PV01-- 1120 00:49:51,694 --> 00:49:53,860 and I'll record some version, something like my PV01 1121 00:49:53,860 --> 00:49:55,630 in that vector, my interest rate sensitivity, 1122 00:49:55,630 --> 00:49:57,171 I'm going to record it as a negative. 1123 00:49:57,171 --> 00:50:01,190 Because when yields go up and I own the bond, I lose money. 1124 00:50:01,190 --> 00:50:04,920 Signs, very important. 1125 00:50:04,920 --> 00:50:07,030 And, again, we've covered-- usually 1126 00:50:07,030 --> 00:50:10,220 I do this in a two hour lecture. 1127 00:50:10,220 --> 00:50:13,180 And we've covered it in less than an hour, so pretty good. 1128 00:50:13,180 --> 00:50:14,437 All right. 1129 00:50:14,437 --> 00:50:16,270 I spent a lot more time on the fixed income. 1130 00:50:16,270 --> 00:50:17,230 [STUDENT COUGHING] 1131 00:50:17,230 --> 00:50:18,966 Are you taking something for that? 1132 00:50:18,966 --> 00:50:21,691 That does not sound healthy. 1133 00:50:21,691 --> 00:50:22,940 I don't mean to embarrass you. 1134 00:50:22,940 --> 00:50:25,106 But I just want to make sure that you're taking care 1135 00:50:25,106 --> 00:50:27,468 of yourself because grad students don't-- I was a grad 1136 00:50:27,468 --> 00:50:29,468 student, I didn't take care of myself very well. 1137 00:50:29,468 --> 00:50:29,967 I worry. 1138 00:50:33,600 --> 00:50:36,350 All right. 1139 00:50:36,350 --> 00:50:38,150 Big picture, variance/covariance. 1140 00:50:38,150 --> 00:50:40,400 Collect data, calculate returns, test 1141 00:50:40,400 --> 00:50:45,680 the data, matrix construction, get my position vector, 1142 00:50:45,680 --> 00:50:48,230 multiply my matrices. 1143 00:50:48,230 --> 00:50:48,830 All right? 1144 00:50:48,830 --> 00:50:51,630 Quick and dirty, that's how we do it. 1145 00:50:51,630 --> 00:50:54,030 That's the simplified approach to measuring 1146 00:50:54,030 --> 00:50:57,469 this order statistic called value at risk using 1147 00:50:57,469 --> 00:50:58,552 this particular technique. 1148 00:51:01,930 --> 00:51:03,890 Questions, comments? 1149 00:51:03,890 --> 00:51:06,004 Anyone? 1150 00:51:06,004 --> 00:51:08,536 Anything you think I need to elucidate on that? 1151 00:51:11,320 --> 00:51:18,070 And this is, in fact, how we did this up until the late '90s. 1152 00:51:18,070 --> 00:51:19,860 Firms used variance/covariance. 1153 00:51:19,860 --> 00:51:24,210 I heard a statistic in Europe in 1996 1154 00:51:24,210 --> 00:51:27,367 that 80% of the European banks were using this technique 1155 00:51:27,367 --> 00:51:28,450 to do their value at risk. 1156 00:51:28,450 --> 00:51:31,780 It was no more complicated than this. 1157 00:51:31,780 --> 00:51:33,165 I use a little flow diagram. 1158 00:51:35,740 --> 00:51:38,210 Get your data returns, graph your data 1159 00:51:38,210 --> 00:51:40,670 to make sure you don't screw it up. 1160 00:51:40,670 --> 00:51:44,280 Get your covariance matrix, multiply your matrices out. 1161 00:51:44,280 --> 00:51:47,230 x transpose sigma x. 1162 00:51:47,230 --> 00:51:50,340 Using the position vectors and then you can do your analysis. 1163 00:51:50,340 --> 00:51:52,010 Normally I would spend some more time 1164 00:51:52,010 --> 00:51:54,468 on that bottom row and different things you can do with it, 1165 00:51:54,468 --> 00:51:58,566 but that will have to suffice for now. 1166 00:51:58,566 --> 00:52:01,044 A couple of points I want to make before we move on 1167 00:52:01,044 --> 00:52:01,960 about the assumptions. 1168 00:52:04,147 --> 00:52:06,230 Actually, I'll fly through this here so we can get 1169 00:52:06,230 --> 00:52:09,310 into Monte Carlo simulation. 1170 00:52:09,310 --> 00:52:10,790 Where am I going to get my data? 1171 00:52:10,790 --> 00:52:11,748 Where do I get my data? 1172 00:52:11,748 --> 00:52:13,660 I often get a lot of my data from Bloomberg, 1173 00:52:13,660 --> 00:52:16,036 I get it from public sources, I get it from the internet. 1174 00:52:16,036 --> 00:52:17,660 Especially when you get it from-- look, 1175 00:52:17,660 --> 00:52:19,730 if it says so on the internet, it must be true. 1176 00:52:19,730 --> 00:52:20,861 Right? 1177 00:52:20,861 --> 00:52:24,530 Didn't Abe Lincoln say, don't believe everything 1178 00:52:24,530 --> 00:52:26,230 you read on the internet? 1179 00:52:26,230 --> 00:52:29,760 That was a quote, I saw that some place. 1180 00:52:29,760 --> 00:52:32,367 You get data from people, you check it. 1181 00:52:32,367 --> 00:52:34,200 There's some sources that are very reliable. 1182 00:52:34,200 --> 00:52:37,190 If you're looking for yield data or foreign exchange data, 1183 00:52:37,190 --> 00:52:38,590 the Federal Reserve has it. 1184 00:52:38,590 --> 00:52:40,910 And they have it back 20 years, daily data. 1185 00:52:40,910 --> 00:52:45,860 It's the H.15 and the H.10. 1186 00:52:45,860 --> 00:52:48,490 It's there, it's free, it's easy to download, just 1187 00:52:48,490 --> 00:52:49,175 be aware of it. 1188 00:52:49,175 --> 00:52:49,675 Exchange-- 1189 00:52:49,675 --> 00:52:52,630 PROFESSOR: [INAUDIBLE] study posted 1190 00:52:52,630 --> 00:52:58,720 on the website that goes through computations 1191 00:52:58,720 --> 00:53:01,840 for regression analysis and asset pricing models 1192 00:53:01,840 --> 00:53:04,490 and the data that's used there is from the Federal 1193 00:53:04,490 --> 00:53:05,560 Reserve for yields. 1194 00:53:05,560 --> 00:53:06,610 KENNETH ABBOTT: It's H.15 It's for yields, 1195 00:53:06,610 --> 00:53:07,690 it's probably from the H.15. 1196 00:53:07,690 --> 00:53:07,930 [INTERPOSING VOICES] 1197 00:53:07,930 --> 00:53:09,670 PROFESSOR: Those files, you can see how to actually get 1198 00:53:09,670 --> 00:53:10,750 that data for yourselves. 1199 00:53:10,750 --> 00:53:12,970 KENNETH ABBOTT: Now, another great source of data 1200 00:53:12,970 --> 00:53:15,080 is Bloomberg. 1201 00:53:15,080 --> 00:53:16,890 Now the good thing about Bloomberg data 1202 00:53:16,890 --> 00:53:19,660 is everybody uses it, so it's clean. 1203 00:53:19,660 --> 00:53:20,840 Relatively clean. 1204 00:53:20,840 --> 00:53:22,990 I still find errors in it from time to time. 1205 00:53:22,990 --> 00:53:25,490 But what happens is when you find an error in your Bloomberg 1206 00:53:25,490 --> 00:53:27,200 data, you get on the phone to Bloomberg right away 1207 00:53:27,200 --> 00:53:28,270 and say I found an error in your data. 1208 00:53:28,270 --> 00:53:29,270 They say, oh, what date? 1209 00:53:29,270 --> 00:53:32,320 June 14, you know, 2012. 1210 00:53:32,320 --> 00:53:34,370 And they'll say, OK, we'll fix it. 1211 00:53:34,370 --> 00:53:34,900 All right? 1212 00:53:34,900 --> 00:53:37,340 So everybody does that, and the data set is pretty clean. 1213 00:53:37,340 --> 00:53:40,430 I found consistently that Bloomberg data is 1214 00:53:40,430 --> 00:53:43,497 the cleanest in my experience. 1215 00:53:43,497 --> 00:53:45,080 How much data do we use in doing this? 1216 00:53:45,080 --> 00:53:47,460 I could use one year of data, I can use two weeks of data. 1217 00:53:47,460 --> 00:53:49,626 Now, times series, we usually want 100 observations. 1218 00:53:49,626 --> 00:53:52,020 That's always been my rule of thumb. 1219 00:53:52,020 --> 00:53:53,320 I can use one year of data. 1220 00:53:53,320 --> 00:53:54,880 There are regulators that require you 1221 00:53:54,880 --> 00:53:57,110 to use at least a year of data. 1222 00:53:57,110 --> 00:53:58,560 You could use two years of data. 1223 00:53:58,560 --> 00:54:00,790 In fact, some firms use one year of data. 1224 00:54:00,790 --> 00:54:04,410 There's one firm that uses five years of data. 1225 00:54:04,410 --> 00:54:07,000 And there, we could say, well, am I going to weight it. 1226 00:54:07,000 --> 00:54:09,060 Am I going to weight my more recent data heavily? 1227 00:54:09,060 --> 00:54:11,480 I could do that with exponential smoothing, which we 1228 00:54:11,480 --> 00:54:12,730 won't have time to talk about. 1229 00:54:12,730 --> 00:54:17,550 It's a technique I can use to lend more credence to the more 1230 00:54:17,550 --> 00:54:18,680 recent data. 1231 00:54:18,680 --> 00:54:22,450 Now, I'm a relatively simple guy. 1232 00:54:22,450 --> 00:54:24,000 I tend to use equally weighted data 1233 00:54:24,000 --> 00:54:26,050 because I believe in Occam's razor, which 1234 00:54:26,050 --> 00:54:29,920 is, the simplest explanation is usually the best. 1235 00:54:29,920 --> 00:54:31,319 I think we get too clever by half 1236 00:54:31,319 --> 00:54:32,485 when we try to parameterize. 1237 00:54:36,140 --> 00:54:39,110 How much more does last week's data 1238 00:54:39,110 --> 00:54:41,550 have an impact than from two weeks ago, three weeks ago. 1239 00:54:41,550 --> 00:54:44,680 I'm not saying that it doesn't, what I am saying is, 1240 00:54:44,680 --> 00:54:48,980 I'm not smart enough to know exactly how much it does. 1241 00:54:48,980 --> 00:54:51,120 And assuming that everything's equally 1242 00:54:51,120 --> 00:54:55,300 weighted throughout time is just as strong an assumption. 1243 00:54:55,300 --> 00:54:59,050 But it's a very simple assumption, and I love simple. 1244 00:54:59,050 --> 00:55:00,002 Yes? 1245 00:55:00,002 --> 00:55:04,140 AUDIENCE: [INAUDIBLE] calculate covariance matrix? 1246 00:55:04,140 --> 00:55:05,132 KENNETH ABBOTT: Yes. 1247 00:55:05,132 --> 00:55:07,317 All right, quickly. 1248 00:55:07,317 --> 00:55:09,150 Actually I think I have some slides on that. 1249 00:55:11,770 --> 00:55:14,850 Let me just finish this and I'll get to that. 1250 00:55:14,850 --> 00:55:15,640 Gaps in data. 1251 00:55:15,640 --> 00:55:17,011 Missing data is a problem. 1252 00:55:17,011 --> 00:55:18,260 How do I fill in missing data? 1253 00:55:18,260 --> 00:55:22,420 I can do a linear interpolation, I can use the prior day's data. 1254 00:55:22,420 --> 00:55:24,780 I can do a Brownian bridge, which is I just 1255 00:55:24,780 --> 00:55:26,800 do a Monte Carlo between them. 1256 00:55:26,800 --> 00:55:28,880 I can do a regression based, I can use regression 1257 00:55:28,880 --> 00:55:31,901 to project changes from one onto changes in another. 1258 00:55:31,901 --> 00:55:33,400 That's usually a whole other lecture 1259 00:55:33,400 --> 00:55:34,860 I gave on how to do missing data. 1260 00:55:34,860 --> 00:55:37,670 Now you've got that lecture for free. 1261 00:55:37,670 --> 00:55:39,320 That's all you need to know. 1262 00:55:39,320 --> 00:55:42,320 It's not only a lecture, it's a very hard homework assignment. 1263 00:55:42,320 --> 00:55:47,292 But how frequently do I update my data? 1264 00:55:47,292 --> 00:55:49,500 Some people update their covariance structures daily. 1265 00:55:49,500 --> 00:55:50,830 I think that's an overkill. 1266 00:55:50,830 --> 00:55:54,760 We update our data set weekly. 1267 00:55:54,760 --> 00:55:55,520 That's what we do. 1268 00:55:55,520 --> 00:56:00,650 And I think that's overkill, but tell that to my regulators. 1269 00:56:00,650 --> 00:56:02,850 And we use daily data, weekly data, monthly data. 1270 00:56:02,850 --> 00:56:04,400 We typically use daily data. 1271 00:56:04,400 --> 00:56:05,957 Some firms may do it differently. 1272 00:56:12,160 --> 00:56:13,770 All right. 1273 00:56:13,770 --> 00:56:17,060 Here's your exponential smoothing. 1274 00:56:17,060 --> 00:56:22,210 Remember, I usually measure covariance, sum of x_i 1275 00:56:22,210 --> 00:56:27,640 minus x bar times y minus y bar divided by n minus 1. 1276 00:56:27,640 --> 00:56:31,670 What if I stuck an omega in there? 1277 00:56:31,670 --> 00:56:34,390 And I use this calculation instead, 1278 00:56:34,390 --> 00:56:38,310 where the denominator is the sum of all the omegas-- you should 1279 00:56:38,310 --> 00:56:39,470 be thinking finite series. 1280 00:56:42,310 --> 00:56:44,841 You have to realize, I was a decent math student, 1281 00:56:44,841 --> 00:56:46,090 I wasn't a great math student. 1282 00:56:46,090 --> 00:56:49,360 And what I found when I was studying this, I was like, wow, 1283 00:56:49,360 --> 00:56:51,340 all that stuff that I learned, it actually-- 1284 00:56:51,340 --> 00:56:52,890 finite series, who knew? 1285 00:56:52,890 --> 00:56:55,130 Who knew that I'd actually use it? 1286 00:56:55,130 --> 00:56:58,470 So I take this, and let's say I'm working backwards in time. 1287 00:56:58,470 --> 00:57:02,070 So today's observations is t_0. 1288 00:57:02,070 --> 00:57:05,880 Yesterday's observation is t_1, t_2, t_3. 1289 00:57:05,880 --> 00:57:10,630 So today's observation would get-- and let's 1290 00:57:10,630 --> 00:57:12,820 assume for the time being that this omega 1291 00:57:12,820 --> 00:57:15,320 is on the order 0.95. 1292 00:57:15,320 --> 00:57:17,090 It could be anything. 1293 00:57:17,090 --> 00:57:21,850 So today would be 0.95 to the 0 divided 1294 00:57:21,850 --> 00:57:23,400 by the sum of all the omegas. 1295 00:57:23,400 --> 00:57:26,900 Tomorrow it will be 0.95 divided by the sum of the omegas. 1296 00:57:26,900 --> 00:57:29,270 The next would be 0.95 squared divided 1297 00:57:29,270 --> 00:57:31,950 by the sum of the omegas. 1298 00:57:31,950 --> 00:57:33,780 0.95 cubed and get smaller and smaller. 1299 00:57:36,960 --> 00:57:42,760 For example, if you use 0.94, 99% of your weight 1300 00:57:42,760 --> 00:57:46,220 will be in the last 76 days. 1301 00:57:46,220 --> 00:57:48,390 76 observations, I shouldn't say 76 days. 1302 00:57:48,390 --> 00:57:49,800 76 observations. 1303 00:57:49,800 --> 00:57:55,270 So there's this notion that the impact declines exponentially. 1304 00:57:55,270 --> 00:57:57,580 Does that make sense? 1305 00:57:57,580 --> 00:58:01,530 People use this pretty commonly, but what scares me about it-- 1306 00:58:01,530 --> 00:58:03,275 somebody stuck these fancy transitions 1307 00:58:03,275 --> 00:58:04,656 in between these slides. 1308 00:58:04,656 --> 00:58:07,940 Anyway, is that here's my standard deviation 1309 00:58:07,940 --> 00:58:09,960 with a rolling six-month window. 1310 00:58:09,960 --> 00:58:13,910 And here's my standard deviation using different weights. 1311 00:58:13,910 --> 00:58:15,660 The point I want to make here, and it's 1312 00:58:15,660 --> 00:58:20,840 an important point, my assumption about my weighting 1313 00:58:20,840 --> 00:58:24,520 coefficient has a material impact 1314 00:58:24,520 --> 00:58:29,110 on the size of my measured volatility. 1315 00:58:29,110 --> 00:58:31,400 Now when I see this, and this is just me. 1316 00:58:31,400 --> 00:58:33,510 There's no finance or statistics theory 1317 00:58:33,510 --> 00:58:38,470 behind this, any time the choice-- any time an assumption 1318 00:58:38,470 --> 00:58:41,900 has this material an impact, bells and whistles go off 1319 00:58:41,900 --> 00:58:43,040 and sirens. 1320 00:58:43,040 --> 00:58:44,840 All right, and red lights flash. 1321 00:58:44,840 --> 00:58:47,190 Be very, very careful. 1322 00:58:47,190 --> 00:58:49,660 Now, lies, damn lies, and statistics. 1323 00:58:49,660 --> 00:58:51,610 You tell me the outcome you want, 1324 00:58:51,610 --> 00:58:54,280 and I'll tell you what statistics to use. 1325 00:58:54,280 --> 00:58:55,720 That's where this could be abused. 1326 00:58:55,720 --> 00:58:57,480 Oh, you want to show high volatility? 1327 00:58:57,480 --> 00:58:59,510 Well let's use this. 1328 00:58:59,510 --> 00:59:01,710 You want to show low volatility, let's use this? 1329 00:59:01,710 --> 00:59:07,780 See, I choose to just take the simplest approach. 1330 00:59:07,780 --> 00:59:09,890 And that's me. 1331 00:59:09,890 --> 00:59:12,460 That's not a terribly scientific opinion, 1332 00:59:12,460 --> 00:59:13,613 but that's what I think. 1333 00:59:17,570 --> 00:59:21,470 Daily versus weekly, percentage changes log changes. 1334 00:59:21,470 --> 00:59:23,130 Units. 1335 00:59:23,130 --> 00:59:27,290 Just like dollar/yen, interest rates. 1336 00:59:27,290 --> 00:59:30,080 Am I long or am I short? 1337 00:59:30,080 --> 00:59:32,490 If I'm long gold, I show it as a positive number. 1338 00:59:32,490 --> 00:59:34,600 And if I'm short gold, in my position vector, 1339 00:59:34,600 --> 00:59:36,550 I show it as a negative number. 1340 00:59:36,550 --> 00:59:40,970 If I'm long yen, and yen is measured in yen per dollar, 1341 00:59:40,970 --> 00:59:42,470 then I show it as a negative number. 1342 00:59:42,470 --> 00:59:46,370 If I'm long yen, but my covariance matrix measures yen 1343 00:59:46,370 --> 00:59:51,740 as dollars per yen-- 0.000094, whatever-- 1344 00:59:51,740 --> 00:59:54,620 then I show it as a positive number. 1345 00:59:54,620 --> 00:59:58,766 It's just like physics only worse 1346 00:59:58,766 --> 01:00:00,140 because it'll cost you real-- no, 1347 01:00:00,140 --> 01:00:01,312 I guess physics would be worse because if you 1348 01:00:01,312 --> 01:00:03,500 get the units wrong, you blow up, right? 1349 01:00:03,500 --> 01:00:06,282 This will just cost you money. 1350 01:00:06,282 --> 01:00:07,240 I've made this mistake. 1351 01:00:07,240 --> 01:00:10,000 I've made the units mistake. 1352 01:00:10,000 --> 01:00:12,283 All right, we talked about fixed income. 1353 01:00:17,830 --> 01:00:21,770 So that's what I want to cover from the bare bones 1354 01:00:21,770 --> 01:00:22,870 setup for VaR. 1355 01:00:22,870 --> 01:00:24,850 Now I'm going to skip the historical simulation 1356 01:00:24,850 --> 01:00:26,110 and go right to the Monte Carlo because I 1357 01:00:26,110 --> 01:00:28,693 want to show you another way we can use covariance structures. 1358 01:00:35,555 --> 01:00:37,052 [POWERPOINT SOUND EFFECT] 1359 01:00:38,380 --> 01:00:42,040 That's going to happen two or three more times. 1360 01:00:42,040 --> 01:00:44,710 Somebody did this, somebody made my presentation cute 1361 01:00:44,710 --> 01:00:45,460 some years ago. 1362 01:00:45,460 --> 01:00:47,309 And I just-- I apologize. 1363 01:00:50,620 --> 01:00:53,185 All right, see, there's a lot to meat in this presentation 1364 01:00:53,185 --> 01:00:56,670 that we don't have time to get to. 1365 01:00:56,670 --> 01:01:01,120 Another approach to doing value at risk 1366 01:01:01,120 --> 01:01:05,620 is rather than use this parametric approach, 1367 01:01:05,620 --> 01:01:09,270 is to simulate the outcomes. 1368 01:01:09,270 --> 01:01:13,950 Simulate the outcomes 100 times, 1,000 times, 10,000 times, 1369 01:01:13,950 --> 01:01:16,920 a million times, and say, these are all the possible outcomes 1370 01:01:16,920 --> 01:01:19,640 based on my simulation assumptions. 1371 01:01:19,640 --> 01:01:23,309 And let's say I simulate 10,000 times, 1372 01:01:23,309 --> 01:01:25,350 and I have 10,000 possible outcomes for tomorrow. 1373 01:01:25,350 --> 01:01:29,840 And I wanted to measure my value at risk at the 1% significance 1374 01:01:29,840 --> 01:01:31,270 level. 1375 01:01:31,270 --> 01:01:34,790 All I would do is take my 10,000 outcomes 1376 01:01:34,790 --> 01:01:39,170 and I would sort them and take my hundredth worst. 1377 01:01:39,170 --> 01:01:41,366 Put it in your pocket, go home. 1378 01:01:41,366 --> 01:01:43,310 That's it. 1379 01:01:43,310 --> 01:01:45,970 This is a different way of getting 1380 01:01:45,970 --> 01:01:47,010 to that order statistic. 1381 01:01:50,330 --> 01:01:52,026 Lends a lot more flexibility. 1382 01:01:52,026 --> 01:01:54,400 So I can go and I can tweak the way I do that simulation, 1383 01:01:54,400 --> 01:01:56,145 I can relax my assumptions of normality. 1384 01:01:58,304 --> 01:01:59,970 I don't have to use normal distribution, 1385 01:01:59,970 --> 01:02:01,740 I could use a t distribution, I could do lots, 1386 01:02:01,740 --> 01:02:03,906 I could tweak my distribution, I could customize it. 1387 01:02:03,906 --> 01:02:06,150 I could put mean reversion in there, 1388 01:02:06,150 --> 01:02:08,410 I could do all kinds of stuff. 1389 01:02:08,410 --> 01:02:12,300 So another way we do value at risk is we 1390 01:02:12,300 --> 01:02:15,110 simulate possible outcomes. 1391 01:02:15,110 --> 01:02:19,670 We rank the outcomes, and we just count them. 1392 01:02:19,670 --> 01:02:21,763 If I've got the 10,000 observations 1393 01:02:21,763 --> 01:02:23,240 and I want my 5% order statistic, 1394 01:02:23,240 --> 01:02:25,230 well I just take my 500th. 1395 01:02:27,830 --> 01:02:28,820 Make sense? 1396 01:02:28,820 --> 01:02:30,890 It's that simple. 1397 01:02:30,890 --> 01:02:32,306 Well, I don't want to make it seem 1398 01:02:32,306 --> 01:02:33,200 like it's that simple because it actually 1399 01:02:33,200 --> 01:02:35,140 gets a little messy in here. 1400 01:02:35,140 --> 01:02:37,680 But when we do Monte Carlo simulation, 1401 01:02:37,680 --> 01:02:40,449 we're simulating what we think is going to happen 1402 01:02:40,449 --> 01:02:41,740 all subject to our assumptions. 1403 01:02:44,640 --> 01:02:46,880 And we run through this Monte Carlo simulation. 1404 01:02:46,880 --> 01:02:49,830 Simulation of method using sequences of random numbers. 1405 01:02:49,830 --> 01:02:51,550 Coined during the Manhattan Project, 1406 01:02:51,550 --> 01:02:54,430 similar to games of chance. 1407 01:02:54,430 --> 01:02:57,610 You need to describe your system in terms of probability density 1408 01:02:57,610 --> 01:02:58,480 functions. 1409 01:02:58,480 --> 01:03:00,900 What type of distribution? 1410 01:03:00,900 --> 01:03:02,040 Is this normal? 1411 01:03:02,040 --> 01:03:02,540 Is it t? 1412 01:03:02,540 --> 01:03:03,470 Is it chi squared? 1413 01:03:03,470 --> 01:03:04,500 Is it F? 1414 01:03:04,500 --> 01:03:05,360 All right? 1415 01:03:05,360 --> 01:03:07,800 That's the way we do it. 1416 01:03:07,800 --> 01:03:09,720 So quickly, how do I do that? 1417 01:03:14,570 --> 01:03:16,200 I have to have random numbers. 1418 01:03:16,200 --> 01:03:19,760 Now they're truly random numbers. 1419 01:03:19,760 --> 01:03:22,457 Somewhere at MIT you could buy-- I used to say tape, 1420 01:03:22,457 --> 01:03:23,540 but people don't use tape. 1421 01:03:23,540 --> 01:03:28,840 They'll give you a website where you can get the atomic decay. 1422 01:03:28,840 --> 01:03:30,450 That's random. 1423 01:03:30,450 --> 01:03:31,706 All right? 1424 01:03:31,706 --> 01:03:36,480 Anything else is pseudo-random. 1425 01:03:36,480 --> 01:03:39,190 What you see when you go into MATLAB, 1426 01:03:39,190 --> 01:03:41,570 you have a random number generator, it's an algorithm. 1427 01:03:41,570 --> 01:03:45,110 It probably takes some number and takes the square root 1428 01:03:45,110 --> 01:03:47,910 of that number and then goes 54 decimal places to the right 1429 01:03:47,910 --> 01:03:51,690 and takes the 55 decimal places to the right, 1430 01:03:51,690 --> 01:03:55,170 multiplies those two numbers together 1431 01:03:55,170 --> 01:03:58,110 and then takes the fifth root, and then goes 16 decimal places 1432 01:03:58,110 --> 01:04:01,610 to the right to get that-- it's some algorithm. 1433 01:04:01,610 --> 01:04:05,590 True story, before I came to appreciate that these were all 1434 01:04:05,590 --> 01:04:09,225 highly algorithmically driven, I was in my 20's, I 1435 01:04:09,225 --> 01:04:11,390 was taking a computer class, I saw two computers, 1436 01:04:11,390 --> 01:04:13,830 they were both running random number of generators 1437 01:04:13,830 --> 01:04:17,190 and they were generating the same random numbers. 1438 01:04:17,190 --> 01:04:19,870 And I thought I was at the event horizon. 1439 01:04:19,870 --> 01:04:22,760 I thought that light was bending and the world 1440 01:04:22,760 --> 01:04:24,610 was coming to an end, all right? 1441 01:04:24,610 --> 01:04:27,767 Because this this stuff can't happen, all right? 1442 01:04:27,767 --> 01:04:29,350 It was happening right in front of me. 1443 01:04:29,350 --> 01:04:31,860 It was a pseudo-random number generator. 1444 01:04:31,860 --> 01:04:34,860 I didn't know, I was 24. 1445 01:04:34,860 --> 01:04:35,360 Anyway. 1446 01:04:38,320 --> 01:04:42,810 quasi-random numbers, it's sort of a way of imposing some order 1447 01:04:42,810 --> 01:04:43,790 on your random numbers. 1448 01:04:43,790 --> 01:04:47,020 You random numbers, one particular set of draws 1449 01:04:47,020 --> 01:04:49,895 may not have enough draws in a particular area 1450 01:04:49,895 --> 01:04:51,270 to give you the numbers you want. 1451 01:04:51,270 --> 01:04:53,767 I can impose some conditions upon that. 1452 01:04:53,767 --> 01:04:56,100 I don't want to get into a discussion of random numbers. 1453 01:04:58,850 --> 01:05:02,250 How do I get from random uniform-- 1454 01:05:02,250 --> 01:05:05,610 most random number generators give you random uniform number 1455 01:05:05,610 --> 01:05:07,280 between 0 and 1. 1456 01:05:07,280 --> 01:05:10,080 What you'll typically do is you'll take that random uniform 1457 01:05:10,080 --> 01:05:13,190 number, you'll map it over to the cumulative density 1458 01:05:13,190 --> 01:05:16,340 function, and map it down. 1459 01:05:16,340 --> 01:05:19,720 So this gets you from random uniform space 1460 01:05:19,720 --> 01:05:23,150 into standard deviation space. 1461 01:05:23,150 --> 01:05:24,860 We used to worry about how we did this, 1462 01:05:24,860 --> 01:05:27,534 now your software does it for you. 1463 01:05:27,534 --> 01:05:29,450 I've gotten comfortable enough, truth be told. 1464 01:05:29,450 --> 01:05:36,040 I usually trust my random number generators in Excel, in MATLAB. 1465 01:05:36,040 --> 01:05:38,980 So I kind of violate my own rules, I don't check. 1466 01:05:38,980 --> 01:05:43,940 But I think most of your standard random number 1467 01:05:43,940 --> 01:05:46,074 of generators are decent enough now. 1468 01:05:46,074 --> 01:05:47,490 And you can go straight to normal, 1469 01:05:47,490 --> 01:05:50,150 you don't have to do random uniform and back 1470 01:05:50,150 --> 01:05:51,280 into random normal. 1471 01:05:51,280 --> 01:05:53,910 You can get it distributed in any way you want. 1472 01:05:59,550 --> 01:06:05,524 What I do when I do a Monte Carlo simulation-- and this 1473 01:06:05,524 --> 01:06:08,065 is going to be rushed because we've only got like 20 minutes. 1474 01:06:10,800 --> 01:06:14,165 If I take a covariance matrix-- you're 1475 01:06:14,165 --> 01:06:15,540 going to have to trust me on this 1476 01:06:15,540 --> 01:06:18,460 because again, I'm covering like eight hours of lecture 1477 01:06:18,460 --> 01:06:19,907 in an hour and a half. 1478 01:06:19,907 --> 01:06:21,740 You guys go to MIT so I have no doubt you're 1479 01:06:21,740 --> 01:06:22,823 going to be all over this. 1480 01:06:48,134 --> 01:06:52,080 Let's take this out of here for a second. 1481 01:06:52,080 --> 01:06:56,250 I can factor my covariance structure. 1482 01:06:59,070 --> 01:07:02,716 I can factor my covariance structure like this. 1483 01:07:02,716 --> 01:07:05,300 And this is the transpose of this. 1484 01:07:05,300 --> 01:07:08,920 I didn't realize that the first time we did this commercially 1485 01:07:08,920 --> 01:07:12,690 I saw this instead of this and I thought we had 1486 01:07:12,690 --> 01:07:13,940 sent bad data to the customer. 1487 01:07:13,940 --> 01:07:15,280 I got physically sick. 1488 01:07:15,280 --> 01:07:21,190 And then I remembered AB transpose 1489 01:07:21,190 --> 01:07:25,840 equals B transpose A-- these things keep happening. 1490 01:07:25,840 --> 01:07:28,720 My high school math keeps coming back to me. 1491 01:07:28,720 --> 01:07:31,114 But I had forgotten this and I got physically sick 1492 01:07:31,114 --> 01:07:33,030 because I thought we'd sent bad data because I 1493 01:07:33,030 --> 01:07:36,400 was looking at this when it's just the transpose of this. 1494 01:07:36,400 --> 01:07:39,900 Anyway, I can factor this into this where this 1495 01:07:39,900 --> 01:07:44,550 is a matrix of eigenvectors. 1496 01:07:44,550 --> 01:07:48,195 This is a diagonal matrix of the eigenvalues. 1497 01:07:48,195 --> 01:07:50,640 All right? 1498 01:07:50,640 --> 01:07:56,100 This is the vaunted Gaussian copula. 1499 01:07:56,100 --> 01:07:57,930 This is it. 1500 01:07:57,930 --> 01:07:59,560 Most people view it as a black box. 1501 01:07:59,560 --> 01:08:01,890 If you've had any more than introductory statistics, 1502 01:08:01,890 --> 01:08:03,360 this should be a glass box to you. 1503 01:08:03,360 --> 01:08:04,765 That's why I wanted to go through this even though I'd 1504 01:08:04,765 --> 01:08:07,872 love to spend another hour and a half and do about 50 examples. 1505 01:08:07,872 --> 01:08:09,330 Because this is how I learned this, 1506 01:08:09,330 --> 01:08:12,880 I didn't learn it from looking at this equation and saying, 1507 01:08:12,880 --> 01:08:13,530 oh, I get it. 1508 01:08:13,530 --> 01:08:15,870 I learned it from actually doing it about 1,000 times 1509 01:08:15,870 --> 01:08:20,020 in a spreadsheet, and sunk in like water into a stone. 1510 01:08:20,020 --> 01:08:23,649 So I factor this matrix, and then 1511 01:08:23,649 --> 01:08:33,250 I take this, which is the square root matrix, which 1512 01:08:33,250 --> 01:08:36,020 is my transpose of my eigenvector matrix 1513 01:08:36,020 --> 01:08:39,490 and diagonal matrix contain the square root of my eigenvalues. 1514 01:08:39,490 --> 01:08:42,750 Now, could this ever be negative and take me 1515 01:08:42,750 --> 01:08:44,910 into imaginary root land? 1516 01:08:44,910 --> 01:08:50,140 Well, if my variances are positive or zero, 1517 01:08:50,140 --> 01:08:52,176 then that will be a problem. 1518 01:08:52,176 --> 01:08:53,800 So here we get into this-- remember you 1519 01:08:53,800 --> 01:08:55,899 guys studied positive semidefinite, 1520 01:08:55,899 --> 01:08:56,649 positive definite. 1521 01:08:56,649 --> 01:08:59,232 Once again, it's another one of these high school math things. 1522 01:08:59,232 --> 01:09:00,130 Like, here it is. 1523 01:09:00,130 --> 01:09:02,035 I had to know this. 1524 01:09:02,035 --> 01:09:04,160 Suddenly I care whether it's positive semidefinite. 1525 01:09:04,160 --> 01:09:08,270 Covariance structures have to be positive semidefinite. 1526 01:09:08,270 --> 01:09:10,450 If you don't have a complete data set, 1527 01:09:10,450 --> 01:09:13,439 let's say you've got 100 observations, 100 observations, 1528 01:09:13,439 --> 01:09:17,740 100 observations, 25 observations, 100 observations, 1529 01:09:17,740 --> 01:09:19,486 you may have a negative eigenvalue. 1530 01:09:19,486 --> 01:09:21,569 If you just measure the covariance with the amount 1531 01:09:21,569 --> 01:09:23,160 of data that you have. 1532 01:09:23,160 --> 01:09:26,430 My intuition-- and I doubt this is the [INAUDIBLE]-- 1533 01:09:26,430 --> 01:09:28,880 is that you're measuring with error and you have fewer 1534 01:09:28,880 --> 01:09:31,470 observations you measure with more error. 1535 01:09:31,470 --> 01:09:35,160 So it's possible if some of your covariance measures 1536 01:09:35,160 --> 01:09:38,000 have 25 observations and some of them 1537 01:09:38,000 --> 01:09:41,590 have 100 observations that there's more error in some 1538 01:09:41,590 --> 01:09:42,229 than in others. 1539 01:09:42,229 --> 01:09:44,529 And so there's the theoretical possibility 1540 01:09:44,529 --> 01:09:46,550 for negative variance. 1541 01:09:46,550 --> 01:09:51,060 True story, we didn't know this in the '90s. 1542 01:09:51,060 --> 01:09:54,075 I took this problem to the chairman of the statistics 1543 01:09:54,075 --> 01:09:58,290 department at NYU said, I'm getting negative eigenvalues. 1544 01:09:58,290 --> 01:10:00,140 And he didn't know. 1545 01:10:00,140 --> 01:10:03,370 He had no idea, he's a smart guy. 1546 01:10:03,370 --> 01:10:06,260 You have to fill in your missing data. 1547 01:10:06,260 --> 01:10:08,500 You have to fill in your missing data. 1548 01:10:08,500 --> 01:10:10,500 If you've got 1,000 observations, 1,000 1549 01:10:10,500 --> 01:10:12,654 observations, 1,000 observations, 200 observations, 1550 01:10:12,654 --> 01:10:14,320 and you want to make sure you won't have 1551 01:10:14,320 --> 01:10:16,300 a negative eigenvalue, you've got 1552 01:10:16,300 --> 01:10:17,810 to fill in those observations. 1553 01:10:17,810 --> 01:10:20,510 Which is why missing data is a whole other thing 1554 01:10:20,510 --> 01:10:22,999 we talk about. 1555 01:10:22,999 --> 01:10:24,790 Again, I could spend a lot of time on that. 1556 01:10:24,790 --> 01:10:26,123 And I learned that the hard way. 1557 01:10:28,690 --> 01:10:32,640 But anyway, so I take this square root matrix, 1558 01:10:32,640 --> 01:10:40,240 if I pre-multiply that square root matrix by row after row 1559 01:10:40,240 --> 01:10:46,700 of normals, I will get out an array 1560 01:10:46,700 --> 01:10:49,580 that has the same covariance structure as that 1561 01:10:49,580 --> 01:10:50,606 with which I started. 1562 01:10:54,140 --> 01:10:56,130 Another story here, I've been using 1563 01:10:56,130 --> 01:10:58,710 the same eigenvalue-- I believe in full attribution, 1564 01:10:58,710 --> 01:10:59,660 I'm not a clever guy. 1565 01:10:59,660 --> 01:11:02,660 I have not an original thought in my head. 1566 01:11:02,660 --> 01:11:04,580 And whenever I use someone else's stuff, 1567 01:11:04,580 --> 01:11:06,830 I give them credit for it. 1568 01:11:06,830 --> 01:11:08,606 And the guy who wrote the code that 1569 01:11:08,606 --> 01:11:10,230 did the eigenvalue decomposition-- this 1570 01:11:10,230 --> 01:11:13,740 is something that was translated from Fortran IV. 1571 01:11:13,740 --> 01:11:15,820 It wasn't even [INAUDIBLE], there's 1572 01:11:15,820 --> 01:11:18,570 a dichotomy in the world. 1573 01:11:18,570 --> 01:11:21,100 There are people that have written Fortran, 1574 01:11:21,100 --> 01:11:22,100 and people that haven't. 1575 01:11:22,100 --> 01:11:24,730 I'm guessing that there are two people in this room that have 1576 01:11:24,730 --> 01:11:27,020 ever written a line of Fortran. 1577 01:11:27,020 --> 01:11:29,580 Anyone here? 1578 01:11:29,580 --> 01:11:31,530 Just saying. 1579 01:11:31,530 --> 01:11:34,882 Yeah, with cards or without cards? 1580 01:11:34,882 --> 01:11:36,740 PROFESSOR: [INAUDIBLE]. 1581 01:11:36,740 --> 01:11:38,410 KENNETH ABBOTT: I didn't use cards. 1582 01:11:38,410 --> 01:11:40,410 See, you're an old-timer because you used cards. 1583 01:11:44,226 --> 01:11:47,701 The punch line is, I've been using this guy's code. 1584 01:11:47,701 --> 01:11:48,950 And I could show you the code. 1585 01:11:48,950 --> 01:11:50,650 It's like the Lone Ranger, I didn't even 1586 01:11:50,650 --> 01:11:52,370 get a chance to thank him. 1587 01:11:52,370 --> 01:11:55,360 Because he didn't put his name on the code. 1588 01:11:55,360 --> 01:11:57,430 On the internet now, if you do something clever 1589 01:11:57,430 --> 01:11:58,610 on the quant newsgroups, you're going 1590 01:11:58,610 --> 01:11:59,980 to post your name all over it. 1591 01:11:59,980 --> 01:12:02,840 I've been wanting to thank this guy for like 20 years 1592 01:12:02,840 --> 01:12:04,040 and I haven't been able to. 1593 01:12:04,040 --> 01:12:06,100 Anyway, eigenvalue code that's been translated. 1594 01:12:06,100 --> 01:12:07,660 Let me show you what this means. 1595 01:12:10,841 --> 01:12:11,840 Here's some source data. 1596 01:12:14,770 --> 01:12:17,446 Here's some percentage changes. 1597 01:12:17,446 --> 01:12:20,120 Just like we talked about. 1598 01:12:20,120 --> 01:12:23,880 Here is the empirical correlation 1599 01:12:23,880 --> 01:12:26,970 of those percentage changes. 1600 01:12:26,970 --> 01:12:30,570 So the correlation of my government 10 year to my AAA 10 1601 01:12:30,570 --> 01:12:32,590 year is 0.83. 1602 01:12:32,590 --> 01:12:33,852 To my AA, 0.84. 1603 01:12:33,852 --> 01:12:36,310 All right, you see this. 1604 01:12:36,310 --> 01:12:38,690 And I have this covariance matrix 1605 01:12:38,690 --> 01:12:42,320 which is the-- the correlation matrix is a scaled version 1606 01:12:42,320 --> 01:12:44,470 of the covariance matrix. 1607 01:12:44,470 --> 01:12:46,720 And I do a little bit of statistical legerdemain. 1608 01:12:51,582 --> 01:12:52,790 Eigenvalues and eigenvectors. 1609 01:12:55,320 --> 01:12:56,985 Take the square root of that. 1610 01:12:56,985 --> 01:12:59,660 And again, I'd love to spend a lot more time on this, 1611 01:12:59,660 --> 01:13:03,090 but we just don't-- suffice to say, 1612 01:13:03,090 --> 01:13:05,820 I call this a transformation matrix, that's my term. 1613 01:13:05,820 --> 01:13:12,235 This matrix here is this. 1614 01:13:12,235 --> 01:13:13,760 If we had another hour and a half 1615 01:13:13,760 --> 01:13:15,710 I'd take the step by step to get you there. 1616 01:13:15,710 --> 01:13:18,400 The proof of which is left to the reader as an exercise. 1617 01:13:18,400 --> 01:13:20,930 I'll leave this spreadsheet for you, I'll send it to you. 1618 01:13:20,930 --> 01:13:22,600 I have this matrix. 1619 01:13:22,600 --> 01:13:24,672 This matrix is like a prism. 1620 01:13:24,672 --> 01:13:26,380 I'm going to pass white light through it, 1621 01:13:26,380 --> 01:13:29,390 I'm going to get a beautiful rainbow. 1622 01:13:29,390 --> 01:13:32,120 Let me show you what I mean. 1623 01:13:32,120 --> 01:13:34,770 So remember that matrix, this matrix I'm calling t. 1624 01:13:39,710 --> 01:13:41,410 Remember my matrix is 10 by 10. 1625 01:13:41,410 --> 01:13:47,095 One, two, three, four, five, six, seven, eight, nine, ten. 1626 01:13:47,095 --> 01:13:50,410 10 columns of data. 1627 01:13:50,410 --> 01:13:53,614 10 by 10 correlation matrix. 1628 01:13:53,614 --> 01:13:54,548 Let's check. 1629 01:14:01,490 --> 01:14:08,980 Now I've got row vectors of sorry-- uncorrelated 1630 01:14:08,980 --> 01:14:09,760 random normals. 1631 01:14:15,840 --> 01:14:20,710 So what I'm doing then is I'm pre-multiplying 1632 01:14:20,710 --> 01:14:26,860 that transformation matrix row by row by each row 1633 01:14:26,860 --> 01:14:28,960 of uncorrelated random normals. 1634 01:14:28,960 --> 01:14:33,940 And what I get is correlated random normals. 1635 01:14:33,940 --> 01:14:37,110 So what I'm telling you here is this array 1636 01:14:37,110 --> 01:14:44,080 happens to be 10 wide and 1,000 long. 1637 01:14:44,080 --> 01:14:45,810 And I'm telling you that I started 1638 01:14:45,810 --> 01:14:53,200 with my historical data-- let me see how much data have there. 1639 01:14:53,200 --> 01:14:57,230 A couple hundred observations of historical data. 1640 01:14:57,230 --> 01:15:03,940 And what I've done is once I have that covariance structure, 1641 01:15:03,940 --> 01:15:17,210 I can create a data set here which 1642 01:15:17,210 --> 01:15:25,300 has the same statistical properties as this. 1643 01:15:32,530 --> 01:15:35,280 Not quite the same. 1644 01:15:35,280 --> 01:15:39,545 It can have the same means and the same variances. 1645 01:15:39,545 --> 01:15:41,430 This is what Monte Carlo simulation is about. 1646 01:15:41,430 --> 01:15:43,804 I wish we had another hour because I'd like to spend time 1647 01:15:43,804 --> 01:15:45,360 and-- this is one of these things, 1648 01:15:45,360 --> 01:15:49,000 and again, when I first saw this, I was like, oh my god. 1649 01:15:49,000 --> 01:15:51,176 I felt like I got the keys to the kingdom. 1650 01:15:51,176 --> 01:15:53,550 And I did, this is manually, did it all on a spreadsheet. 1651 01:15:53,550 --> 01:15:54,966 Didn't believe anyone else's code, 1652 01:15:54,966 --> 01:15:57,080 did it all on a spreadsheet. 1653 01:15:57,080 --> 01:16:03,149 But what that means-- quickly, let me just go back over here 1654 01:16:03,149 --> 01:16:03,690 for a second. 1655 01:16:07,660 --> 01:16:09,630 I happen to have about 800 observations here. 1656 01:16:12,230 --> 01:16:14,450 Historical observations. 1657 01:16:14,450 --> 01:16:22,370 What I did was I happened to generate 1,000 samples here. 1658 01:16:22,370 --> 01:16:25,530 But I could generate 10,000 or 100,000, 1659 01:16:25,530 --> 01:16:27,740 or a million or 10 million or a billion 1660 01:16:27,740 --> 01:16:29,570 just by doing more random normals. 1661 01:16:29,570 --> 01:16:31,640 I could generate-- in effect, what 1662 01:16:31,640 --> 01:16:36,695 I'm generating here is synthetic time series that 1663 01:16:36,695 --> 01:16:39,680 have properties similar to my underlying data. 1664 01:16:42,250 --> 01:16:44,430 That's what Monte Carlo simulation is about. 1665 01:16:44,430 --> 01:16:48,100 The means and the variances and the covariances of this data 1666 01:16:48,100 --> 01:16:50,470 set are just like that. 1667 01:16:50,470 --> 01:16:53,700 Now, again, true story, when somebody first showed me this 1668 01:16:53,700 --> 01:16:54,680 I did not believe them. 1669 01:16:54,680 --> 01:16:58,346 So I developed a bunch of little tests. 1670 01:16:58,346 --> 01:17:02,210 And I said, let me just look at the correlation of my Monte 1671 01:17:02,210 --> 01:17:07,430 Carlo data versus my original correlation matrix. 1672 01:17:07,430 --> 01:17:13,270 So 0.83, 0.84, 0.85, 0.85, 0.67, 0.81. 1673 01:17:13,270 --> 01:17:16,510 You look at the corresponding ones of the random numbers I 1674 01:17:16,510 --> 01:17:21,890 just generated, 0.81, 0.82, 0.84, 0.84, 0.64, 0.52. 1675 01:17:21,890 --> 01:17:24,641 0.54 versus 0.52. 1676 01:17:24,641 --> 01:17:26,580 0.18 Versus 0.12. 1677 01:17:26,580 --> 01:17:28,600 0.51 versus 0.47. 1678 01:17:28,600 --> 01:17:32,270 Somebody want to tell me why they're not spot on? 1679 01:17:32,270 --> 01:17:34,150 Sampling error. 1680 01:17:34,150 --> 01:17:37,270 The more data I use the closer it will get to that. 1681 01:17:37,270 --> 01:17:41,290 If I do 1 million, I'd better get right on top of that. 1682 01:17:41,290 --> 01:17:43,450 Does that make sense? 1683 01:17:43,450 --> 01:17:46,490 So what I'm telling you here is that I can 1684 01:17:46,490 --> 01:17:48,080 generate synthetic time series. 1685 01:17:48,080 --> 01:17:49,559 Now, why would I generate so many? 1686 01:17:49,559 --> 01:17:51,100 Well because, remember, I care what's 1687 01:17:51,100 --> 01:17:53,260 going on out in that tail. 1688 01:17:53,260 --> 01:17:56,220 If I only have 100 observations and I'm looking empirically 1689 01:17:56,220 --> 01:18:01,040 at my tail, I've only got one observation out in the 1% tail. 1690 01:18:01,040 --> 01:18:04,040 And that doesn't tell me a whole lot about what's going on. 1691 01:18:04,040 --> 01:18:06,100 If I can simulate that distribution exactly, 1692 01:18:06,100 --> 01:18:08,635 I can say, you know what, I want a billion observations 1693 01:18:08,635 --> 01:18:10,740 in that tail. 1694 01:18:10,740 --> 01:18:12,000 Now we can look at that tail. 1695 01:18:18,506 --> 01:18:19,880 If I have 1 billion observations, 1696 01:18:19,880 --> 01:18:22,750 let's say I'm looking at some kind of normal distribution. 1697 01:18:22,750 --> 01:18:27,200 I'm circling it out here, I'm seeing-- 1698 01:18:27,200 --> 01:18:29,930 I can really dig in and see what the properties of this thing 1699 01:18:29,930 --> 01:18:30,560 are. 1700 01:18:30,560 --> 01:18:32,980 In fact, this can really only take two distributions, 1701 01:18:32,980 --> 01:18:34,063 and really, it's only one. 1702 01:18:34,063 --> 01:18:36,870 But that's another story. 1703 01:18:36,870 --> 01:18:39,210 So what I do in Monte Carlo simulations, 1704 01:18:39,210 --> 01:18:45,800 I'm simulating these outcomes so we can get a lot more meat 1705 01:18:45,800 --> 01:18:48,419 in this tail to understand what's happening out there. 1706 01:18:48,419 --> 01:18:49,460 Does it drop off quickly? 1707 01:18:49,460 --> 01:18:50,970 Does it not drop off quickly? 1708 01:18:50,970 --> 01:18:53,920 That's kind of what it's about. 1709 01:18:53,920 --> 01:18:56,850 So we're about out of time. 1710 01:18:56,850 --> 01:18:59,850 We just covered like four weeks of material, all right? 1711 01:18:59,850 --> 01:19:00,990 But you guys are from MIT. 1712 01:19:00,990 --> 01:19:02,680 I have complete confidence in you. 1713 01:19:02,680 --> 01:19:03,950 I say that to the people who work for me. 1714 01:19:03,950 --> 01:19:05,700 I have complete confidence in your ability 1715 01:19:05,700 --> 01:19:08,270 to get that done by tomorrow morning. 1716 01:19:11,560 --> 01:19:13,020 Questions or comments? 1717 01:19:13,020 --> 01:19:16,050 I know you're sipping from the fire hose here. 1718 01:19:16,050 --> 01:19:17,160 I fully appreciate that. 1719 01:19:20,350 --> 01:19:21,610 So those are examples. 1720 01:19:21,610 --> 01:19:24,210 When I do this with historical simulation 1721 01:19:24,210 --> 01:19:29,870 I won't generate these Monte Carlo trials, 1722 01:19:29,870 --> 01:19:32,070 I'll just use historical data. 1723 01:19:32,070 --> 01:19:35,740 And my fat tails are built into it. 1724 01:19:35,740 --> 01:19:38,250 But what I've shown you today is what 1725 01:19:38,250 --> 01:19:42,030 we developed a one-asset VaR model, 1726 01:19:42,030 --> 01:19:47,530 then we developed a multi-asset variance/covariance model. 1727 01:19:47,530 --> 01:19:50,520 And then I showed you quickly, and in far less time 1728 01:19:50,520 --> 01:19:52,050 than I would like to have shown you, 1729 01:19:52,050 --> 01:19:57,060 is how I can use another statistical technique, which 1730 01:19:57,060 --> 01:20:01,950 is called the Gaussian copula, to generate synthetic data 1731 01:20:01,950 --> 01:20:05,470 sets that will have the same properties as my source 1732 01:20:05,470 --> 01:20:07,470 historical data. 1733 01:20:07,470 --> 01:20:08,060 All right? 1734 01:20:10,982 --> 01:20:12,444 There you have it. 1735 01:20:12,444 --> 01:20:12,944 [APPLAUSE] 1736 01:20:12,944 --> 01:20:16,304 Oh you don't have to-- please, please, please. 1737 01:20:16,304 --> 01:20:18,470 And I'll tell you, for me, one of the coolest things 1738 01:20:18,470 --> 01:20:20,240 was actually being able to apply so much 1739 01:20:20,240 --> 01:20:22,410 of the math I learned in high school and in college 1740 01:20:22,410 --> 01:20:23,826 and never thought I'd apply again. 1741 01:20:23,826 --> 01:20:25,860 One of my best moments was actually 1742 01:20:25,860 --> 01:20:27,355 finding a use for trigonometry. 1743 01:20:30,110 --> 01:20:34,030 If you're not an engineer, where are you going to use it? 1744 01:20:34,030 --> 01:20:35,030 Where do you use it? 1745 01:20:35,030 --> 01:20:36,090 Seasonals. 1746 01:20:36,090 --> 01:20:38,210 You do seasonal estimation. 1747 01:20:38,210 --> 01:20:41,489 And what you do is you do fast Fourier transform. 1748 01:20:41,489 --> 01:20:43,280 Because I can describe any seasonal pattern 1749 01:20:43,280 --> 01:20:45,990 with a linear combination of sine and cosine functions. 1750 01:20:45,990 --> 01:20:47,475 And it actually works. 1751 01:20:47,475 --> 01:20:49,600 I have my students do it as an exercise every year. 1752 01:20:49,600 --> 01:20:52,080 I say, go get New York city temperature data. 1753 01:20:52,080 --> 01:20:54,650 And show me some linear combination 1754 01:20:54,650 --> 01:20:57,230 of sine and cosine functions that 1755 01:20:57,230 --> 01:21:00,740 will show me the seasonal pattern of temperature data. 1756 01:21:00,740 --> 01:21:04,952 And when I first realized I could use trigonometry, yes! 1757 01:21:04,952 --> 01:21:07,460 It wasn't a waste of time. 1758 01:21:07,460 --> 01:21:08,960 I still-- polar coordinates, I still 1759 01:21:08,960 --> 01:21:11,165 haven't found a use for that one. 1760 01:21:11,165 --> 01:21:11,790 But it's there. 1761 01:21:11,790 --> 01:21:12,890 I know it's there. 1762 01:21:12,890 --> 01:21:13,790 All right? 1763 01:21:13,790 --> 01:21:15,340 Go home.