1 00:00:01,540 --> 00:00:03,910 The following content is provided under a Creative 2 00:00:03,910 --> 00:00:05,300 Commons license. 3 00:00:05,300 --> 00:00:07,510 Your support will help MIT OpenCourseWare 4 00:00:07,510 --> 00:00:11,600 continue to offer high quality educational resources for free. 5 00:00:11,600 --> 00:00:14,140 To make a donation or to view additional materials 6 00:00:14,140 --> 00:00:17,885 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,885 --> 00:00:18,510 at ocw.mit.edu. 8 00:00:18,510 --> 00:00:23,510 W 9 00:00:23,510 --> 00:00:25,010 WILLIAM GREEN: So let's get started. 10 00:00:28,260 --> 00:00:30,470 This is my last lecture of the class, 11 00:00:30,470 --> 00:00:31,850 and I want to thank you guys. 12 00:00:31,850 --> 00:00:33,290 This has been a really good class. 13 00:00:33,290 --> 00:00:35,930 I really enjoyed the good questions on the forum 14 00:00:35,930 --> 00:00:37,790 especially. 15 00:00:37,790 --> 00:00:41,100 So I don't know if you guys enjoyed it, 16 00:00:41,100 --> 00:00:44,120 but I liked it anyway. 17 00:00:44,120 --> 00:00:45,484 So I wanted to give-- 18 00:00:45,484 --> 00:00:47,150 my lecture here is going to be a wrap up 19 00:00:47,150 --> 00:00:49,610 on the stochastic methods, and then professor Swan's 20 00:00:49,610 --> 00:00:51,260 going to give a review on Wednesday. 21 00:00:51,260 --> 00:00:54,140 It will be the last lecture of this class. 22 00:00:54,140 --> 00:00:58,121 And then the final is a week from today, I think. 23 00:00:58,121 --> 00:00:59,950 Morning? 24 00:00:59,950 --> 00:01:01,200 Yeah, morning. 25 00:01:01,200 --> 00:01:01,700 All right. 26 00:01:05,470 --> 00:01:06,040 All right. 27 00:01:06,040 --> 00:01:10,820 So as you did in the homework and we talked about, 28 00:01:10,820 --> 00:01:14,980 there is a lot of multidimensional integrals 29 00:01:14,980 --> 00:01:17,020 we'd like to be able to evaluate. 30 00:01:17,020 --> 00:01:19,930 And a lot of them have this form that 31 00:01:19,930 --> 00:01:22,270 is a probability density time some f 32 00:01:22,270 --> 00:01:24,580 that we're trying to evaluate. 33 00:01:24,580 --> 00:01:25,824 That's the integrand. 34 00:01:25,824 --> 00:01:28,240 And then I didn't draw in the millions of integral symbols 35 00:01:28,240 --> 00:01:29,050 around it. 36 00:01:29,050 --> 00:01:31,670 But usually x is a very high dimensional thing, 37 00:01:31,670 --> 00:01:34,780 so you have to integrate a lot of things. 38 00:01:34,780 --> 00:01:37,930 And often we actually don't really know p. 39 00:01:37,930 --> 00:01:41,470 We know a waiting factor, w. 40 00:01:41,470 --> 00:01:43,330 So I just drew three of the integrals there, 41 00:01:43,330 --> 00:01:46,750 but there might be 10 to the 23rd integrals there. 42 00:01:46,750 --> 00:01:49,090 So w is a lot easier for us to deal with than anything 43 00:01:49,090 --> 00:01:51,740 with all those integral signs. 44 00:01:51,740 --> 00:01:53,490 So for example, the Boltzmann distribution 45 00:01:53,490 --> 00:01:54,670 is one of these things. 46 00:01:54,670 --> 00:01:56,860 And also, just recall back earlier, 47 00:01:56,860 --> 00:01:59,980 we were talking about models versus data. 48 00:01:59,980 --> 00:02:01,750 The Bayesian analysis of experiments 49 00:02:01,750 --> 00:02:07,519 says that you take your likelihood function, 50 00:02:07,519 --> 00:02:09,310 the likelihood that you would have observed 51 00:02:09,310 --> 00:02:11,680 the data you did if the parameters were 52 00:02:11,680 --> 00:02:14,590 true for certain value, theta. 53 00:02:14,590 --> 00:02:16,900 And then you multiply that times the prior knowledge 54 00:02:16,900 --> 00:02:20,110 of the parameter values, and that gives you the weighting 55 00:02:20,110 --> 00:02:23,680 factor really for any integrals involving parameters, 56 00:02:23,680 --> 00:02:29,980 because it's really giving you the w of data given everything 57 00:02:29,980 --> 00:02:30,760 you know. 58 00:02:30,760 --> 00:02:35,240 Prior information, and you also know the new data you measured. 59 00:02:35,240 --> 00:02:38,309 And put them all together, you get the weighting factor 60 00:02:38,309 --> 00:02:39,100 for the parameters. 61 00:02:39,100 --> 00:02:41,740 And so you can evaluate all kinds 62 00:02:41,740 --> 00:02:44,745 of multidimensional integrals over all the parameters 63 00:02:44,745 --> 00:02:45,370 in the problem. 64 00:02:45,370 --> 00:02:47,530 So very often you might do experiments 65 00:02:47,530 --> 00:02:49,930 where you maybe have four or five or six 66 00:02:49,930 --> 00:02:52,630 adjustable parameters, theta, some of which 67 00:02:52,630 --> 00:02:54,910 you know something about ahead of time, 68 00:02:54,910 --> 00:02:57,430 and you want to put them together this way. 69 00:02:57,430 --> 00:02:59,710 And that should be a summary of everything you know. 70 00:02:59,710 --> 00:03:01,210 But it's sort of a goofball summary, 71 00:03:01,210 --> 00:03:03,670 because it's a multidimensional function, right, 72 00:03:03,670 --> 00:03:05,679 a function with a lot of variables. 73 00:03:05,679 --> 00:03:07,720 But now you know how to do integrals of functions 74 00:03:07,720 --> 00:03:08,678 with lots of variables. 75 00:03:08,678 --> 00:03:14,577 So you can compute things like some linear combination 76 00:03:14,577 --> 00:03:16,660 of-- what's the expectation value of some function 77 00:03:16,660 --> 00:03:18,050 of those parameters. 78 00:03:18,050 --> 00:03:21,300 So if f, f here was a function of theta-- 79 00:03:21,300 --> 00:03:22,353 why is it bouncing? 80 00:03:25,740 --> 00:03:26,420 OK. 81 00:03:26,420 --> 00:03:28,790 I wonder why you guys are always look like this. 82 00:03:31,831 --> 00:03:32,330 All right. 83 00:03:32,330 --> 00:03:37,094 So f of theta, of any function you want, you can evaluate-- 84 00:03:37,094 --> 00:03:38,510 it's a function of the parameters, 85 00:03:38,510 --> 00:03:41,000 like prediction of what will happen in some new experiment 86 00:03:41,000 --> 00:03:43,372 that depends on the parameters. 87 00:03:43,372 --> 00:03:45,080 You'd have it as f of theta, and then you 88 00:03:45,080 --> 00:03:46,850 could integrate it over w of theta 89 00:03:46,850 --> 00:03:52,400 to get the expected value of the result of the new experiment. 90 00:03:52,400 --> 00:03:54,090 And also actually the distribution 91 00:03:54,090 --> 00:03:56,740 you have to get that way. 92 00:03:56,740 --> 00:03:59,475 Does this make any sense? 93 00:03:59,475 --> 00:04:01,100 I see one person thinks it makes sense. 94 00:04:03,990 --> 00:04:05,350 OK. 95 00:04:05,350 --> 00:04:08,230 So just because we did Monte Carlo using Boltzmann, 96 00:04:08,230 --> 00:04:10,081 which is a common one to use, there's 97 00:04:10,081 --> 00:04:11,080 other weighting factors. 98 00:04:11,080 --> 00:04:12,190 I guess that's my comment. 99 00:04:12,190 --> 00:04:13,731 And anytime you end up something that 100 00:04:13,731 --> 00:04:16,660 has some integrals with weighting factors, 101 00:04:16,660 --> 00:04:21,399 then you should think, oh, I can use Metropolis Monte Carlo. 102 00:04:21,399 --> 00:04:24,820 And also, anytime you have experiments, a lot of times 103 00:04:24,820 --> 00:04:27,460 you think of the, OK, I want to get the best fit least squares 104 00:04:27,460 --> 00:04:29,126 approach, which is what you already knew 105 00:04:29,126 --> 00:04:31,480 before I taught you anything. 106 00:04:31,480 --> 00:04:33,970 But it's very important to remember that you can also 107 00:04:33,970 --> 00:04:36,550 write the actual probability density, 108 00:04:36,550 --> 00:04:38,717 the shape of the probability density the parameters. 109 00:04:38,717 --> 00:04:40,591 And that way you can get all the correlations 110 00:04:40,591 --> 00:04:42,190 between the parameters and include 111 00:04:42,190 --> 00:04:44,190 all the previous knowledge of the parameters 112 00:04:44,190 --> 00:04:45,260 and stuff like that. 113 00:04:45,260 --> 00:04:48,500 So it's a very important thing to remember from this course 114 00:04:48,500 --> 00:04:50,590 is this Bayesian formula. 115 00:04:50,590 --> 00:04:51,110 All right. 116 00:04:54,889 --> 00:04:57,180 But Metropolis Monte Carlo, as you saw in the homework, 117 00:04:57,180 --> 00:05:00,750 is not always that easy. 118 00:05:00,750 --> 00:05:05,730 So you need to choose the step length in every dimension. 119 00:05:05,730 --> 00:05:07,230 It's sort of the factor you're going 120 00:05:07,230 --> 00:05:10,740 to multiply times random numbers to figure out how far to step 121 00:05:10,740 --> 00:05:12,610 from one state to the next. 122 00:05:12,610 --> 00:05:18,234 And it's not so obvious how to choose that before you start 123 00:05:18,234 --> 00:05:19,650 because you may not know that much 124 00:05:19,650 --> 00:05:23,791 about the shape of the integrand, right? 125 00:05:23,791 --> 00:05:26,290 Actually, even with the strength of the p in the probability 126 00:05:26,290 --> 00:05:28,880 distribution, you might not be that sure about it. 127 00:05:28,880 --> 00:05:33,710 And so if you accidentally choose it too large, 128 00:05:33,710 --> 00:05:36,190 what will happen is it tries to take big steps away 129 00:05:36,190 --> 00:05:39,640 from a probable state and ends up at really improbable states, 130 00:05:39,640 --> 00:05:42,290 and then most of the times it won't accept that transition, 131 00:05:42,290 --> 00:05:42,790 right? 132 00:05:42,790 --> 00:05:44,770 Metropolis Monte Carlo will just repeat the same state 133 00:05:44,770 --> 00:05:45,603 over and over again. 134 00:05:45,603 --> 00:05:48,490 So if you see you just keep sitting there getting 135 00:05:48,490 --> 00:05:50,551 the same value over and over and over again, 136 00:05:50,551 --> 00:05:52,300 then that's warning you that you must have 137 00:05:52,300 --> 00:05:55,020 chose your stepsize too large. 138 00:05:55,020 --> 00:05:57,330 Alternatively, if you choose it too small, 139 00:05:57,330 --> 00:05:59,820 all your steps will be accepted because you're not 140 00:05:59,820 --> 00:06:00,795 moving anywhere. 141 00:06:00,795 --> 00:06:02,670 So you're basically staying at the same point 142 00:06:02,670 --> 00:06:05,336 over and over again, except it's slightly different, because you 143 00:06:05,336 --> 00:06:07,230 took a little tiny step. 144 00:06:07,230 --> 00:06:08,970 But that way you won't necessarily 145 00:06:08,970 --> 00:06:10,566 sample the whole range. 146 00:06:10,566 --> 00:06:11,940 And you can see that by plotting. 147 00:06:14,540 --> 00:06:19,840 They say the distance from the initial state position. 148 00:06:19,840 --> 00:06:24,270 You initiated the Monte Carlo Markov chain at some state. 149 00:06:24,270 --> 00:06:28,530 And if you plot probability distribution of density, 150 00:06:28,530 --> 00:06:31,810 the distance from all the new states to the original state, 151 00:06:31,810 --> 00:06:33,884 and if that number is super teeny tiny, 152 00:06:33,884 --> 00:06:36,050 then you might be a little concerned that you really 153 00:06:36,050 --> 00:06:37,902 didn't cover the whole space. 154 00:06:37,902 --> 00:06:40,110 But again, you have to know what is super teeny tiny, 155 00:06:40,110 --> 00:06:41,330 so you have to have a range in mind. 156 00:06:41,330 --> 00:06:41,829 Yeah? 157 00:06:41,829 --> 00:06:43,640 AUDIENCE: Can you do this adaptively? 158 00:06:43,640 --> 00:06:44,973 WILLIAM GREEN: Yeah, yeah, yeah. 159 00:06:44,973 --> 00:06:47,180 So good algorithms for this try to pick it up 160 00:06:47,180 --> 00:06:48,596 that there's a problem, like if it 161 00:06:48,596 --> 00:06:50,485 keeps on either not accepting any states, 162 00:06:50,485 --> 00:06:53,340 it would try to shrink the step, the delta. 163 00:06:53,340 --> 00:06:55,320 Or if it's always excepting the states, 164 00:06:55,320 --> 00:06:57,910 then it might try to increase it. 165 00:06:57,910 --> 00:06:59,910 You want accept actually most of the transitions 166 00:06:59,910 --> 00:07:01,410 because it makes it more efficient. 167 00:07:01,410 --> 00:07:02,410 But you don't accept them all. 168 00:07:02,410 --> 00:07:03,150 So you want to do some-- 169 00:07:03,150 --> 00:07:05,540 I don't know-- keep it to 0.9 probability or something 170 00:07:05,540 --> 00:07:07,599 like that, or 0.8. 171 00:07:07,599 --> 00:07:09,390 And there is probably a whole journal paper 172 00:07:09,390 --> 00:07:13,110 about what's the optimal way to do the adaptive stepsizing 173 00:07:13,110 --> 00:07:14,580 to get to convert the rest. 174 00:07:14,580 --> 00:07:15,270 Yeah? 175 00:07:15,270 --> 00:07:18,270 AUDIENCE: How did you prevent your initial guess [INAUDIBLE]?? 176 00:07:22,102 --> 00:07:23,560 WILLIAM GREEN: Well, to me, I don't 177 00:07:23,560 --> 00:07:25,070 think that throwing them out is a good idea. 178 00:07:25,070 --> 00:07:25,690 I don't know. 179 00:07:25,690 --> 00:07:27,045 People do it. 180 00:07:27,045 --> 00:07:28,670 I think it's that your really just need 181 00:07:28,670 --> 00:07:33,370 to make enough steps that you are sampling everywhere, 182 00:07:33,370 --> 00:07:34,660 and then it shouldn't matter. 183 00:07:34,660 --> 00:07:36,100 And a good thing to do is actually 184 00:07:36,100 --> 00:07:38,692 start from a completely different initial state 185 00:07:38,692 --> 00:07:40,900 and make sure you get the same value of the integral, 186 00:07:40,900 --> 00:07:43,457 and that will give you a lot more confidence 187 00:07:43,457 --> 00:07:44,790 that you're not sensitive to it. 188 00:07:44,790 --> 00:07:46,870 If you're sensitive to it, you're in trouble. 189 00:07:46,870 --> 00:07:50,181 Because you really won't know how many to throw away. 190 00:07:50,181 --> 00:07:52,180 And you don't know if you actually even achieved 191 00:07:52,180 --> 00:07:54,170 the real integral. 192 00:07:54,170 --> 00:07:56,140 I think in the hydrogen peroxide example, 193 00:07:56,140 --> 00:07:58,000 actually it was symmetrical, so it was a little bit 194 00:07:58,000 --> 00:07:58,958 didn't matter too much. 195 00:07:58,958 --> 00:08:01,147 But it I had given you an asymmetrical one, 196 00:08:01,147 --> 00:08:01,730 there's like-- 197 00:08:05,675 --> 00:08:07,360 here's like the dihedral angle. 198 00:08:07,360 --> 00:08:10,310 It has to do with the orientation of the h, 199 00:08:10,310 --> 00:08:12,240 like sort of out of the plane. 200 00:08:12,240 --> 00:08:13,840 And it should be symmetrical that you 201 00:08:13,840 --> 00:08:17,200 can sort of plus and minus, and there's 202 00:08:17,200 --> 00:08:18,774 a high probability region over here 203 00:08:18,774 --> 00:08:20,440 and a high probability region over here. 204 00:08:20,440 --> 00:08:21,700 And you start one of them, say, over here, 205 00:08:21,700 --> 00:08:22,880 and you just sample around here, you 206 00:08:22,880 --> 00:08:24,463 get a lot of points all have basically 207 00:08:24,463 --> 00:08:26,460 the same value dihedral angle. 208 00:08:26,460 --> 00:08:29,110 And you never see any of these, then it's no good, right? 209 00:08:29,110 --> 00:08:30,190 Now this particular case, you might still 210 00:08:30,190 --> 00:08:32,414 get the right answer by luck, because they're exactly 211 00:08:32,414 --> 00:08:32,914 symmetrical. 212 00:08:32,914 --> 00:08:34,630 If this one is a little bit asymmetrical, 213 00:08:34,630 --> 00:08:36,159 then you'd get the wrong answer. 214 00:08:36,159 --> 00:08:37,809 Does that make sense? 215 00:08:37,809 --> 00:08:40,390 So you really want to make sure you're really 216 00:08:40,390 --> 00:08:44,660 sampling over all the physically accessible region. 217 00:08:44,660 --> 00:08:47,720 But again, this is a common problem 218 00:08:47,720 --> 00:08:50,260 for us all the time, is that when you're 219 00:08:50,260 --> 00:08:52,010 doing a calculation, you want to know what 220 00:08:52,010 --> 00:08:54,390 the answer is before you start. 221 00:08:54,390 --> 00:08:56,790 This is a absolutely critical thing. 222 00:08:56,790 --> 00:08:59,440 Because how the heck you going to know if you have a bug? 223 00:08:59,440 --> 00:08:59,940 Right? 224 00:08:59,940 --> 00:09:02,990 The computer's going to give you some number at the end. 225 00:09:02,990 --> 00:09:06,459 You have to know actually what the answer is before you start. 226 00:09:06,459 --> 00:09:07,750 I know this seems weird, right? 227 00:09:07,750 --> 00:09:08,927 But anyway, it's the truth. 228 00:09:08,927 --> 00:09:10,510 If you're going to calculate anything, 229 00:09:10,510 --> 00:09:12,940 you really want to know what the answer is roughly. 230 00:09:12,940 --> 00:09:16,840 What the units are, what the order of magnitude is. 231 00:09:16,840 --> 00:09:18,192 Some things about it. 232 00:09:18,192 --> 00:09:20,650 And ideally you should try to think of some simple test you 233 00:09:20,650 --> 00:09:23,890 can do to try to check whether your code's right, 234 00:09:23,890 --> 00:09:25,570 whether things are reasonable. 235 00:09:25,570 --> 00:09:28,300 The reasonableness test, is it reasonable 236 00:09:28,300 --> 00:09:29,690 that what which you get. 237 00:09:29,690 --> 00:09:31,390 So this is the same kind of thing. 238 00:09:31,390 --> 00:09:33,980 Like you draw the H2O2 and say, oh, I think 239 00:09:33,980 --> 00:09:35,980 the dipole moment should be about such and such, 240 00:09:35,980 --> 00:09:38,520 because I know the typical charges on an H atom and O 241 00:09:38,520 --> 00:09:39,021 atom. 242 00:09:39,021 --> 00:09:41,186 And if you get some number that's way off from that, 243 00:09:41,186 --> 00:09:43,480 then obviously you made a mistake somewhere, right? 244 00:09:43,480 --> 00:09:44,938 It could be a mistake in your code, 245 00:09:44,938 --> 00:09:47,640 or it could just be you just didn't sample correctly 246 00:09:47,640 --> 00:09:50,970 because your delta was wrong, for example. 247 00:09:50,970 --> 00:09:52,919 All right? 248 00:09:52,919 --> 00:09:54,960 Now I know I've told this to students every time, 249 00:09:54,960 --> 00:09:56,010 and every time I tell this to students, 250 00:09:56,010 --> 00:09:59,040 they totally reject this idea that you should know the answer 251 00:09:59,040 --> 00:10:00,090 before you start. 252 00:10:00,090 --> 00:10:02,490 But it's absolutely critical. 253 00:10:02,490 --> 00:10:04,665 I mean, my whole job as a professor 254 00:10:04,665 --> 00:10:06,540 is I tell my students, please calculate this. 255 00:10:06,540 --> 00:10:09,540 But before I tell them that, I know what the answer is, right, 256 00:10:09,540 --> 00:10:10,400 roughly. 257 00:10:10,400 --> 00:10:12,650 And so that way when they come back and show it to me, 258 00:10:12,650 --> 00:10:14,210 I can say, oh, it's reasonable, OK, I believe them. 259 00:10:14,210 --> 00:10:14,724 Yeah. 260 00:10:14,724 --> 00:10:16,140 So I think the number should be 20 261 00:10:16,140 --> 00:10:18,360 and they come back with 14.3, I say, 20, 14.3, 262 00:10:18,360 --> 00:10:19,800 they're pretty close. 263 00:10:19,800 --> 00:10:21,250 OK, so I believe them. 264 00:10:21,250 --> 00:10:22,625 And when I think it's 20 and they 265 00:10:22,625 --> 00:10:25,850 come back, it's 10 to the minus 6, then I'm pretty confident, 266 00:10:25,850 --> 00:10:27,020 I tell them, I think you must have made a mistake. 267 00:10:27,020 --> 00:10:28,550 And they're like, no, no, no, I did a great job. 268 00:10:28,550 --> 00:10:30,320 I'm sure my calculus is perfect. 269 00:10:30,320 --> 00:10:31,610 And then I'm like, now we're really probing, 270 00:10:31,610 --> 00:10:32,526 and I think I'm like-- 271 00:10:32,526 --> 00:10:33,130 I don't know-- 272 00:10:33,130 --> 00:10:33,830 I have it in for them. 273 00:10:33,830 --> 00:10:35,621 But no, actually I knew what the answer is, 274 00:10:35,621 --> 00:10:37,460 so obviously it can't be right. 275 00:10:37,460 --> 00:10:39,200 So they must have made a mistake. 276 00:10:39,200 --> 00:10:41,210 Anyway, you should be like it, too. 277 00:10:41,210 --> 00:10:43,550 You want to be able to be self-critical about what 278 00:10:43,550 --> 00:10:46,530 do you expect your answers to be before you do them. 279 00:10:46,530 --> 00:10:47,030 All right. 280 00:10:49,690 --> 00:10:54,410 What else is problematic with Monte Carlo? 281 00:10:54,410 --> 00:10:56,549 This is very problematic. 282 00:10:56,549 --> 00:10:58,090 If you want to achieve high accuracy, 283 00:10:58,090 --> 00:11:00,280 you need a really large number of states. 284 00:11:00,280 --> 00:11:01,180 And it's funny. 285 00:11:01,180 --> 00:11:04,120 It's like you can get a pretty good accuracy with not very 286 00:11:04,120 --> 00:11:06,620 many samples because of the behavior of one 287 00:11:06,620 --> 00:11:09,120 over square root of n. 288 00:11:09,120 --> 00:11:12,587 So it has very big changes, that small values of n. 289 00:11:12,587 --> 00:11:14,670 And just a relatively small number of samples, you 290 00:11:14,670 --> 00:11:16,950 get something that's halfway reasonable. 291 00:11:16,950 --> 00:11:19,830 But then if you have three significant figures 292 00:11:19,830 --> 00:11:21,482 and you want to get the fourth one, 293 00:11:21,482 --> 00:11:23,940 it's a killer because you have to do 100 times more effort, 294 00:11:23,940 --> 00:11:27,950 100 times more samples to get one more significant figure. 295 00:11:27,950 --> 00:11:29,870 And so you've already done 100,000, 296 00:11:29,870 --> 00:11:31,470 and now all the sudden you have to do 10 million samples. 297 00:11:31,470 --> 00:11:33,490 And then you only get one more [INAUDIBLE] of that. 298 00:11:33,490 --> 00:11:35,031 Now you have to do a billion samples, 299 00:11:35,031 --> 00:11:37,900 and it's like, this gets really out of hand. 300 00:11:37,900 --> 00:11:40,032 So that's a really unfortunate thing about it. 301 00:11:40,032 --> 00:11:42,490 But the nice behavior at the beginning is you get roughly-- 302 00:11:42,490 --> 00:11:44,160 if you only want a few significant figures, 303 00:11:44,160 --> 00:11:46,350 you can get them pretty darn fast with Monte Carlo, 304 00:11:46,350 --> 00:11:47,766 certainly a lot better than trying 305 00:11:47,766 --> 00:11:51,450 to do the trapezoid rule in multidimensions or something 306 00:11:51,450 --> 00:11:53,990 like that. 307 00:11:53,990 --> 00:11:56,490 All right. 308 00:11:56,490 --> 00:11:58,230 So you guys just did Monte Carlo. 309 00:11:58,230 --> 00:11:59,480 Are there things I should tell you about 310 00:11:59,480 --> 00:12:01,438 or we should talk about, problems you ran into? 311 00:12:06,145 --> 00:12:08,520 Now you see, when I set that homework problem up for you, 312 00:12:08,520 --> 00:12:09,478 I helped you out a lot. 313 00:12:09,478 --> 00:12:10,870 I don't know if you noticed that. 314 00:12:10,870 --> 00:12:14,490 So the original problem had, it's four atoms, 315 00:12:14,490 --> 00:12:17,350 they each have three xyz positions. 316 00:12:17,350 --> 00:12:19,500 So there's 12 coordinates. 317 00:12:19,500 --> 00:12:22,080 And then I did clever tricks to get it down to, 318 00:12:22,080 --> 00:12:23,670 I think, six, maybe. 319 00:12:23,670 --> 00:12:24,270 OK. 320 00:12:24,270 --> 00:12:26,560 So going from six dimensions to 12 dimensions, 321 00:12:26,560 --> 00:12:28,300 that's actually a pretty big deal. 322 00:12:28,300 --> 00:12:29,940 And so if I have given you the original 12-dimensional 323 00:12:29,940 --> 00:12:31,500 problem, you could still compute it and actually 324 00:12:31,500 --> 00:12:33,320 use the same kind of code as you wrote. 325 00:12:33,320 --> 00:12:35,790 But the number samples you'd need to get a good number 326 00:12:35,790 --> 00:12:40,550 gets really wildly different. 327 00:12:40,550 --> 00:12:43,130 Right, and so again, this has to do with knowing 328 00:12:43,130 --> 00:12:44,870 the answer ahead of time. 329 00:12:44,870 --> 00:12:47,420 You know, I knew, OK, that the magnitude of the dipole moment 330 00:12:47,420 --> 00:12:48,880 doesn't depend on the orientation of the molecule, 331 00:12:48,880 --> 00:12:49,730 so therefore, I'm going to get rid 332 00:12:49,730 --> 00:12:51,200 of all the rotational degrees of freedom. 333 00:12:51,200 --> 00:12:53,408 I know it doesn't depend on the transitional position 334 00:12:53,408 --> 00:12:56,660 of the molecule, so therefore, I get rid of all those. 335 00:12:56,660 --> 00:12:58,479 And so before I even did any calculation, 336 00:12:58,479 --> 00:13:00,770 I can tell you I can get rid of six degrees of freedom, 337 00:13:00,770 --> 00:13:01,885 so that'll help you a lot. 338 00:13:01,885 --> 00:13:03,260 But it's a similar kind of thing. 339 00:13:03,260 --> 00:13:05,270 You want to do that all the time yourself, is try to think 340 00:13:05,270 --> 00:13:07,340 what can I do that's easier than just doing the brute force 341 00:13:07,340 --> 00:13:07,970 problem? 342 00:13:07,970 --> 00:13:08,615 Yeah? 343 00:13:08,615 --> 00:13:11,585 AUDIENCE: So on that problem, changing the max stepsize, 344 00:13:11,585 --> 00:13:14,427 you could change the percentage of step-- or [INAUDIBLE].. 345 00:13:14,427 --> 00:13:15,260 WILLIAM GREEN: Yeah. 346 00:13:15,260 --> 00:13:18,634 AUDIENCE: And I think the mean state roughly around the same, 347 00:13:18,634 --> 00:13:22,020 but the shape of distribution actually changed a little bit. 348 00:13:22,020 --> 00:13:22,770 WILLIAM GREEN: OK. 349 00:13:22,770 --> 00:13:24,410 AUDIENCE: So how is-- 350 00:13:24,410 --> 00:13:25,820 WILLIAM GREEN: So I think it might be this kind of thing 351 00:13:25,820 --> 00:13:26,570 is one part of it. 352 00:13:26,570 --> 00:13:30,846 It's like if you only integrate one lobe of the dihedral, 353 00:13:30,846 --> 00:13:32,720 you actually get a dipole moment value that's 354 00:13:32,720 --> 00:13:35,460 pretty reasonable, even if you totally missed this other lobe. 355 00:13:35,460 --> 00:13:36,959 But if you did a different stepsize, 356 00:13:36,959 --> 00:13:39,320 you might start getting some samples over here, too. 357 00:13:39,320 --> 00:13:41,360 So the distribution would look a lot different, 358 00:13:41,360 --> 00:13:43,276 but you still end up getting roughly the mean. 359 00:13:43,276 --> 00:13:44,969 But also it's partly that the-- 360 00:13:44,969 --> 00:13:46,760 the good thing about the Monte Carlo method 361 00:13:46,760 --> 00:13:48,800 is that, the first few samples, no matter what they are, 362 00:13:48,800 --> 00:13:50,660 give you something that's on the order of magnitude of what 363 00:13:50,660 --> 00:13:52,310 the value of the thing is. 364 00:13:52,310 --> 00:13:54,770 So even a really lousy sampling at the beginning 365 00:13:54,770 --> 00:13:55,880 still gives you something that's halfway 366 00:13:55,880 --> 00:13:56,870 reasonable for the average. 367 00:13:56,870 --> 00:13:58,578 AUDIENCE: So if you're trying to recreate 368 00:13:58,578 --> 00:14:02,250 the overall distribution histogram, 369 00:14:02,250 --> 00:14:06,010 then how do you know which max stepsize to choose, because 370 00:14:06,010 --> 00:14:07,774 [INAUDIBLE]. 371 00:14:07,774 --> 00:14:09,940 WILLIAM GREEN: Yeah, so the rule of thumb I've heard 372 00:14:09,940 --> 00:14:11,860 is that people try to get an acceptance 373 00:14:11,860 --> 00:14:14,730 ratio between 0.2 and 0.8. 374 00:14:14,730 --> 00:14:18,600 So it means you want-- 375 00:14:18,600 --> 00:14:20,100 Yeah. 376 00:14:20,100 --> 00:14:21,210 But I really don't know. 377 00:14:21,210 --> 00:14:22,170 I'm not an expert in this field. 378 00:14:22,170 --> 00:14:23,700 But I'm sure you read papers and people have 379 00:14:23,700 --> 00:14:25,950 big discussions about what the acceptance ratio should 380 00:14:25,950 --> 00:14:29,700 be in order to taking big enough steps 381 00:14:29,700 --> 00:14:32,390 that you have a chance to get the weird stuff. 382 00:14:32,390 --> 00:14:34,140 In this kind of one, it might be a problem 383 00:14:34,140 --> 00:14:35,880 if you're doing steps, say, in the dihedral, 384 00:14:35,880 --> 00:14:37,630 you may have to step quite a long distance 385 00:14:37,630 --> 00:14:39,180 to find the other lobe. 386 00:14:39,180 --> 00:14:40,821 So sometimes having some pre-knowledge 387 00:14:40,821 --> 00:14:43,320 of what you think the shape of the things are is a big help. 388 00:14:46,588 --> 00:14:47,088 Yeah? 389 00:14:47,088 --> 00:14:49,450 No? 390 00:14:49,450 --> 00:14:51,866 Sorry, maybe I didn't answer your question. 391 00:14:51,866 --> 00:14:52,366 Is that-- 392 00:14:52,366 --> 00:14:53,590 AUDIENCE: Yeah. 393 00:14:53,590 --> 00:14:54,423 WILLIAM GREEN: Yeah. 394 00:14:57,100 --> 00:14:58,535 All right. 395 00:14:58,535 --> 00:14:59,910 All right, so that's Monte Carlo. 396 00:14:59,910 --> 00:15:06,500 To Then we talked about the more difficult problem 397 00:15:06,500 --> 00:15:09,650 of where the probability distribution is not stationary, 398 00:15:09,650 --> 00:15:11,900 and in fact, we don't even have the w written out. 399 00:15:11,900 --> 00:15:14,210 All we have is a differential equation 400 00:15:14,210 --> 00:15:16,600 that divides the probability distribution. 401 00:15:16,600 --> 00:15:20,885 And we did it here for the case of discrete states, 402 00:15:20,885 --> 00:15:22,760 and that's the right one to use, for example, 403 00:15:22,760 --> 00:15:25,530 for the kinetics equation, if you 404 00:15:25,530 --> 00:15:28,520 want to get them exactly right. 405 00:15:28,520 --> 00:15:31,000 Now that differential equation looks really easy, right? 406 00:15:31,000 --> 00:15:32,519 Just a the times p. 407 00:15:32,519 --> 00:15:34,310 It's a linear differential equation system. 408 00:15:34,310 --> 00:15:36,110 You knew how to do that one probably 409 00:15:36,110 --> 00:15:37,979 in your second semester of undergraduate 410 00:15:37,979 --> 00:15:39,770 taking differential equations class, right? 411 00:15:39,770 --> 00:15:41,725 They did dialyze the matrix. 412 00:15:41,725 --> 00:15:42,225 Yeah? 413 00:15:42,225 --> 00:15:44,484 Remember this? 414 00:15:44,484 --> 00:15:45,650 So that looks really simple. 415 00:15:45,650 --> 00:15:47,941 But the problem is that the number of states is so big. 416 00:15:47,941 --> 00:15:54,070 So have any of you tried problem two on the optional homework? 417 00:15:54,070 --> 00:15:56,100 You haven't tried? 418 00:15:56,100 --> 00:15:57,830 Has anyone even looked at it? 419 00:15:57,830 --> 00:15:59,413 I know somebody looked at it because I 420 00:15:59,413 --> 00:16:00,750 got some question about it. 421 00:16:00,750 --> 00:16:01,260 All right. 422 00:16:01,260 --> 00:16:02,634 At least one person looked at it. 423 00:16:02,634 --> 00:16:06,028 Let me tell you guys briefly what this problem is. 424 00:16:06,028 --> 00:16:07,620 It's a really relevant problem. 425 00:16:07,620 --> 00:16:12,660 If you go work for Professor Jensen, Professor Braatz, maybe 426 00:16:12,660 --> 00:16:16,150 Professor Roman, you might ed up doing this calculation. 427 00:16:16,150 --> 00:16:19,590 This is not a crazy calculation. 428 00:16:19,590 --> 00:16:20,430 It's very simple. 429 00:16:20,430 --> 00:16:23,550 All they have is like a chess board of sites 430 00:16:23,550 --> 00:16:25,650 on a catalytic surface. 431 00:16:25,650 --> 00:16:28,260 And they some number of sites. 432 00:16:28,260 --> 00:16:30,990 And this is what-- if you take a crystalline catalyst 433 00:16:30,990 --> 00:16:33,840 and cleave it, you'll have a whole repetitive pattern 434 00:16:33,840 --> 00:16:35,650 on the surface, and somewhere along there 435 00:16:35,650 --> 00:16:37,317 is some active site that does something. 436 00:16:37,317 --> 00:16:39,150 You can tell that experimentally because you 437 00:16:39,150 --> 00:16:41,280 stick that catalyst in the presence of some gases, 438 00:16:41,280 --> 00:16:42,180 you make products that you didn't 439 00:16:42,180 --> 00:16:43,530 make when it wasn't there. 440 00:16:43,530 --> 00:16:44,890 So it did something. 441 00:16:44,890 --> 00:16:46,680 So the way people model this is they 442 00:16:46,680 --> 00:16:48,610 say, OK, there's some active sites on here, 443 00:16:48,610 --> 00:16:52,435 and there's probably one of them per unit cell maybe. 444 00:16:52,435 --> 00:16:54,120 And the site can either be empty, 445 00:16:54,120 --> 00:16:56,400 like if I ultra high vacuum, I pump on it and heat it, 446 00:16:56,400 --> 00:16:57,840 should be nothing there. 447 00:16:57,840 --> 00:17:00,840 And then if I expose it to a little bit of some gas A, 448 00:17:00,840 --> 00:17:04,020 some of the sites might have A molecules sticking on them now. 449 00:17:04,020 --> 00:17:07,109 And I also exposed to some sites gas B. Maybe one of these sites 450 00:17:07,109 --> 00:17:10,190 might have a B molecule on it, too. 451 00:17:10,190 --> 00:17:12,491 And if A and B can react and make my product, 452 00:17:12,491 --> 00:17:14,490 C, then maybe they're sitting next to each other 453 00:17:14,490 --> 00:17:16,031 and they might react with each other. 454 00:17:16,031 --> 00:17:18,970 OK, so this is the whole way you look at this problem. 455 00:17:18,970 --> 00:17:23,520 And so all I want to know is, if I have some A's and B's sitting 456 00:17:23,520 --> 00:17:26,489 on the surface, sort of how often are they going to react? 457 00:17:26,489 --> 00:17:28,530 And the problem is a little bit more complicated. 458 00:17:28,530 --> 00:17:30,190 It says, well, suppose I also know 459 00:17:30,190 --> 00:17:32,773 if I put a whole lot of B on the surface, what I end up seeing 460 00:17:32,773 --> 00:17:33,300 is coke. 461 00:17:33,300 --> 00:17:35,430 My whole surface gets totally coked up. 462 00:17:35,430 --> 00:17:37,260 So let's model that by saying, well, 463 00:17:37,260 --> 00:17:39,840 if suppose two B's are next to each other, then some probably 464 00:17:39,840 --> 00:17:42,140 reacting to turn to some coke product that 465 00:17:42,140 --> 00:17:44,614 sticks there permanently and just poisons the catalyst. 466 00:17:44,614 --> 00:17:46,104 OK? 467 00:17:46,104 --> 00:17:47,520 And this is real life, too, right? 468 00:17:47,520 --> 00:17:49,820 So if you're trying to run a catalytic process, 469 00:17:49,820 --> 00:17:51,820 a lot of times, you have to run with one reagent 470 00:17:51,820 --> 00:17:55,790 in great excess to keep the surface kind of clean, 471 00:17:55,790 --> 00:17:59,737 keep it covered by the unharmful reactant. 472 00:17:59,737 --> 00:18:01,820 And then you let little bits of the other reactant 473 00:18:01,820 --> 00:18:04,540 come down and react with the A really quickly. 474 00:18:04,540 --> 00:18:07,040 But you don't let too much of the second reactant come down, 475 00:18:07,040 --> 00:18:09,590 because it might cause a problem like dimerize or coke 476 00:18:09,590 --> 00:18:10,700 or something. 477 00:18:10,700 --> 00:18:12,530 OK, so that's the model. 478 00:18:12,530 --> 00:18:15,230 Some of these guys have A's, some of them have B's. 479 00:18:15,230 --> 00:18:18,680 And in the unlikely case that two of these guys react, 480 00:18:18,680 --> 00:18:21,400 they both turn around and then turn them into S's. 481 00:18:21,400 --> 00:18:22,044 Coke. 482 00:18:22,044 --> 00:18:23,780 Soot. 483 00:18:23,780 --> 00:18:26,840 And then that part's dead forever after that. 484 00:18:26,840 --> 00:18:28,790 OK, so that's the model. 485 00:18:28,790 --> 00:18:32,330 So in the case that they have, they had 100 sites. 486 00:18:32,330 --> 00:18:35,010 I think it was 10 by 10 if I remember correctly. 487 00:18:35,010 --> 00:18:37,010 So it's 100 sites, and each site can either 488 00:18:37,010 --> 00:18:41,570 be empty, or have an A molecule, or have a B molecule, 489 00:18:41,570 --> 00:18:44,180 or have a coke molecule, S. So there's 490 00:18:44,180 --> 00:18:48,160 four different states on each of 100 different sites. 491 00:18:48,160 --> 00:18:51,364 So how many states are there altogether? 492 00:18:51,364 --> 00:18:54,450 Its' four to the what? 493 00:18:54,450 --> 00:18:58,690 100. 494 00:18:58,690 --> 00:18:59,270 OK. 495 00:18:59,270 --> 00:19:01,340 So that's a pretty big number. 496 00:19:01,340 --> 00:19:02,440 So that's how many states. 497 00:19:02,440 --> 00:19:06,260 It's about 10 to the 60th, I think. 498 00:19:06,260 --> 00:19:06,760 All right. 499 00:19:06,760 --> 00:19:07,840 So this is a very large number. 500 00:19:07,840 --> 00:19:09,490 This is bigger than Avogadro's number. 501 00:19:09,490 --> 00:19:10,830 This is really a lot of states. 502 00:19:10,830 --> 00:19:13,300 You wouldn't have thought that just a stupid little problem 503 00:19:13,300 --> 00:19:17,530 with just a 10 by 10 piece of catalyst, 504 00:19:17,530 --> 00:19:19,540 and three different things that can stick there, 505 00:19:19,540 --> 00:19:21,730 would give you so many numbers, but it does. 506 00:19:21,730 --> 00:19:26,800 So now I have a problem that if I have-- 507 00:19:26,800 --> 00:19:29,680 my p vector is now a probability that 508 00:19:29,680 --> 00:19:34,410 has a number for each of the 10 to the 60th possible states. 509 00:19:34,410 --> 00:19:38,790 And then the matrix m is that number squared, right? 510 00:19:38,790 --> 00:19:39,940 Dimension. 511 00:19:39,940 --> 00:19:44,700 So it's 10 to the 120th power elements inside the matrix m. 512 00:19:44,700 --> 00:19:47,290 So even though the matrix m is very sparse, 513 00:19:47,290 --> 00:19:50,040 this might still be a problem, because 10 to the 120th 514 00:19:50,040 --> 00:19:51,166 is really big, OK? 515 00:19:51,166 --> 00:19:53,290 And your computer can only hold 10 to the 9th or 10 516 00:19:53,290 --> 00:19:57,270 to the 10th numbers in it, so this is going to be a problem. 517 00:19:59,780 --> 00:20:02,750 And also that's not just what p is. 518 00:20:02,750 --> 00:20:05,060 That's what p is at this instant, 519 00:20:05,060 --> 00:20:07,290 say, 1 millisecond after t naught. 520 00:20:07,290 --> 00:20:09,660 And then if I wait to 1.1 milliseconds after t naught, 521 00:20:09,660 --> 00:20:11,060 p will be different. 522 00:20:11,060 --> 00:20:16,370 So I have 10 to the 60th numbers that change with time 523 00:20:16,370 --> 00:20:18,885 in some way that I don't know. 524 00:20:18,885 --> 00:20:20,790 OK? 525 00:20:20,790 --> 00:20:26,070 So this is actually a really hard problem to solve it. 526 00:20:26,070 --> 00:20:29,392 And so everybody uses Kinetic Monte Carlo to do it, 527 00:20:29,392 --> 00:20:31,350 because there's no way I can possibly even list 528 00:20:31,350 --> 00:20:33,340 all the elements of p. 529 00:20:33,340 --> 00:20:37,977 And in fact, if I sample, if there's 10 to 60th states, 530 00:20:37,977 --> 00:20:39,810 no matter how good my sampling algorithm is, 531 00:20:39,810 --> 00:20:41,570 I'm never going to sample 10 to the 60th states. 532 00:20:41,570 --> 00:20:44,070 So there's a lot of states I'm never going to get-- not even 533 00:20:44,070 --> 00:20:45,637 get one sample of. 534 00:20:45,637 --> 00:20:47,220 I'm just never going to encounter them 535 00:20:47,220 --> 00:20:50,817 at all, because there's just too many states. 536 00:20:50,817 --> 00:20:52,150 But anyway, people do it anyway. 537 00:20:52,150 --> 00:20:54,490 So we go ahead and we'll do the Gillespie algorithm, 538 00:20:54,490 --> 00:20:55,530 which you guys-- 539 00:20:55,530 --> 00:20:58,120 we talked about in class. 540 00:20:58,120 --> 00:21:00,880 Now to compute a Gillespie trajectory, 541 00:21:00,880 --> 00:21:03,312 we have to compute two random numbers, right? 542 00:21:03,312 --> 00:21:05,020 We compute one random number, it tells us 543 00:21:05,020 --> 00:21:07,410 how long we wait until the next thing happens. 544 00:21:07,410 --> 00:21:09,250 And then we compute a second random number 545 00:21:09,250 --> 00:21:11,416 that tells us which of the many possible things that 546 00:21:11,416 --> 00:21:13,060 happened actually happened. 547 00:21:13,060 --> 00:21:15,463 And so if I have a case like this where I 548 00:21:15,463 --> 00:21:19,120 had the-- to the original case. 549 00:21:19,120 --> 00:21:21,690 What can happen, the A could react with the B, 550 00:21:21,690 --> 00:21:24,910 the B could react to the B. The A could come back off. 551 00:21:24,910 --> 00:21:27,035 The A can move to the next adjacent site. 552 00:21:27,035 --> 00:21:29,430 The B can move to the next adjacent site. 553 00:21:29,430 --> 00:21:31,410 A lot of things can happen, right? 554 00:21:31,410 --> 00:21:34,995 So quite a variety of things can happen from this initial state. 555 00:21:34,995 --> 00:21:37,120 And so you have to, if you're solving this problem, 556 00:21:37,120 --> 00:21:38,286 you have to write that down. 557 00:21:38,286 --> 00:21:39,668 Yeah, John Paul? 558 00:21:39,668 --> 00:21:41,644 AUDIENCE: So the special waiting times 559 00:21:41,644 --> 00:21:47,325 are [INAUDIBLE] process where we haven't decided 560 00:21:47,325 --> 00:21:50,042 there's going to be an arrival, [INAUDIBLE] arrival 561 00:21:50,042 --> 00:21:51,417 times by that [INAUDIBLE]. 562 00:21:51,417 --> 00:21:52,250 WILLIAM GREEN: Yeah. 563 00:21:52,250 --> 00:21:53,360 AUDIENCE: And then we make something happen. 564 00:21:53,360 --> 00:21:55,943 But I mean I feel like you could run the exact same simulation 565 00:21:55,943 --> 00:21:57,743 without counting the times. 566 00:21:57,743 --> 00:22:00,274 Get the same answer and then after the fact come 567 00:22:00,274 --> 00:22:04,164 and [INAUDIBLE] into the system. 568 00:22:04,164 --> 00:22:07,570 [INAUDIBLE] doesn't seem to be [INAUDIBLE] 569 00:22:07,570 --> 00:22:08,980 WILLIAM GREEN: Yes, that's right. 570 00:22:08,980 --> 00:22:10,270 Yeah, that's true. 571 00:22:10,270 --> 00:22:12,110 So you only do the time calculation 572 00:22:12,110 --> 00:22:13,815 because you care about the time. 573 00:22:13,815 --> 00:22:15,190 If you don't care about the time, 574 00:22:15,190 --> 00:22:16,630 you only care about the steady-state solution, 575 00:22:16,630 --> 00:22:18,331 then you can probably do it some other way. 576 00:22:18,331 --> 00:22:19,872 AUDIENCE: You don't need to calculate 577 00:22:19,872 --> 00:22:21,038 the times inside that. 578 00:22:21,038 --> 00:22:25,374 You could calculate the entire state space trajectory 579 00:22:25,374 --> 00:22:28,580 without any reference to the time, yeah? 580 00:22:28,580 --> 00:22:29,756 WILLIAM GREEN: Yes. 581 00:22:29,756 --> 00:22:31,130 Yeah, so you can get the sequence 582 00:22:31,130 --> 00:22:32,630 of all things that happened if it didn't give you 583 00:22:32,630 --> 00:22:33,920 the time, because it doesn't matter, right? 584 00:22:33,920 --> 00:22:34,640 For the time. 585 00:22:34,640 --> 00:22:36,245 It just means the sequence of what happened. 586 00:22:36,245 --> 00:22:36,560 AUDIENCE: Yeah. 587 00:22:36,560 --> 00:22:38,490 WILLIAM GREEN: The sequence of states is all that matters. 588 00:22:38,490 --> 00:22:40,070 But if you want the kinetic information, 589 00:22:40,070 --> 00:22:40,960 then you also want the time, because it 590 00:22:40,960 --> 00:22:43,346 might be it sat in one state for a million years, 591 00:22:43,346 --> 00:22:44,220 and the other state-- 592 00:22:44,220 --> 00:22:46,607 AUDIENCE: [INAUDIBLE] 593 00:22:46,607 --> 00:22:47,440 WILLIAM GREEN: Yeah. 594 00:22:47,440 --> 00:22:48,295 Yeah, that's right. 595 00:22:48,295 --> 00:22:48,920 Let's go ahead. 596 00:22:48,920 --> 00:22:50,211 OK, so maybe make this cheaper. 597 00:22:50,211 --> 00:22:52,311 Make it one random number for j. 598 00:22:52,311 --> 00:22:52,810 No? 599 00:22:52,810 --> 00:22:53,310 OK. 600 00:22:57,760 --> 00:22:58,260 All right. 601 00:22:58,260 --> 00:23:02,110 So things to just keep in mind about this. 602 00:23:02,110 --> 00:23:04,920 So the cost here is to compute in random numbers. 603 00:23:04,920 --> 00:23:06,860 We have to compute at least-- 604 00:23:06,860 --> 00:23:08,810 well, some number of random numbers. 605 00:23:08,810 --> 00:23:10,950 I don't know how many. 606 00:23:10,950 --> 00:23:13,140 And that number depends on the length of time 607 00:23:13,140 --> 00:23:17,650 I want to simulate and what my delta t is. 608 00:23:17,650 --> 00:23:22,192 Where the delta t is sort of like the average time 609 00:23:22,192 --> 00:23:23,400 between the events happening. 610 00:23:23,400 --> 00:23:28,160 It's like one over a in the [INAUDIBLE].. 611 00:23:28,160 --> 00:23:34,030 And so if that delta t is very small, 612 00:23:34,030 --> 00:23:36,124 and my time I want to simulate is very large, 613 00:23:36,124 --> 00:23:37,540 that means that each trajectory is 614 00:23:37,540 --> 00:23:38,790 going to have a lot of events. 615 00:23:38,790 --> 00:23:40,623 And that means I'm going to have to generate 616 00:23:40,623 --> 00:23:41,816 a lot of random numbers. 617 00:23:41,816 --> 00:23:43,190 And so generate one trajectory is 618 00:23:43,190 --> 00:23:45,250 going to be really expensive. 619 00:23:45,250 --> 00:23:48,580 On the other hand, if the delta t were about the same size 620 00:23:48,580 --> 00:23:50,170 as the total time I'm simulating, 621 00:23:50,170 --> 00:23:52,090 then I might only get one or two events, 622 00:23:52,090 --> 00:23:54,640 and so I only have to generate four random numbers 623 00:23:54,640 --> 00:23:56,370 for each sample. 624 00:23:56,370 --> 00:23:58,410 All right? 625 00:23:58,410 --> 00:24:00,165 And I just have to keep in mind that I'm 626 00:24:00,165 --> 00:24:01,290 going to have a lot of low probability states 627 00:24:01,290 --> 00:24:02,400 that I'm not going to sample at all. 628 00:24:02,400 --> 00:24:04,025 I might even have some high probability 629 00:24:04,025 --> 00:24:06,852 states I don't sample, because I just can't do enough samples. 630 00:24:06,852 --> 00:24:09,310 And I had the same exact problem I have with the Metropolis 631 00:24:09,310 --> 00:24:11,580 Monte Carlo and with all these stochastic methods 632 00:24:11,580 --> 00:24:14,100 that the scaling is one over square root of n. 633 00:24:14,100 --> 00:24:16,170 So I can get a rough idea pretty easily, 634 00:24:16,170 --> 00:24:18,937 but I don't get a really precise number of anything, 635 00:24:18,937 --> 00:24:20,770 then I'm going to have a problem because I'm 636 00:24:20,770 --> 00:24:23,340 going to have trouble to do enough samples to really refine 637 00:24:23,340 --> 00:24:25,750 the number. 638 00:24:25,750 --> 00:24:30,080 The key thing in this is that with anything 639 00:24:30,080 --> 00:24:33,880 like this where I'm going to have so many states, 640 00:24:33,880 --> 00:24:36,160 I'm going to run a lot of trajectories 641 00:24:36,160 --> 00:24:39,610 to get at all-- sample all the things that can happen. 642 00:24:39,610 --> 00:24:41,860 And so I'm probably not going to be 643 00:24:41,860 --> 00:24:43,300 able to store all the trajectories 644 00:24:43,300 --> 00:24:47,370 I ran on my computer, because I'm going to run so many. 645 00:24:47,370 --> 00:24:50,340 And so I really want to compute things 646 00:24:50,340 --> 00:24:53,010 on the fly, which means I should decide ahead of time 647 00:24:53,010 --> 00:24:56,270 all the averages, all the f's I'm trying to compute. 648 00:24:56,270 --> 00:24:58,480 So I might only compute f and f squared. 649 00:24:58,480 --> 00:25:00,930 And I don't know, whatever else I want to compute. 650 00:25:00,930 --> 00:25:03,360 The average value of the number of A's on the surface. 651 00:25:03,360 --> 00:25:05,735 I mean, there might be a lot of diagnostics I can compute 652 00:25:05,735 --> 00:25:07,040 to just check everything's OK. 653 00:25:07,040 --> 00:25:08,998 Think of that beforehand, coded to compute them 654 00:25:08,998 --> 00:25:11,556 as it runs, and then we only have to store the running 655 00:25:11,556 --> 00:25:12,930 averages of all those quantities, 656 00:25:12,930 --> 00:25:15,179 and I don't have to store anything about exactly which 657 00:25:15,179 --> 00:25:16,922 state sequence I hit. 658 00:25:16,922 --> 00:25:19,377 Is that OK? 659 00:25:19,377 --> 00:25:20,359 All right. 660 00:25:23,810 --> 00:25:25,630 Now if you remember, in this equation, 661 00:25:25,630 --> 00:25:28,130 there was an initial condition of the right-hand side, which 662 00:25:28,130 --> 00:25:30,950 we totally ignored so far. 663 00:25:30,950 --> 00:25:33,240 But that's actually pretty important. 664 00:25:33,240 --> 00:25:36,905 And in a lot of real cases, from your macroscopic continuum 665 00:25:36,905 --> 00:25:40,720 view of things, you have an idea of the average value events. 666 00:25:40,720 --> 00:25:42,850 And what you don't know is about the discreteness 667 00:25:42,850 --> 00:25:44,230 and you don't know about the correlations. 668 00:25:44,230 --> 00:25:45,688 All you know is sort of an average. 669 00:25:45,688 --> 00:25:49,060 I expect to have two A's and three B's in the surface 670 00:25:49,060 --> 00:25:52,420 or something from the partial pressure of A and B 671 00:25:52,420 --> 00:25:56,170 and the binding constants of A and B. So I have some idea. 672 00:25:56,170 --> 00:25:58,210 So I think I know these averages. 673 00:25:58,210 --> 00:26:04,210 And so what people usually do is they use a Poisson distribution 674 00:26:04,210 --> 00:26:07,130 for the probability of the exact number of A. 675 00:26:07,130 --> 00:26:09,850 So I suppose I think on average I'll have two A's. , Well, 676 00:26:09,850 --> 00:26:14,150 I might have one A, I might have three A's on the surface. 677 00:26:14,150 --> 00:26:18,210 And so I'll just use the Poisson distribution of n 678 00:26:18,210 --> 00:26:21,760 to get an estimate of sort of the expected 679 00:26:21,760 --> 00:26:25,139 width of distributions. 680 00:26:25,139 --> 00:26:27,430 And so that's the formula for the Poisson distribution. 681 00:26:27,430 --> 00:26:31,210 You have to know the average values of Na, Nb, Nc. 682 00:26:31,210 --> 00:26:39,190 And then you can see that the N appears in the-- 683 00:26:39,190 --> 00:26:41,480 on average, N appears in the exponent 684 00:26:41,480 --> 00:26:43,786 and in the factorial [INAUDIBLE].. 685 00:26:43,786 --> 00:26:44,286 Right? 686 00:26:47,240 --> 00:26:48,860 And so really every time you start 687 00:26:48,860 --> 00:26:51,318 a new trajectory in Kinetic Monte Carlo, what you should do 688 00:26:51,318 --> 00:26:53,180 is sample from the Poisson distribution 689 00:26:53,180 --> 00:26:55,562 to get a new initial condition as well. 690 00:26:55,562 --> 00:26:57,127 OK? 691 00:26:57,127 --> 00:26:58,710 And that way you're sampling over both 692 00:26:58,710 --> 00:27:01,830 all the possible initial conditions and all 693 00:27:01,830 --> 00:27:03,420 the possible things that can happen 694 00:27:03,420 --> 00:27:06,850 to them to really get the real average of what you might want. 695 00:27:11,250 --> 00:27:12,050 All right. 696 00:27:12,050 --> 00:27:15,970 Now we just realized how horrible this problem is 697 00:27:15,970 --> 00:27:17,900 because it has so many states. 698 00:27:17,900 --> 00:27:20,108 And so then we're going to have to think immediately, 699 00:27:20,108 --> 00:27:22,040 what can I do to make this faster and easier? 700 00:27:22,040 --> 00:27:24,830 And so here are some things people do. 701 00:27:24,830 --> 00:27:28,280 One thing is that people try to figure out 702 00:27:28,280 --> 00:27:30,260 what are the really fast processes, 703 00:27:30,260 --> 00:27:32,134 and do I care about them. 704 00:27:32,134 --> 00:27:34,300 And some of the fast processes, you might say, well, 705 00:27:34,300 --> 00:27:37,824 like diffusion of A moving to here, 706 00:27:37,824 --> 00:27:40,240 and then it moves to here, and then it moves back to here, 707 00:27:40,240 --> 00:27:41,474 then it moves back to here. 708 00:27:41,474 --> 00:27:42,640 That's of no interest to me. 709 00:27:42,640 --> 00:27:44,932 I don't care where the A is bound on the surface. 710 00:27:44,932 --> 00:27:45,630 All right? 711 00:27:45,630 --> 00:27:48,740 The only way I care about it is whether the reacts of the B 712 00:27:48,740 --> 00:27:50,040 when it's sitting next to it. 713 00:27:50,040 --> 00:27:52,498 In other respects, I really don't care about the diffusion. 714 00:27:52,498 --> 00:27:54,560 So having the time concept for the diffusion 715 00:27:54,560 --> 00:27:56,890 be the real time concept might not be that important. 716 00:27:56,890 --> 00:27:58,390 So then you can do different things. 717 00:27:58,390 --> 00:28:00,480 One thing is you can assume it's infinitely fast, 718 00:28:00,480 --> 00:28:05,310 and you say, well, every time I look at a site, 719 00:28:05,310 --> 00:28:08,790 I assume that this A has a 1/10 chance of being here or here 720 00:28:08,790 --> 00:28:11,700 or here or here or here, all these different spots, 721 00:28:11,700 --> 00:28:14,154 as if it's equilibrated around all the empty sites. 722 00:28:14,154 --> 00:28:15,820 That would be one possible way to do it. 723 00:28:15,820 --> 00:28:18,850 That's the infinitely fast diffusion idea. 724 00:28:18,850 --> 00:28:21,490 Another idea is to say, well, let's slow down 725 00:28:21,490 --> 00:28:24,370 the fusion to make it slower just 726 00:28:24,370 --> 00:28:25,900 to help out our computation. 727 00:28:25,900 --> 00:28:27,790 So I say, well, on average, I might 728 00:28:27,790 --> 00:28:29,680 get one reaction a millisecond. 729 00:28:29,680 --> 00:28:32,530 In reality, the diffusion time is a nanosecond. 730 00:28:32,530 --> 00:28:34,690 But I don't care about all that nanosecond stuff. 731 00:28:34,690 --> 00:28:37,291 Let's pretend that it's a tenth of a millisecond. 732 00:28:37,291 --> 00:28:39,040 So it'll still be pretty much equilibrated 733 00:28:39,040 --> 00:28:40,840 on the time scale on the reactions, 734 00:28:40,840 --> 00:28:43,810 but that way I'll be able to accelerate my calculation 735 00:28:43,810 --> 00:28:48,040 by seven orders of magnitude by going from a nanosecond time 736 00:28:48,040 --> 00:28:50,252 scale to a tenth of a millisecond time scale. 737 00:28:50,252 --> 00:28:52,210 So there's a lot of different tricks like that. 738 00:28:52,210 --> 00:28:54,293 If you read a paper that does Kinetic Monte Carlo, 739 00:28:54,293 --> 00:28:56,002 you've got to read exactly what they did. 740 00:28:56,002 --> 00:28:57,668 But they usually do something like this, 741 00:28:57,668 --> 00:28:59,580 because it's just totally out of hand 742 00:28:59,580 --> 00:29:03,460 if you try to model everything perfectly. 743 00:29:03,460 --> 00:29:05,494 Also, the low probability events, 744 00:29:05,494 --> 00:29:06,910 the really low probability events, 745 00:29:06,910 --> 00:29:08,129 you're never going to sample. 746 00:29:08,129 --> 00:29:10,420 So if I have a process that happens on my catalyst that 747 00:29:10,420 --> 00:29:14,290 takes 10 hours to happen, and my main reaction 748 00:29:14,290 --> 00:29:15,790 happens on a millisecond time scale, 749 00:29:15,790 --> 00:29:18,064 I'm never going to be able to run out to 10 hours. 750 00:29:18,064 --> 00:29:19,480 So I might as well just forget it. 751 00:29:19,480 --> 00:29:21,910 So if I know experimentally that the coking thing only 752 00:29:21,910 --> 00:29:24,960 happens on 10-hour time scale, I'd just take it out of there. 753 00:29:24,960 --> 00:29:27,460 And I don't have to clutter up my calculation with all these 754 00:29:27,460 --> 00:29:28,360 S's. 755 00:29:28,360 --> 00:29:31,080 I'm not going to form any of them in my time scale anyway. 756 00:29:31,080 --> 00:29:32,470 And so therefore, I cut the number of states. 757 00:29:32,470 --> 00:29:34,803 Instead of being 4 to the 100th, now it's 3 to the 100th 758 00:29:34,803 --> 00:29:37,370 because I got rid of all the coke. 759 00:29:37,370 --> 00:29:41,300 And so now I've drastically reduced the size of my problem. 760 00:29:41,300 --> 00:29:43,490 Not often you can do a reduction like this. 761 00:29:43,490 --> 00:29:48,240 That's a pretty big reduction in the size of a problem. 762 00:29:48,240 --> 00:29:50,250 All right? 763 00:29:50,250 --> 00:29:51,660 And then you have to have an idea 764 00:29:51,660 --> 00:29:54,180 of what adequate sampling is. 765 00:29:54,180 --> 00:29:56,160 So you're going to see some lower probability 766 00:29:56,160 --> 00:29:58,620 processes compared to some higher probability ones. 767 00:29:58,620 --> 00:30:01,080 And you have to know, when are these low ones so 768 00:30:01,080 --> 00:30:04,010 small that they're statistical noise. 769 00:30:04,010 --> 00:30:07,435 And this is the margin of error problem. 770 00:30:07,435 --> 00:30:09,810 Do you ever see the polling, like in the political polls, 771 00:30:09,810 --> 00:30:14,100 they always say, so-and-so many people believe in evolution 772 00:30:14,100 --> 00:30:16,770 plus or minus something, right? 773 00:30:16,770 --> 00:30:18,720 Well, the way they get the plus or minus 774 00:30:18,720 --> 00:30:21,630 is from the square root of the number of samples 775 00:30:21,630 --> 00:30:24,990 with a positive result. So the margin 776 00:30:24,990 --> 00:30:27,715 of error on low probability things is much larger. 777 00:30:27,715 --> 00:30:30,090 So the number of people who believe that Professor Greene 778 00:30:30,090 --> 00:30:32,440 is God is a very small number. 779 00:30:32,440 --> 00:30:33,990 So if you find somebody like that, 780 00:30:33,990 --> 00:30:36,240 you have to figure they're within the margin of error, 781 00:30:36,240 --> 00:30:36,820 right? 782 00:30:36,820 --> 00:30:37,320 OK? 783 00:30:39,900 --> 00:30:43,200 But the number of people who are going to attend 1034 784 00:30:43,200 --> 00:30:45,690 this morning is a big enough number 785 00:30:45,690 --> 00:30:48,900 that the margin of error in that is maybe two or three people. 786 00:30:48,900 --> 00:30:50,710 It's not going to be 100 people. 787 00:30:50,710 --> 00:30:51,930 That make sense? 788 00:30:51,930 --> 00:30:52,860 All right. 789 00:30:52,860 --> 00:30:55,560 So the more likely the event is, the more likely 790 00:30:55,560 --> 00:30:57,120 you'll get a lot of samples of it 791 00:30:57,120 --> 00:30:59,661 if you count the errors roughly the square root of the number 792 00:30:59,661 --> 00:31:01,680 of samples you got of that event, 793 00:31:01,680 --> 00:31:08,310 then the statistical error in your sample 794 00:31:08,310 --> 00:31:10,830 of the high probability events won't be that large. 795 00:31:10,830 --> 00:31:12,486 Whereas for the low probability events, 796 00:31:12,486 --> 00:31:13,652 it could be really gigantic. 797 00:31:16,114 --> 00:31:17,530 All right, so just for an example. 798 00:31:17,530 --> 00:31:20,900 In the main reaction, A plus B is really fast 799 00:31:20,900 --> 00:31:22,490 because it's a good catalyst. 800 00:31:22,490 --> 00:31:24,198 So I'm going to get a lot of trajectories 801 00:31:24,198 --> 00:31:25,650 going to show that reaction. 802 00:31:25,650 --> 00:31:26,860 So that's good. 803 00:31:26,860 --> 00:31:28,820 And I should get good samples [INAUDIBLE] 804 00:31:28,820 --> 00:31:30,530 But the coking reaction is really slow. 805 00:31:30,530 --> 00:31:32,030 At least I hope it's slow, otherwise 806 00:31:32,030 --> 00:31:33,380 the catalyst is no good. 807 00:31:33,380 --> 00:31:35,005 And so if it's really slow, I might not 808 00:31:35,005 --> 00:31:37,490 get very many of those guys. 809 00:31:37,490 --> 00:31:39,890 And so even if I left it in the calculation, 810 00:31:39,890 --> 00:31:42,556 I might not be able to reach any conclusion because I might only 811 00:31:42,556 --> 00:31:45,050 see one when coking even out of all of the 100,000 812 00:31:45,050 --> 00:31:45,900 trajectories I run. 813 00:31:45,900 --> 00:31:47,900 And so then I won't know whether to say anything 814 00:31:47,900 --> 00:31:50,030 about that or not. 815 00:31:50,030 --> 00:31:52,244 And then if I let the diffusion in and it's too fast, 816 00:31:52,244 --> 00:31:53,660 then my delta t is going to be too 817 00:31:53,660 --> 00:31:55,970 large, which means that my CPU time 818 00:31:55,970 --> 00:31:58,770 to compute a single trajectory is going to be too large, 819 00:31:58,770 --> 00:32:00,260 which means that I won't be able to get good sampling because I 820 00:32:00,260 --> 00:32:02,192 won't be able to run very many trajectories. 821 00:32:02,192 --> 00:32:04,316 So I might want to do something to get rid of that. 822 00:32:06,950 --> 00:32:07,640 All right. 823 00:32:07,640 --> 00:32:09,950 Now there's another method that a lot of people use. 824 00:32:09,950 --> 00:32:11,260 I think actually Professor Swan uses this sometimes. 825 00:32:11,260 --> 00:32:12,260 Is that correct? 826 00:32:12,260 --> 00:32:13,250 Yes. 827 00:32:13,250 --> 00:32:14,780 So this is another method. 828 00:32:14,780 --> 00:32:18,290 And what it is, is you solve the equations 829 00:32:18,290 --> 00:32:21,980 of motion of the atoms or clumps of atoms typically using 830 00:32:21,980 --> 00:32:24,020 Newton's equations of motion. 831 00:32:24,020 --> 00:32:27,740 And typically people use it using force fields that 832 00:32:27,740 --> 00:32:29,650 were fitted to some experimental data, 833 00:32:29,650 --> 00:32:31,400 and maybe with some quantum chemistry data 834 00:32:31,400 --> 00:32:34,575 as well, to get some idea of the forces between the atoms 835 00:32:34,575 --> 00:32:36,200 and the force with which the molecules, 836 00:32:36,200 --> 00:32:40,580 when they bump into each other, how they interact. 837 00:32:40,580 --> 00:32:44,840 And typically, it's done classically 838 00:32:44,840 --> 00:32:47,300 using Newton's equations of motions. 839 00:32:47,300 --> 00:32:48,949 But if you don't like that, you can 840 00:32:48,949 --> 00:32:50,490 put in the quantum mechanical effects 841 00:32:50,490 --> 00:32:52,670 in a couple of different ways, and my groups, 842 00:32:52,670 --> 00:32:54,650 worked them one way called RPMD. 843 00:32:54,650 --> 00:32:56,540 And you can get pretty good agreement 844 00:32:56,540 --> 00:32:58,070 with the quantum chemical results 845 00:32:58,070 --> 00:33:01,262 by doing this fancier version of molecular dynamics. 846 00:33:01,262 --> 00:33:02,720 So there's some different equations 847 00:33:02,720 --> 00:33:05,620 you solve, but basically the same. 848 00:33:05,620 --> 00:33:07,702 And so you can do it. 849 00:33:07,702 --> 00:33:09,160 And there's a nice algorithm called 850 00:33:09,160 --> 00:33:13,300 the velocity of Verlet algorithm that almost everybody uses 851 00:33:13,300 --> 00:33:14,410 in this field. 852 00:33:14,410 --> 00:33:16,630 And what's nice about that one is 853 00:33:16,630 --> 00:33:19,400 that it can do a lot of steps. 854 00:33:19,400 --> 00:33:21,810 A lot of steps of moving the atoms around. 855 00:33:21,810 --> 00:33:24,670 And after you do a million steps like that, 856 00:33:24,670 --> 00:33:26,590 you compute the energy by adding up 857 00:33:26,590 --> 00:33:28,173 all the potential energies and kinetic 858 00:33:28,173 --> 00:33:29,530 energies of all the atoms. 859 00:33:29,530 --> 00:33:31,196 And it'll still be about the same energy 860 00:33:31,196 --> 00:33:32,290 as you started from. 861 00:33:32,290 --> 00:33:35,424 Whereas a lot of methods, if you do the integral like that 862 00:33:35,424 --> 00:33:37,090 and you calculate the energy at the end, 863 00:33:37,090 --> 00:33:38,964 it won't be the same as energy you calculated 864 00:33:38,964 --> 00:33:41,110 because the methods have little round-off errors, 865 00:33:41,110 --> 00:33:45,560 and they kind of accumulate in a way that messes up the energy. 866 00:33:45,560 --> 00:33:48,880 And so this particular algorithm is 867 00:33:48,880 --> 00:33:53,080 a nice O to E solver method that is good for-- 868 00:33:53,080 --> 00:33:56,140 has a property that is good for conserving the energy. 869 00:33:59,510 --> 00:34:01,420 And then a lot of people use a thermostat, 870 00:34:01,420 --> 00:34:03,190 because they care about samples that are 871 00:34:03,190 --> 00:34:04,780 in contact with a thermal bath. 872 00:34:04,780 --> 00:34:08,500 So you're sampling a few molecules, maybe 100 molecules 873 00:34:08,500 --> 00:34:11,900 or 1,000 molecules, but not 10 to the 23rd molecules. 874 00:34:11,900 --> 00:34:14,260 So you have your 100 molecules or your 1,000 molecules, 875 00:34:14,260 --> 00:34:16,551 and you pretend that they're in contact with some bath, 876 00:34:16,551 --> 00:34:18,909 and you're watching those 100 molecules wiggle around, 877 00:34:18,909 --> 00:34:20,620 and their energy is not exactly conserved 878 00:34:20,620 --> 00:34:23,929 because they're exchanging energy with a thermal bath. 879 00:34:23,929 --> 00:34:26,290 And so there's different things called thermostats 880 00:34:26,290 --> 00:34:29,050 which are computer ways of adjusting 881 00:34:29,050 --> 00:34:31,300 the velocities of the atoms periodically 882 00:34:31,300 --> 00:34:33,310 as if they got a kick from the thermal bath that 883 00:34:33,310 --> 00:34:35,159 makes them go up or down. 884 00:34:35,159 --> 00:34:39,000 And that's very important trying to do, say, chemical kinetics, 885 00:34:39,000 --> 00:34:42,587 because their reactions are so slow that most of the time 886 00:34:42,587 --> 00:34:44,920 you'll watch the atoms wiggling around, nothing happens. 887 00:34:44,920 --> 00:34:48,427 You need the unusual case when you get a big kick somewhere 888 00:34:48,427 --> 00:34:50,760 that gives you enough energy that you overcome a barrier 889 00:34:50,760 --> 00:34:51,992 to make a reaction happen. 890 00:34:54,660 --> 00:34:55,159 All right. 891 00:34:55,159 --> 00:34:57,170 So that's what molecular dynamics is. 892 00:34:57,170 --> 00:35:00,960 And numerous people in the world and in this department 893 00:35:00,960 --> 00:35:05,030 and on the campus do these kind of calculations. 894 00:35:05,030 --> 00:35:08,060 And there's two ways that they're used. 895 00:35:08,060 --> 00:35:09,950 One way is used as an alternative 896 00:35:09,950 --> 00:35:12,290 to Metropolis Monte Carlo. 897 00:35:12,290 --> 00:35:14,540 So you're trying to compute basically multidimensional 898 00:35:14,540 --> 00:35:16,890 integrals, basically it's from statistical mechanics 899 00:35:16,890 --> 00:35:18,470 integrals, more or less. 900 00:35:18,470 --> 00:35:20,589 And instead of using Metropolis Monte Carlo, 901 00:35:20,589 --> 00:35:22,130 you decide to use molecular dynamics. 902 00:35:24,650 --> 00:35:29,000 This is a tricky choice between those two options. 903 00:35:29,000 --> 00:35:31,610 Nice thing about the molecular dynamics 904 00:35:31,610 --> 00:35:34,639 is I didn't have to choose any stepsize, basically, right? 905 00:35:34,639 --> 00:35:37,180 it's like the time scale was set by the real physical motions 906 00:35:37,180 --> 00:35:38,207 of the atoms. 907 00:35:38,207 --> 00:35:39,790 And if I don't want to think about it, 908 00:35:39,790 --> 00:35:42,520 I can just put in, what's the real physical time scale 909 00:35:42,520 --> 00:35:44,184 of the vibration of some atoms. 910 00:35:44,184 --> 00:35:46,600 And I don't have to think about it at all and just run it. 911 00:35:51,070 --> 00:35:53,250 But something should be a little bit thinking about 912 00:35:53,250 --> 00:35:55,333 is the molecular dynamics equations we're solving, 913 00:35:55,333 --> 00:35:58,360 it's really a time accurate method. 914 00:35:58,360 --> 00:35:59,950 We actually get real time dependences. 915 00:35:59,950 --> 00:36:02,074 We're using the real physical time, which turns out 916 00:36:02,074 --> 00:36:03,040 to be pretty darn fast. 917 00:36:03,040 --> 00:36:04,940 So molecular vibrations or time scales 918 00:36:04,940 --> 00:36:07,630 are like tens of femtoseconds. 919 00:36:07,630 --> 00:36:09,827 And so that's really fast. 920 00:36:09,827 --> 00:36:11,410 And so you have a really tiny delta t. 921 00:36:16,679 --> 00:36:18,970 But if you're trying to actually compute a steady state 922 00:36:18,970 --> 00:36:22,900 property, then maybe this isn't-- it's not necessarily 923 00:36:22,900 --> 00:36:25,177 the best way to do it, OK? 924 00:36:25,177 --> 00:36:27,760 Because you're doing something where you're doing extra effort 925 00:36:27,760 --> 00:36:28,990 to keep the time accuracy. 926 00:36:28,990 --> 00:36:31,270 It's sort of along the lines of John Paul's question. 927 00:36:31,270 --> 00:36:33,799 If you're doing Kinetic Monte Carlo, if you didn't really 928 00:36:33,799 --> 00:36:36,340 care about the time, then you don't have spend time computing 929 00:36:36,340 --> 00:36:37,424 the time, right? 930 00:36:37,424 --> 00:36:39,340 And the same thing here is if you don't really 931 00:36:39,340 --> 00:36:40,798 care about the time, then you might 932 00:36:40,798 --> 00:36:44,140 want to use the Metropolis Monte Carlo instead, 933 00:36:44,140 --> 00:36:47,200 because if time's not really in your problem, 934 00:36:47,200 --> 00:36:49,930 then you can tailor it, take steps that don't really 935 00:36:49,930 --> 00:36:53,272 have to be physical or related to physical amounts of time, 936 00:36:53,272 --> 00:36:55,260 and you can still get the right integral. 937 00:36:55,260 --> 00:36:57,010 Whereas here, it's going to necessarily do 938 00:36:57,010 --> 00:36:59,050 things that are exactly the physical amount of time. 939 00:36:59,050 --> 00:37:00,550 And some processes in the real world 940 00:37:00,550 --> 00:37:04,030 are pretty darn slow, at least compared to 100 femtoseconds. 941 00:37:04,030 --> 00:37:06,880 So you might have problems trying to compute 942 00:37:06,880 --> 00:37:09,200 by a time accurate method. 943 00:37:09,200 --> 00:37:10,780 Now on the other hand, sometimes you 944 00:37:10,780 --> 00:37:13,150 want to compute time dependent properties. 945 00:37:13,150 --> 00:37:16,660 And this is more or less an exact simulation 946 00:37:16,660 --> 00:37:19,460 of what the molecules really do on the time scale 947 00:37:19,460 --> 00:37:20,620 that you're simulating. 948 00:37:20,620 --> 00:37:22,030 So if that's what you care about, 949 00:37:22,030 --> 00:37:24,480 like what's happening on the time scale of picoseconds 950 00:37:24,480 --> 00:37:26,782 and nanoseconds, this might be exactly right, 951 00:37:26,782 --> 00:37:28,990 because you're actually simulating with a tool that's 952 00:37:28,990 --> 00:37:31,198 time accurate on the time scale of what you're really 953 00:37:31,198 --> 00:37:32,509 trying to measure. 954 00:37:32,509 --> 00:37:35,050 And so some chemical reactions, some kinds of energy transfer 955 00:37:35,050 --> 00:37:36,758 processes, like Professor Tisdale's group 956 00:37:36,758 --> 00:37:40,220 has exciton transfers that are happening on picosecond time 957 00:37:40,220 --> 00:37:40,890 scales. 958 00:37:40,890 --> 00:37:43,030 It's tailored to that kind of problem. 959 00:37:45,550 --> 00:37:47,460 OK, but this is the limitation. 960 00:37:47,460 --> 00:37:50,200 So you have to use a very small delta t. 961 00:37:50,200 --> 00:37:51,940 Therefore, your total time is typically 962 00:37:51,940 --> 00:37:54,670 limited to nanoseconds as far as you can integrate, 963 00:37:54,670 --> 00:37:56,062 because you have to-- 964 00:37:56,062 --> 00:37:58,270 if you just count how many time steps you would need. 965 00:38:02,080 --> 00:38:04,870 So if you're trying to determine some kind of static equilibrium 966 00:38:04,870 --> 00:38:08,070 property, you start from some initial gas 967 00:38:08,070 --> 00:38:11,325 at the positions of the atoms and the initial velocities, 968 00:38:11,325 --> 00:38:13,200 and then physically, it takes some time scale 969 00:38:13,200 --> 00:38:16,239 for that initial gas to relax to the real equilibrium, 970 00:38:16,239 --> 00:38:18,030 because you don't know the real equilibrium 971 00:38:18,030 --> 00:38:20,100 situation of the system. 972 00:38:20,100 --> 00:38:22,620 And that time scale, if it's longer than nanoseconds, 973 00:38:22,620 --> 00:38:24,078 you're in trouble, because it's not 974 00:38:24,078 --> 00:38:27,150 going to be done before you've done the calculation. 975 00:38:27,150 --> 00:38:30,206 And you never have even achieved the equilibrium situation. 976 00:38:30,206 --> 00:38:32,580 Also, if you have a situation like that hydrogen peroxide 977 00:38:32,580 --> 00:38:34,470 case we talked about, suppose it takes 978 00:38:34,470 --> 00:38:36,590 a millisecond for the hydrogen peroxide 979 00:38:36,590 --> 00:38:39,090 to change confirmation from the dihedral angle being one way 980 00:38:39,090 --> 00:38:41,010 and to be the other way. 981 00:38:41,010 --> 00:38:43,770 If I can only say, well, simulate for a nanosecond, 982 00:38:43,770 --> 00:38:45,510 then I'm never going to see that happen. 983 00:38:45,510 --> 00:38:48,050 So I'm never going to jump from the one conformer to the other. 984 00:38:48,050 --> 00:38:50,330 On the other hand, if I care what happens inside the one 985 00:38:50,330 --> 00:38:51,630 conformer, everything inside there 986 00:38:51,630 --> 00:38:53,421 probably happens on nanosecond time scales, 987 00:38:53,421 --> 00:38:55,620 and so I'll get the really good sampling 988 00:38:55,620 --> 00:38:59,420 of what's happening inside the one conformer. 989 00:38:59,420 --> 00:39:01,482 And similarly for the dynamic processes, 990 00:39:01,482 --> 00:39:03,690 if you have a process that happens on nanosecond time 991 00:39:03,690 --> 00:39:06,080 scale, this is really the ideal way to do it. 992 00:39:06,080 --> 00:39:08,430 If you have a process that's happening on millisecond 993 00:39:08,430 --> 00:39:10,003 or seconds or hours time scales, then 994 00:39:10,003 --> 00:39:12,680 this is not really the way to do it at all. 995 00:39:12,680 --> 00:39:15,610 And then there's an initial condition problem with this. 996 00:39:15,610 --> 00:39:17,710 It's kind of related to the Kinetic Monte Carlo 997 00:39:17,710 --> 00:39:19,280 initial condition problem. 998 00:39:19,280 --> 00:39:21,940 So in the Kinetic Monte Carlo, we 999 00:39:21,940 --> 00:39:24,910 had to sample over all the possible initial conditions. 1000 00:39:24,910 --> 00:39:26,851 We did it with the Poisson distribution. 1001 00:39:26,851 --> 00:39:28,600 There is a similar issue here, is how do I 1002 00:39:28,600 --> 00:39:29,808 start the initial conditions? 1003 00:39:29,808 --> 00:39:32,941 Where do I arrange the atoms to be to start out? 1004 00:39:32,941 --> 00:39:35,440 I really want to sample over all the different possible ways 1005 00:39:35,440 --> 00:39:37,810 the molecules could be arranged. 1006 00:39:37,810 --> 00:39:39,760 And particularly if I have some-- 1007 00:39:39,760 --> 00:39:41,230 say I have a protein. 1008 00:39:41,230 --> 00:39:45,280 And the protein has some conformation it likes to sit-in 1009 00:39:45,280 --> 00:39:47,866 and then it can unfold and go to some other conformation. 1010 00:39:47,866 --> 00:39:49,990 Maybe there's two or three conformations like this. 1011 00:39:49,990 --> 00:39:51,406 The time scales for those changes, 1012 00:39:51,406 --> 00:39:52,750 again, might be milliseconds. 1013 00:39:52,750 --> 00:39:53,800 I'm not going to be able to follow them 1014 00:39:53,800 --> 00:39:55,341 with the molecular dynamic, so I need 1015 00:39:55,341 --> 00:39:57,330 to have some sampling method to set me up 1016 00:39:57,330 --> 00:39:58,225 in each of the different conformations 1017 00:39:58,225 --> 00:39:59,470 that I want to sample. 1018 00:39:59,470 --> 00:40:01,180 And then I can follow very accurately 1019 00:40:01,180 --> 00:40:03,370 what would happen over a couple nanoseconds 1020 00:40:03,370 --> 00:40:05,110 after it's in that confirmation. 1021 00:40:05,110 --> 00:40:07,330 But I'll hardly ever see it actually achieve 1022 00:40:07,330 --> 00:40:11,090 the other confirmation on the time scale. 1023 00:40:11,090 --> 00:40:15,050 So I guess what I want to say to you 1024 00:40:15,050 --> 00:40:17,690 guys is these are different tools for really 1025 00:40:17,690 --> 00:40:18,980 different purposes. 1026 00:40:18,980 --> 00:40:22,280 The Metropolis Monte Carlo, the Kinetic Monte Carlo, 1027 00:40:22,280 --> 00:40:24,320 and the molecular dynamics, they're 1028 00:40:24,320 --> 00:40:26,190 all good for some kinds of problems. 1029 00:40:26,190 --> 00:40:28,400 But none of them is good for all problems. 1030 00:40:28,400 --> 00:40:35,120 And I often get journal papers where 1031 00:40:35,120 --> 00:40:38,130 somebody uses the wrong method for the wrong problem. 1032 00:40:38,130 --> 00:40:39,950 And so I have to reject it. 1033 00:40:39,950 --> 00:40:42,730 I've often had people come that want to do postdocs for me. 1034 00:40:42,730 --> 00:40:43,760 And they're talking about their thesis, 1035 00:40:43,760 --> 00:40:46,310 and I see the poor kid has spent five years of his life using 1036 00:40:46,310 --> 00:40:48,560 the wrong tool for the problem he's doing. 1037 00:40:48,560 --> 00:40:49,760 It's very sad. 1038 00:40:49,760 --> 00:40:52,190 So don't that be you, OK? 1039 00:40:52,190 --> 00:40:55,180 So don't just use a tool because you know it. 1040 00:40:55,180 --> 00:40:56,930 Say, does this tool work for this problem? 1041 00:40:56,930 --> 00:40:58,719 If not, I've got to find a new tool. 1042 00:40:58,719 --> 00:41:01,260 And just make sure you're using tools that match the problems 1043 00:41:01,260 --> 00:41:04,334 you want that you're trying to solve. 1044 00:41:04,334 --> 00:41:05,655 I think that's all I got. 1045 00:41:05,655 --> 00:41:07,410 So I have 10 more minutes left in class. 1046 00:41:07,410 --> 00:41:08,576 Any questions you guys have? 1047 00:41:12,930 --> 00:41:13,741 Yeah? 1048 00:41:13,741 --> 00:41:16,046 AUDIENCE: Just have one on weighting functions 1049 00:41:16,046 --> 00:41:20,290 for the Monte Carlo simulations. 1050 00:41:20,290 --> 00:41:23,310 I'm still a little shaky on how you always determine them. 1051 00:41:23,310 --> 00:41:26,395 It seems like it was given to us on the problem we 1052 00:41:26,395 --> 00:41:29,430 did on the homework and then in the example in class, 1053 00:41:29,430 --> 00:41:32,310 you just set it to an even distribution, 1054 00:41:32,310 --> 00:41:33,642 or a uniform distribution? 1055 00:41:33,642 --> 00:41:34,975 WILLIAM GREEN: Yeah, yeah, yeah. 1056 00:41:34,975 --> 00:41:37,731 AUDIENCE: How do you choose what's best, how do you know 1057 00:41:37,731 --> 00:41:38,230 [INAUDIBLE]? 1058 00:41:38,230 --> 00:41:39,063 WILLIAM GREEN: Yeah. 1059 00:41:42,330 --> 00:41:43,684 Well, yes. 1060 00:41:43,684 --> 00:41:44,725 So you get some integral. 1061 00:41:48,618 --> 00:41:50,921 This integral of g of x something. 1062 00:41:54,010 --> 00:41:55,590 All right? 1063 00:41:55,590 --> 00:41:58,429 Where this is a lot. 1064 00:41:58,429 --> 00:41:59,970 And then you have to figure out, what 1065 00:41:59,970 --> 00:42:02,670 am I going to do so I can solve this thing? 1066 00:42:02,670 --> 00:42:07,170 And the clever thing is to figure out, hmm, 1067 00:42:07,170 --> 00:42:12,810 can I rewrite this as p of x times f of x, because I 1068 00:42:12,810 --> 00:42:15,090 know how to solve those kind. 1069 00:42:15,090 --> 00:42:16,301 And actually, I don't even-- 1070 00:42:16,301 --> 00:42:18,300 Yeah, and even if I don't know what p is really, 1071 00:42:18,300 --> 00:42:25,110 maybe I can write it as w of x over some number of integrals 1072 00:42:25,110 --> 00:42:27,840 here. 1073 00:42:27,840 --> 00:42:30,570 That will be my p of x. 1074 00:42:30,570 --> 00:42:34,470 So most of these things you can do with uniform distributions 1075 00:42:34,470 --> 00:42:36,350 if you want. 1076 00:42:36,350 --> 00:42:38,567 But then your sampling can be extremely inefficient, 1077 00:42:38,567 --> 00:42:40,150 because you'll sample a lot of regions 1078 00:42:40,150 --> 00:42:44,272 that have very low probability of being really there. 1079 00:42:44,272 --> 00:42:45,730 But this is like a cleverness thing 1080 00:42:45,730 --> 00:42:48,610 about can I figure out what's that p times f 1081 00:42:48,610 --> 00:42:53,600 that's equal to g that's going to work the best for me. 1082 00:42:53,600 --> 00:42:57,530 And work the best, that's a good question. 1083 00:42:57,530 --> 00:43:04,080 So one way of working the best is, if f is most constant, 1084 00:43:04,080 --> 00:43:06,390 because if f is perfectly constant, 1085 00:43:06,390 --> 00:43:09,230 then I get the right answer in the first sample. 1086 00:43:09,230 --> 00:43:10,269 OK? 1087 00:43:10,269 --> 00:43:12,060 So that's one thing is trying to figure out 1088 00:43:12,060 --> 00:43:13,710 f to be really pretty constant. 1089 00:43:13,710 --> 00:43:15,470 The second one is to try to figure out 1090 00:43:15,470 --> 00:43:22,100 p that's very sharply focused. 1091 00:43:22,100 --> 00:43:25,040 So I don't need to sample a lot of x values to get it. 1092 00:43:25,040 --> 00:43:27,290 Now these two are kind of at odds with each other, 1093 00:43:27,290 --> 00:43:31,250 the sharp p and the flat f. 1094 00:43:31,250 --> 00:43:33,020 Probably again, probably whole people 1095 00:43:33,020 --> 00:43:34,894 wrote their PhD thesis in applied mathematics 1096 00:43:34,894 --> 00:43:38,750 about what's the optimal choice of p and f. 1097 00:43:38,750 --> 00:43:41,494 A lot of times, I'm not that smart, 1098 00:43:41,494 --> 00:43:43,910 so I just like, if I have a problem in stat mech, I just-- 1099 00:43:43,910 --> 00:43:45,270 I'll always do a Boltzmann factor. 1100 00:43:45,270 --> 00:43:46,895 It may not really the best thing to do, 1101 00:43:46,895 --> 00:43:48,740 but that's what I would do, right? 1102 00:43:48,740 --> 00:43:51,650 And if I was doing a Bayesian problem, 1103 00:43:51,650 --> 00:43:54,729 it's sort of like the w is given to you, right? 1104 00:43:54,729 --> 00:43:55,270 Where's that? 1105 00:43:58,410 --> 00:44:02,372 That's the formula I know, so I'm going to use that w. 1106 00:44:02,372 --> 00:44:04,080 Maybe there's a more clever way to do it, 1107 00:44:04,080 --> 00:44:06,197 but that's what I normally would do. 1108 00:44:09,286 --> 00:44:11,410 Actually, one thing that I don't know if we've ever 1109 00:44:11,410 --> 00:44:16,570 talked about explicitly, but very important to know, 1110 00:44:16,570 --> 00:44:23,130 is that this kind of formula, these formulas have w of theta. 1111 00:44:23,130 --> 00:44:26,145 That's the joint probability that theta one is something, 1112 00:44:26,145 --> 00:44:28,520 and theta two is something, and theta three is something, 1113 00:44:28,520 --> 00:44:30,560 and theta four is something. 1114 00:44:30,560 --> 00:44:32,704 It's that probably density. 1115 00:44:32,704 --> 00:44:34,120 But a lot of times, you don't care 1116 00:44:34,120 --> 00:44:35,721 about that level of detail. 1117 00:44:35,721 --> 00:44:37,720 Like you may have only a few of those parameters 1118 00:44:37,720 --> 00:44:39,100 that matter to you. 1119 00:44:39,100 --> 00:44:41,680 So you really care, you would like to know p of-- 1120 00:44:41,680 --> 00:44:43,697 I don't know, theta one, theta two. 1121 00:44:43,697 --> 00:44:46,030 And you'd like to get rid of all the rest of the thetas, 1122 00:44:46,030 --> 00:44:48,156 because these are two thetas you really care about. 1123 00:44:48,156 --> 00:44:49,571 Maybe these are the ones you think 1124 00:44:49,571 --> 00:44:51,170 you're controlling in your experiment, 1125 00:44:51,170 --> 00:44:52,762 you're trying to determine. 1126 00:44:52,762 --> 00:44:54,220 In the other ones, somebody already 1127 00:44:54,220 --> 00:44:55,780 measured the mass of the proton, so you really 1128 00:44:55,780 --> 00:44:57,430 don't want to determine the mass of proton again, 1129 00:44:57,430 --> 00:44:58,750 and you're not going to really say anything about it 1130 00:44:58,750 --> 00:45:00,100 even if you did. 1131 00:45:00,100 --> 00:45:02,655 If your calculation says that it made 1132 00:45:02,655 --> 00:45:04,780 the mass of proton a little bit different than what 1133 00:45:04,780 --> 00:45:06,812 the standard book says, you might believe that 1134 00:45:06,812 --> 00:45:09,270 in your heart that it's true, but you're probably not going 1135 00:45:09,270 --> 00:45:10,980 to say it, because you're like, I better double 1136 00:45:10,980 --> 00:45:11,938 check before I do that. 1137 00:45:11,938 --> 00:45:14,910 But I'm sure that it determines the length of my reactor, 1138 00:45:14,910 --> 00:45:16,050 because I measured that with a meter stick, 1139 00:45:16,050 --> 00:45:17,466 and nobody else knows that number, 1140 00:45:17,466 --> 00:45:20,340 so I'm sure that's the theta I got that I'm really 1141 00:45:20,340 --> 00:45:21,990 going to be in control of here. 1142 00:45:21,990 --> 00:45:26,790 So oftentimes you do this. 1143 00:45:26,790 --> 00:45:29,520 You want theta one, theta two, but you actually 1144 00:45:29,520 --> 00:45:35,560 know theta one, theta two, theta three, quite a few of them 1145 00:45:35,560 --> 00:45:37,680 from my formula. 1146 00:45:37,680 --> 00:45:41,310 And so you can do what's called a marginal integral. 1147 00:45:41,310 --> 00:45:45,246 Suppose I have-- I don't know-- theta three, theta four. 1148 00:45:45,246 --> 00:45:48,090 I could do D theta three, D theta four. 1149 00:45:50,680 --> 00:45:53,620 And this is like integrating out these degrees of freedom 1150 00:45:53,620 --> 00:45:59,360 to get the probability density that I want, all right? 1151 00:45:59,360 --> 00:46:02,540 So if you have some case where you all you care about is 1152 00:46:02,540 --> 00:46:06,574 the variance of something, and you 1153 00:46:06,574 --> 00:46:07,990 don't care about all the rest, you 1154 00:46:07,990 --> 00:46:09,460 can kind of integrate them out. 1155 00:46:09,460 --> 00:46:11,172 That's a very handy trick to do. 1156 00:46:15,524 --> 00:46:17,940 Well, you can do similar things with the Boltzmann things, 1157 00:46:17,940 --> 00:46:20,231 like for example, a lot of the Boltzmann distributions, 1158 00:46:20,231 --> 00:46:22,197 we don't actually care about the momenta, 1159 00:46:22,197 --> 00:46:24,780 because what we measure, say, is like a crystallography thing. 1160 00:46:24,780 --> 00:46:25,863 We see positions of atoms. 1161 00:46:25,863 --> 00:46:27,510 We don't see velocities anyway. 1162 00:46:27,510 --> 00:46:28,180 So a lot of times, people will just 1163 00:46:28,180 --> 00:46:30,390 integrate all the velocities out of the problem 1164 00:46:30,390 --> 00:46:31,462 right at the beginning. 1165 00:46:34,342 --> 00:46:35,800 It depends on what you want, right? 1166 00:46:35,800 --> 00:46:39,714 You're in charge of your problem. 1167 00:46:39,714 --> 00:46:41,172 Any more questions? 1168 00:46:44,580 --> 00:46:45,235 Yes, Kristen. 1169 00:46:45,235 --> 00:46:47,460 AUDIENCE: OK, well, I just have an announcement. 1170 00:46:47,460 --> 00:46:47,650 WILLIAM GREEN: OK. 1171 00:46:47,650 --> 00:46:49,540 AUDIENCE: [INAUDIBLE] today at office hours, 1172 00:46:49,540 --> 00:46:52,405 we'll talk about KMC, probably from about 5:00 to 5:30, 1173 00:46:52,405 --> 00:46:54,830 but come any other time [INAUDIBLE] questions. 1174 00:46:54,830 --> 00:46:57,401 And we just posted a poll for the final review session. 1175 00:46:57,401 --> 00:46:59,526 It's either going to be Wednesday evening or Friday 1176 00:46:59,526 --> 00:47:02,472 morning, so if you could vote on that as soon as possible, 1177 00:47:02,472 --> 00:47:03,804 that would be [INAUDIBLE]. 1178 00:47:03,804 --> 00:47:04,970 WILLIAM GREEN: OK, got that? 1179 00:47:04,970 --> 00:47:08,520 So final review is either Wednesday evening or Friday 1180 00:47:08,520 --> 00:47:10,290 morning, you have to vote. 1181 00:47:10,290 --> 00:47:11,790 And if you come today, 5:00 to 5:30, 1182 00:47:11,790 --> 00:47:13,934 it'll be all about Kinetic Monte Carlo. 1183 00:47:13,934 --> 00:47:15,600 And other times today are about anything 1184 00:47:15,600 --> 00:47:18,210 you want to talk about. 1185 00:47:18,210 --> 00:47:21,030 And the homework solution will be posted shortly, 1186 00:47:21,030 --> 00:47:25,590 I think, for the last homework that was graded. 1187 00:47:25,590 --> 00:47:27,170 Anything else? 1188 00:47:27,170 --> 00:47:29,380 All right, good luck as you study and good luck 1189 00:47:29,380 --> 00:47:31,140 on the exam.