1 00:00:01,540 --> 00:00:03,910 The following content is provided under a Creative 2 00:00:03,910 --> 00:00:05,300 Commons license. 3 00:00:05,300 --> 00:00:07,510 Your support will help MIT OpenCourseWare 4 00:00:07,510 --> 00:00:11,600 continue to offer high-quality educational resources for free. 5 00:00:11,600 --> 00:00:14,140 To make a donation, or to view additional materials 6 00:00:14,140 --> 00:00:18,100 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:18,100 --> 00:00:18,980 at ocw.mit.edu. 8 00:00:23,180 --> 00:00:25,060 OK, we're moving on. 9 00:00:25,060 --> 00:00:27,340 No more linear algebra. 10 00:00:27,340 --> 00:00:29,980 We're going to try solving some more difficult problems-- 11 00:00:29,980 --> 00:00:30,970 of course, all those problems will just 12 00:00:30,970 --> 00:00:32,980 be turned into linear algebra as we move on, 13 00:00:32,980 --> 00:00:37,750 so your expertise now with different techniques 14 00:00:37,750 --> 00:00:40,687 from linear algebra is going to come in handy. 15 00:00:40,687 --> 00:00:42,270 The next section of this course, we're 16 00:00:42,270 --> 00:00:45,320 talking about systems of nonlinear equations, 17 00:00:45,320 --> 00:00:48,490 and we'll transition into problems in optimization-- 18 00:00:48,490 --> 00:00:50,860 which, turns out, look a lot like systems of nonlinear 19 00:00:50,860 --> 00:00:51,880 equations, as well. 20 00:00:51,880 --> 00:00:53,710 And we'll try to leverage what we 21 00:00:53,710 --> 00:00:56,541 learn in the next few lectures to solve different optimization 22 00:00:56,541 --> 00:00:57,040 problems. 23 00:00:59,457 --> 00:01:01,040 Before going on, I just want to recap. 24 00:01:01,040 --> 00:01:03,165 All right, last time we talked about singular value 25 00:01:03,165 --> 00:01:06,940 decomposition-- which is like an eigenvalue decomposition 26 00:01:06,940 --> 00:01:09,540 for any matrix. 27 00:01:09,540 --> 00:01:11,920 And associated with that matrix are singular vectors, 28 00:01:11,920 --> 00:01:13,930 left and right singular vectors. 29 00:01:13,930 --> 00:01:17,690 And the singular values of that matrix. 30 00:01:17,690 --> 00:01:22,570 Your TA, Kristen, reminded me that you can actually 31 00:01:22,570 --> 00:01:25,390 define a condition number for any matrix, as well. 32 00:01:25,390 --> 00:01:26,950 The condition we gave originally was 33 00:01:26,950 --> 00:01:30,235 associated with solving square systems of equations 34 00:01:30,235 --> 00:01:31,730 that actually have a solution. 35 00:01:31,730 --> 00:01:34,188 But there is a condition number associated with any matrix, 36 00:01:34,188 --> 00:01:36,680 and it's defined in terms of its singular values. 37 00:01:36,680 --> 00:01:39,430 So if you go back and look at the definition of the two norm, 38 00:01:39,430 --> 00:01:42,190 you try to think about the condition number associated 39 00:01:42,190 --> 00:01:44,700 with the two norm, you'll see that the conditions 40 00:01:44,700 --> 00:01:46,840 of any matrix maybe can be defined 41 00:01:46,840 --> 00:01:51,130 as the square root of the ratio of the biggest and the smallest 42 00:01:51,130 --> 00:01:53,650 singular values of that matrix. 43 00:01:53,650 --> 00:01:56,740 So there's a condition number associated with any matrix. 44 00:01:56,740 --> 00:01:58,180 The condition number as an entity 45 00:01:58,180 --> 00:01:59,680 makes most sense when we're thinking 46 00:01:59,680 --> 00:02:01,990 about how we amplify errors-- 47 00:02:01,990 --> 00:02:05,130 numerical errors, solving systems of equation zones. 48 00:02:05,130 --> 00:02:07,609 It's most easily applied to square systems that actually 49 00:02:07,609 --> 00:02:09,400 have unique solutions, but you can apply it 50 00:02:09,400 --> 00:02:11,320 to any system of equations you want. 51 00:02:11,320 --> 00:02:12,820 And the singular value decomposition 52 00:02:12,820 --> 00:02:16,560 is one way to tap into that. 53 00:02:16,560 --> 00:02:20,110 OK, the last thing we did discussing linear algebra 54 00:02:20,110 --> 00:02:23,680 was to talk about iterative solutions, the systems 55 00:02:23,680 --> 00:02:26,130 of linear equations. 56 00:02:26,130 --> 00:02:28,980 And that's actually our hook into solutions 57 00:02:28,980 --> 00:02:30,810 to systems of nonlinear equations. 58 00:02:30,810 --> 00:02:33,420 It's going to turn out that exact solutions are 59 00:02:33,420 --> 00:02:37,085 hard to come by for anything but linear systems of equations. 60 00:02:37,085 --> 00:02:38,460 And so we're always going to have 61 00:02:38,460 --> 00:02:42,960 these iterative algorithms, where we refine initial guesses 62 00:02:42,960 --> 00:02:45,240 for solutions until we converge to something 63 00:02:45,240 --> 00:02:48,079 that's a solution to the problem that we wanted before. 64 00:02:48,079 --> 00:02:49,620 And one question you should ask, when 65 00:02:49,620 --> 00:02:55,830 you do these sorts of iterations, is when do I stop? 66 00:02:55,830 --> 00:02:57,930 I don't know the exact solution to this problem. 67 00:02:57,930 --> 00:03:00,660 I can't say I'm close enough-- what 68 00:03:00,660 --> 00:03:02,370 does close enough even mean? 69 00:03:02,370 --> 00:03:04,770 So how do I decide to stop? 70 00:03:04,770 --> 00:03:06,570 Do you have any ideas or suggestions 71 00:03:06,570 --> 00:03:08,050 for how you might do that? 72 00:03:08,050 --> 00:03:09,790 You've done some of this already in a homework assignment. 73 00:03:09,790 --> 00:03:10,706 But what do you think? 74 00:03:10,706 --> 00:03:12,030 How do you decide to stop? 75 00:03:12,030 --> 00:03:12,824 Yeah? 76 00:03:12,824 --> 00:03:14,116 AUDIENCE: [INAUDIBLE]. 77 00:03:18,580 --> 00:03:20,650 PROFESSOR: OK, so this is one suggestion-- 78 00:03:20,650 --> 00:03:24,930 look at my current iteration, and my next iteration, 79 00:03:24,930 --> 00:03:28,090 ask how far apart are these two numbers? 80 00:03:28,090 --> 00:03:30,220 If they're sufficiently far apart, 81 00:03:30,220 --> 00:03:34,540 seems like I've got some more steps to make before I converge 82 00:03:34,540 --> 00:03:35,620 to my solution. 83 00:03:35,620 --> 00:03:37,660 And if they're sufficiently close together, 84 00:03:37,660 --> 00:03:39,460 the steps I'm taking are small enough 85 00:03:39,460 --> 00:03:42,310 that I might be happy accepting this solution 86 00:03:42,310 --> 00:03:44,040 as a good approximation. 87 00:03:44,650 --> 00:03:46,660 That's called the step norm criteria-- 88 00:03:46,660 --> 00:03:48,880 I'll give you a formalization of that later-- it's 89 00:03:48,880 --> 00:03:50,530 called the step norm criteria. 90 00:03:50,530 --> 00:03:53,680 How big are the steps that I'm taking? 91 00:03:53,680 --> 00:03:56,920 And are they sufficiently small that I don't 92 00:03:56,920 --> 00:03:59,260 care about any future steps? 93 00:03:59,260 --> 00:04:00,070 Another suggestion? 94 00:04:00,070 --> 00:04:01,320 AUDIENCE: I've got a question. 95 00:04:01,320 --> 00:04:01,986 PROFESSOR: Yeah? 96 00:04:01,986 --> 00:04:05,478 AUDIENCE: [INAUDIBLE] absolute [INAUDIBLE] 97 00:04:05,478 --> 00:04:09,205 when we did that for homework, I tried to do it [INAUDIBLE],, 98 00:04:09,205 --> 00:04:11,170 and [INAUDIBLE]. 99 00:04:11,170 --> 00:04:13,290 PROFESSOR: Yes. 100 00:04:13,290 --> 00:04:16,180 I will show you a definition of the step norm criteria 101 00:04:16,180 --> 00:04:19,540 that integrates both relative and absolute error 102 00:04:19,540 --> 00:04:20,500 into the definition. 103 00:04:20,500 --> 00:04:23,710 And we'll see why, OK? 104 00:04:23,710 --> 00:04:25,370 One problem may be-- 105 00:04:25,370 --> 00:04:28,839 what if the solution I'm trying to converge to is 0? 106 00:04:28,839 --> 00:04:30,880 How do you define the relative error with respect 107 00:04:30,880 --> 00:04:33,196 to the number 0? 108 00:04:33,196 --> 00:04:35,320 There isn't one-- there is only absolute error when 109 00:04:35,320 --> 00:04:36,611 you're trying to converge to 0. 110 00:04:36,611 --> 00:04:39,260 So you may want to have some measure 111 00:04:39,260 --> 00:04:41,882 of both absolute and relative step size 112 00:04:41,882 --> 00:04:43,840 in order to determine whether you're converged. 113 00:04:43,840 --> 00:04:46,970 Is that the only way to do it, though? 114 00:04:46,970 --> 00:04:49,600 Have any ideas, alternative proposals for deciding 115 00:04:49,600 --> 00:04:51,472 convergence? 116 00:04:51,472 --> 00:04:52,930 AUDIENCE: [INAUDIBLE] the residual. 117 00:04:52,930 --> 00:04:53,710 PROFESSOR: The residual. 118 00:04:53,710 --> 00:04:54,850 OK, what's the residual? 119 00:04:54,850 --> 00:04:56,840 AUDIENCE: [INAUDIBLE] 120 00:04:56,840 --> 00:04:59,560 PROFESSOR: Good, OK, so we asked, how good a solution-- 121 00:04:59,560 --> 00:05:03,760 we can ask how good a solution is this value that we've 122 00:05:03,760 --> 00:05:07,660 converged to by putting it back into the original system 123 00:05:07,660 --> 00:05:10,930 of equations, and asking, how far out of balance are we? 124 00:05:10,930 --> 00:05:13,930 I take my best guess for solution x, and multiply it by 125 00:05:13,930 --> 00:05:15,940 and A, and I subtract b-- 126 00:05:15,940 --> 00:05:18,310 we call that the residual. 127 00:05:18,310 --> 00:05:21,280 And is the residual sufficiently converged, or not? 128 00:05:21,280 --> 00:05:23,200 If it's small enough in magnitude, 129 00:05:23,200 --> 00:05:24,970 then we would say, OK, maybe we're 130 00:05:24,970 --> 00:05:27,130 sufficiently close to the solution. 131 00:05:27,130 --> 00:05:29,170 If it's too big, then we say, OK, let's iterate 132 00:05:29,170 --> 00:05:30,670 some more until we get there. 133 00:05:30,670 --> 00:05:33,520 That is called the function norm criterion. 134 00:05:35,950 --> 00:05:37,700 We're going to talk about these in detail, 135 00:05:37,700 --> 00:05:40,100 as applied to systems of nonlinear equations. 136 00:05:40,100 --> 00:05:44,840 But these same criteria apply to all iterative processes. 137 00:05:44,840 --> 00:05:46,922 Neither is preferred over the other. 138 00:05:46,922 --> 00:05:48,380 You don't know the exact solution-- 139 00:05:48,380 --> 00:05:50,840 you have no way of measuring how close or far away 140 00:05:50,840 --> 00:05:52,190 you are from the exact solution. 141 00:05:52,190 --> 00:05:54,740 So usually you use as many tools as possible 142 00:05:54,740 --> 00:05:57,830 to try to judge how good your approximation is. 143 00:05:57,830 --> 00:06:01,209 But you don't know for certain. 144 00:06:01,209 --> 00:06:02,750 What do you do, when that's the case? 145 00:06:02,750 --> 00:06:05,170 We haven't really talked about this in this class. 146 00:06:05,170 --> 00:06:07,630 You have a problem, you program it into a computer, 147 00:06:07,630 --> 00:06:08,950 you get a solution. 148 00:06:08,950 --> 00:06:11,860 Is at the end of the story? 149 00:06:11,860 --> 00:06:13,100 We just stop? 150 00:06:13,100 --> 00:06:16,702 You get a number back out, and that's the answer? 151 00:06:16,702 --> 00:06:17,910 How do you know you're right? 152 00:06:21,902 --> 00:06:22,900 How do you know? 153 00:06:26,400 --> 00:06:27,840 We talked about numerical error-- 154 00:06:27,840 --> 00:06:29,490 every calculation has a numerical error-- how 155 00:06:29,490 --> 00:06:31,031 do you know you got the right answer? 156 00:06:35,950 --> 00:06:38,795 What do you think? 157 00:06:38,795 --> 00:06:40,222 AUDIENCE: [INAUDIBLE] 158 00:06:40,222 --> 00:06:41,680 PROFESSOR: OK, yeah, this is true-- 159 00:06:41,680 --> 00:06:44,240 so you plug your solution back into the equation, 160 00:06:44,240 --> 00:06:47,140 you ask, how good a job does it do satisfying that? 161 00:06:47,140 --> 00:06:48,970 But maybe this equation is relatively 162 00:06:48,970 --> 00:06:51,040 insensitive to the solution you provide. 163 00:06:51,040 --> 00:06:54,160 Maybe many solutions nearby look like they also 164 00:06:54,160 --> 00:06:56,020 satisfy the equation, but those solutions 165 00:06:56,020 --> 00:06:57,460 are sufficiently far apart. 166 00:06:57,460 --> 00:06:59,329 So how do you-- 167 00:06:59,329 --> 00:07:01,700 AUDIENCE: [INAUDIBLE] 168 00:07:01,700 --> 00:07:05,560 PROFESSOR: OK, so that sort of physical reasoning 169 00:07:05,560 --> 00:07:06,560 is a good one. 170 00:07:06,560 --> 00:07:09,520 In your transfer class, you'll talk 171 00:07:09,520 --> 00:07:11,800 about doing asymptotic expansions, 172 00:07:11,800 --> 00:07:14,380 or asymptotic solutions to problems. 173 00:07:14,380 --> 00:07:17,110 Solve this complicated problem in a limit 174 00:07:17,110 --> 00:07:19,240 where it has some analytical solution, 175 00:07:19,240 --> 00:07:20,800 and figure how the solution scales, 176 00:07:20,800 --> 00:07:22,960 with respect to different parameters. 177 00:07:22,960 --> 00:07:24,550 So you can have an analytical solution 178 00:07:24,550 --> 00:07:26,550 that you compare against your numerical solution 179 00:07:26,550 --> 00:07:27,820 in certain limits. 180 00:07:27,820 --> 00:07:30,490 You have experiments that you've done. 181 00:07:30,490 --> 00:07:32,650 Experiment-- that's the reality. 182 00:07:32,650 --> 00:07:35,590 The computer is a fiction that's trying to model reality, 183 00:07:35,590 --> 00:07:38,050 so you can compare your solution to experiments. 184 00:07:38,050 --> 00:07:41,650 You could also solve the problem a bunch of different ways, 185 00:07:41,650 --> 00:07:44,950 and see if all these answers converge in the same place. 186 00:07:44,950 --> 00:07:47,875 So we're going to talk about solving nonlinear equations-- 187 00:07:47,875 --> 00:07:49,750 we're going to need different initial guesses 188 00:07:49,750 --> 00:07:52,022 for our iterative methods. 189 00:07:52,022 --> 00:07:53,980 We might try several different initial guesses, 190 00:07:53,980 --> 00:07:56,160 and see what solutions we come up with. 191 00:07:56,160 --> 00:07:58,660 Maybe we converge all to the same solution, 192 00:07:58,660 --> 00:08:00,970 or maybe this problem has some weird sensitivity in it, 193 00:08:00,970 --> 00:08:03,040 and we get lots of different solutions 194 00:08:03,040 --> 00:08:05,320 that aren't coordinated with each other. 195 00:08:05,320 --> 00:08:07,720 One of the duties of someone who's 196 00:08:07,720 --> 00:08:11,170 using numerical methods to solve problems is to try 197 00:08:11,170 --> 00:08:13,060 to validate their result-- 198 00:08:13,060 --> 00:08:15,940 by solving it multiple times or multiple ways, 199 00:08:15,940 --> 00:08:18,490 or comparing against experiment, or against known 200 00:08:18,490 --> 00:08:20,560 analytical results, and certain limits 201 00:08:20,560 --> 00:08:23,290 where the answer should be exact. 202 00:08:23,290 --> 00:08:26,840 But you can't just accept what the computer tells you-- 203 00:08:26,840 --> 00:08:31,547 you have to validate it against some sort of external solution 204 00:08:31,547 --> 00:08:32,630 that you can compare with. 205 00:08:32,630 --> 00:08:34,713 Sometimes it's hard to come up with that solution, 206 00:08:34,713 --> 00:08:36,669 but it's immensely important. 207 00:08:36,669 --> 00:08:38,664 We know every calculation can be an error. 208 00:08:38,664 --> 00:08:40,539 And as we go on to more complicated problems, 209 00:08:40,539 --> 00:08:42,549 it's even more important to validate things. 210 00:08:45,332 --> 00:08:51,750 So, systems of nonlinear equations-- 211 00:08:51,750 --> 00:08:55,390 so these are problems of a type f 212 00:08:55,390 --> 00:08:59,770 of x equals 0, where x is some vector of unknowns, 213 00:08:59,770 --> 00:09:03,790 and has dimension N. And f is a function that 214 00:09:03,790 --> 00:09:06,865 takes as input vectors of dimension N, 215 00:09:06,865 --> 00:09:09,790 and gives an output a vector of dimension N. 216 00:09:09,790 --> 00:09:13,390 It's a map from R N to R N. But it's no longer 217 00:09:13,390 --> 00:09:16,690 necessarily a linear map-- it can be some nonlinear function 218 00:09:16,690 --> 00:09:20,180 of all the elements of this x. 219 00:09:20,180 --> 00:09:22,370 And the solution to this equation, 220 00:09:22,370 --> 00:09:24,650 the particular solution of this equation, x-- 221 00:09:24,650 --> 00:09:26,360 for which f of x equals 0-- are called 222 00:09:26,360 --> 00:09:30,730 the roots of this vector-valued function. 223 00:09:30,730 --> 00:09:33,290 The linear equations, then, are just represented in this form 224 00:09:33,290 --> 00:09:37,200 as A x minus b, A x minus b equals 0-- 225 00:09:37,200 --> 00:09:38,750 it's the same as the linear equations 226 00:09:38,750 --> 00:09:39,708 we were solving before. 227 00:09:39,708 --> 00:09:43,700 So we're searching for the roots of these functions. 228 00:09:43,700 --> 00:09:45,200 Common chemical engineering examples 229 00:09:45,200 --> 00:09:48,140 include equations of state, often nonlinear, 230 00:09:48,140 --> 00:09:50,150 in terms of the variables we're interested in. 231 00:09:50,150 --> 00:09:52,310 Energy balances have lots of non-linearities 232 00:09:52,310 --> 00:09:54,220 introduced into them. 233 00:09:54,220 --> 00:09:57,890 Mass balances with nonlinear reactions. 234 00:09:57,890 --> 00:10:00,200 Or reactions that are non-isothermal, 235 00:10:00,200 --> 00:10:02,210 so their kinetics are sensitive to temperature, 236 00:10:02,210 --> 00:10:04,464 and temperatures are variable, we want to know. 237 00:10:04,464 --> 00:10:06,380 These sorts of nonlinear equations crop up all 238 00:10:06,380 --> 00:10:10,450 over the place, and you want to be able to solve them reliably. 239 00:10:10,450 --> 00:10:12,690 Here's a simple, simple example. 240 00:10:12,690 --> 00:10:14,590 The Van der Waals equation of state-- here 241 00:10:14,590 --> 00:10:18,400 I've written it in terms of reduced pressure, temperature, 242 00:10:18,400 --> 00:10:19,390 and molar volume. 243 00:10:19,390 --> 00:10:23,680 Nonetheless, this is the Van der Waals equation of state. 244 00:10:23,680 --> 00:10:26,500 And somebody told you once that if I plot pressure 245 00:10:26,500 --> 00:10:30,190 versus molar volume for different temperatures, 246 00:10:30,190 --> 00:10:32,590 I may see that, at a given pressure, 247 00:10:32,590 --> 00:10:35,380 there could be just one root-- 248 00:10:35,380 --> 00:10:38,150 one possible molar volume that satisfies the equation 249 00:10:38,150 --> 00:10:38,650 of state. 250 00:10:38,650 --> 00:10:44,212 Or there can be one, two, three potential roots. 251 00:10:44,212 --> 00:10:46,420 It turns out we don't know, with nonlinear equations, 252 00:10:46,420 --> 00:10:48,460 how many possible solutions there are. 253 00:10:48,460 --> 00:10:51,190 We knew, for linear equations, I either 254 00:10:51,190 --> 00:10:53,562 had zero, one, or an infinite number of solutions. 255 00:10:53,562 --> 00:10:55,270 But with nonlinear equations, in general, 256 00:10:55,270 --> 00:10:57,130 there's no way to predict them. 257 00:10:57,130 --> 00:11:00,100 This problem can be transformed into a polynomial, 258 00:11:00,100 --> 00:11:02,650 and polynomials are one of the few nonlinear equations where 259 00:11:02,650 --> 00:11:06,110 we know how to bound the possible number of solutions. 260 00:11:09,220 --> 00:11:11,132 So we can transform this nonlinear equation 261 00:11:11,132 --> 00:11:12,840 to the form I showed you before-- we just 262 00:11:12,840 --> 00:11:15,900 move 8/3 T to the other side of this equation. 263 00:11:15,900 --> 00:11:19,540 We want to find the roots of this equation. 264 00:11:19,540 --> 00:11:22,080 So possibly, given pressure and temperature-- pressure 265 00:11:22,080 --> 00:11:24,840 and temperature-- find all the molar volumes that 266 00:11:24,840 --> 00:11:28,560 satisfy the equation of state. 267 00:11:28,560 --> 00:11:30,240 This is actually overly simplified 268 00:11:30,240 --> 00:11:32,850 for a particular physical problem, 269 00:11:32,850 --> 00:11:35,910 of looking at vapor liquid coexistence of the Van der 270 00:11:35,910 --> 00:11:37,650 Waals fluid. 271 00:11:37,650 --> 00:11:41,490 You can't specify pressure and temperature independently-- 272 00:11:41,490 --> 00:11:43,830 the saturation pressure, the coexistence pressure 273 00:11:43,830 --> 00:11:47,400 depends on the temperature as the Gibbs phase rule. 274 00:11:47,400 --> 00:11:51,300 So actually, phase equilibria is made up of three parts-- 275 00:11:51,300 --> 00:11:52,740 there's thermal equilibrium. 276 00:11:52,740 --> 00:11:55,260 I'm going to add two phases, a gas and a liquid, 277 00:11:55,260 --> 00:11:56,295 and they have to have the same temperature, 278 00:11:56,295 --> 00:11:57,680 otherwise they're not in equilibrium. 279 00:11:57,680 --> 00:11:59,400 There's got to be mechanical equilibrium of two 280 00:11:59,400 --> 00:12:00,852 phases, the gas and the liquid. 281 00:12:00,852 --> 00:12:02,310 They better have the same pressure, 282 00:12:02,310 --> 00:12:03,750 otherwise one is going to be pushing harder 283 00:12:03,750 --> 00:12:04,690 on the other one. 284 00:12:04,690 --> 00:12:06,980 They'll be in motion-- that's not in equilibrium. 285 00:12:06,980 --> 00:12:09,480 They've got to have the same chemical potential-- they have 286 00:12:09,480 --> 00:12:11,190 to be in chemical equilibrium. 287 00:12:11,190 --> 00:12:14,590 So there can't be any net mass flux from one phase to another, 288 00:12:14,590 --> 00:12:16,132 otherwise, one phase is going to grow 289 00:12:16,132 --> 00:12:17,506 and the other is going to shrink. 290 00:12:17,506 --> 00:12:19,320 They're not in equilibrium with each other. 291 00:12:19,320 --> 00:12:22,860 So actually, the problem of determining 292 00:12:22,860 --> 00:12:26,610 vapor liquid coexistence, in this Van der Waals fluid, 293 00:12:26,610 --> 00:12:30,240 involves satisfying a number of different equations, some 294 00:12:30,240 --> 00:12:32,370 of which are nonlinear, and are constrained 295 00:12:32,370 --> 00:12:34,050 by the equation of state. 296 00:12:36,790 --> 00:12:40,330 Given the temperature, there are three unknowns-- the pressure, 297 00:12:40,330 --> 00:12:43,000 and the more volumes of the gas and the liquid. 298 00:12:43,000 --> 00:12:45,660 And there are three nonlinear equations we have to solve. 299 00:12:45,660 --> 00:12:47,366 Two of those are the equation of state, 300 00:12:47,366 --> 00:12:48,490 in the gas and the liquid-- 301 00:12:48,490 --> 00:12:49,948 I'll show them to you in a second-- 302 00:12:49,948 --> 00:12:53,059 and the other is this Maxwell equal area construction, 303 00:12:53,059 --> 00:12:55,600 which essentially says that the chemical potential in the two 304 00:12:55,600 --> 00:12:56,810 phases is equal. 305 00:12:56,810 --> 00:12:59,770 This is one way of representing that. 306 00:12:59,770 --> 00:13:03,520 So we have to solve this system of nonlinear equations 307 00:13:03,520 --> 00:13:05,950 for the saturation pressure, the molar 308 00:13:05,950 --> 00:13:10,530 volume of the gas or vapor, and the molar volume of the liquid. 309 00:13:10,530 --> 00:13:11,780 And these are those equations. 310 00:13:11,780 --> 00:13:13,770 Here's the equation of state and the gas, 311 00:13:13,770 --> 00:13:15,970 here's the equation of state in the liquid, 312 00:13:15,970 --> 00:13:19,140 and here's the Maxwell equal area construction 313 00:13:19,140 --> 00:13:22,840 at defined values of P sat, V G and V L 314 00:13:22,840 --> 00:13:25,050 that satisfy all three equations. 315 00:13:25,050 --> 00:13:27,780 There's not going to be an analytical way to do this-- 316 00:13:27,780 --> 00:13:29,638 it has to be done numerically. 317 00:13:33,760 --> 00:13:36,070 Here's a simplification that I can make, though. 318 00:13:36,070 --> 00:13:37,900 So I can take that equal area construction, 319 00:13:37,900 --> 00:13:41,117 and I can solve for P sat in terms of V G and V L. 320 00:13:41,117 --> 00:13:42,700 And so that reduces the dimensionality 321 00:13:42,700 --> 00:13:45,299 of these equations from three to two. 322 00:13:45,299 --> 00:13:47,590 And when it's two dimensional, I can plot these things, 323 00:13:47,590 --> 00:13:48,920 so that's helpful. 324 00:13:48,920 --> 00:13:52,170 So let's plot f 1-- 325 00:13:52,170 --> 00:13:57,190 the equation of state in the gas is a function of V G and V L. 326 00:13:57,190 --> 00:14:00,870 Where that's equal to 0, that's this black curve here. 327 00:14:00,870 --> 00:14:02,710 Let's plot f 2-- 328 00:14:02,710 --> 00:14:05,350 the equation of state in the liquid 329 00:14:05,350 --> 00:14:08,080 as a function of V G and V L, where that's equal to 0, 330 00:14:08,080 --> 00:14:10,030 that's this blue curve here. 331 00:14:10,030 --> 00:14:12,700 And the solutions are where these curves intersect. 332 00:14:12,700 --> 00:14:16,150 So we're seeking out the specific points graphically 333 00:14:16,150 --> 00:14:17,660 where these curves intersect. 334 00:14:17,660 --> 00:14:19,430 First, this solution, and that solution 335 00:14:19,430 --> 00:14:21,430 aren't the solutions we're interested in at all. 336 00:14:21,430 --> 00:14:23,950 That would say that the molar volume of the gas 337 00:14:23,950 --> 00:14:25,362 and the liquid is the same-- 338 00:14:25,362 --> 00:14:26,820 that's not really phase separation. 339 00:14:26,820 --> 00:14:32,530 We want these heterogeneous solutions out here. 340 00:14:32,530 --> 00:14:34,990 So we need some methodology that can reliably 341 00:14:34,990 --> 00:14:36,700 take us to those solutions. 342 00:14:36,700 --> 00:14:38,560 We'll see that that methodology-- 343 00:14:38,560 --> 00:14:40,720 the most reliable methodology, and one of the ones 344 00:14:40,720 --> 00:14:42,690 that converges fastest to the solutions-- 345 00:14:42,690 --> 00:14:45,289 is called the Newton-Raphson method. 346 00:14:45,289 --> 00:14:47,080 But even before we do that, let's talk more 347 00:14:47,080 --> 00:14:50,175 about the structure of systems of nonlinear equations, 348 00:14:50,175 --> 00:14:52,260 and what sort of solutions we can expect. 349 00:14:52,260 --> 00:14:55,075 Does this example makes sense to everyone? 350 00:14:55,075 --> 00:14:58,150 Have you thought about this before, maybe? 351 00:14:58,150 --> 00:15:00,560 Yeah. 352 00:15:00,560 --> 00:15:04,010 So given a function, which is a map from R N to R N, 353 00:15:04,010 --> 00:15:08,630 find the special solution, x star, such that f of x star 354 00:15:08,630 --> 00:15:09,200 equals 0. 355 00:15:09,200 --> 00:15:10,040 That's our task. 356 00:15:10,040 --> 00:15:12,267 And there could be no solutions. 357 00:15:12,267 --> 00:15:13,850 There can be one to an infinite number 358 00:15:13,850 --> 00:15:16,580 of locally unique solutions, and there 359 00:15:16,580 --> 00:15:19,960 can be an infinite number of solutions. 360 00:15:19,960 --> 00:15:22,360 A solution is to be locally unique 361 00:15:22,360 --> 00:15:27,610 if I can wrap that solution by some ball of points, in which 362 00:15:27,610 --> 00:15:28,930 there are no other solutions. 363 00:15:28,930 --> 00:15:31,210 That ball can be very, very small, but as long 364 00:15:31,210 --> 00:15:33,430 as I can wrap that solution in some ball of points, 365 00:15:33,430 --> 00:15:36,220 which are not solutions, we term that locally unique. 366 00:15:39,490 --> 00:15:41,910 So consider the simple function, one 367 00:15:41,910 --> 00:15:45,240 which depends on two variables-- x1 and x2, 368 00:15:45,240 --> 00:15:49,410 and f 2, which depends on x1 and x2, equals 0. 369 00:15:49,410 --> 00:15:52,602 And I'm going to plot, in the x1, x2 plane, were f 1 370 00:15:52,602 --> 00:15:55,490 and f 2 are equal to 0, so that these curves here. 371 00:15:58,280 --> 00:16:00,250 And here, we have a locally unique solution-- 372 00:16:00,250 --> 00:16:03,064 we see the curves cross at exactly one point. 373 00:16:03,064 --> 00:16:04,480 Here, you can see these two curves 374 00:16:04,480 --> 00:16:07,930 are tangent with each other. 375 00:16:07,930 --> 00:16:09,760 They could be coincident with each other, 376 00:16:09,760 --> 00:16:11,800 over some finite distance. 377 00:16:11,800 --> 00:16:14,210 In which case there's a lot of solutions that live 378 00:16:14,210 --> 00:16:18,370 on some locally tangent area. 379 00:16:18,370 --> 00:16:20,440 Or they could just touch in one point. 380 00:16:20,440 --> 00:16:22,630 So they may be tangent, and the solutions there 381 00:16:22,630 --> 00:16:25,300 are not locally unique, or they may touch at one point, 382 00:16:25,300 --> 00:16:27,450 and the solutions are-- 383 00:16:27,450 --> 00:16:29,780 there's one solution, and it's locally unique there. 384 00:16:33,430 --> 00:16:37,220 The reason why we talk about locally unique solutions is 385 00:16:37,220 --> 00:16:40,790 it's going to be hard for a numerical method 386 00:16:40,790 --> 00:16:43,370 to find anything that's not locally 387 00:16:43,370 --> 00:16:45,470 unique in a reliable way. 388 00:16:45,470 --> 00:16:47,360 Locally unique solutions, numerical methods 389 00:16:47,360 --> 00:16:48,660 can find very reliably. 390 00:16:48,660 --> 00:16:51,710 But if they're not locally unique? 391 00:16:51,710 --> 00:16:53,420 My iterative method could converge 392 00:16:53,420 --> 00:16:56,690 to any one of these solutions that lives on this line, any 393 00:16:56,690 --> 00:16:57,800 of these tangent points. 394 00:16:57,800 --> 00:16:59,633 And I'm going to have a hard time predicting 395 00:16:59,633 --> 00:17:01,280 which one it's going to go to. 396 00:17:01,280 --> 00:17:04,069 That's a problem if you're trying to solve something 397 00:17:04,069 --> 00:17:06,240 reliably over and over again-- if I 398 00:17:06,240 --> 00:17:08,750 converge to one of these solutions, or another solution, 399 00:17:08,750 --> 00:17:11,150 or another solution, the data that 400 00:17:11,150 --> 00:17:15,790 comes out of that process isn't going to be easy to interpret. 401 00:17:15,790 --> 00:17:18,640 There's something called the inverse function theorem, which 402 00:17:18,640 --> 00:17:22,829 says if f of x star is equal to 0, 403 00:17:22,829 --> 00:17:26,550 and the determinant of this matrix, J, 404 00:17:26,550 --> 00:17:29,400 which we call the Jacobian, evaluated at x star 405 00:17:29,400 --> 00:17:32,700 is not equal to 0, then x star necessarily 406 00:17:32,700 --> 00:17:35,050 is a locally unique solution. 407 00:17:35,050 --> 00:17:37,400 So it's the inverse function theorem. 408 00:17:37,400 --> 00:17:42,795 The Jacobian is a matrix of the partial derivatives of f, 409 00:17:42,795 --> 00:17:44,940 if an elements of f, with respect 410 00:17:44,940 --> 00:17:48,030 to different elements of x. 411 00:17:48,030 --> 00:17:51,780 So the first row of the Jacobian is the derivatives 412 00:17:51,780 --> 00:17:54,180 of the first elements of f, with respect 413 00:17:54,180 --> 00:17:56,940 to all the elements of x, and the other rows 414 00:17:56,940 --> 00:17:58,890 proceed accordingly. 415 00:17:58,890 --> 00:18:01,110 If the determinant of this matrix, 416 00:18:01,110 --> 00:18:03,750 evaluated at the solution, is not equal to 0, 417 00:18:03,750 --> 00:18:06,000 then this solution is necessarily locally unique. 418 00:18:06,000 --> 00:18:07,499 That's the inverse function theorem. 419 00:18:11,430 --> 00:18:13,280 The Jacobian describes the rate of change 420 00:18:13,280 --> 00:18:15,240 of this vector-valued function with respect 421 00:18:15,240 --> 00:18:18,920 to all of its independent variables. 422 00:18:18,920 --> 00:18:20,930 And you may find that, for some solutions, 423 00:18:20,930 --> 00:18:23,459 the determinant in the Jacobian is equal to 0. 424 00:18:23,459 --> 00:18:25,250 We can't really say what's going on there-- 425 00:18:25,250 --> 00:18:26,900 the solution may be locally unique, 426 00:18:26,900 --> 00:18:28,700 it may not be locally unique. 427 00:18:28,700 --> 00:18:31,602 I'm going to provide you some examples in a second. 428 00:18:31,602 --> 00:18:33,060 And most numerical methods are only 429 00:18:33,060 --> 00:18:36,550 going to find one of these local unique solutions at a time. 430 00:18:36,550 --> 00:18:39,930 If we have some non-local solutions, 431 00:18:39,930 --> 00:18:41,370 that'll cause us problems. 432 00:18:41,370 --> 00:18:44,200 So we tend to want to work with functions that have locally 433 00:18:44,200 --> 00:18:47,970 unique solutions to begin with. 434 00:18:47,970 --> 00:18:48,887 OK, here's an example. 435 00:18:48,887 --> 00:18:50,386 Oh, you have your notes, so you know 436 00:18:50,386 --> 00:18:51,800 the formula for the Jacobian. 437 00:18:51,800 --> 00:18:53,383 Compute the Jacobian of this function. 438 00:19:34,400 --> 00:19:39,790 So this function has a root at x1 equals 0, x2 equals 0. 439 00:19:39,790 --> 00:19:41,440 If you think graphically about what 440 00:19:41,440 --> 00:19:43,330 each of these little functions represents, 441 00:19:43,330 --> 00:19:46,240 you would agree that that route is locally unique-- it's 442 00:19:46,240 --> 00:19:49,420 just one point where both elements 443 00:19:49,420 --> 00:19:52,420 of this vector-valued function are equal to 0. 444 00:19:52,420 --> 00:19:55,441 Here's what the Jacobian of this function should be. 445 00:19:55,441 --> 00:19:57,940 If you take the derivative of the first element with respect 446 00:19:57,940 --> 00:20:00,683 to x1, and then x2, take the derivative 447 00:20:00,683 --> 00:20:03,950 of the second element with respect to x1 and then x2-- 448 00:20:03,950 --> 00:20:08,290 at the solution, at the root of this function, where x1 is 0, 449 00:20:08,290 --> 00:20:12,100 and x2 is 0, the Jacobian is a matrix of zeros. 450 00:20:12,100 --> 00:20:14,010 Its determinant is 0. 451 00:20:14,010 --> 00:20:16,840 But the solution is locally unique. 452 00:20:16,840 --> 00:20:19,330 The inverse function theorem only 453 00:20:19,330 --> 00:20:22,690 tells us about what happens when the determinant's not 454 00:20:22,690 --> 00:20:24,610 equal to 0. 455 00:20:24,610 --> 00:20:27,220 If the determinant's not 0, then we 456 00:20:27,220 --> 00:20:30,380 have a locally unique solution. 457 00:20:30,380 --> 00:20:32,470 Solution may be locally unique, its determinant 458 00:20:32,470 --> 00:20:33,790 may be equal to 0. 459 00:20:33,790 --> 00:20:34,770 Does that makes sense? 460 00:20:34,770 --> 00:20:37,260 You see how that plays out? 461 00:20:37,260 --> 00:20:39,240 OK. 462 00:20:39,240 --> 00:20:41,069 There's a physical way to think about-- 463 00:20:41,069 --> 00:20:43,360 or a geometric way to think about this inverse function 464 00:20:43,360 --> 00:20:43,859 theorem. 465 00:20:43,859 --> 00:20:45,740 So think about the linear equation, 466 00:20:45,740 --> 00:20:48,754 f of x is A x minus b. 467 00:20:48,754 --> 00:20:51,170 You can show-- and you should actually work through this-- 468 00:20:51,170 --> 00:20:52,820 that the Jacobian of this function 469 00:20:52,820 --> 00:20:58,180 is just the matrix A. It says how the function changes, 470 00:20:58,180 --> 00:21:00,620 with respect to small changes in x. 471 00:21:00,620 --> 00:21:01,840 Well, that's just A-- 472 00:21:01,840 --> 00:21:05,090 this is a linear function. 473 00:21:05,090 --> 00:21:07,000 So the equation, f of x equals 0, 474 00:21:07,000 --> 00:21:09,310 has a locally unique solution when the determinant 475 00:21:09,310 --> 00:21:12,400 of the Jacobian-- which is the determinant of A-- 476 00:21:12,400 --> 00:21:13,666 is not equal to 0. 477 00:21:13,666 --> 00:21:15,040 But you already knew that, right? 478 00:21:15,040 --> 00:21:18,530 We already talked through linear algebra. 479 00:21:18,530 --> 00:21:22,370 And so you know when this matrix A is singular, 480 00:21:22,370 --> 00:21:24,280 then we can't invert this system of equations 481 00:21:24,280 --> 00:21:27,190 and find a unique solution in the first place. 482 00:21:27,190 --> 00:21:29,890 So the inverse function theorem is nothing more 483 00:21:29,890 --> 00:21:32,950 than an extension of what we learned about when functions 484 00:21:32,950 --> 00:21:34,720 are and aren't invertible. 485 00:21:34,720 --> 00:21:37,390 Because a locally unique solution when A is invertible. 486 00:21:39,940 --> 00:21:43,480 In the neighborhood of f of x, in the neighborhood 487 00:21:43,480 --> 00:21:47,410 of a root of f of x, we can often approximate the function 488 00:21:47,410 --> 00:21:49,360 as being linear. 489 00:21:49,360 --> 00:21:52,120 We can treat it as though it's a system of linear equations, 490 00:21:52,120 --> 00:21:54,989 very close to that root. 491 00:21:54,989 --> 00:21:57,280 And then the things that we learned from linear algebra 492 00:21:57,280 --> 00:22:01,540 are inherited by these linearized solutions. 493 00:22:01,540 --> 00:22:05,680 So here's this set of curves that I showed you before. 494 00:22:05,680 --> 00:22:08,920 Near this root, let's zoom in-- 495 00:22:08,920 --> 00:22:10,510 let's zoom in. 496 00:22:10,510 --> 00:22:11,980 These lines look mostly straight. 497 00:22:11,980 --> 00:22:16,855 It's like the place where two planes intersect-- 498 00:22:16,855 --> 00:22:19,920 intersect this x1, x2 plane-- they each intersect at a line, 499 00:22:19,920 --> 00:22:23,010 and the crossing of those lines is the solution. 500 00:22:23,010 --> 00:22:24,010 And it's locally unique. 501 00:22:24,010 --> 00:22:29,880 Because these two planes span different subspaces. 502 00:22:29,880 --> 00:22:31,770 Here's the case where we may have 503 00:22:31,770 --> 00:22:33,180 non-locally unique [INAUDIBLE]. 504 00:22:33,180 --> 00:22:36,090 Zoom in on this root, and very close to this root, 505 00:22:36,090 --> 00:22:37,650 well, it's hard to tell. 506 00:22:37,650 --> 00:22:40,620 Maybe these two planes are coincident with each other, 507 00:22:40,620 --> 00:22:42,970 and they intersect and form the same line-- 508 00:22:42,970 --> 00:22:45,360 in which case they may not be locally unique. 509 00:22:45,360 --> 00:22:47,970 Maybe if I zoom in close enough, I see, no, actually, they 510 00:22:47,970 --> 00:22:50,053 have a slightly different orientation with respect 511 00:22:50,053 --> 00:22:53,340 to each other, and there is a locally unique solution there. 512 00:22:53,340 --> 00:22:56,760 It's difficult to tell here. 513 00:22:56,760 --> 00:22:59,490 So these cases where the curves cross are easy to determine. 514 00:22:59,490 --> 00:23:00,960 These are the ones that the inverse function 515 00:23:00,960 --> 00:23:01,967 theorem tells us about. 516 00:23:01,967 --> 00:23:03,800 These ones are a little harder to work with. 517 00:23:07,630 --> 00:23:11,950 So I mentioned that you can zoom in, and look close to a root, 518 00:23:11,950 --> 00:23:14,090 and approximate the function is linear. 519 00:23:14,090 --> 00:23:16,680 This is a process called linearization. 520 00:23:16,680 --> 00:23:19,450 You are in this for 1-D functions-- 521 00:23:19,450 --> 00:23:22,510 f of x, at a point x plus delta x, 522 00:23:22,510 --> 00:23:26,815 is f of x plus its derivative times delta x. 523 00:23:29,480 --> 00:23:32,460 And this'll typically be valid for reasonably well-behaved 524 00:23:32,460 --> 00:23:34,200 functions-- this sort of a linearization 525 00:23:34,200 --> 00:23:36,660 is going to be valid as delta x goes to 0. 526 00:23:36,660 --> 00:23:38,640 So as long as I haven't moved too far away 527 00:23:38,640 --> 00:23:42,870 from the point to x, I can approximate my function 528 00:23:42,870 --> 00:23:45,894 in the neighborhood of x using this linearization. 529 00:23:45,894 --> 00:23:47,310 You know, turns out the same thing 530 00:23:47,310 --> 00:23:49,050 is true for nonlinear functions. 531 00:23:49,050 --> 00:23:53,070 So f of x plus delta x is f of x plus-- 532 00:23:53,070 --> 00:23:56,580 well, I need the derivatives of my function with respect to x, 533 00:23:56,580 --> 00:23:59,020 and those derivatives are partial derivatives now, 534 00:23:59,020 --> 00:24:01,440 because we're in higher dimensional spaces. 535 00:24:01,440 --> 00:24:04,290 That's the Jacobian multiplied by this vector 536 00:24:04,290 --> 00:24:06,300 of displacements, delta x. 537 00:24:06,300 --> 00:24:08,040 And this will typically be valid as long 538 00:24:08,040 --> 00:24:11,250 as the length of this delta x is not too big-- 539 00:24:11,250 --> 00:24:14,130 as long as I haven't moved too far away from the point 540 00:24:14,130 --> 00:24:16,860 I'm interested in, f of x, this will be a reasonably good 541 00:24:16,860 --> 00:24:17,580 approximation. 542 00:24:17,580 --> 00:24:21,330 As long as our functions are well-behaved. 543 00:24:21,330 --> 00:24:25,762 There's an error that's incurred in making these approximations. 544 00:24:25,762 --> 00:24:27,720 And for a general function that's well-behaved, 545 00:24:27,720 --> 00:24:30,480 that error is going to be ordered delta x squared-- 546 00:24:30,480 --> 00:24:33,470 and they're in the 1-D, or the multi-dimensional case. 547 00:24:35,872 --> 00:24:37,330 And this sort of an expansion, it's 548 00:24:37,330 --> 00:24:40,990 just part of a Taylor expansion for each component of f of x. 549 00:24:40,990 --> 00:24:43,270 So we take element I of f. 550 00:24:43,270 --> 00:24:45,870 I want to know its value at x plus delta x. 551 00:24:45,870 --> 00:24:47,650 That's its value at x. 552 00:24:47,650 --> 00:24:50,620 Plus the sum of partial derivatives of that element, 553 00:24:50,620 --> 00:24:54,400 with respect to each of the elements of x, times delta x-- 554 00:24:54,400 --> 00:24:56,320 each of those delta x's. 555 00:24:56,320 --> 00:24:57,820 Plus, there are some high order term 556 00:24:57,820 --> 00:25:01,090 in this Taylor expansion, which is quadratic in delta x 557 00:25:01,090 --> 00:25:01,610 instead. 558 00:25:01,610 --> 00:25:03,440 There's a cubic term, and so on. 559 00:25:03,440 --> 00:25:05,950 These quadratic terms are what give rise to this order, 560 00:25:05,950 --> 00:25:09,100 delta x squared error. 561 00:25:09,100 --> 00:25:10,570 In higher dimensions, we typically 562 00:25:10,570 --> 00:25:12,730 don't worry about these quadratic terms. 563 00:25:12,730 --> 00:25:15,250 We're pretty satisfied with linearization 564 00:25:15,250 --> 00:25:17,740 of our system of equations. 565 00:25:17,740 --> 00:25:20,680 Sometimes for 1-D nonlinear functions, 566 00:25:20,680 --> 00:25:22,210 you can take advantage of knowing 567 00:25:22,210 --> 00:25:24,970 what these quadratic terms are to do some funny things. 568 00:25:24,970 --> 00:25:27,730 But in many dimensions, you usually don't use that. 569 00:25:27,730 --> 00:25:30,640 Usually you just think about linearizing the solution. 570 00:25:30,640 --> 00:25:33,440 So if I know where the solution is, if I know it's close, 571 00:25:33,440 --> 00:25:36,010 if I can figure out points that are close to the solution, 572 00:25:36,010 --> 00:25:38,760 then I can linearize the function in that neighborhood-- 573 00:25:38,760 --> 00:25:41,440 I can find the solution to the linearized equation, instead. 574 00:25:41,440 --> 00:25:44,394 That's going to be suitably close to the exact solution. 575 00:25:44,394 --> 00:25:45,310 Does that makes sense? 576 00:25:49,970 --> 00:25:54,200 Nonlinear equations, like I said, are solved iteratively. 577 00:25:54,200 --> 00:25:58,310 Which means we make a map in our algorithmic map 578 00:25:58,310 --> 00:26:01,350 which takes some value xy and generates 579 00:26:01,350 --> 00:26:05,390 some new value x plus 1, which is a better approximation 580 00:26:05,390 --> 00:26:07,580 for the solution we're after. 581 00:26:07,580 --> 00:26:11,420 And we designed the map so that the root, x star, 582 00:26:11,420 --> 00:26:13,230 is what's called a fixed point of the map. 583 00:26:13,230 --> 00:26:15,430 So if I put x star in on this side, 584 00:26:15,430 --> 00:26:18,530 I get x star in on the other side. 585 00:26:18,530 --> 00:26:23,420 By design, the root is a fixed point of the map. 586 00:26:23,420 --> 00:26:26,420 The map may converge, or it may not converge, 587 00:26:26,420 --> 00:26:28,820 but the root is a fixed point. 588 00:26:28,820 --> 00:26:31,040 And we'll stop iterating when the map is sufficiently 589 00:26:31,040 --> 00:26:32,090 converged. 590 00:26:32,090 --> 00:26:34,580 You guys came up with two different criteria 591 00:26:34,580 --> 00:26:35,840 for stopping. 592 00:26:35,840 --> 00:26:38,470 One is called the function norm criteria. 593 00:26:38,470 --> 00:26:41,204 I look at how big my function is-- 594 00:26:41,204 --> 00:26:42,620 I'm trying to find the place where 595 00:26:42,620 --> 00:26:43,910 the function is equal to 0. 596 00:26:43,910 --> 00:26:47,030 So I look at how big, in the norm space, 597 00:26:47,030 --> 00:26:50,090 my function is for my current best solution, 598 00:26:50,090 --> 00:26:52,395 and ask if it's smaller than some tolerance epsilon. 599 00:26:52,395 --> 00:26:54,020 If it is, then I say, well, my function 600 00:26:54,020 --> 00:26:56,480 is sufficiently close to 0-- 601 00:26:56,480 --> 00:26:58,790 I'm happy with this solution. 602 00:26:58,790 --> 00:27:00,650 The solution is close enough to satisfying 603 00:27:00,650 --> 00:27:04,899 the original equation that I'll accept it, and I stop. 604 00:27:04,899 --> 00:27:07,190 The other criteria, it's called the step norm criteria. 605 00:27:07,190 --> 00:27:09,470 I look at two successive approximations 606 00:27:09,470 --> 00:27:10,180 for the solution. 607 00:27:10,180 --> 00:27:13,190 I take their difference, and ask, 608 00:27:13,190 --> 00:27:14,990 is the norm of that difference smaller 609 00:27:14,990 --> 00:27:18,680 than either some absolute tolerance, 610 00:27:18,680 --> 00:27:22,070 or some relative tolerance, multiplied by the norm 611 00:27:22,070 --> 00:27:24,650 of my current solution? 612 00:27:24,650 --> 00:27:31,850 So suppose x is a large number, that spacing between these x's 613 00:27:31,850 --> 00:27:35,360 may be quite big, but the relative spacing may actually 614 00:27:35,360 --> 00:27:37,410 be quite small. 615 00:27:37,410 --> 00:27:39,620 And if the relative spacing is small enough, 616 00:27:39,620 --> 00:27:42,350 you might say, well, this is sufficiently converged. 617 00:27:42,350 --> 00:27:46,100 And so that's where this relative error, relative error 618 00:27:46,100 --> 00:27:48,650 tolerance, comes into play in the step norm criteria. 619 00:27:48,650 --> 00:27:52,310 Suppose x is a small number, close to 0 instead. 620 00:27:52,310 --> 00:27:56,180 These steps may be very tiny-- 621 00:27:56,180 --> 00:27:59,180 these steps may be quite tiny. 622 00:27:59,180 --> 00:28:03,140 They may satisfy this relative criteria quite well, 623 00:28:03,140 --> 00:28:05,270 but you may want to put some absolute tolerance 624 00:28:05,270 --> 00:28:08,780 on how far these steps are before you stop instead. 625 00:28:08,780 --> 00:28:12,032 Because these x's may be small in and of themselves. 626 00:28:12,032 --> 00:28:13,490 And so this one is easy to satisfy, 627 00:28:13,490 --> 00:28:14,656 but this one becomes harder. 628 00:28:14,656 --> 00:28:16,316 So you usually use both of these. 629 00:28:16,316 --> 00:28:17,690 Sometimes you have solutions that 630 00:28:17,690 --> 00:28:19,670 are converging toward small numbers, 631 00:28:19,670 --> 00:28:22,050 and then the absolute error tolerance becomes important. 632 00:28:22,050 --> 00:28:23,425 Sometimes you have solutions that 633 00:28:23,425 --> 00:28:25,070 are converging towards large numbers, 634 00:28:25,070 --> 00:28:27,650 and so the relative error tolerance becomes important. 635 00:28:27,650 --> 00:28:29,480 Does that make sense? 636 00:28:29,480 --> 00:28:32,030 Of course, you can't just use this one, or just use that one. 637 00:28:32,030 --> 00:28:35,780 You typically like to use all of these. 638 00:28:35,780 --> 00:28:38,540 Because they can fail. 639 00:28:38,540 --> 00:28:41,474 And if the function norm criterion fail-- 640 00:28:41,474 --> 00:28:42,890 here's an example where I'm taking 641 00:28:42,890 --> 00:28:45,830 some iterations, some approximate solutions 642 00:28:45,830 --> 00:28:49,460 that are headed towards the actual root of this function. 643 00:28:49,460 --> 00:28:53,240 And at some point, I find that this solution 644 00:28:53,240 --> 00:28:55,820 is within epsilon of 0. 645 00:28:55,820 --> 00:28:57,974 And so I'd like to accept this solution, 646 00:28:57,974 --> 00:28:59,390 but graphically it looks like it's 647 00:28:59,390 --> 00:29:00,860 very far away from the root. 648 00:29:00,860 --> 00:29:02,360 So this is a case where the function 649 00:29:02,360 --> 00:29:05,690 has a very shallow slope. 650 00:29:05,690 --> 00:29:08,690 It's a very shallow slope, and the functional criteria, 651 00:29:08,690 --> 00:29:10,670 not so good, really. 652 00:29:10,670 --> 00:29:14,425 I call this a solution, but it's quite a ways away from star. 653 00:29:14,425 --> 00:29:16,550 So sometimes it's going to work, but sometimes it's 654 00:29:16,550 --> 00:29:18,520 not going to work. 655 00:29:18,520 --> 00:29:20,230 Here's the step norm criteria-- here, 656 00:29:20,230 --> 00:29:23,519 I have a function nowhere near a root-- 657 00:29:23,519 --> 00:29:25,310 I have no idea where I am on this function, 658 00:29:25,310 --> 00:29:27,060 I don't know what value this function has, 659 00:29:27,060 --> 00:29:30,527 but my steps suddenly got small enough 660 00:29:30,527 --> 00:29:32,610 that they're smaller than this absolute tolerance, 661 00:29:32,610 --> 00:29:34,818 or they're smaller than the relative error tolerance. 662 00:29:34,818 --> 00:29:36,340 I might say, OK, let's stop. 663 00:29:36,340 --> 00:29:39,239 I'm not taking very large steps anymore, 664 00:29:39,239 --> 00:29:40,780 this seems like a good place to quit. 665 00:29:40,780 --> 00:29:42,792 But actually, my function just after this 666 00:29:42,792 --> 00:29:45,250 didn't go to 0 at all, it curved up and went the other way. 667 00:29:45,250 --> 00:29:47,770 There's not even a solution nearby. 668 00:29:47,770 --> 00:29:49,810 So both of these things can fail, 669 00:29:49,810 --> 00:29:51,550 and we try to use both of them instead 670 00:29:51,550 --> 00:29:54,700 to evaluate whether we have a reasonable solution 671 00:29:54,700 --> 00:29:57,000 to our nonlinear equation or not. 672 00:29:57,000 --> 00:29:57,610 Make sense? 673 00:30:00,750 --> 00:30:04,114 Are there any questions about that before you I go on? 674 00:30:04,114 --> 00:30:05,998 No. 675 00:30:05,998 --> 00:30:07,420 OK. 676 00:30:07,420 --> 00:30:09,580 We also talk oftentimes about the rate 677 00:30:09,580 --> 00:30:13,780 of convergence of the iterative process that we're using. 678 00:30:13,780 --> 00:30:18,239 We might say it converges linearly, or quadratically. 679 00:30:18,239 --> 00:30:19,780 And the rate of convergence is always 680 00:30:19,780 --> 00:30:24,310 assessed by looking at the difference 681 00:30:24,310 --> 00:30:28,140 between successive-- well, we look 682 00:30:28,140 --> 00:30:31,550 at the ratio of differences for successive approximation. 683 00:30:31,550 --> 00:30:35,025 So here's the difference between my best approximation, step 684 00:30:35,025 --> 00:30:37,950 i plus 1 minus the exact solution, 685 00:30:37,950 --> 00:30:41,880 normed, divided by my best approximation at step i, 686 00:30:41,880 --> 00:30:46,770 minus the exact solution normed, and raised to some power, q. 687 00:30:46,770 --> 00:30:51,650 And as I go to very large numbers of iterations, i-- 688 00:30:51,650 --> 00:30:54,240 this limit should be i, I apologize. 689 00:30:54,240 --> 00:30:55,845 I'll fix that in the notes online, 690 00:30:55,845 --> 00:30:58,580 but this limit should be i-- is i, the number of steps 691 00:30:58,580 --> 00:31:00,380 gets very large. 692 00:31:00,380 --> 00:31:04,470 This ratio should converge to some constant. 693 00:31:04,470 --> 00:31:06,254 And the ratio will converge to a constant 694 00:31:06,254 --> 00:31:07,920 when I choose the right power of q here. 695 00:31:10,750 --> 00:31:14,491 So when this limit exists, and it doesn't go to 0, 696 00:31:14,491 --> 00:31:16,490 we can identify what sort of convergence we get. 697 00:31:16,490 --> 00:31:19,370 So if q equals 1, and C is smaller 698 00:31:19,370 --> 00:31:21,544 than when we say that convergence is linear, 699 00:31:21,544 --> 00:31:22,710 what is that saying, really? 700 00:31:22,710 --> 00:31:25,460 This top step here is the absolute error, 701 00:31:25,460 --> 00:31:27,680 an approximation i plus 1. 702 00:31:27,680 --> 00:31:31,660 This bottom step here is the absolute error in step i-- 703 00:31:31,660 --> 00:31:34,110 remember two is one for linear convergence. 704 00:31:34,110 --> 00:31:38,870 So the ratio of absolute errors, as long as that's less than 1-- 705 00:31:38,870 --> 00:31:42,570 I'm converging, I'm moving my way towards the solution. 706 00:31:42,570 --> 00:31:45,020 And we say that rate is linear. 707 00:31:45,020 --> 00:31:48,140 If C is 10 to the minus 1, then each iteration 708 00:31:48,140 --> 00:31:52,070 will be one digit more accurate than the previous one. 709 00:31:52,070 --> 00:31:54,440 The absolute error will be 10 times smaller 710 00:31:54,440 --> 00:31:56,720 in the next iteration versus the previous one. 711 00:31:56,720 --> 00:31:57,710 That would be great-- 712 00:31:57,710 --> 00:31:59,150 usually C isn't that small. 713 00:32:02,210 --> 00:32:05,850 If this power, for which this limit exists, q, 714 00:32:05,850 --> 00:32:10,140 is bigger than 1, we say the convergence is super linear. 715 00:32:10,140 --> 00:32:12,750 If q is 2, which we'll see is something 716 00:32:12,750 --> 00:32:15,540 that results from the Newton-Raphson method, 717 00:32:15,540 --> 00:32:18,960 then we say convergence is quadratic. 718 00:32:18,960 --> 00:32:22,170 What that means is the number of accurate digits in my solution 719 00:32:22,170 --> 00:32:25,060 will actually double with each iteration. 720 00:32:25,060 --> 00:32:28,320 Linear, which equals 10 to the minus 1, 721 00:32:28,320 --> 00:32:30,910 I get one digit per iteration. 722 00:32:30,910 --> 00:32:33,910 Quadratic, I double the number of digits per iteration-- 723 00:32:33,910 --> 00:32:35,430 I have one digit on one iteration, 724 00:32:35,430 --> 00:32:37,740 I get two the next one, and four the next one, 725 00:32:37,740 --> 00:32:39,120 and eight the next one. 726 00:32:39,120 --> 00:32:44,130 So quadratic convergence is marvelous. 727 00:32:44,130 --> 00:32:46,320 Linear convergence, that's OK. 728 00:32:46,320 --> 00:32:49,210 That's about the minimum you'd be willing to accept. 729 00:32:49,210 --> 00:32:50,910 Quadratic convergence is great, so we 730 00:32:50,910 --> 00:32:54,030 aim for methods that try to have these higher order 731 00:32:54,030 --> 00:32:54,780 convergences. 732 00:32:54,780 --> 00:32:57,787 So you really quickly get highly accurate solutions. 733 00:32:57,787 --> 00:32:59,370 You can go back and look at your notes 734 00:32:59,370 --> 00:33:00,980 and see that the Jacobian method, 735 00:33:00,980 --> 00:33:04,950 and the Gauss-Seidel method both show linear convergence rates. 736 00:33:04,950 --> 00:33:07,902 They're linear methods. 737 00:33:07,902 --> 00:33:09,710 Does this make sense? 738 00:33:09,710 --> 00:33:11,145 OK. 739 00:33:11,145 --> 00:33:12,770 So I meant it mentioned Newton-Raphson. 740 00:33:12,770 --> 00:33:14,450 Hopefully, somebody at some point 741 00:33:14,450 --> 00:33:16,220 told you about the Newton-Raphson method 742 00:33:16,220 --> 00:33:19,790 for solving at least one-dimensional, nonlinear 743 00:33:19,790 --> 00:33:20,450 equations. 744 00:33:20,450 --> 00:33:22,700 It goes like this, though. 745 00:33:22,700 --> 00:33:27,440 You say, I guess my solution is close to this green point here. 746 00:33:27,440 --> 00:33:31,650 Let me linearize my function at that point, 747 00:33:31,650 --> 00:33:34,830 and find where that linear approximation has a root-- 748 00:33:34,830 --> 00:33:36,710 which is this next green point. 749 00:33:36,710 --> 00:33:39,020 And then repeat that process here. 750 00:33:39,020 --> 00:33:42,320 I find the linearization of my function, this pink arrow. 751 00:33:42,320 --> 00:33:45,620 I look for where that linear function has a root, 752 00:33:45,620 --> 00:33:47,390 and that's my next best approximation. 753 00:33:47,390 --> 00:33:49,850 And I repeat this process over and over, 754 00:33:49,850 --> 00:33:51,890 and it will reliably-- 755 00:33:51,890 --> 00:33:56,320 under certain circumstances-- converge to the root. 756 00:33:56,320 --> 00:33:58,610 What does that look like, in terms of the equations? 757 00:33:58,610 --> 00:34:02,420 So I linearized my function, so I 758 00:34:02,420 --> 00:34:06,200 want to approximate f of x at i plus 1, 759 00:34:06,200 --> 00:34:09,050 in terms of f of x at i-- so it's f of x at i, 760 00:34:09,050 --> 00:34:12,210 plus the derivative, multiplied by the difference between x 761 00:34:12,210 --> 00:34:14,050 i plus 1 and x i. 762 00:34:14,050 --> 00:34:16,820 And I say, find the place where this approximation 763 00:34:16,820 --> 00:34:20,270 is equal to 0, and determine what 764 00:34:20,270 --> 00:34:23,679 the next point that I'm going to use to approximate my solution 765 00:34:23,679 --> 00:34:24,179 is. 766 00:34:24,179 --> 00:34:27,710 So I solve for x i plus 1, in terms of x i. 767 00:34:27,710 --> 00:34:30,840 How big of a step do I take from x i to x i plus 1? 768 00:34:30,840 --> 00:34:34,460 It's this big, so the ratio of the function to its derivative 769 00:34:34,460 --> 00:34:36,839 at x i. 770 00:34:36,839 --> 00:34:42,110 And the derivative does the job of telling me which direction I 771 00:34:42,110 --> 00:34:43,850 should step in. 772 00:34:43,850 --> 00:34:45,889 Derivative gives me directionality, 773 00:34:45,889 --> 00:34:50,120 and this ratio here tells me the magnitude of the step. 774 00:34:50,120 --> 00:34:52,010 The magnitudes, you know, they're 775 00:34:52,010 --> 00:34:54,719 not very good oftentimes, because these functions 776 00:34:54,719 --> 00:34:57,260 that we're trying to solve aren't very linear. 777 00:34:57,260 --> 00:34:58,940 Usually they're highly nonlinear. 778 00:34:58,940 --> 00:35:01,457 What's really helpful is getting the direction right. 779 00:35:01,457 --> 00:35:04,040 You could go right, you could go left-- only one of those ways 780 00:35:04,040 --> 00:35:05,480 is getting you to the root. 781 00:35:05,480 --> 00:35:06,980 Newton-Raphson has this advantage-- 782 00:35:06,980 --> 00:35:09,360 it always points you in the right direction. 783 00:35:09,360 --> 00:35:10,880 OK? 784 00:35:10,880 --> 00:35:13,340 Of course, you can do this in any number of dimensions, not 785 00:35:13,340 --> 00:35:16,040 just one dimension. 786 00:35:16,040 --> 00:35:17,540 So you can approximate your function 787 00:35:17,540 --> 00:35:21,140 as linear-- f of x i plus 1 is approximately 0. 788 00:35:21,140 --> 00:35:23,120 And then let's take our linearized version 789 00:35:23,120 --> 00:35:27,740 of the function, and let's find where it's equal to 0. 790 00:35:27,740 --> 00:35:30,170 Sometimes what's done is to replace 791 00:35:30,170 --> 00:35:33,200 this difference, x i plus 1, minus x i, 792 00:35:33,200 --> 00:35:35,750 with an unknown vector, d i-- 793 00:35:35,750 --> 00:35:37,530 which is the step size. 794 00:35:37,530 --> 00:35:41,540 How big a step am I going to take from x i to x i plus 1? 795 00:35:41,540 --> 00:35:44,720 And so we solve this equation for the displacement, d i. 796 00:35:44,720 --> 00:35:46,860 Move f to the other side, so you have 797 00:35:46,860 --> 00:35:50,410 Jacobian times d i is minus f. 798 00:35:50,410 --> 00:35:53,270 And solve-- d i is Jacobian inverse times f. 799 00:35:53,270 --> 00:35:56,010 It's just a system of linear equations. 800 00:35:56,010 --> 00:35:58,550 Now we know our step size. 801 00:35:58,550 --> 00:36:02,620 So x i plus 1 is x i plus b, or x i plus 1 802 00:36:02,620 --> 00:36:07,600 is x i minus Jacobian inverse times f. 803 00:36:07,600 --> 00:36:09,700 The inverse of the Jacobian plays the same role 804 00:36:09,700 --> 00:36:11,620 is 1 over the derivative. 805 00:36:11,620 --> 00:36:13,960 It's telling us what direction to step in, 806 00:36:13,960 --> 00:36:16,240 in this multi-dimensional space. 807 00:36:16,240 --> 00:36:19,420 And this solution to the system of equations 808 00:36:19,420 --> 00:36:23,530 is giving us a magnitude of the step that's good, not great, 809 00:36:23,530 --> 00:36:26,980 but is taking us closer and closer to the root. 810 00:36:26,980 --> 00:36:28,480 So this is the Newton-Raphson method 811 00:36:28,480 --> 00:36:30,834 applied to the system of nonlinear equations. 812 00:36:30,834 --> 00:36:32,500 This is really the way you want to solve 813 00:36:32,500 --> 00:36:35,430 these sorts of problems. 814 00:36:35,430 --> 00:36:36,920 It doesn't always work-- 815 00:36:36,920 --> 00:36:38,732 things can go wrong. 816 00:36:38,732 --> 00:36:40,190 What sorts of things go wrong here? 817 00:36:40,190 --> 00:36:40,773 Can you guess? 818 00:36:44,010 --> 00:36:44,648 Yeah? 819 00:36:44,648 --> 00:36:47,516 AUDIENCE: [INAUDIBLE] 820 00:36:47,516 --> 00:36:48,860 PROFESSOR: OK, this is good. 821 00:36:48,860 --> 00:36:52,550 So in the 1-D problem, sometimes the Newton-Raphson method 822 00:36:52,550 --> 00:36:54,470 can get stuck. 823 00:36:54,470 --> 00:36:58,190 So it won't have good necessarily global convergence 824 00:36:58,190 --> 00:36:58,700 properties. 825 00:36:58,700 --> 00:37:01,580 If you have a bad initial guess, it might get stuck someplace, 826 00:37:01,580 --> 00:37:02,930 and the iterates will converge. 827 00:37:02,930 --> 00:37:03,920 That can be true. 828 00:37:03,920 --> 00:37:04,878 What else can go wrong? 829 00:37:04,878 --> 00:37:06,424 AUDIENCE: [INAUDIBLE] 830 00:37:06,424 --> 00:37:08,090 PROFESSOR: Good, so if you're derivative 831 00:37:08,090 --> 00:37:10,290 is 0, that's going to be problematic. 832 00:37:10,290 --> 00:37:11,930 What's the multi-dimensional equivalent 833 00:37:11,930 --> 00:37:13,340 of the derivative being 0? 834 00:37:13,340 --> 00:37:15,030 AUDIENCE: [INAUDIBLE] 835 00:37:15,030 --> 00:37:16,470 PROFESSOR: What's that? 836 00:37:16,470 --> 00:37:17,810 Singular Jacobian, right? 837 00:37:17,810 --> 00:37:22,190 So if this is J, the Jacobian, has some null space associated 838 00:37:22,190 --> 00:37:24,110 with it, how am I supposed to figure out 839 00:37:24,110 --> 00:37:26,700 which direction to step in? 840 00:37:26,700 --> 00:37:29,720 There's some arbitrariness associated with the solution 841 00:37:29,720 --> 00:37:31,820 of this system of equations. 842 00:37:31,820 --> 00:37:35,240 So the derivative is 0 in the 1-D example, that's a problem. 843 00:37:35,240 --> 00:37:37,070 That problem gets a little fuzzier, 844 00:37:37,070 --> 00:37:38,630 but it's still a big problem when 845 00:37:38,630 --> 00:37:42,020 we try to solve for the step size, or the step-- 846 00:37:42,020 --> 00:37:43,100 the Newton-Raphson step. 847 00:37:43,100 --> 00:37:44,660 They may not be able to do this. 848 00:37:47,400 --> 00:37:49,040 You don't run into this very often, 849 00:37:49,040 --> 00:37:51,105 but you can, from time to time. 850 00:37:51,105 --> 00:37:52,730 One place where this is going to happen 851 00:37:52,730 --> 00:37:56,137 is if we have a non-locally unique solution. 852 00:37:56,137 --> 00:37:58,220 When we have one of those, we know the determinant 853 00:37:58,220 --> 00:38:02,480 of the Jacobian at that point is going to be 0. 854 00:38:02,480 --> 00:38:04,370 If we're close to those solutions, 855 00:38:04,370 --> 00:38:05,870 well the determinant of the Jacobian 856 00:38:05,870 --> 00:38:08,720 is going to be close to 0-- 857 00:38:08,720 --> 00:38:11,420 you might expect that the system of equations you have to solve 858 00:38:11,420 --> 00:38:14,120 becomes ill-conditioned. 859 00:38:14,120 --> 00:38:16,280 So even though there may be an exact solution 860 00:38:16,280 --> 00:38:18,562 for all the steps leading up to that point, 861 00:38:18,562 --> 00:38:20,270 the equations may become ill-conditioned. 862 00:38:20,270 --> 00:38:22,760 You may not be able to reliably find those solutions 863 00:38:22,760 --> 00:38:24,290 with your computer, either. 864 00:38:24,290 --> 00:38:26,330 So then these steps you take, well, 865 00:38:26,330 --> 00:38:29,380 who knows where they're going at that point. 866 00:38:29,380 --> 00:38:31,577 It's going to be crazy. 867 00:38:31,577 --> 00:38:33,410 There are ways of fixing all these problems. 868 00:38:33,410 --> 00:38:36,442 Let's do an example. 869 00:38:36,442 --> 00:38:37,900 This is a geometry example, but you 870 00:38:37,900 --> 00:38:40,910 can write it is a system of nonlinear equations, as well. 871 00:38:40,910 --> 00:38:43,450 So we have two circles-- 872 00:38:43,450 --> 00:38:47,320 circle f 1, circle f 2 in the x1, x2 plane. 873 00:38:47,320 --> 00:38:49,450 They satisfy-- these are the locus of points, 874 00:38:49,450 --> 00:38:54,115 that satisfy the equation f 1 of x 1 and x 2 equals 0-- 875 00:38:54,115 --> 00:38:55,800 this is the equation for one circle, 876 00:38:55,800 --> 00:38:57,770 and this is the equation for the other circle. 877 00:38:57,770 --> 00:38:59,145 And we want the solution, we want 878 00:38:59,145 --> 00:39:02,920 the roots of this vector-valued function, f, 879 00:39:02,920 --> 00:39:04,330 for vector-valued x. 880 00:39:04,330 --> 00:39:09,110 And those are the intercepts between these two circles. 881 00:39:09,110 --> 00:39:12,110 You can do it using Newton-Raphson, 882 00:39:12,110 --> 00:39:14,990 so you're going to need to know the Jacobian. 883 00:39:14,990 --> 00:39:17,010 So compute the Jacobian of this function. 884 00:39:17,010 --> 00:39:18,840 This is practice-- maybe most of you 885 00:39:18,840 --> 00:39:20,370 know how to compute a Jacobian, but some people 886 00:39:20,370 --> 00:39:21,060 haven't done it before. 887 00:39:21,060 --> 00:39:23,100 So it's always good to make sure you remember 888 00:39:23,100 --> 00:39:25,260 that the first row of the Jacobian 889 00:39:25,260 --> 00:39:28,530 is the derivatives of the first element of f. 890 00:39:28,530 --> 00:39:30,885 And later rows are later elements. 891 00:39:30,885 --> 00:39:32,760 You don't want the transpose of the Jacobian. 892 00:39:32,760 --> 00:39:33,551 Then it won't work. 893 00:40:02,400 --> 00:40:05,130 OK, so should look something like this. 894 00:40:05,130 --> 00:40:08,260 There's your Jacobian. 895 00:40:08,260 --> 00:40:10,360 The Newton-Raphson process tells us 896 00:40:10,360 --> 00:40:14,620 how to take steps from one approximation to the next. 897 00:40:14,620 --> 00:40:21,060 The step is equal to minus the Jacobian inverse evaluated, 898 00:40:21,060 --> 00:40:23,980 and my best guess for the solution multiplied 899 00:40:23,980 --> 00:40:29,080 by the function-- evaluated at my best guess of the solution. 900 00:40:29,080 --> 00:40:31,810 You're never going to compute Jacobian inverse explicitly-- 901 00:40:31,810 --> 00:40:35,170 that's just code for solve this system of linear equations. 902 00:40:35,170 --> 00:40:38,922 So use the slash operator in MatLab, for example. 903 00:40:38,922 --> 00:40:39,630 And here you go-- 904 00:40:39,630 --> 00:40:42,880 I had an initial guess for the solution, iterate 0 905 00:40:42,880 --> 00:40:44,190 at minus 1 and 3. 906 00:40:44,190 --> 00:40:45,884 This is somewhere outside the circles-- 907 00:40:45,884 --> 00:40:47,550 it's pretty far away from the solutions. 908 00:40:47,550 --> 00:40:51,180 But I do my Newton-Raphson steps, I iterate on and on. 909 00:40:51,180 --> 00:40:54,180 And after four steps, you can see 910 00:40:54,180 --> 00:40:56,130 that the step size in absolute value 911 00:40:56,130 --> 00:40:57,540 is order 10 to the minus 3. 912 00:40:57,540 --> 00:41:00,637 The function norm is order 10 to the minus 3, 913 00:41:00,637 --> 00:41:02,470 as well-- and maybe order 10 to the minus 2, 914 00:41:02,470 --> 00:41:03,800 but it's getting down there. 915 00:41:03,800 --> 00:41:05,980 These things are decreasing pretty quickly. 916 00:41:05,980 --> 00:41:07,740 And I move to a point that you'll 917 00:41:07,740 --> 00:41:09,240 see is pretty close to the solution. 918 00:41:13,110 --> 00:41:16,070 Here's some things you need to know about Newton-Raphson, 919 00:41:16,070 --> 00:41:19,160 you'll want to think carefully about as we go forward. 920 00:41:19,160 --> 00:41:24,990 So it possesses a local convergence property. 921 00:41:24,990 --> 00:41:28,260 I'm going to illustrate that graphically for you. 922 00:41:28,260 --> 00:41:31,070 So here, I didn't solve the problem once, I solved it-- 923 00:41:31,070 --> 00:41:32,600 I don't know, 10,000 times. 924 00:41:32,600 --> 00:41:36,610 And I chose different initial points to start iterating with. 925 00:41:36,610 --> 00:41:39,240 Here, minus 1, 3, that was one point. 926 00:41:39,240 --> 00:41:41,300 But I chose a whole bunch of them. 927 00:41:41,300 --> 00:41:43,970 And I asked, how many iterations-- 928 00:41:43,970 --> 00:41:46,100 how many steps did my Newton-Raphson method 929 00:41:46,100 --> 00:41:48,440 have to take before I got sufficiently 930 00:41:48,440 --> 00:41:53,540 close to either this root here, or this root there? 931 00:41:53,540 --> 00:41:55,992 I don't remember what that convergence criterion was-- 932 00:41:55,992 --> 00:41:57,950 it doesn't really matter, but there was some 10 933 00:41:57,950 --> 00:42:00,366 to the minus 3, or 10 to the minus 5 or 10 to the minus 8, 934 00:42:00,366 --> 00:42:02,000 the convergence criterion that I made 935 00:42:02,000 --> 00:42:05,180 sure the Newton-Raphson method hit, in both the function norm 936 00:42:05,180 --> 00:42:08,450 and step norm cases. 937 00:42:08,450 --> 00:42:11,620 And then, if the color on this map is blue-- 938 00:42:11,620 --> 00:42:14,240 the solution converged to this star in the blue zone-- 939 00:42:14,240 --> 00:42:17,390 if the color's orange, the solution converged to the star 940 00:42:17,390 --> 00:42:19,180 in the orange zone. 941 00:42:19,180 --> 00:42:21,920 And if the color is light, it didn't take so many iterations 942 00:42:21,920 --> 00:42:22,544 to converge. 943 00:42:22,544 --> 00:42:24,710 And if the color gets darker, it takes more and more 944 00:42:24,710 --> 00:42:27,080 iterations to converge. 945 00:42:27,080 --> 00:42:28,130 So that's the picture-- 946 00:42:28,130 --> 00:42:29,104 that's this map. 947 00:42:29,104 --> 00:42:30,770 I solved it a bunch of times, and then I 948 00:42:30,770 --> 00:42:33,124 mapped out how many iterations did 949 00:42:33,124 --> 00:42:35,040 it take me to converge to different solutions? 950 00:42:35,040 --> 00:42:37,307 So you can see if I start close to the solution, 951 00:42:37,307 --> 00:42:38,390 the color is really light. 952 00:42:38,390 --> 00:42:41,120 It doesn't take very many iterations to get there. 953 00:42:41,120 --> 00:42:43,335 And the further way I move in this direction-- 954 00:42:43,335 --> 00:42:45,710 still doesn't seem like it take so many iterations to get 955 00:42:45,710 --> 00:42:46,880 there, either. 956 00:42:46,880 --> 00:42:49,100 I need a good initial guess-- 957 00:42:49,100 --> 00:42:51,910 I want to be close to where I think the solution is. 958 00:42:51,910 --> 00:42:54,590 Because once I'm over here somewhere, I do pretty well. 959 00:42:54,590 --> 00:42:56,215 And the same is true on the other side, 960 00:42:56,215 --> 00:42:59,000 because this problem is symmetric. 961 00:42:59,000 --> 00:43:03,170 There's a line down the middle here, and along this line, 962 00:43:03,170 --> 00:43:05,585 the determinant of the Jacobian is equal to 0. 963 00:43:08,580 --> 00:43:11,970 So we talked about these points with the Newton-Raphson method. 964 00:43:11,970 --> 00:43:14,520 And if I pick initial guesses sufficiently close 965 00:43:14,520 --> 00:43:17,970 to this line, you can see the color gets darker and darker. 966 00:43:17,970 --> 00:43:20,100 The number of iterations required to converge 967 00:43:20,100 --> 00:43:23,640 the solution goes way up. 968 00:43:23,640 --> 00:43:28,190 Now, the Newton-Raphson method possesses a local convergence 969 00:43:28,190 --> 00:43:28,740 property. 970 00:43:28,740 --> 00:43:32,750 Which means, if a locally unique solution, 971 00:43:32,750 --> 00:43:34,910 there's always going to be some neighborhood 972 00:43:34,910 --> 00:43:38,090 around that solution for which the determinant of the Jacobian 973 00:43:38,090 --> 00:43:39,560 is not equal to 0. 974 00:43:39,560 --> 00:43:42,710 And in that neighborhood, I can guarantee 975 00:43:42,710 --> 00:43:46,010 that this iterative process will eventually reach the solution. 976 00:43:46,010 --> 00:43:48,070 That's pretty good. 977 00:43:48,070 --> 00:43:48,770 That's handy. 978 00:43:48,770 --> 00:43:50,270 These iterates can go anywhere-- how 979 00:43:50,270 --> 00:43:51,700 do you know you're getting to the solution? 980 00:43:51,700 --> 00:43:52,720 Are you going to waste your time, 981 00:43:52,720 --> 00:43:53,300 or are you going to get there? 982 00:43:53,300 --> 00:43:55,625 So that's this local convergence property associated 983 00:43:55,625 --> 00:43:58,690 with it, which is nice. 984 00:43:58,690 --> 00:44:00,700 But it'll break down as I get to places 985 00:44:00,700 --> 00:44:04,450 where the determent of the Jacobian is equal to 0. 986 00:44:04,450 --> 00:44:06,820 so there could be a zone-- 987 00:44:06,820 --> 00:44:09,730 it's not in this one-- there could be a zone, for example, 988 00:44:09,730 --> 00:44:11,967 like a ring on which the determinant of the Jacobian 989 00:44:11,967 --> 00:44:12,550 sequence is 0. 990 00:44:12,550 --> 00:44:14,410 And if I take a guess inside that ring, who 991 00:44:14,410 --> 00:44:16,276 knows where that solution is going to go to. 992 00:44:16,276 --> 00:44:17,650 Could be something like Sam said, 993 00:44:17,650 --> 00:44:21,580 where the iterative method just bounces around inside that ring 994 00:44:21,580 --> 00:44:24,070 and never converges. 995 00:44:24,070 --> 00:44:26,800 But when I have roots that are locally unique, 996 00:44:26,800 --> 00:44:29,830 and I start with good guesses close to those roots, 997 00:44:29,830 --> 00:44:31,660 I can guarantee the Newton-Raphson method 998 00:44:31,660 --> 00:44:32,620 will converge. 999 00:44:32,620 --> 00:44:35,290 I'll show you next time that not only does it converge, 1000 00:44:35,290 --> 00:44:37,060 but it also converges quadratically. 1001 00:44:37,060 --> 00:44:39,452 So you start sufficiently close to the solution, 1002 00:44:39,452 --> 00:44:41,410 you get to double the number of accurate digits 1003 00:44:41,410 --> 00:44:42,340 in each iteration. 1004 00:44:42,340 --> 00:44:44,200 You can see that happening here. 1005 00:44:44,200 --> 00:44:47,620 OK, so I have one accurate digit, now I have two. 1006 00:44:47,620 --> 00:44:49,967 The next iteration I'll have four, and so on. 1007 00:44:49,967 --> 00:44:51,550 The number of accurate digits is going 1008 00:44:51,550 --> 00:44:56,030 to double at each iteration. 1009 00:44:56,030 --> 00:44:57,870 So that's going to conclude for today. 1010 00:44:57,870 --> 00:45:01,160 Next time, we'll talk about how to fix these problems 1011 00:45:01,160 --> 00:45:02,720 with the Newton-Raphson method. 1012 00:45:02,720 --> 00:45:04,220 So there are going to be cases where 1013 00:45:04,220 --> 00:45:07,345 the convergence isn't ideal, where we can improve things. 1014 00:45:07,345 --> 00:45:08,720 There are going to be cases where 1015 00:45:08,720 --> 00:45:10,525 we don't want to compute the Jacobian 1016 00:45:10,525 --> 00:45:11,720 or the Jacobian inverse. 1017 00:45:11,720 --> 00:45:13,840 And we can improve the method. 1018 00:45:13,840 --> 00:45:15,390 Thanks.