1
00:00:01,540 --> 00:00:03,910
The following content is
provided under a Creative

2
00:00:03,910 --> 00:00:05,300
Commons license.

3
00:00:05,300 --> 00:00:07,510
Your support will help
MIT OpenCourseWare

4
00:00:07,510 --> 00:00:11,600
continue to offer high-quality
educational resources for free.

5
00:00:11,600 --> 00:00:14,140
To make a donation, or to
view additional materials

6
00:00:14,140 --> 00:00:18,100
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:18,100 --> 00:00:18,980
at ocw.mit.edu.

8
00:00:23,180 --> 00:00:25,060
OK, we're moving on.

9
00:00:25,060 --> 00:00:27,340
No more linear algebra.

10
00:00:27,340 --> 00:00:29,980
We're going to try solving
some more difficult problems--

11
00:00:29,980 --> 00:00:30,970
of course, all those
problems will just

12
00:00:30,970 --> 00:00:32,980
be turned into linear
algebra as we move on,

13
00:00:32,980 --> 00:00:37,750
so your expertise now
with different techniques

14
00:00:37,750 --> 00:00:40,687
from linear algebra is
going to come in handy.

15
00:00:40,687 --> 00:00:42,270
The next section of
this course, we're

16
00:00:42,270 --> 00:00:45,320
talking about systems
of nonlinear equations,

17
00:00:45,320 --> 00:00:48,490
and we'll transition into
problems in optimization--

18
00:00:48,490 --> 00:00:50,860
which, turns out, look a lot
like systems of nonlinear

19
00:00:50,860 --> 00:00:51,880
equations, as well.

20
00:00:51,880 --> 00:00:53,710
And we'll try to
leverage what we

21
00:00:53,710 --> 00:00:56,541
learn in the next few lectures
to solve different optimization

22
00:00:56,541 --> 00:00:57,040
problems.

23
00:00:59,457 --> 00:01:01,040
Before going on, I
just want to recap.

24
00:01:01,040 --> 00:01:03,165
All right, last time we
talked about singular value

25
00:01:03,165 --> 00:01:06,940
decomposition-- which is like
an eigenvalue decomposition

26
00:01:06,940 --> 00:01:09,540
for any matrix.

27
00:01:09,540 --> 00:01:11,920
And associated with that
matrix are singular vectors,

28
00:01:11,920 --> 00:01:13,930
left and right singular vectors.

29
00:01:13,930 --> 00:01:17,690
And the singular
values of that matrix.

30
00:01:17,690 --> 00:01:22,570
Your TA, Kristen, reminded
me that you can actually

31
00:01:22,570 --> 00:01:25,390
define a condition number
for any matrix, as well.

32
00:01:25,390 --> 00:01:26,950
The condition we
gave originally was

33
00:01:26,950 --> 00:01:30,235
associated with solving
square systems of equations

34
00:01:30,235 --> 00:01:31,730
that actually have a solution.

35
00:01:31,730 --> 00:01:34,188
But there is a condition number
associated with any matrix,

36
00:01:34,188 --> 00:01:36,680
and it's defined in terms
of its singular values.

37
00:01:36,680 --> 00:01:39,430
So if you go back and look at
the definition of the two norm,

38
00:01:39,430 --> 00:01:42,190
you try to think about the
condition number associated

39
00:01:42,190 --> 00:01:44,700
with the two norm, you'll
see that the conditions

40
00:01:44,700 --> 00:01:46,840
of any matrix maybe
can be defined

41
00:01:46,840 --> 00:01:51,130
as the square root of the ratio
of the biggest and the smallest

42
00:01:51,130 --> 00:01:53,650
singular values of that matrix.

43
00:01:53,650 --> 00:01:56,740
So there's a condition number
associated with any matrix.

44
00:01:56,740 --> 00:01:58,180
The condition
number as an entity

45
00:01:58,180 --> 00:01:59,680
makes most sense
when we're thinking

46
00:01:59,680 --> 00:02:01,990
about how we amplify errors--

47
00:02:01,990 --> 00:02:05,130
numerical errors, solving
systems of equation zones.

48
00:02:05,130 --> 00:02:07,609
It's most easily applied to
square systems that actually

49
00:02:07,609 --> 00:02:09,400
have unique solutions,
but you can apply it

50
00:02:09,400 --> 00:02:11,320
to any system of
equations you want.

51
00:02:11,320 --> 00:02:12,820
And the singular
value decomposition

52
00:02:12,820 --> 00:02:16,560
is one way to tap into that.

53
00:02:16,560 --> 00:02:20,110
OK, the last thing we did
discussing linear algebra

54
00:02:20,110 --> 00:02:23,680
was to talk about iterative
solutions, the systems

55
00:02:23,680 --> 00:02:26,130
of linear equations.

56
00:02:26,130 --> 00:02:28,980
And that's actually
our hook into solutions

57
00:02:28,980 --> 00:02:30,810
to systems of
nonlinear equations.

58
00:02:30,810 --> 00:02:33,420
It's going to turn out
that exact solutions are

59
00:02:33,420 --> 00:02:37,085
hard to come by for anything
but linear systems of equations.

60
00:02:37,085 --> 00:02:38,460
And so we're always
going to have

61
00:02:38,460 --> 00:02:42,960
these iterative algorithms,
where we refine initial guesses

62
00:02:42,960 --> 00:02:45,240
for solutions until we
converge to something

63
00:02:45,240 --> 00:02:48,079
that's a solution to the
problem that we wanted before.

64
00:02:48,079 --> 00:02:49,620
And one question
you should ask, when

65
00:02:49,620 --> 00:02:55,830
you do these sorts of
iterations, is when do I stop?

66
00:02:55,830 --> 00:02:57,930
I don't know the exact
solution to this problem.

67
00:02:57,930 --> 00:03:00,660
I can't say I'm
close enough-- what

68
00:03:00,660 --> 00:03:02,370
does close enough even mean?

69
00:03:02,370 --> 00:03:04,770
So how do I decide to stop?

70
00:03:04,770 --> 00:03:06,570
Do you have any
ideas or suggestions

71
00:03:06,570 --> 00:03:08,050
for how you might do that?

72
00:03:08,050 --> 00:03:09,790
You've done some of this already
in a homework assignment.

73
00:03:09,790 --> 00:03:10,706
But what do you think?

74
00:03:10,706 --> 00:03:12,030
How do you decide to stop?

75
00:03:12,030 --> 00:03:12,824
Yeah?

76
00:03:12,824 --> 00:03:14,116
AUDIENCE: [INAUDIBLE].

77
00:03:18,580 --> 00:03:20,650
PROFESSOR: OK, so this
is one suggestion--

78
00:03:20,650 --> 00:03:24,930
look at my current iteration,
and my next iteration,

79
00:03:24,930 --> 00:03:28,090
ask how far apart are
these two numbers?

80
00:03:28,090 --> 00:03:30,220
If they're
sufficiently far apart,

81
00:03:30,220 --> 00:03:34,540
seems like I've got some more
steps to make before I converge

82
00:03:34,540 --> 00:03:35,620
to my solution.

83
00:03:35,620 --> 00:03:37,660
And if they're sufficiently
close together,

84
00:03:37,660 --> 00:03:39,460
the steps I'm taking
are small enough

85
00:03:39,460 --> 00:03:42,310
that I might be happy
accepting this solution

86
00:03:42,310 --> 00:03:44,040
as a good approximation.

87
00:03:44,650 --> 00:03:46,660
That's called the
step norm criteria--

88
00:03:46,660 --> 00:03:48,880
I'll give you a formalization
of that later-- it's

89
00:03:48,880 --> 00:03:50,530
called the step norm criteria.

90
00:03:50,530 --> 00:03:53,680
How big are the steps
that I'm taking?

91
00:03:53,680 --> 00:03:56,920
And are they sufficiently
small that I don't

92
00:03:56,920 --> 00:03:59,260
care about any future steps?

93
00:03:59,260 --> 00:04:00,070
Another suggestion?

94
00:04:00,070 --> 00:04:01,320
AUDIENCE: I've got a question.

95
00:04:01,320 --> 00:04:01,986
PROFESSOR: Yeah?

96
00:04:01,986 --> 00:04:05,478
AUDIENCE: [INAUDIBLE]
absolute [INAUDIBLE]

97
00:04:05,478 --> 00:04:09,205
when we did that for homework,
I tried to do it [INAUDIBLE],,

98
00:04:09,205 --> 00:04:11,170
and [INAUDIBLE].

99
00:04:11,170 --> 00:04:13,290
PROFESSOR: Yes.

100
00:04:13,290 --> 00:04:16,180
I will show you a definition
of the step norm criteria

101
00:04:16,180 --> 00:04:19,540
that integrates both
relative and absolute error

102
00:04:19,540 --> 00:04:20,500
into the definition.

103
00:04:20,500 --> 00:04:23,710
And we'll see why, OK?

104
00:04:23,710 --> 00:04:25,370
One problem may be--

105
00:04:25,370 --> 00:04:28,839
what if the solution I'm
trying to converge to is 0?

106
00:04:28,839 --> 00:04:30,880
How do you define the
relative error with respect

107
00:04:30,880 --> 00:04:33,196
to the number 0?

108
00:04:33,196 --> 00:04:35,320
There isn't one-- there is
only absolute error when

109
00:04:35,320 --> 00:04:36,611
you're trying to converge to 0.

110
00:04:36,611 --> 00:04:39,260
So you may want to
have some measure

111
00:04:39,260 --> 00:04:41,882
of both absolute and
relative step size

112
00:04:41,882 --> 00:04:43,840
in order to determine
whether you're converged.

113
00:04:43,840 --> 00:04:46,970
Is that the only way
to do it, though?

114
00:04:46,970 --> 00:04:49,600
Have any ideas, alternative
proposals for deciding

115
00:04:49,600 --> 00:04:51,472
convergence?

116
00:04:51,472 --> 00:04:52,930
AUDIENCE: [INAUDIBLE]
the residual.

117
00:04:52,930 --> 00:04:53,710
PROFESSOR: The residual.

118
00:04:53,710 --> 00:04:54,850
OK, what's the residual?

119
00:04:54,850 --> 00:04:56,840
AUDIENCE: [INAUDIBLE]

120
00:04:56,840 --> 00:04:59,560
PROFESSOR: Good, OK, so we
asked, how good a solution--

121
00:04:59,560 --> 00:05:03,760
we can ask how good a solution
is this value that we've

122
00:05:03,760 --> 00:05:07,660
converged to by putting it
back into the original system

123
00:05:07,660 --> 00:05:10,930
of equations, and asking, how
far out of balance are we?

124
00:05:10,930 --> 00:05:13,930
I take my best guess for
solution x, and multiply it by

125
00:05:13,930 --> 00:05:15,940
and A, and I subtract b--

126
00:05:15,940 --> 00:05:18,310
we call that the residual.

127
00:05:18,310 --> 00:05:21,280
And is the residual
sufficiently converged, or not?

128
00:05:21,280 --> 00:05:23,200
If it's small
enough in magnitude,

129
00:05:23,200 --> 00:05:24,970
then we would say,
OK, maybe we're

130
00:05:24,970 --> 00:05:27,130
sufficiently close
to the solution.

131
00:05:27,130 --> 00:05:29,170
If it's too big, then we
say, OK, let's iterate

132
00:05:29,170 --> 00:05:30,670
some more until we get there.

133
00:05:30,670 --> 00:05:33,520
That is called the
function norm criterion.

134
00:05:35,950 --> 00:05:37,700
We're going to talk
about these in detail,

135
00:05:37,700 --> 00:05:40,100
as applied to systems
of nonlinear equations.

136
00:05:40,100 --> 00:05:44,840
But these same criteria apply
to all iterative processes.

137
00:05:44,840 --> 00:05:46,922
Neither is preferred
over the other.

138
00:05:46,922 --> 00:05:48,380
You don't know the
exact solution--

139
00:05:48,380 --> 00:05:50,840
you have no way of measuring
how close or far away

140
00:05:50,840 --> 00:05:52,190
you are from the exact solution.

141
00:05:52,190 --> 00:05:54,740
So usually you use as
many tools as possible

142
00:05:54,740 --> 00:05:57,830
to try to judge how good
your approximation is.

143
00:05:57,830 --> 00:06:01,209
But you don't know for certain.

144
00:06:01,209 --> 00:06:02,750
What do you do, when
that's the case?

145
00:06:02,750 --> 00:06:05,170
We haven't really talked
about this in this class.

146
00:06:05,170 --> 00:06:07,630
You have a problem, you
program it into a computer,

147
00:06:07,630 --> 00:06:08,950
you get a solution.

148
00:06:08,950 --> 00:06:11,860
Is at the end of the story?

149
00:06:11,860 --> 00:06:13,100
We just stop?

150
00:06:13,100 --> 00:06:16,702
You get a number back out,
and that's the answer?

151
00:06:16,702 --> 00:06:17,910
How do you know you're right?

152
00:06:21,902 --> 00:06:22,900
How do you know?

153
00:06:26,400 --> 00:06:27,840
We talked about
numerical error--

154
00:06:27,840 --> 00:06:29,490
every calculation has
a numerical error-- how

155
00:06:29,490 --> 00:06:31,031
do you know you got
the right answer?

156
00:06:35,950 --> 00:06:38,795
What do you think?

157
00:06:38,795 --> 00:06:40,222
AUDIENCE: [INAUDIBLE]

158
00:06:40,222 --> 00:06:41,680
PROFESSOR: OK,
yeah, this is true--

159
00:06:41,680 --> 00:06:44,240
so you plug your solution
back into the equation,

160
00:06:44,240 --> 00:06:47,140
you ask, how good a job
does it do satisfying that?

161
00:06:47,140 --> 00:06:48,970
But maybe this
equation is relatively

162
00:06:48,970 --> 00:06:51,040
insensitive to the
solution you provide.

163
00:06:51,040 --> 00:06:54,160
Maybe many solutions
nearby look like they also

164
00:06:54,160 --> 00:06:56,020
satisfy the equation,
but those solutions

165
00:06:56,020 --> 00:06:57,460
are sufficiently far apart.

166
00:06:57,460 --> 00:06:59,329
So how do you--

167
00:06:59,329 --> 00:07:01,700
AUDIENCE: [INAUDIBLE]

168
00:07:01,700 --> 00:07:05,560
PROFESSOR: OK, so that
sort of physical reasoning

169
00:07:05,560 --> 00:07:06,560
is a good one.

170
00:07:06,560 --> 00:07:09,520
In your transfer
class, you'll talk

171
00:07:09,520 --> 00:07:11,800
about doing
asymptotic expansions,

172
00:07:11,800 --> 00:07:14,380
or asymptotic
solutions to problems.

173
00:07:14,380 --> 00:07:17,110
Solve this complicated
problem in a limit

174
00:07:17,110 --> 00:07:19,240
where it has some
analytical solution,

175
00:07:19,240 --> 00:07:20,800
and figure how the
solution scales,

176
00:07:20,800 --> 00:07:22,960
with respect to
different parameters.

177
00:07:22,960 --> 00:07:24,550
So you can have an
analytical solution

178
00:07:24,550 --> 00:07:26,550
that you compare against
your numerical solution

179
00:07:26,550 --> 00:07:27,820
in certain limits.

180
00:07:27,820 --> 00:07:30,490
You have experiments
that you've done.

181
00:07:30,490 --> 00:07:32,650
Experiment-- that's the reality.

182
00:07:32,650 --> 00:07:35,590
The computer is a fiction
that's trying to model reality,

183
00:07:35,590 --> 00:07:38,050
so you can compare your
solution to experiments.

184
00:07:38,050 --> 00:07:41,650
You could also solve the problem
a bunch of different ways,

185
00:07:41,650 --> 00:07:44,950
and see if all these answers
converge in the same place.

186
00:07:44,950 --> 00:07:47,875
So we're going to talk about
solving nonlinear equations--

187
00:07:47,875 --> 00:07:49,750
we're going to need
different initial guesses

188
00:07:49,750 --> 00:07:52,022
for our iterative methods.

189
00:07:52,022 --> 00:07:53,980
We might try several
different initial guesses,

190
00:07:53,980 --> 00:07:56,160
and see what solutions
we come up with.

191
00:07:56,160 --> 00:07:58,660
Maybe we converge all
to the same solution,

192
00:07:58,660 --> 00:08:00,970
or maybe this problem has
some weird sensitivity in it,

193
00:08:00,970 --> 00:08:03,040
and we get lots of
different solutions

194
00:08:03,040 --> 00:08:05,320
that aren't coordinated
with each other.

195
00:08:05,320 --> 00:08:07,720
One of the duties
of someone who's

196
00:08:07,720 --> 00:08:11,170
using numerical methods to
solve problems is to try

197
00:08:11,170 --> 00:08:13,060
to validate their result--

198
00:08:13,060 --> 00:08:15,940
by solving it multiple
times or multiple ways,

199
00:08:15,940 --> 00:08:18,490
or comparing against
experiment, or against known

200
00:08:18,490 --> 00:08:20,560
analytical results,
and certain limits

201
00:08:20,560 --> 00:08:23,290
where the answer
should be exact.

202
00:08:23,290 --> 00:08:26,840
But you can't just accept
what the computer tells you--

203
00:08:26,840 --> 00:08:31,547
you have to validate it against
some sort of external solution

204
00:08:31,547 --> 00:08:32,630
that you can compare with.

205
00:08:32,630 --> 00:08:34,713
Sometimes it's hard to
come up with that solution,

206
00:08:34,713 --> 00:08:36,669
but it's immensely important.

207
00:08:36,669 --> 00:08:38,664
We know every calculation
can be an error.

208
00:08:38,664 --> 00:08:40,539
And as we go on to more
complicated problems,

209
00:08:40,539 --> 00:08:42,549
it's even more important
to validate things.

210
00:08:45,332 --> 00:08:51,750
So, systems of
nonlinear equations--

211
00:08:51,750 --> 00:08:55,390
so these are
problems of a type f

212
00:08:55,390 --> 00:08:59,770
of x equals 0, where x is
some vector of unknowns,

213
00:08:59,770 --> 00:09:03,790
and has dimension N.
And f is a function that

214
00:09:03,790 --> 00:09:06,865
takes as input vectors
of dimension N,

215
00:09:06,865 --> 00:09:09,790
and gives an output a
vector of dimension N.

216
00:09:09,790 --> 00:09:13,390
It's a map from R N to
R N. But it's no longer

217
00:09:13,390 --> 00:09:16,690
necessarily a linear map-- it
can be some nonlinear function

218
00:09:16,690 --> 00:09:20,180
of all the elements of this x.

219
00:09:20,180 --> 00:09:22,370
And the solution
to this equation,

220
00:09:22,370 --> 00:09:24,650
the particular solution
of this equation, x--

221
00:09:24,650 --> 00:09:26,360
for which f of x
equals 0-- are called

222
00:09:26,360 --> 00:09:30,730
the roots of this
vector-valued function.

223
00:09:30,730 --> 00:09:33,290
The linear equations, then, are
just represented in this form

224
00:09:33,290 --> 00:09:37,200
as A x minus b, A x
minus b equals 0--

225
00:09:37,200 --> 00:09:38,750
it's the same as
the linear equations

226
00:09:38,750 --> 00:09:39,708
we were solving before.

227
00:09:39,708 --> 00:09:43,700
So we're searching for the
roots of these functions.

228
00:09:43,700 --> 00:09:45,200
Common chemical
engineering examples

229
00:09:45,200 --> 00:09:48,140
include equations of
state, often nonlinear,

230
00:09:48,140 --> 00:09:50,150
in terms of the variables
we're interested in.

231
00:09:50,150 --> 00:09:52,310
Energy balances have
lots of non-linearities

232
00:09:52,310 --> 00:09:54,220
introduced into them.

233
00:09:54,220 --> 00:09:57,890
Mass balances with
nonlinear reactions.

234
00:09:57,890 --> 00:10:00,200
Or reactions that
are non-isothermal,

235
00:10:00,200 --> 00:10:02,210
so their kinetics are
sensitive to temperature,

236
00:10:02,210 --> 00:10:04,464
and temperatures are
variable, we want to know.

237
00:10:04,464 --> 00:10:06,380
These sorts of nonlinear
equations crop up all

238
00:10:06,380 --> 00:10:10,450
over the place, and you want to
be able to solve them reliably.

239
00:10:10,450 --> 00:10:12,690
Here's a simple, simple example.

240
00:10:12,690 --> 00:10:14,590
The Van der Waals
equation of state-- here

241
00:10:14,590 --> 00:10:18,400
I've written it in terms of
reduced pressure, temperature,

242
00:10:18,400 --> 00:10:19,390
and molar volume.

243
00:10:19,390 --> 00:10:23,680
Nonetheless, this is the Van
der Waals equation of state.

244
00:10:23,680 --> 00:10:26,500
And somebody told you once
that if I plot pressure

245
00:10:26,500 --> 00:10:30,190
versus molar volume for
different temperatures,

246
00:10:30,190 --> 00:10:32,590
I may see that, at
a given pressure,

247
00:10:32,590 --> 00:10:35,380
there could be just one root--

248
00:10:35,380 --> 00:10:38,150
one possible molar volume
that satisfies the equation

249
00:10:38,150 --> 00:10:38,650
of state.

250
00:10:38,650 --> 00:10:44,212
Or there can be one, two,
three potential roots.

251
00:10:44,212 --> 00:10:46,420
It turns out we don't know,
with nonlinear equations,

252
00:10:46,420 --> 00:10:48,460
how many possible
solutions there are.

253
00:10:48,460 --> 00:10:51,190
We knew, for linear
equations, I either

254
00:10:51,190 --> 00:10:53,562
had zero, one, or an
infinite number of solutions.

255
00:10:53,562 --> 00:10:55,270
But with nonlinear
equations, in general,

256
00:10:55,270 --> 00:10:57,130
there's no way to predict them.

257
00:10:57,130 --> 00:11:00,100
This problem can be
transformed into a polynomial,

258
00:11:00,100 --> 00:11:02,650
and polynomials are one of the
few nonlinear equations where

259
00:11:02,650 --> 00:11:06,110
we know how to bound the
possible number of solutions.

260
00:11:09,220 --> 00:11:11,132
So we can transform
this nonlinear equation

261
00:11:11,132 --> 00:11:12,840
to the form I showed
you before-- we just

262
00:11:12,840 --> 00:11:15,900
move 8/3 T to the other
side of this equation.

263
00:11:15,900 --> 00:11:19,540
We want to find the
roots of this equation.

264
00:11:19,540 --> 00:11:22,080
So possibly, given pressure
and temperature-- pressure

265
00:11:22,080 --> 00:11:24,840
and temperature-- find
all the molar volumes that

266
00:11:24,840 --> 00:11:28,560
satisfy the equation of state.

267
00:11:28,560 --> 00:11:30,240
This is actually
overly simplified

268
00:11:30,240 --> 00:11:32,850
for a particular
physical problem,

269
00:11:32,850 --> 00:11:35,910
of looking at vapor liquid
coexistence of the Van der

270
00:11:35,910 --> 00:11:37,650
Waals fluid.

271
00:11:37,650 --> 00:11:41,490
You can't specify pressure and
temperature independently--

272
00:11:41,490 --> 00:11:43,830
the saturation pressure,
the coexistence pressure

273
00:11:43,830 --> 00:11:47,400
depends on the temperature
as the Gibbs phase rule.

274
00:11:47,400 --> 00:11:51,300
So actually, phase equilibria
is made up of three parts--

275
00:11:51,300 --> 00:11:52,740
there's thermal equilibrium.

276
00:11:52,740 --> 00:11:55,260
I'm going to add two
phases, a gas and a liquid,

277
00:11:55,260 --> 00:11:56,295
and they have to have
the same temperature,

278
00:11:56,295 --> 00:11:57,680
otherwise they're
not in equilibrium.

279
00:11:57,680 --> 00:11:59,400
There's got to be mechanical
equilibrium of two

280
00:11:59,400 --> 00:12:00,852
phases, the gas and the liquid.

281
00:12:00,852 --> 00:12:02,310
They better have
the same pressure,

282
00:12:02,310 --> 00:12:03,750
otherwise one is going
to be pushing harder

283
00:12:03,750 --> 00:12:04,690
on the other one.

284
00:12:04,690 --> 00:12:06,980
They'll be in motion--
that's not in equilibrium.

285
00:12:06,980 --> 00:12:09,480
They've got to have the same
chemical potential-- they have

286
00:12:09,480 --> 00:12:11,190
to be in chemical equilibrium.

287
00:12:11,190 --> 00:12:14,590
So there can't be any net mass
flux from one phase to another,

288
00:12:14,590 --> 00:12:16,132
otherwise, one phase
is going to grow

289
00:12:16,132 --> 00:12:17,506
and the other is
going to shrink.

290
00:12:17,506 --> 00:12:19,320
They're not in equilibrium
with each other.

291
00:12:19,320 --> 00:12:22,860
So actually, the
problem of determining

292
00:12:22,860 --> 00:12:26,610
vapor liquid coexistence,
in this Van der Waals fluid,

293
00:12:26,610 --> 00:12:30,240
involves satisfying a number
of different equations, some

294
00:12:30,240 --> 00:12:32,370
of which are nonlinear,
and are constrained

295
00:12:32,370 --> 00:12:34,050
by the equation of state.

296
00:12:36,790 --> 00:12:40,330
Given the temperature, there are
three unknowns-- the pressure,

297
00:12:40,330 --> 00:12:43,000
and the more volumes of
the gas and the liquid.

298
00:12:43,000 --> 00:12:45,660
And there are three nonlinear
equations we have to solve.

299
00:12:45,660 --> 00:12:47,366
Two of those are the
equation of state,

300
00:12:47,366 --> 00:12:48,490
in the gas and the liquid--

301
00:12:48,490 --> 00:12:49,948
I'll show them to
you in a second--

302
00:12:49,948 --> 00:12:53,059
and the other is this Maxwell
equal area construction,

303
00:12:53,059 --> 00:12:55,600
which essentially says that the
chemical potential in the two

304
00:12:55,600 --> 00:12:56,810
phases is equal.

305
00:12:56,810 --> 00:12:59,770
This is one way of
representing that.

306
00:12:59,770 --> 00:13:03,520
So we have to solve this
system of nonlinear equations

307
00:13:03,520 --> 00:13:05,950
for the saturation
pressure, the molar

308
00:13:05,950 --> 00:13:10,530
volume of the gas or vapor, and
the molar volume of the liquid.

309
00:13:10,530 --> 00:13:11,780
And these are those equations.

310
00:13:11,780 --> 00:13:13,770
Here's the equation
of state and the gas,

311
00:13:13,770 --> 00:13:15,970
here's the equation of
state in the liquid,

312
00:13:15,970 --> 00:13:19,140
and here's the Maxwell
equal area construction

313
00:13:19,140 --> 00:13:22,840
at defined values of
P sat, V G and V L

314
00:13:22,840 --> 00:13:25,050
that satisfy all
three equations.

315
00:13:25,050 --> 00:13:27,780
There's not going to be an
analytical way to do this--

316
00:13:27,780 --> 00:13:29,638
it has to be done numerically.

317
00:13:33,760 --> 00:13:36,070
Here's a simplification
that I can make, though.

318
00:13:36,070 --> 00:13:37,900
So I can take that
equal area construction,

319
00:13:37,900 --> 00:13:41,117
and I can solve for P sat
in terms of V G and V L.

320
00:13:41,117 --> 00:13:42,700
And so that reduces
the dimensionality

321
00:13:42,700 --> 00:13:45,299
of these equations
from three to two.

322
00:13:45,299 --> 00:13:47,590
And when it's two dimensional,
I can plot these things,

323
00:13:47,590 --> 00:13:48,920
so that's helpful.

324
00:13:48,920 --> 00:13:52,170
So let's plot f 1--

325
00:13:52,170 --> 00:13:57,190
the equation of state in the gas
is a function of V G and V L.

326
00:13:57,190 --> 00:14:00,870
Where that's equal to 0,
that's this black curve here.

327
00:14:00,870 --> 00:14:02,710
Let's plot f 2--

328
00:14:02,710 --> 00:14:05,350
the equation of
state in the liquid

329
00:14:05,350 --> 00:14:08,080
as a function of V G and V
L, where that's equal to 0,

330
00:14:08,080 --> 00:14:10,030
that's this blue curve here.

331
00:14:10,030 --> 00:14:12,700
And the solutions are where
these curves intersect.

332
00:14:12,700 --> 00:14:16,150
So we're seeking out the
specific points graphically

333
00:14:16,150 --> 00:14:17,660
where these curves intersect.

334
00:14:17,660 --> 00:14:19,430
First, this solution,
and that solution

335
00:14:19,430 --> 00:14:21,430
aren't the solutions we're
interested in at all.

336
00:14:21,430 --> 00:14:23,950
That would say that the
molar volume of the gas

337
00:14:23,950 --> 00:14:25,362
and the liquid is the same--

338
00:14:25,362 --> 00:14:26,820
that's not really
phase separation.

339
00:14:26,820 --> 00:14:32,530
We want these heterogeneous
solutions out here.

340
00:14:32,530 --> 00:14:34,990
So we need some methodology
that can reliably

341
00:14:34,990 --> 00:14:36,700
take us to those solutions.

342
00:14:36,700 --> 00:14:38,560
We'll see that
that methodology--

343
00:14:38,560 --> 00:14:40,720
the most reliable methodology,
and one of the ones

344
00:14:40,720 --> 00:14:42,690
that converges fastest
to the solutions--

345
00:14:42,690 --> 00:14:45,289
is called the
Newton-Raphson method.

346
00:14:45,289 --> 00:14:47,080
But even before we do
that, let's talk more

347
00:14:47,080 --> 00:14:50,175
about the structure of systems
of nonlinear equations,

348
00:14:50,175 --> 00:14:52,260
and what sort of
solutions we can expect.

349
00:14:52,260 --> 00:14:55,075
Does this example makes
sense to everyone?

350
00:14:55,075 --> 00:14:58,150
Have you thought about
this before, maybe?

351
00:14:58,150 --> 00:15:00,560
Yeah.

352
00:15:00,560 --> 00:15:04,010
So given a function, which
is a map from R N to R N,

353
00:15:04,010 --> 00:15:08,630
find the special solution, x
star, such that f of x star

354
00:15:08,630 --> 00:15:09,200
equals 0.

355
00:15:09,200 --> 00:15:10,040
That's our task.

356
00:15:10,040 --> 00:15:12,267
And there could be no solutions.

357
00:15:12,267 --> 00:15:13,850
There can be one to
an infinite number

358
00:15:13,850 --> 00:15:16,580
of locally unique
solutions, and there

359
00:15:16,580 --> 00:15:19,960
can be an infinite
number of solutions.

360
00:15:19,960 --> 00:15:22,360
A solution is to
be locally unique

361
00:15:22,360 --> 00:15:27,610
if I can wrap that solution by
some ball of points, in which

362
00:15:27,610 --> 00:15:28,930
there are no other solutions.

363
00:15:28,930 --> 00:15:31,210
That ball can be very,
very small, but as long

364
00:15:31,210 --> 00:15:33,430
as I can wrap that solution
in some ball of points,

365
00:15:33,430 --> 00:15:36,220
which are not solutions, we
term that locally unique.

366
00:15:39,490 --> 00:15:41,910
So consider the
simple function, one

367
00:15:41,910 --> 00:15:45,240
which depends on two
variables-- x1 and x2,

368
00:15:45,240 --> 00:15:49,410
and f 2, which depends
on x1 and x2, equals 0.

369
00:15:49,410 --> 00:15:52,602
And I'm going to plot, in
the x1, x2 plane, were f 1

370
00:15:52,602 --> 00:15:55,490
and f 2 are equal to 0,
so that these curves here.

371
00:15:58,280 --> 00:16:00,250
And here, we have a
locally unique solution--

372
00:16:00,250 --> 00:16:03,064
we see the curves cross
at exactly one point.

373
00:16:03,064 --> 00:16:04,480
Here, you can see
these two curves

374
00:16:04,480 --> 00:16:07,930
are tangent with each other.

375
00:16:07,930 --> 00:16:09,760
They could be coincident
with each other,

376
00:16:09,760 --> 00:16:11,800
over some finite distance.

377
00:16:11,800 --> 00:16:14,210
In which case there's a
lot of solutions that live

378
00:16:14,210 --> 00:16:18,370
on some locally tangent area.

379
00:16:18,370 --> 00:16:20,440
Or they could just
touch in one point.

380
00:16:20,440 --> 00:16:22,630
So they may be tangent,
and the solutions there

381
00:16:22,630 --> 00:16:25,300
are not locally unique, or
they may touch at one point,

382
00:16:25,300 --> 00:16:27,450
and the solutions are--

383
00:16:27,450 --> 00:16:29,780
there's one solution, and
it's locally unique there.

384
00:16:33,430 --> 00:16:37,220
The reason why we talk about
locally unique solutions is

385
00:16:37,220 --> 00:16:40,790
it's going to be hard
for a numerical method

386
00:16:40,790 --> 00:16:43,370
to find anything
that's not locally

387
00:16:43,370 --> 00:16:45,470
unique in a reliable way.

388
00:16:45,470 --> 00:16:47,360
Locally unique solutions,
numerical methods

389
00:16:47,360 --> 00:16:48,660
can find very reliably.

390
00:16:48,660 --> 00:16:51,710
But if they're not
locally unique?

391
00:16:51,710 --> 00:16:53,420
My iterative method
could converge

392
00:16:53,420 --> 00:16:56,690
to any one of these solutions
that lives on this line, any

393
00:16:56,690 --> 00:16:57,800
of these tangent points.

394
00:16:57,800 --> 00:16:59,633
And I'm going to have
a hard time predicting

395
00:16:59,633 --> 00:17:01,280
which one it's going to go to.

396
00:17:01,280 --> 00:17:04,069
That's a problem if you're
trying to solve something

397
00:17:04,069 --> 00:17:06,240
reliably over and
over again-- if I

398
00:17:06,240 --> 00:17:08,750
converge to one of these
solutions, or another solution,

399
00:17:08,750 --> 00:17:11,150
or another solution,
the data that

400
00:17:11,150 --> 00:17:15,790
comes out of that process isn't
going to be easy to interpret.

401
00:17:15,790 --> 00:17:18,640
There's something called the
inverse function theorem, which

402
00:17:18,640 --> 00:17:22,829
says if f of x
star is equal to 0,

403
00:17:22,829 --> 00:17:26,550
and the determinant
of this matrix, J,

404
00:17:26,550 --> 00:17:29,400
which we call the Jacobian,
evaluated at x star

405
00:17:29,400 --> 00:17:32,700
is not equal to 0,
then x star necessarily

406
00:17:32,700 --> 00:17:35,050
is a locally unique solution.

407
00:17:35,050 --> 00:17:37,400
So it's the inverse
function theorem.

408
00:17:37,400 --> 00:17:42,795
The Jacobian is a matrix of
the partial derivatives of f,

409
00:17:42,795 --> 00:17:44,940
if an elements of
f, with respect

410
00:17:44,940 --> 00:17:48,030
to different elements of x.

411
00:17:48,030 --> 00:17:51,780
So the first row of the
Jacobian is the derivatives

412
00:17:51,780 --> 00:17:54,180
of the first elements
of f, with respect

413
00:17:54,180 --> 00:17:56,940
to all the elements of
x, and the other rows

414
00:17:56,940 --> 00:17:58,890
proceed accordingly.

415
00:17:58,890 --> 00:18:01,110
If the determinant
of this matrix,

416
00:18:01,110 --> 00:18:03,750
evaluated at the solution,
is not equal to 0,

417
00:18:03,750 --> 00:18:06,000
then this solution is
necessarily locally unique.

418
00:18:06,000 --> 00:18:07,499
That's the inverse
function theorem.

419
00:18:11,430 --> 00:18:13,280
The Jacobian describes
the rate of change

420
00:18:13,280 --> 00:18:15,240
of this vector-valued
function with respect

421
00:18:15,240 --> 00:18:18,920
to all of its
independent variables.

422
00:18:18,920 --> 00:18:20,930
And you may find that,
for some solutions,

423
00:18:20,930 --> 00:18:23,459
the determinant in the
Jacobian is equal to 0.

424
00:18:23,459 --> 00:18:25,250
We can't really say
what's going on there--

425
00:18:25,250 --> 00:18:26,900
the solution may
be locally unique,

426
00:18:26,900 --> 00:18:28,700
it may not be locally unique.

427
00:18:28,700 --> 00:18:31,602
I'm going to provide you
some examples in a second.

428
00:18:31,602 --> 00:18:33,060
And most numerical
methods are only

429
00:18:33,060 --> 00:18:36,550
going to find one of these local
unique solutions at a time.

430
00:18:36,550 --> 00:18:39,930
If we have some
non-local solutions,

431
00:18:39,930 --> 00:18:41,370
that'll cause us problems.

432
00:18:41,370 --> 00:18:44,200
So we tend to want to work with
functions that have locally

433
00:18:44,200 --> 00:18:47,970
unique solutions to begin with.

434
00:18:47,970 --> 00:18:48,887
OK, here's an example.

435
00:18:48,887 --> 00:18:50,386
Oh, you have your
notes, so you know

436
00:18:50,386 --> 00:18:51,800
the formula for the Jacobian.

437
00:18:51,800 --> 00:18:53,383
Compute the Jacobian
of this function.

438
00:19:34,400 --> 00:19:39,790
So this function has a root
at x1 equals 0, x2 equals 0.

439
00:19:39,790 --> 00:19:41,440
If you think
graphically about what

440
00:19:41,440 --> 00:19:43,330
each of these little
functions represents,

441
00:19:43,330 --> 00:19:46,240
you would agree that that
route is locally unique-- it's

442
00:19:46,240 --> 00:19:49,420
just one point
where both elements

443
00:19:49,420 --> 00:19:52,420
of this vector-valued
function are equal to 0.

444
00:19:52,420 --> 00:19:55,441
Here's what the Jacobian
of this function should be.

445
00:19:55,441 --> 00:19:57,940
If you take the derivative of
the first element with respect

446
00:19:57,940 --> 00:20:00,683
to x1, and then x2,
take the derivative

447
00:20:00,683 --> 00:20:03,950
of the second element with
respect to x1 and then x2--

448
00:20:03,950 --> 00:20:08,290
at the solution, at the root of
this function, where x1 is 0,

449
00:20:08,290 --> 00:20:12,100
and x2 is 0, the Jacobian
is a matrix of zeros.

450
00:20:12,100 --> 00:20:14,010
Its determinant is 0.

451
00:20:14,010 --> 00:20:16,840
But the solution
is locally unique.

452
00:20:16,840 --> 00:20:19,330
The inverse function
theorem only

453
00:20:19,330 --> 00:20:22,690
tells us about what happens
when the determinant's not

454
00:20:22,690 --> 00:20:24,610
equal to 0.

455
00:20:24,610 --> 00:20:27,220
If the determinant's
not 0, then we

456
00:20:27,220 --> 00:20:30,380
have a locally unique solution.

457
00:20:30,380 --> 00:20:32,470
Solution may be locally
unique, its determinant

458
00:20:32,470 --> 00:20:33,790
may be equal to 0.

459
00:20:33,790 --> 00:20:34,770
Does that makes sense?

460
00:20:34,770 --> 00:20:37,260
You see how that plays out?

461
00:20:37,260 --> 00:20:39,240
OK.

462
00:20:39,240 --> 00:20:41,069
There's a physical
way to think about--

463
00:20:41,069 --> 00:20:43,360
or a geometric way to think
about this inverse function

464
00:20:43,360 --> 00:20:43,859
theorem.

465
00:20:43,859 --> 00:20:45,740
So think about the
linear equation,

466
00:20:45,740 --> 00:20:48,754
f of x is A x minus b.

467
00:20:48,754 --> 00:20:51,170
You can show-- and you should
actually work through this--

468
00:20:51,170 --> 00:20:52,820
that the Jacobian
of this function

469
00:20:52,820 --> 00:20:58,180
is just the matrix A. It says
how the function changes,

470
00:20:58,180 --> 00:21:00,620
with respect to
small changes in x.

471
00:21:00,620 --> 00:21:01,840
Well, that's just A--

472
00:21:01,840 --> 00:21:05,090
this is a linear function.

473
00:21:05,090 --> 00:21:07,000
So the equation,
f of x equals 0,

474
00:21:07,000 --> 00:21:09,310
has a locally unique
solution when the determinant

475
00:21:09,310 --> 00:21:12,400
of the Jacobian-- which
is the determinant of A--

476
00:21:12,400 --> 00:21:13,666
is not equal to 0.

477
00:21:13,666 --> 00:21:15,040
But you already
knew that, right?

478
00:21:15,040 --> 00:21:18,530
We already talked
through linear algebra.

479
00:21:18,530 --> 00:21:22,370
And so you know when this
matrix A is singular,

480
00:21:22,370 --> 00:21:24,280
then we can't invert
this system of equations

481
00:21:24,280 --> 00:21:27,190
and find a unique solution
in the first place.

482
00:21:27,190 --> 00:21:29,890
So the inverse function
theorem is nothing more

483
00:21:29,890 --> 00:21:32,950
than an extension of what we
learned about when functions

484
00:21:32,950 --> 00:21:34,720
are and aren't invertible.

485
00:21:34,720 --> 00:21:37,390
Because a locally unique
solution when A is invertible.

486
00:21:39,940 --> 00:21:43,480
In the neighborhood of f
of x, in the neighborhood

487
00:21:43,480 --> 00:21:47,410
of a root of f of x, we can
often approximate the function

488
00:21:47,410 --> 00:21:49,360
as being linear.

489
00:21:49,360 --> 00:21:52,120
We can treat it as though it's
a system of linear equations,

490
00:21:52,120 --> 00:21:54,989
very close to that root.

491
00:21:54,989 --> 00:21:57,280
And then the things that we
learned from linear algebra

492
00:21:57,280 --> 00:22:01,540
are inherited by these
linearized solutions.

493
00:22:01,540 --> 00:22:05,680
So here's this set of curves
that I showed you before.

494
00:22:05,680 --> 00:22:08,920
Near this root, let's zoom in--

495
00:22:08,920 --> 00:22:10,510
let's zoom in.

496
00:22:10,510 --> 00:22:11,980
These lines look
mostly straight.

497
00:22:11,980 --> 00:22:16,855
It's like the place where
two planes intersect--

498
00:22:16,855 --> 00:22:19,920
intersect this x1, x2 plane--
they each intersect at a line,

499
00:22:19,920 --> 00:22:23,010
and the crossing of those
lines is the solution.

500
00:22:23,010 --> 00:22:24,010
And it's locally unique.

501
00:22:24,010 --> 00:22:29,880
Because these two planes
span different subspaces.

502
00:22:29,880 --> 00:22:31,770
Here's the case
where we may have

503
00:22:31,770 --> 00:22:33,180
non-locally unique [INAUDIBLE].

504
00:22:33,180 --> 00:22:36,090
Zoom in on this root, and
very close to this root,

505
00:22:36,090 --> 00:22:37,650
well, it's hard to tell.

506
00:22:37,650 --> 00:22:40,620
Maybe these two planes are
coincident with each other,

507
00:22:40,620 --> 00:22:42,970
and they intersect and
form the same line--

508
00:22:42,970 --> 00:22:45,360
in which case they may
not be locally unique.

509
00:22:45,360 --> 00:22:47,970
Maybe if I zoom in close enough,
I see, no, actually, they

510
00:22:47,970 --> 00:22:50,053
have a slightly different
orientation with respect

511
00:22:50,053 --> 00:22:53,340
to each other, and there is a
locally unique solution there.

512
00:22:53,340 --> 00:22:56,760
It's difficult to tell here.

513
00:22:56,760 --> 00:22:59,490
So these cases where the curves
cross are easy to determine.

514
00:22:59,490 --> 00:23:00,960
These are the ones that
the inverse function

515
00:23:00,960 --> 00:23:01,967
theorem tells us about.

516
00:23:01,967 --> 00:23:03,800
These ones are a little
harder to work with.

517
00:23:07,630 --> 00:23:11,950
So I mentioned that you can zoom
in, and look close to a root,

518
00:23:11,950 --> 00:23:14,090
and approximate the
function is linear.

519
00:23:14,090 --> 00:23:16,680
This is a process
called linearization.

520
00:23:16,680 --> 00:23:19,450
You are in this
for 1-D functions--

521
00:23:19,450 --> 00:23:22,510
f of x, at a point
x plus delta x,

522
00:23:22,510 --> 00:23:26,815
is f of x plus its
derivative times delta x.

523
00:23:29,480 --> 00:23:32,460
And this'll typically be valid
for reasonably well-behaved

524
00:23:32,460 --> 00:23:34,200
functions-- this sort
of a linearization

525
00:23:34,200 --> 00:23:36,660
is going to be valid
as delta x goes to 0.

526
00:23:36,660 --> 00:23:38,640
So as long as I haven't
moved too far away

527
00:23:38,640 --> 00:23:42,870
from the point to x, I can
approximate my function

528
00:23:42,870 --> 00:23:45,894
in the neighborhood of x
using this linearization.

529
00:23:45,894 --> 00:23:47,310
You know, turns
out the same thing

530
00:23:47,310 --> 00:23:49,050
is true for nonlinear functions.

531
00:23:49,050 --> 00:23:53,070
So f of x plus delta
x is f of x plus--

532
00:23:53,070 --> 00:23:56,580
well, I need the derivatives of
my function with respect to x,

533
00:23:56,580 --> 00:23:59,020
and those derivatives are
partial derivatives now,

534
00:23:59,020 --> 00:24:01,440
because we're in higher
dimensional spaces.

535
00:24:01,440 --> 00:24:04,290
That's the Jacobian
multiplied by this vector

536
00:24:04,290 --> 00:24:06,300
of displacements, delta x.

537
00:24:06,300 --> 00:24:08,040
And this will typically
be valid as long

538
00:24:08,040 --> 00:24:11,250
as the length of this
delta x is not too big--

539
00:24:11,250 --> 00:24:14,130
as long as I haven't moved
too far away from the point

540
00:24:14,130 --> 00:24:16,860
I'm interested in, f of x,
this will be a reasonably good

541
00:24:16,860 --> 00:24:17,580
approximation.

542
00:24:17,580 --> 00:24:21,330
As long as our functions
are well-behaved.

543
00:24:21,330 --> 00:24:25,762
There's an error that's incurred
in making these approximations.

544
00:24:25,762 --> 00:24:27,720
And for a general function
that's well-behaved,

545
00:24:27,720 --> 00:24:30,480
that error is going to be
ordered delta x squared--

546
00:24:30,480 --> 00:24:33,470
and they're in the 1-D, or
the multi-dimensional case.

547
00:24:35,872 --> 00:24:37,330
And this sort of
an expansion, it's

548
00:24:37,330 --> 00:24:40,990
just part of a Taylor expansion
for each component of f of x.

549
00:24:40,990 --> 00:24:43,270
So we take element I of f.

550
00:24:43,270 --> 00:24:45,870
I want to know its
value at x plus delta x.

551
00:24:45,870 --> 00:24:47,650
That's its value at x.

552
00:24:47,650 --> 00:24:50,620
Plus the sum of partial
derivatives of that element,

553
00:24:50,620 --> 00:24:54,400
with respect to each of the
elements of x, times delta x--

554
00:24:54,400 --> 00:24:56,320
each of those delta x's.

555
00:24:56,320 --> 00:24:57,820
Plus, there are
some high order term

556
00:24:57,820 --> 00:25:01,090
in this Taylor expansion,
which is quadratic in delta x

557
00:25:01,090 --> 00:25:01,610
instead.

558
00:25:01,610 --> 00:25:03,440
There's a cubic term, and so on.

559
00:25:03,440 --> 00:25:05,950
These quadratic terms are
what give rise to this order,

560
00:25:05,950 --> 00:25:09,100
delta x squared error.

561
00:25:09,100 --> 00:25:10,570
In higher dimensions,
we typically

562
00:25:10,570 --> 00:25:12,730
don't worry about
these quadratic terms.

563
00:25:12,730 --> 00:25:15,250
We're pretty satisfied
with linearization

564
00:25:15,250 --> 00:25:17,740
of our system of equations.

565
00:25:17,740 --> 00:25:20,680
Sometimes for 1-D
nonlinear functions,

566
00:25:20,680 --> 00:25:22,210
you can take
advantage of knowing

567
00:25:22,210 --> 00:25:24,970
what these quadratic terms
are to do some funny things.

568
00:25:24,970 --> 00:25:27,730
But in many dimensions,
you usually don't use that.

569
00:25:27,730 --> 00:25:30,640
Usually you just think about
linearizing the solution.

570
00:25:30,640 --> 00:25:33,440
So if I know where the solution
is, if I know it's close,

571
00:25:33,440 --> 00:25:36,010
if I can figure out points
that are close to the solution,

572
00:25:36,010 --> 00:25:38,760
then I can linearize the
function in that neighborhood--

573
00:25:38,760 --> 00:25:41,440
I can find the solution to the
linearized equation, instead.

574
00:25:41,440 --> 00:25:44,394
That's going to be suitably
close to the exact solution.

575
00:25:44,394 --> 00:25:45,310
Does that makes sense?

576
00:25:49,970 --> 00:25:54,200
Nonlinear equations, like I
said, are solved iteratively.

577
00:25:54,200 --> 00:25:58,310
Which means we make a map
in our algorithmic map

578
00:25:58,310 --> 00:26:01,350
which takes some
value xy and generates

579
00:26:01,350 --> 00:26:05,390
some new value x plus 1, which
is a better approximation

580
00:26:05,390 --> 00:26:07,580
for the solution we're after.

581
00:26:07,580 --> 00:26:11,420
And we designed the map
so that the root, x star,

582
00:26:11,420 --> 00:26:13,230
is what's called a
fixed point of the map.

583
00:26:13,230 --> 00:26:15,430
So if I put x star
in on this side,

584
00:26:15,430 --> 00:26:18,530
I get x star in
on the other side.

585
00:26:18,530 --> 00:26:23,420
By design, the root is a
fixed point of the map.

586
00:26:23,420 --> 00:26:26,420
The map may converge,
or it may not converge,

587
00:26:26,420 --> 00:26:28,820
but the root is a fixed point.

588
00:26:28,820 --> 00:26:31,040
And we'll stop iterating
when the map is sufficiently

589
00:26:31,040 --> 00:26:32,090
converged.

590
00:26:32,090 --> 00:26:34,580
You guys came up with
two different criteria

591
00:26:34,580 --> 00:26:35,840
for stopping.

592
00:26:35,840 --> 00:26:38,470
One is called the
function norm criteria.

593
00:26:38,470 --> 00:26:41,204
I look at how big
my function is--

594
00:26:41,204 --> 00:26:42,620
I'm trying to find
the place where

595
00:26:42,620 --> 00:26:43,910
the function is equal to 0.

596
00:26:43,910 --> 00:26:47,030
So I look at how big,
in the norm space,

597
00:26:47,030 --> 00:26:50,090
my function is for my
current best solution,

598
00:26:50,090 --> 00:26:52,395
and ask if it's smaller
than some tolerance epsilon.

599
00:26:52,395 --> 00:26:54,020
If it is, then I say,
well, my function

600
00:26:54,020 --> 00:26:56,480
is sufficiently close to 0--

601
00:26:56,480 --> 00:26:58,790
I'm happy with this solution.

602
00:26:58,790 --> 00:27:00,650
The solution is close
enough to satisfying

603
00:27:00,650 --> 00:27:04,899
the original equation that
I'll accept it, and I stop.

604
00:27:04,899 --> 00:27:07,190
The other criteria, it's
called the step norm criteria.

605
00:27:07,190 --> 00:27:09,470
I look at two successive
approximations

606
00:27:09,470 --> 00:27:10,180
for the solution.

607
00:27:10,180 --> 00:27:13,190
I take their
difference, and ask,

608
00:27:13,190 --> 00:27:14,990
is the norm of that
difference smaller

609
00:27:14,990 --> 00:27:18,680
than either some
absolute tolerance,

610
00:27:18,680 --> 00:27:22,070
or some relative tolerance,
multiplied by the norm

611
00:27:22,070 --> 00:27:24,650
of my current solution?

612
00:27:24,650 --> 00:27:31,850
So suppose x is a large number,
that spacing between these x's

613
00:27:31,850 --> 00:27:35,360
may be quite big, but the
relative spacing may actually

614
00:27:35,360 --> 00:27:37,410
be quite small.

615
00:27:37,410 --> 00:27:39,620
And if the relative
spacing is small enough,

616
00:27:39,620 --> 00:27:42,350
you might say, well, this
is sufficiently converged.

617
00:27:42,350 --> 00:27:46,100
And so that's where this
relative error, relative error

618
00:27:46,100 --> 00:27:48,650
tolerance, comes into play
in the step norm criteria.

619
00:27:48,650 --> 00:27:52,310
Suppose x is a small
number, close to 0 instead.

620
00:27:52,310 --> 00:27:56,180
These steps may be very tiny--

621
00:27:56,180 --> 00:27:59,180
these steps may be quite tiny.

622
00:27:59,180 --> 00:28:03,140
They may satisfy this
relative criteria quite well,

623
00:28:03,140 --> 00:28:05,270
but you may want to put
some absolute tolerance

624
00:28:05,270 --> 00:28:08,780
on how far these steps are
before you stop instead.

625
00:28:08,780 --> 00:28:12,032
Because these x's may be
small in and of themselves.

626
00:28:12,032 --> 00:28:13,490
And so this one is
easy to satisfy,

627
00:28:13,490 --> 00:28:14,656
but this one becomes harder.

628
00:28:14,656 --> 00:28:16,316
So you usually
use both of these.

629
00:28:16,316 --> 00:28:17,690
Sometimes you have
solutions that

630
00:28:17,690 --> 00:28:19,670
are converging
toward small numbers,

631
00:28:19,670 --> 00:28:22,050
and then the absolute error
tolerance becomes important.

632
00:28:22,050 --> 00:28:23,425
Sometimes you have
solutions that

633
00:28:23,425 --> 00:28:25,070
are converging
towards large numbers,

634
00:28:25,070 --> 00:28:27,650
and so the relative error
tolerance becomes important.

635
00:28:27,650 --> 00:28:29,480
Does that make sense?

636
00:28:29,480 --> 00:28:32,030
Of course, you can't just use
this one, or just use that one.

637
00:28:32,030 --> 00:28:35,780
You typically like
to use all of these.

638
00:28:35,780 --> 00:28:38,540
Because they can fail.

639
00:28:38,540 --> 00:28:41,474
And if the function
norm criterion fail--

640
00:28:41,474 --> 00:28:42,890
here's an example
where I'm taking

641
00:28:42,890 --> 00:28:45,830
some iterations, some
approximate solutions

642
00:28:45,830 --> 00:28:49,460
that are headed towards the
actual root of this function.

643
00:28:49,460 --> 00:28:53,240
And at some point, I
find that this solution

644
00:28:53,240 --> 00:28:55,820
is within epsilon of 0.

645
00:28:55,820 --> 00:28:57,974
And so I'd like to
accept this solution,

646
00:28:57,974 --> 00:28:59,390
but graphically
it looks like it's

647
00:28:59,390 --> 00:29:00,860
very far away from the root.

648
00:29:00,860 --> 00:29:02,360
So this is a case
where the function

649
00:29:02,360 --> 00:29:05,690
has a very shallow slope.

650
00:29:05,690 --> 00:29:08,690
It's a very shallow slope,
and the functional criteria,

651
00:29:08,690 --> 00:29:10,670
not so good, really.

652
00:29:10,670 --> 00:29:14,425
I call this a solution, but it's
quite a ways away from star.

653
00:29:14,425 --> 00:29:16,550
So sometimes it's going to
work, but sometimes it's

654
00:29:16,550 --> 00:29:18,520
not going to work.

655
00:29:18,520 --> 00:29:20,230
Here's the step norm
criteria-- here,

656
00:29:20,230 --> 00:29:23,519
I have a function
nowhere near a root--

657
00:29:23,519 --> 00:29:25,310
I have no idea where
I am on this function,

658
00:29:25,310 --> 00:29:27,060
I don't know what value
this function has,

659
00:29:27,060 --> 00:29:30,527
but my steps suddenly
got small enough

660
00:29:30,527 --> 00:29:32,610
that they're smaller than
this absolute tolerance,

661
00:29:32,610 --> 00:29:34,818
or they're smaller than the
relative error tolerance.

662
00:29:34,818 --> 00:29:36,340
I might say, OK, let's stop.

663
00:29:36,340 --> 00:29:39,239
I'm not taking very
large steps anymore,

664
00:29:39,239 --> 00:29:40,780
this seems like a
good place to quit.

665
00:29:40,780 --> 00:29:42,792
But actually, my
function just after this

666
00:29:42,792 --> 00:29:45,250
didn't go to 0 at all, it curved
up and went the other way.

667
00:29:45,250 --> 00:29:47,770
There's not even
a solution nearby.

668
00:29:47,770 --> 00:29:49,810
So both of these
things can fail,

669
00:29:49,810 --> 00:29:51,550
and we try to use
both of them instead

670
00:29:51,550 --> 00:29:54,700
to evaluate whether we
have a reasonable solution

671
00:29:54,700 --> 00:29:57,000
to our nonlinear
equation or not.

672
00:29:57,000 --> 00:29:57,610
Make sense?

673
00:30:00,750 --> 00:30:04,114
Are there any questions about
that before you I go on?

674
00:30:04,114 --> 00:30:05,998
No.

675
00:30:05,998 --> 00:30:07,420
OK.

676
00:30:07,420 --> 00:30:09,580
We also talk oftentimes
about the rate

677
00:30:09,580 --> 00:30:13,780
of convergence of the iterative
process that we're using.

678
00:30:13,780 --> 00:30:18,239
We might say it converges
linearly, or quadratically.

679
00:30:18,239 --> 00:30:19,780
And the rate of
convergence is always

680
00:30:19,780 --> 00:30:24,310
assessed by looking
at the difference

681
00:30:24,310 --> 00:30:28,140
between successive--
well, we look

682
00:30:28,140 --> 00:30:31,550
at the ratio of differences
for successive approximation.

683
00:30:31,550 --> 00:30:35,025
So here's the difference between
my best approximation, step

684
00:30:35,025 --> 00:30:37,950
i plus 1 minus the
exact solution,

685
00:30:37,950 --> 00:30:41,880
normed, divided by my best
approximation at step i,

686
00:30:41,880 --> 00:30:46,770
minus the exact solution normed,
and raised to some power, q.

687
00:30:46,770 --> 00:30:51,650
And as I go to very large
numbers of iterations, i--

688
00:30:51,650 --> 00:30:54,240
this limit should
be i, I apologize.

689
00:30:54,240 --> 00:30:55,845
I'll fix that in
the notes online,

690
00:30:55,845 --> 00:30:58,580
but this limit should be i--
is i, the number of steps

691
00:30:58,580 --> 00:31:00,380
gets very large.

692
00:31:00,380 --> 00:31:04,470
This ratio should
converge to some constant.

693
00:31:04,470 --> 00:31:06,254
And the ratio will
converge to a constant

694
00:31:06,254 --> 00:31:07,920
when I choose the
right power of q here.

695
00:31:10,750 --> 00:31:14,491
So when this limit exists,
and it doesn't go to 0,

696
00:31:14,491 --> 00:31:16,490
we can identify what sort
of convergence we get.

697
00:31:16,490 --> 00:31:19,370
So if q equals 1,
and C is smaller

698
00:31:19,370 --> 00:31:21,544
than when we say that
convergence is linear,

699
00:31:21,544 --> 00:31:22,710
what is that saying, really?

700
00:31:22,710 --> 00:31:25,460
This top step here is
the absolute error,

701
00:31:25,460 --> 00:31:27,680
an approximation i plus 1.

702
00:31:27,680 --> 00:31:31,660
This bottom step here is the
absolute error in step i--

703
00:31:31,660 --> 00:31:34,110
remember two is one
for linear convergence.

704
00:31:34,110 --> 00:31:38,870
So the ratio of absolute errors,
as long as that's less than 1--

705
00:31:38,870 --> 00:31:42,570
I'm converging, I'm moving
my way towards the solution.

706
00:31:42,570 --> 00:31:45,020
And we say that rate is linear.

707
00:31:45,020 --> 00:31:48,140
If C is 10 to the minus
1, then each iteration

708
00:31:48,140 --> 00:31:52,070
will be one digit more
accurate than the previous one.

709
00:31:52,070 --> 00:31:54,440
The absolute error will
be 10 times smaller

710
00:31:54,440 --> 00:31:56,720
in the next iteration
versus the previous one.

711
00:31:56,720 --> 00:31:57,710
That would be great--

712
00:31:57,710 --> 00:31:59,150
usually C isn't that small.

713
00:32:02,210 --> 00:32:05,850
If this power, for which
this limit exists, q,

714
00:32:05,850 --> 00:32:10,140
is bigger than 1, we say the
convergence is super linear.

715
00:32:10,140 --> 00:32:12,750
If q is 2, which
we'll see is something

716
00:32:12,750 --> 00:32:15,540
that results from the
Newton-Raphson method,

717
00:32:15,540 --> 00:32:18,960
then we say convergence
is quadratic.

718
00:32:18,960 --> 00:32:22,170
What that means is the number of
accurate digits in my solution

719
00:32:22,170 --> 00:32:25,060
will actually double
with each iteration.

720
00:32:25,060 --> 00:32:28,320
Linear, which equals
10 to the minus 1,

721
00:32:28,320 --> 00:32:30,910
I get one digit per iteration.

722
00:32:30,910 --> 00:32:33,910
Quadratic, I double the number
of digits per iteration--

723
00:32:33,910 --> 00:32:35,430
I have one digit
on one iteration,

724
00:32:35,430 --> 00:32:37,740
I get two the next one,
and four the next one,

725
00:32:37,740 --> 00:32:39,120
and eight the next one.

726
00:32:39,120 --> 00:32:44,130
So quadratic convergence
is marvelous.

727
00:32:44,130 --> 00:32:46,320
Linear convergence, that's OK.

728
00:32:46,320 --> 00:32:49,210
That's about the minimum
you'd be willing to accept.

729
00:32:49,210 --> 00:32:50,910
Quadratic convergence
is great, so we

730
00:32:50,910 --> 00:32:54,030
aim for methods that try
to have these higher order

731
00:32:54,030 --> 00:32:54,780
convergences.

732
00:32:54,780 --> 00:32:57,787
So you really quickly get
highly accurate solutions.

733
00:32:57,787 --> 00:32:59,370
You can go back and
look at your notes

734
00:32:59,370 --> 00:33:00,980
and see that the
Jacobian method,

735
00:33:00,980 --> 00:33:04,950
and the Gauss-Seidel method both
show linear convergence rates.

736
00:33:04,950 --> 00:33:07,902
They're linear methods.

737
00:33:07,902 --> 00:33:09,710
Does this make sense?

738
00:33:09,710 --> 00:33:11,145
OK.

739
00:33:11,145 --> 00:33:12,770
So I meant it mentioned
Newton-Raphson.

740
00:33:12,770 --> 00:33:14,450
Hopefully, somebody
at some point

741
00:33:14,450 --> 00:33:16,220
told you about the
Newton-Raphson method

742
00:33:16,220 --> 00:33:19,790
for solving at least
one-dimensional, nonlinear

743
00:33:19,790 --> 00:33:20,450
equations.

744
00:33:20,450 --> 00:33:22,700
It goes like this, though.

745
00:33:22,700 --> 00:33:27,440
You say, I guess my solution is
close to this green point here.

746
00:33:27,440 --> 00:33:31,650
Let me linearize my
function at that point,

747
00:33:31,650 --> 00:33:34,830
and find where that linear
approximation has a root--

748
00:33:34,830 --> 00:33:36,710
which is this next green point.

749
00:33:36,710 --> 00:33:39,020
And then repeat
that process here.

750
00:33:39,020 --> 00:33:42,320
I find the linearization of
my function, this pink arrow.

751
00:33:42,320 --> 00:33:45,620
I look for where that
linear function has a root,

752
00:33:45,620 --> 00:33:47,390
and that's my next
best approximation.

753
00:33:47,390 --> 00:33:49,850
And I repeat this
process over and over,

754
00:33:49,850 --> 00:33:51,890
and it will reliably--

755
00:33:51,890 --> 00:33:56,320
under certain circumstances--
converge to the root.

756
00:33:56,320 --> 00:33:58,610
What does that look like,
in terms of the equations?

757
00:33:58,610 --> 00:34:02,420
So I linearized
my function, so I

758
00:34:02,420 --> 00:34:06,200
want to approximate
f of x at i plus 1,

759
00:34:06,200 --> 00:34:09,050
in terms of f of x at
i-- so it's f of x at i,

760
00:34:09,050 --> 00:34:12,210
plus the derivative, multiplied
by the difference between x

761
00:34:12,210 --> 00:34:14,050
i plus 1 and x i.

762
00:34:14,050 --> 00:34:16,820
And I say, find the place
where this approximation

763
00:34:16,820 --> 00:34:20,270
is equal to 0,
and determine what

764
00:34:20,270 --> 00:34:23,679
the next point that I'm going to
use to approximate my solution

765
00:34:23,679 --> 00:34:24,179
is.

766
00:34:24,179 --> 00:34:27,710
So I solve for x i plus
1, in terms of x i.

767
00:34:27,710 --> 00:34:30,840
How big of a step do I take
from x i to x i plus 1?

768
00:34:30,840 --> 00:34:34,460
It's this big, so the ratio of
the function to its derivative

769
00:34:34,460 --> 00:34:36,839
at x i.

770
00:34:36,839 --> 00:34:42,110
And the derivative does the job
of telling me which direction I

771
00:34:42,110 --> 00:34:43,850
should step in.

772
00:34:43,850 --> 00:34:45,889
Derivative gives
me directionality,

773
00:34:45,889 --> 00:34:50,120
and this ratio here tells me
the magnitude of the step.

774
00:34:50,120 --> 00:34:52,010
The magnitudes,
you know, they're

775
00:34:52,010 --> 00:34:54,719
not very good oftentimes,
because these functions

776
00:34:54,719 --> 00:34:57,260
that we're trying to
solve aren't very linear.

777
00:34:57,260 --> 00:34:58,940
Usually they're
highly nonlinear.

778
00:34:58,940 --> 00:35:01,457
What's really helpful is
getting the direction right.

779
00:35:01,457 --> 00:35:04,040
You could go right, you could
go left-- only one of those ways

780
00:35:04,040 --> 00:35:05,480
is getting you to the root.

781
00:35:05,480 --> 00:35:06,980
Newton-Raphson has
this advantage--

782
00:35:06,980 --> 00:35:09,360
it always points you
in the right direction.

783
00:35:09,360 --> 00:35:10,880
OK?

784
00:35:10,880 --> 00:35:13,340
Of course, you can do this in
any number of dimensions, not

785
00:35:13,340 --> 00:35:16,040
just one dimension.

786
00:35:16,040 --> 00:35:17,540
So you can approximate
your function

787
00:35:17,540 --> 00:35:21,140
as linear-- f of x i plus
1 is approximately 0.

788
00:35:21,140 --> 00:35:23,120
And then let's take
our linearized version

789
00:35:23,120 --> 00:35:27,740
of the function, and let's
find where it's equal to 0.

790
00:35:27,740 --> 00:35:30,170
Sometimes what's
done is to replace

791
00:35:30,170 --> 00:35:33,200
this difference, x
i plus 1, minus x i,

792
00:35:33,200 --> 00:35:35,750
with an unknown vector, d i--

793
00:35:35,750 --> 00:35:37,530
which is the step size.

794
00:35:37,530 --> 00:35:41,540
How big a step am I going to
take from x i to x i plus 1?

795
00:35:41,540 --> 00:35:44,720
And so we solve this equation
for the displacement, d i.

796
00:35:44,720 --> 00:35:46,860
Move f to the other
side, so you have

797
00:35:46,860 --> 00:35:50,410
Jacobian times d i is minus f.

798
00:35:50,410 --> 00:35:53,270
And solve-- d i is
Jacobian inverse times f.

799
00:35:53,270 --> 00:35:56,010
It's just a system
of linear equations.

800
00:35:56,010 --> 00:35:58,550
Now we know our step size.

801
00:35:58,550 --> 00:36:02,620
So x i plus 1 is x i
plus b, or x i plus 1

802
00:36:02,620 --> 00:36:07,600
is x i minus Jacobian
inverse times f.

803
00:36:07,600 --> 00:36:09,700
The inverse of the Jacobian
plays the same role

804
00:36:09,700 --> 00:36:11,620
is 1 over the derivative.

805
00:36:11,620 --> 00:36:13,960
It's telling us what
direction to step in,

806
00:36:13,960 --> 00:36:16,240
in this multi-dimensional space.

807
00:36:16,240 --> 00:36:19,420
And this solution to
the system of equations

808
00:36:19,420 --> 00:36:23,530
is giving us a magnitude of the
step that's good, not great,

809
00:36:23,530 --> 00:36:26,980
but is taking us closer
and closer to the root.

810
00:36:26,980 --> 00:36:28,480
So this is the
Newton-Raphson method

811
00:36:28,480 --> 00:36:30,834
applied to the system
of nonlinear equations.

812
00:36:30,834 --> 00:36:32,500
This is really the
way you want to solve

813
00:36:32,500 --> 00:36:35,430
these sorts of problems.

814
00:36:35,430 --> 00:36:36,920
It doesn't always work--

815
00:36:36,920 --> 00:36:38,732
things can go wrong.

816
00:36:38,732 --> 00:36:40,190
What sorts of things
go wrong here?

817
00:36:40,190 --> 00:36:40,773
Can you guess?

818
00:36:44,010 --> 00:36:44,648
Yeah?

819
00:36:44,648 --> 00:36:47,516
AUDIENCE: [INAUDIBLE]

820
00:36:47,516 --> 00:36:48,860
PROFESSOR: OK, this is good.

821
00:36:48,860 --> 00:36:52,550
So in the 1-D problem, sometimes
the Newton-Raphson method

822
00:36:52,550 --> 00:36:54,470
can get stuck.

823
00:36:54,470 --> 00:36:58,190
So it won't have good
necessarily global convergence

824
00:36:58,190 --> 00:36:58,700
properties.

825
00:36:58,700 --> 00:37:01,580
If you have a bad initial guess,
it might get stuck someplace,

826
00:37:01,580 --> 00:37:02,930
and the iterates will converge.

827
00:37:02,930 --> 00:37:03,920
That can be true.

828
00:37:03,920 --> 00:37:04,878
What else can go wrong?

829
00:37:04,878 --> 00:37:06,424
AUDIENCE: [INAUDIBLE]

830
00:37:06,424 --> 00:37:08,090
PROFESSOR: Good, so
if you're derivative

831
00:37:08,090 --> 00:37:10,290
is 0, that's going
to be problematic.

832
00:37:10,290 --> 00:37:11,930
What's the
multi-dimensional equivalent

833
00:37:11,930 --> 00:37:13,340
of the derivative being 0?

834
00:37:13,340 --> 00:37:15,030
AUDIENCE: [INAUDIBLE]

835
00:37:15,030 --> 00:37:16,470
PROFESSOR: What's that?

836
00:37:16,470 --> 00:37:17,810
Singular Jacobian, right?

837
00:37:17,810 --> 00:37:22,190
So if this is J, the Jacobian,
has some null space associated

838
00:37:22,190 --> 00:37:24,110
with it, how am I
supposed to figure out

839
00:37:24,110 --> 00:37:26,700
which direction to step in?

840
00:37:26,700 --> 00:37:29,720
There's some arbitrariness
associated with the solution

841
00:37:29,720 --> 00:37:31,820
of this system of equations.

842
00:37:31,820 --> 00:37:35,240
So the derivative is 0 in the
1-D example, that's a problem.

843
00:37:35,240 --> 00:37:37,070
That problem gets
a little fuzzier,

844
00:37:37,070 --> 00:37:38,630
but it's still a
big problem when

845
00:37:38,630 --> 00:37:42,020
we try to solve for the
step size, or the step--

846
00:37:42,020 --> 00:37:43,100
the Newton-Raphson step.

847
00:37:43,100 --> 00:37:44,660
They may not be able to do this.

848
00:37:47,400 --> 00:37:49,040
You don't run into
this very often,

849
00:37:49,040 --> 00:37:51,105
but you can, from time to time.

850
00:37:51,105 --> 00:37:52,730
One place where this
is going to happen

851
00:37:52,730 --> 00:37:56,137
is if we have a non-locally
unique solution.

852
00:37:56,137 --> 00:37:58,220
When we have one of those,
we know the determinant

853
00:37:58,220 --> 00:38:02,480
of the Jacobian at that
point is going to be 0.

854
00:38:02,480 --> 00:38:04,370
If we're close to
those solutions,

855
00:38:04,370 --> 00:38:05,870
well the determinant
of the Jacobian

856
00:38:05,870 --> 00:38:08,720
is going to be close to 0--

857
00:38:08,720 --> 00:38:11,420
you might expect that the system
of equations you have to solve

858
00:38:11,420 --> 00:38:14,120
becomes ill-conditioned.

859
00:38:14,120 --> 00:38:16,280
So even though there
may be an exact solution

860
00:38:16,280 --> 00:38:18,562
for all the steps
leading up to that point,

861
00:38:18,562 --> 00:38:20,270
the equations may
become ill-conditioned.

862
00:38:20,270 --> 00:38:22,760
You may not be able to
reliably find those solutions

863
00:38:22,760 --> 00:38:24,290
with your computer, either.

864
00:38:24,290 --> 00:38:26,330
So then these steps
you take, well,

865
00:38:26,330 --> 00:38:29,380
who knows where they're
going at that point.

866
00:38:29,380 --> 00:38:31,577
It's going to be crazy.

867
00:38:31,577 --> 00:38:33,410
There are ways of fixing
all these problems.

868
00:38:33,410 --> 00:38:36,442
Let's do an example.

869
00:38:36,442 --> 00:38:37,900
This is a geometry
example, but you

870
00:38:37,900 --> 00:38:40,910
can write it is a system of
nonlinear equations, as well.

871
00:38:40,910 --> 00:38:43,450
So we have two circles--

872
00:38:43,450 --> 00:38:47,320
circle f 1, circle f
2 in the x1, x2 plane.

873
00:38:47,320 --> 00:38:49,450
They satisfy-- these
are the locus of points,

874
00:38:49,450 --> 00:38:54,115
that satisfy the equation f
1 of x 1 and x 2 equals 0--

875
00:38:54,115 --> 00:38:55,800
this is the equation
for one circle,

876
00:38:55,800 --> 00:38:57,770
and this is the equation
for the other circle.

877
00:38:57,770 --> 00:38:59,145
And we want the
solution, we want

878
00:38:59,145 --> 00:39:02,920
the roots of this
vector-valued function, f,

879
00:39:02,920 --> 00:39:04,330
for vector-valued x.

880
00:39:04,330 --> 00:39:09,110
And those are the intercepts
between these two circles.

881
00:39:09,110 --> 00:39:12,110
You can do it using
Newton-Raphson,

882
00:39:12,110 --> 00:39:14,990
so you're going to need
to know the Jacobian.

883
00:39:14,990 --> 00:39:17,010
So compute the Jacobian
of this function.

884
00:39:17,010 --> 00:39:18,840
This is practice--
maybe most of you

885
00:39:18,840 --> 00:39:20,370
know how to compute a
Jacobian, but some people

886
00:39:20,370 --> 00:39:21,060
haven't done it before.

887
00:39:21,060 --> 00:39:23,100
So it's always good to
make sure you remember

888
00:39:23,100 --> 00:39:25,260
that the first row
of the Jacobian

889
00:39:25,260 --> 00:39:28,530
is the derivatives of
the first element of f.

890
00:39:28,530 --> 00:39:30,885
And later rows are
later elements.

891
00:39:30,885 --> 00:39:32,760
You don't want the
transpose of the Jacobian.

892
00:39:32,760 --> 00:39:33,551
Then it won't work.

893
00:40:02,400 --> 00:40:05,130
OK, so should look
something like this.

894
00:40:05,130 --> 00:40:08,260
There's your Jacobian.

895
00:40:08,260 --> 00:40:10,360
The Newton-Raphson
process tells us

896
00:40:10,360 --> 00:40:14,620
how to take steps from one
approximation to the next.

897
00:40:14,620 --> 00:40:21,060
The step is equal to minus the
Jacobian inverse evaluated,

898
00:40:21,060 --> 00:40:23,980
and my best guess for
the solution multiplied

899
00:40:23,980 --> 00:40:29,080
by the function-- evaluated at
my best guess of the solution.

900
00:40:29,080 --> 00:40:31,810
You're never going to compute
Jacobian inverse explicitly--

901
00:40:31,810 --> 00:40:35,170
that's just code for solve this
system of linear equations.

902
00:40:35,170 --> 00:40:38,922
So use the slash operator
in MatLab, for example.

903
00:40:38,922 --> 00:40:39,630
And here you go--

904
00:40:39,630 --> 00:40:42,880
I had an initial guess for
the solution, iterate 0

905
00:40:42,880 --> 00:40:44,190
at minus 1 and 3.

906
00:40:44,190 --> 00:40:45,884
This is somewhere
outside the circles--

907
00:40:45,884 --> 00:40:47,550
it's pretty far away
from the solutions.

908
00:40:47,550 --> 00:40:51,180
But I do my Newton-Raphson
steps, I iterate on and on.

909
00:40:51,180 --> 00:40:54,180
And after four
steps, you can see

910
00:40:54,180 --> 00:40:56,130
that the step size
in absolute value

911
00:40:56,130 --> 00:40:57,540
is order 10 to the minus 3.

912
00:40:57,540 --> 00:41:00,637
The function norm is
order 10 to the minus 3,

913
00:41:00,637 --> 00:41:02,470
as well-- and maybe
order 10 to the minus 2,

914
00:41:02,470 --> 00:41:03,800
but it's getting down there.

915
00:41:03,800 --> 00:41:05,980
These things are
decreasing pretty quickly.

916
00:41:05,980 --> 00:41:07,740
And I move to a
point that you'll

917
00:41:07,740 --> 00:41:09,240
see is pretty close
to the solution.

918
00:41:13,110 --> 00:41:16,070
Here's some things you need
to know about Newton-Raphson,

919
00:41:16,070 --> 00:41:19,160
you'll want to think carefully
about as we go forward.

920
00:41:19,160 --> 00:41:24,990
So it possesses a local
convergence property.

921
00:41:24,990 --> 00:41:28,260
I'm going to illustrate
that graphically for you.

922
00:41:28,260 --> 00:41:31,070
So here, I didn't solve the
problem once, I solved it--

923
00:41:31,070 --> 00:41:32,600
I don't know, 10,000 times.

924
00:41:32,600 --> 00:41:36,610
And I chose different initial
points to start iterating with.

925
00:41:36,610 --> 00:41:39,240
Here, minus 1, 3,
that was one point.

926
00:41:39,240 --> 00:41:41,300
But I chose a whole
bunch of them.

927
00:41:41,300 --> 00:41:43,970
And I asked, how
many iterations--

928
00:41:43,970 --> 00:41:46,100
how many steps did my
Newton-Raphson method

929
00:41:46,100 --> 00:41:48,440
have to take before
I got sufficiently

930
00:41:48,440 --> 00:41:53,540
close to either this root
here, or this root there?

931
00:41:53,540 --> 00:41:55,992
I don't remember what that
convergence criterion was--

932
00:41:55,992 --> 00:41:57,950
it doesn't really matter,
but there was some 10

933
00:41:57,950 --> 00:42:00,366
to the minus 3, or 10 to the
minus 5 or 10 to the minus 8,

934
00:42:00,366 --> 00:42:02,000
the convergence
criterion that I made

935
00:42:02,000 --> 00:42:05,180
sure the Newton-Raphson method
hit, in both the function norm

936
00:42:05,180 --> 00:42:08,450
and step norm cases.

937
00:42:08,450 --> 00:42:11,620
And then, if the color
on this map is blue--

938
00:42:11,620 --> 00:42:14,240
the solution converged to
this star in the blue zone--

939
00:42:14,240 --> 00:42:17,390
if the color's orange, the
solution converged to the star

940
00:42:17,390 --> 00:42:19,180
in the orange zone.

941
00:42:19,180 --> 00:42:21,920
And if the color is light, it
didn't take so many iterations

942
00:42:21,920 --> 00:42:22,544
to converge.

943
00:42:22,544 --> 00:42:24,710
And if the color gets darker,
it takes more and more

944
00:42:24,710 --> 00:42:27,080
iterations to converge.

945
00:42:27,080 --> 00:42:28,130
So that's the picture--

946
00:42:28,130 --> 00:42:29,104
that's this map.

947
00:42:29,104 --> 00:42:30,770
I solved it a bunch
of times, and then I

948
00:42:30,770 --> 00:42:33,124
mapped out how
many iterations did

949
00:42:33,124 --> 00:42:35,040
it take me to converge
to different solutions?

950
00:42:35,040 --> 00:42:37,307
So you can see if I start
close to the solution,

951
00:42:37,307 --> 00:42:38,390
the color is really light.

952
00:42:38,390 --> 00:42:41,120
It doesn't take very many
iterations to get there.

953
00:42:41,120 --> 00:42:43,335
And the further way I
move in this direction--

954
00:42:43,335 --> 00:42:45,710
still doesn't seem like it
take so many iterations to get

955
00:42:45,710 --> 00:42:46,880
there, either.

956
00:42:46,880 --> 00:42:49,100
I need a good initial guess--

957
00:42:49,100 --> 00:42:51,910
I want to be close to where
I think the solution is.

958
00:42:51,910 --> 00:42:54,590
Because once I'm over here
somewhere, I do pretty well.

959
00:42:54,590 --> 00:42:56,215
And the same is true
on the other side,

960
00:42:56,215 --> 00:42:59,000
because this problem
is symmetric.

961
00:42:59,000 --> 00:43:03,170
There's a line down the middle
here, and along this line,

962
00:43:03,170 --> 00:43:05,585
the determinant of the
Jacobian is equal to 0.

963
00:43:08,580 --> 00:43:11,970
So we talked about these points
with the Newton-Raphson method.

964
00:43:11,970 --> 00:43:14,520
And if I pick initial
guesses sufficiently close

965
00:43:14,520 --> 00:43:17,970
to this line, you can see the
color gets darker and darker.

966
00:43:17,970 --> 00:43:20,100
The number of iterations
required to converge

967
00:43:20,100 --> 00:43:23,640
the solution goes way up.

968
00:43:23,640 --> 00:43:28,190
Now, the Newton-Raphson method
possesses a local convergence

969
00:43:28,190 --> 00:43:28,740
property.

970
00:43:28,740 --> 00:43:32,750
Which means, if a
locally unique solution,

971
00:43:32,750 --> 00:43:34,910
there's always going
to be some neighborhood

972
00:43:34,910 --> 00:43:38,090
around that solution for which
the determinant of the Jacobian

973
00:43:38,090 --> 00:43:39,560
is not equal to 0.

974
00:43:39,560 --> 00:43:42,710
And in that neighborhood,
I can guarantee

975
00:43:42,710 --> 00:43:46,010
that this iterative process will
eventually reach the solution.

976
00:43:46,010 --> 00:43:48,070
That's pretty good.

977
00:43:48,070 --> 00:43:48,770
That's handy.

978
00:43:48,770 --> 00:43:50,270
These iterates can
go anywhere-- how

979
00:43:50,270 --> 00:43:51,700
do you know you're
getting to the solution?

980
00:43:51,700 --> 00:43:52,720
Are you going to
waste your time,

981
00:43:52,720 --> 00:43:53,300
or are you going to get there?

982
00:43:53,300 --> 00:43:55,625
So that's this local
convergence property associated

983
00:43:55,625 --> 00:43:58,690
with it, which is nice.

984
00:43:58,690 --> 00:44:00,700
But it'll break down
as I get to places

985
00:44:00,700 --> 00:44:04,450
where the determent of the
Jacobian is equal to 0.

986
00:44:04,450 --> 00:44:06,820
so there could be a zone--

987
00:44:06,820 --> 00:44:09,730
it's not in this one-- there
could be a zone, for example,

988
00:44:09,730 --> 00:44:11,967
like a ring on which the
determinant of the Jacobian

989
00:44:11,967 --> 00:44:12,550
sequence is 0.

990
00:44:12,550 --> 00:44:14,410
And if I take a guess
inside that ring, who

991
00:44:14,410 --> 00:44:16,276
knows where that solution
is going to go to.

992
00:44:16,276 --> 00:44:17,650
Could be something
like Sam said,

993
00:44:17,650 --> 00:44:21,580
where the iterative method just
bounces around inside that ring

994
00:44:21,580 --> 00:44:24,070
and never converges.

995
00:44:24,070 --> 00:44:26,800
But when I have roots
that are locally unique,

996
00:44:26,800 --> 00:44:29,830
and I start with good
guesses close to those roots,

997
00:44:29,830 --> 00:44:31,660
I can guarantee the
Newton-Raphson method

998
00:44:31,660 --> 00:44:32,620
will converge.

999
00:44:32,620 --> 00:44:35,290
I'll show you next time that
not only does it converge,

1000
00:44:35,290 --> 00:44:37,060
but it also converges
quadratically.

1001
00:44:37,060 --> 00:44:39,452
So you start sufficiently
close to the solution,

1002
00:44:39,452 --> 00:44:41,410
you get to double the
number of accurate digits

1003
00:44:41,410 --> 00:44:42,340
in each iteration.

1004
00:44:42,340 --> 00:44:44,200
You can see that happening here.

1005
00:44:44,200 --> 00:44:47,620
OK, so I have one accurate
digit, now I have two.

1006
00:44:47,620 --> 00:44:49,967
The next iteration I'll
have four, and so on.

1007
00:44:49,967 --> 00:44:51,550
The number of accurate
digits is going

1008
00:44:51,550 --> 00:44:56,030
to double at each iteration.

1009
00:44:56,030 --> 00:44:57,870
So that's going to
conclude for today.

1010
00:44:57,870 --> 00:45:01,160
Next time, we'll talk about
how to fix these problems

1011
00:45:01,160 --> 00:45:02,720
with the Newton-Raphson method.

1012
00:45:02,720 --> 00:45:04,220
So there are going
to be cases where

1013
00:45:04,220 --> 00:45:07,345
the convergence isn't ideal,
where we can improve things.

1014
00:45:07,345 --> 00:45:08,720
There are going
to be cases where

1015
00:45:08,720 --> 00:45:10,525
we don't want to
compute the Jacobian

1016
00:45:10,525 --> 00:45:11,720
or the Jacobian inverse.

1017
00:45:11,720 --> 00:45:13,840
And we can improve the method.

1018
00:45:13,840 --> 00:45:15,390
Thanks.