WEBVTT
00:00:01.540 --> 00:00:03.910
The following content is
provided under a Creative
00:00:03.910 --> 00:00:05.300
Commons license.
00:00:05.300 --> 00:00:07.510
Your support will help
MIT OpenCourseWare
00:00:07.510 --> 00:00:11.600
continue to offer high-quality
educational resources for free.
00:00:11.600 --> 00:00:14.140
To make a donation, or to
view additional materials
00:00:14.140 --> 00:00:18.100
from hundreds of MIT courses,
visit MIT OpenCourseWare
00:00:18.100 --> 00:00:18.980
at ocw.mit.edu.
00:00:23.180 --> 00:00:25.060
OK, we're moving on.
00:00:25.060 --> 00:00:27.340
No more linear algebra.
00:00:27.340 --> 00:00:29.980
We're going to try solving
some more difficult problems--
00:00:29.980 --> 00:00:30.970
of course, all those
problems will just
00:00:30.970 --> 00:00:32.980
be turned into linear
algebra as we move on,
00:00:32.980 --> 00:00:37.750
so your expertise now
with different techniques
00:00:37.750 --> 00:00:40.687
from linear algebra is
going to come in handy.
00:00:40.687 --> 00:00:42.270
The next section of
this course, we're
00:00:42.270 --> 00:00:45.320
talking about systems
of nonlinear equations,
00:00:45.320 --> 00:00:48.490
and we'll transition into
problems in optimization--
00:00:48.490 --> 00:00:50.860
which, turns out, look a lot
like systems of nonlinear
00:00:50.860 --> 00:00:51.880
equations, as well.
00:00:51.880 --> 00:00:53.710
And we'll try to
leverage what we
00:00:53.710 --> 00:00:56.541
learn in the next few lectures
to solve different optimization
00:00:56.541 --> 00:00:57.040
problems.
00:00:59.457 --> 00:01:01.040
Before going on, I
just want to recap.
00:01:01.040 --> 00:01:03.165
All right, last time we
talked about singular value
00:01:03.165 --> 00:01:06.940
decomposition-- which is like
an eigenvalue decomposition
00:01:06.940 --> 00:01:09.540
for any matrix.
00:01:09.540 --> 00:01:11.920
And associated with that
matrix are singular vectors,
00:01:11.920 --> 00:01:13.930
left and right singular vectors.
00:01:13.930 --> 00:01:17.690
And the singular
values of that matrix.
00:01:17.690 --> 00:01:22.570
Your TA, Kristen, reminded
me that you can actually
00:01:22.570 --> 00:01:25.390
define a condition number
for any matrix, as well.
00:01:25.390 --> 00:01:26.950
The condition we
gave originally was
00:01:26.950 --> 00:01:30.235
associated with solving
square systems of equations
00:01:30.235 --> 00:01:31.730
that actually have a solution.
00:01:31.730 --> 00:01:34.188
But there is a condition number
associated with any matrix,
00:01:34.188 --> 00:01:36.680
and it's defined in terms
of its singular values.
00:01:36.680 --> 00:01:39.430
So if you go back and look at
the definition of the two norm,
00:01:39.430 --> 00:01:42.190
you try to think about the
condition number associated
00:01:42.190 --> 00:01:44.700
with the two norm, you'll
see that the conditions
00:01:44.700 --> 00:01:46.840
of any matrix maybe
can be defined
00:01:46.840 --> 00:01:51.130
as the square root of the ratio
of the biggest and the smallest
00:01:51.130 --> 00:01:53.650
singular values of that matrix.
00:01:53.650 --> 00:01:56.740
So there's a condition number
associated with any matrix.
00:01:56.740 --> 00:01:58.180
The condition
number as an entity
00:01:58.180 --> 00:01:59.680
makes most sense
when we're thinking
00:01:59.680 --> 00:02:01.990
about how we amplify errors--
00:02:01.990 --> 00:02:05.130
numerical errors, solving
systems of equation zones.
00:02:05.130 --> 00:02:07.609
It's most easily applied to
square systems that actually
00:02:07.609 --> 00:02:09.400
have unique solutions,
but you can apply it
00:02:09.400 --> 00:02:11.320
to any system of
equations you want.
00:02:11.320 --> 00:02:12.820
And the singular
value decomposition
00:02:12.820 --> 00:02:16.560
is one way to tap into that.
00:02:16.560 --> 00:02:20.110
OK, the last thing we did
discussing linear algebra
00:02:20.110 --> 00:02:23.680
was to talk about iterative
solutions, the systems
00:02:23.680 --> 00:02:26.130
of linear equations.
00:02:26.130 --> 00:02:28.980
And that's actually
our hook into solutions
00:02:28.980 --> 00:02:30.810
to systems of
nonlinear equations.
00:02:30.810 --> 00:02:33.420
It's going to turn out
that exact solutions are
00:02:33.420 --> 00:02:37.085
hard to come by for anything
but linear systems of equations.
00:02:37.085 --> 00:02:38.460
And so we're always
going to have
00:02:38.460 --> 00:02:42.960
these iterative algorithms,
where we refine initial guesses
00:02:42.960 --> 00:02:45.240
for solutions until we
converge to something
00:02:45.240 --> 00:02:48.079
that's a solution to the
problem that we wanted before.
00:02:48.079 --> 00:02:49.620
And one question
you should ask, when
00:02:49.620 --> 00:02:55.830
you do these sorts of
iterations, is when do I stop?
00:02:55.830 --> 00:02:57.930
I don't know the exact
solution to this problem.
00:02:57.930 --> 00:03:00.660
I can't say I'm
close enough-- what
00:03:00.660 --> 00:03:02.370
does close enough even mean?
00:03:02.370 --> 00:03:04.770
So how do I decide to stop?
00:03:04.770 --> 00:03:06.570
Do you have any
ideas or suggestions
00:03:06.570 --> 00:03:08.050
for how you might do that?
00:03:08.050 --> 00:03:09.790
You've done some of this already
in a homework assignment.
00:03:09.790 --> 00:03:10.706
But what do you think?
00:03:10.706 --> 00:03:12.030
How do you decide to stop?
00:03:12.030 --> 00:03:12.824
Yeah?
00:03:12.824 --> 00:03:14.116
AUDIENCE: [INAUDIBLE].
00:03:18.580 --> 00:03:20.650
PROFESSOR: OK, so this
is one suggestion--
00:03:20.650 --> 00:03:24.930
look at my current iteration,
and my next iteration,
00:03:24.930 --> 00:03:28.090
ask how far apart are
these two numbers?
00:03:28.090 --> 00:03:30.220
If they're
sufficiently far apart,
00:03:30.220 --> 00:03:34.540
seems like I've got some more
steps to make before I converge
00:03:34.540 --> 00:03:35.620
to my solution.
00:03:35.620 --> 00:03:37.660
And if they're sufficiently
close together,
00:03:37.660 --> 00:03:39.460
the steps I'm taking
are small enough
00:03:39.460 --> 00:03:42.310
that I might be happy
accepting this solution
00:03:42.310 --> 00:03:44.040
as a good approximation.
00:03:44.650 --> 00:03:46.660
That's called the
step norm criteria--
00:03:46.660 --> 00:03:48.880
I'll give you a formalization
of that later-- it's
00:03:48.880 --> 00:03:50.530
called the step norm criteria.
00:03:50.530 --> 00:03:53.680
How big are the steps
that I'm taking?
00:03:53.680 --> 00:03:56.920
And are they sufficiently
small that I don't
00:03:56.920 --> 00:03:59.260
care about any future steps?
00:03:59.260 --> 00:04:00.070
Another suggestion?
00:04:00.070 --> 00:04:01.320
AUDIENCE: I've got a question.
00:04:01.320 --> 00:04:01.986
PROFESSOR: Yeah?
00:04:01.986 --> 00:04:05.478
AUDIENCE: [INAUDIBLE]
absolute [INAUDIBLE]
00:04:05.478 --> 00:04:09.205
when we did that for homework,
I tried to do it [INAUDIBLE],,
00:04:09.205 --> 00:04:11.170
and [INAUDIBLE].
00:04:11.170 --> 00:04:13.290
PROFESSOR: Yes.
00:04:13.290 --> 00:04:16.180
I will show you a definition
of the step norm criteria
00:04:16.180 --> 00:04:19.540
that integrates both
relative and absolute error
00:04:19.540 --> 00:04:20.500
into the definition.
00:04:20.500 --> 00:04:23.710
And we'll see why, OK?
00:04:23.710 --> 00:04:25.370
One problem may be--
00:04:25.370 --> 00:04:28.839
what if the solution I'm
trying to converge to is 0?
00:04:28.839 --> 00:04:30.880
How do you define the
relative error with respect
00:04:30.880 --> 00:04:33.196
to the number 0?
00:04:33.196 --> 00:04:35.320
There isn't one-- there is
only absolute error when
00:04:35.320 --> 00:04:36.611
you're trying to converge to 0.
00:04:36.611 --> 00:04:39.260
So you may want to
have some measure
00:04:39.260 --> 00:04:41.882
of both absolute and
relative step size
00:04:41.882 --> 00:04:43.840
in order to determine
whether you're converged.
00:04:43.840 --> 00:04:46.970
Is that the only way
to do it, though?
00:04:46.970 --> 00:04:49.600
Have any ideas, alternative
proposals for deciding
00:04:49.600 --> 00:04:51.472
convergence?
00:04:51.472 --> 00:04:52.930
AUDIENCE: [INAUDIBLE]
the residual.
00:04:52.930 --> 00:04:53.710
PROFESSOR: The residual.
00:04:53.710 --> 00:04:54.850
OK, what's the residual?
00:04:54.850 --> 00:04:56.840
AUDIENCE: [INAUDIBLE]
00:04:56.840 --> 00:04:59.560
PROFESSOR: Good, OK, so we
asked, how good a solution--
00:04:59.560 --> 00:05:03.760
we can ask how good a solution
is this value that we've
00:05:03.760 --> 00:05:07.660
converged to by putting it
back into the original system
00:05:07.660 --> 00:05:10.930
of equations, and asking, how
far out of balance are we?
00:05:10.930 --> 00:05:13.930
I take my best guess for
solution x, and multiply it by
00:05:13.930 --> 00:05:15.940
and A, and I subtract b--
00:05:15.940 --> 00:05:18.310
we call that the residual.
00:05:18.310 --> 00:05:21.280
And is the residual
sufficiently converged, or not?
00:05:21.280 --> 00:05:23.200
If it's small
enough in magnitude,
00:05:23.200 --> 00:05:24.970
then we would say,
OK, maybe we're
00:05:24.970 --> 00:05:27.130
sufficiently close
to the solution.
00:05:27.130 --> 00:05:29.170
If it's too big, then we
say, OK, let's iterate
00:05:29.170 --> 00:05:30.670
some more until we get there.
00:05:30.670 --> 00:05:33.520
That is called the
function norm criterion.
00:05:35.950 --> 00:05:37.700
We're going to talk
about these in detail,
00:05:37.700 --> 00:05:40.100
as applied to systems
of nonlinear equations.
00:05:40.100 --> 00:05:44.840
But these same criteria apply
to all iterative processes.
00:05:44.840 --> 00:05:46.922
Neither is preferred
over the other.
00:05:46.922 --> 00:05:48.380
You don't know the
exact solution--
00:05:48.380 --> 00:05:50.840
you have no way of measuring
how close or far away
00:05:50.840 --> 00:05:52.190
you are from the exact solution.
00:05:52.190 --> 00:05:54.740
So usually you use as
many tools as possible
00:05:54.740 --> 00:05:57.830
to try to judge how good
your approximation is.
00:05:57.830 --> 00:06:01.209
But you don't know for certain.
00:06:01.209 --> 00:06:02.750
What do you do, when
that's the case?
00:06:02.750 --> 00:06:05.170
We haven't really talked
about this in this class.
00:06:05.170 --> 00:06:07.630
You have a problem, you
program it into a computer,
00:06:07.630 --> 00:06:08.950
you get a solution.
00:06:08.950 --> 00:06:11.860
Is at the end of the story?
00:06:11.860 --> 00:06:13.100
We just stop?
00:06:13.100 --> 00:06:16.702
You get a number back out,
and that's the answer?
00:06:16.702 --> 00:06:17.910
How do you know you're right?
00:06:21.902 --> 00:06:22.900
How do you know?
00:06:26.400 --> 00:06:27.840
We talked about
numerical error--
00:06:27.840 --> 00:06:29.490
every calculation has
a numerical error-- how
00:06:29.490 --> 00:06:31.031
do you know you got
the right answer?
00:06:35.950 --> 00:06:38.795
What do you think?
00:06:38.795 --> 00:06:40.222
AUDIENCE: [INAUDIBLE]
00:06:40.222 --> 00:06:41.680
PROFESSOR: OK,
yeah, this is true--
00:06:41.680 --> 00:06:44.240
so you plug your solution
back into the equation,
00:06:44.240 --> 00:06:47.140
you ask, how good a job
does it do satisfying that?
00:06:47.140 --> 00:06:48.970
But maybe this
equation is relatively
00:06:48.970 --> 00:06:51.040
insensitive to the
solution you provide.
00:06:51.040 --> 00:06:54.160
Maybe many solutions
nearby look like they also
00:06:54.160 --> 00:06:56.020
satisfy the equation,
but those solutions
00:06:56.020 --> 00:06:57.460
are sufficiently far apart.
00:06:57.460 --> 00:06:59.329
So how do you--
00:06:59.329 --> 00:07:01.700
AUDIENCE: [INAUDIBLE]
00:07:01.700 --> 00:07:05.560
PROFESSOR: OK, so that
sort of physical reasoning
00:07:05.560 --> 00:07:06.560
is a good one.
00:07:06.560 --> 00:07:09.520
In your transfer
class, you'll talk
00:07:09.520 --> 00:07:11.800
about doing
asymptotic expansions,
00:07:11.800 --> 00:07:14.380
or asymptotic
solutions to problems.
00:07:14.380 --> 00:07:17.110
Solve this complicated
problem in a limit
00:07:17.110 --> 00:07:19.240
where it has some
analytical solution,
00:07:19.240 --> 00:07:20.800
and figure how the
solution scales,
00:07:20.800 --> 00:07:22.960
with respect to
different parameters.
00:07:22.960 --> 00:07:24.550
So you can have an
analytical solution
00:07:24.550 --> 00:07:26.550
that you compare against
your numerical solution
00:07:26.550 --> 00:07:27.820
in certain limits.
00:07:27.820 --> 00:07:30.490
You have experiments
that you've done.
00:07:30.490 --> 00:07:32.650
Experiment-- that's the reality.
00:07:32.650 --> 00:07:35.590
The computer is a fiction
that's trying to model reality,
00:07:35.590 --> 00:07:38.050
so you can compare your
solution to experiments.
00:07:38.050 --> 00:07:41.650
You could also solve the problem
a bunch of different ways,
00:07:41.650 --> 00:07:44.950
and see if all these answers
converge in the same place.
00:07:44.950 --> 00:07:47.875
So we're going to talk about
solving nonlinear equations--
00:07:47.875 --> 00:07:49.750
we're going to need
different initial guesses
00:07:49.750 --> 00:07:52.022
for our iterative methods.
00:07:52.022 --> 00:07:53.980
We might try several
different initial guesses,
00:07:53.980 --> 00:07:56.160
and see what solutions
we come up with.
00:07:56.160 --> 00:07:58.660
Maybe we converge all
to the same solution,
00:07:58.660 --> 00:08:00.970
or maybe this problem has
some weird sensitivity in it,
00:08:00.970 --> 00:08:03.040
and we get lots of
different solutions
00:08:03.040 --> 00:08:05.320
that aren't coordinated
with each other.
00:08:05.320 --> 00:08:07.720
One of the duties
of someone who's
00:08:07.720 --> 00:08:11.170
using numerical methods to
solve problems is to try
00:08:11.170 --> 00:08:13.060
to validate their result--
00:08:13.060 --> 00:08:15.940
by solving it multiple
times or multiple ways,
00:08:15.940 --> 00:08:18.490
or comparing against
experiment, or against known
00:08:18.490 --> 00:08:20.560
analytical results,
and certain limits
00:08:20.560 --> 00:08:23.290
where the answer
should be exact.
00:08:23.290 --> 00:08:26.840
But you can't just accept
what the computer tells you--
00:08:26.840 --> 00:08:31.547
you have to validate it against
some sort of external solution
00:08:31.547 --> 00:08:32.630
that you can compare with.
00:08:32.630 --> 00:08:34.713
Sometimes it's hard to
come up with that solution,
00:08:34.713 --> 00:08:36.669
but it's immensely important.
00:08:36.669 --> 00:08:38.664
We know every calculation
can be an error.
00:08:38.664 --> 00:08:40.539
And as we go on to more
complicated problems,
00:08:40.539 --> 00:08:42.549
it's even more important
to validate things.
00:08:45.332 --> 00:08:51.750
So, systems of
nonlinear equations--
00:08:51.750 --> 00:08:55.390
so these are
problems of a type f
00:08:55.390 --> 00:08:59.770
of x equals 0, where x is
some vector of unknowns,
00:08:59.770 --> 00:09:03.790
and has dimension N.
And f is a function that
00:09:03.790 --> 00:09:06.865
takes as input vectors
of dimension N,
00:09:06.865 --> 00:09:09.790
and gives an output a
vector of dimension N.
00:09:09.790 --> 00:09:13.390
It's a map from R N to
R N. But it's no longer
00:09:13.390 --> 00:09:16.690
necessarily a linear map-- it
can be some nonlinear function
00:09:16.690 --> 00:09:20.180
of all the elements of this x.
00:09:20.180 --> 00:09:22.370
And the solution
to this equation,
00:09:22.370 --> 00:09:24.650
the particular solution
of this equation, x--
00:09:24.650 --> 00:09:26.360
for which f of x
equals 0-- are called
00:09:26.360 --> 00:09:30.730
the roots of this
vector-valued function.
00:09:30.730 --> 00:09:33.290
The linear equations, then, are
just represented in this form
00:09:33.290 --> 00:09:37.200
as A x minus b, A x
minus b equals 0--
00:09:37.200 --> 00:09:38.750
it's the same as
the linear equations
00:09:38.750 --> 00:09:39.708
we were solving before.
00:09:39.708 --> 00:09:43.700
So we're searching for the
roots of these functions.
00:09:43.700 --> 00:09:45.200
Common chemical
engineering examples
00:09:45.200 --> 00:09:48.140
include equations of
state, often nonlinear,
00:09:48.140 --> 00:09:50.150
in terms of the variables
we're interested in.
00:09:50.150 --> 00:09:52.310
Energy balances have
lots of non-linearities
00:09:52.310 --> 00:09:54.220
introduced into them.
00:09:54.220 --> 00:09:57.890
Mass balances with
nonlinear reactions.
00:09:57.890 --> 00:10:00.200
Or reactions that
are non-isothermal,
00:10:00.200 --> 00:10:02.210
so their kinetics are
sensitive to temperature,
00:10:02.210 --> 00:10:04.464
and temperatures are
variable, we want to know.
00:10:04.464 --> 00:10:06.380
These sorts of nonlinear
equations crop up all
00:10:06.380 --> 00:10:10.450
over the place, and you want to
be able to solve them reliably.
00:10:10.450 --> 00:10:12.690
Here's a simple, simple example.
00:10:12.690 --> 00:10:14.590
The Van der Waals
equation of state-- here
00:10:14.590 --> 00:10:18.400
I've written it in terms of
reduced pressure, temperature,
00:10:18.400 --> 00:10:19.390
and molar volume.
00:10:19.390 --> 00:10:23.680
Nonetheless, this is the Van
der Waals equation of state.
00:10:23.680 --> 00:10:26.500
And somebody told you once
that if I plot pressure
00:10:26.500 --> 00:10:30.190
versus molar volume for
different temperatures,
00:10:30.190 --> 00:10:32.590
I may see that, at
a given pressure,
00:10:32.590 --> 00:10:35.380
there could be just one root--
00:10:35.380 --> 00:10:38.150
one possible molar volume
that satisfies the equation
00:10:38.150 --> 00:10:38.650
of state.
00:10:38.650 --> 00:10:44.212
Or there can be one, two,
three potential roots.
00:10:44.212 --> 00:10:46.420
It turns out we don't know,
with nonlinear equations,
00:10:46.420 --> 00:10:48.460
how many possible
solutions there are.
00:10:48.460 --> 00:10:51.190
We knew, for linear
equations, I either
00:10:51.190 --> 00:10:53.562
had zero, one, or an
infinite number of solutions.
00:10:53.562 --> 00:10:55.270
But with nonlinear
equations, in general,
00:10:55.270 --> 00:10:57.130
there's no way to predict them.
00:10:57.130 --> 00:11:00.100
This problem can be
transformed into a polynomial,
00:11:00.100 --> 00:11:02.650
and polynomials are one of the
few nonlinear equations where
00:11:02.650 --> 00:11:06.110
we know how to bound the
possible number of solutions.
00:11:09.220 --> 00:11:11.132
So we can transform
this nonlinear equation
00:11:11.132 --> 00:11:12.840
to the form I showed
you before-- we just
00:11:12.840 --> 00:11:15.900
move 8/3 T to the other
side of this equation.
00:11:15.900 --> 00:11:19.540
We want to find the
roots of this equation.
00:11:19.540 --> 00:11:22.080
So possibly, given pressure
and temperature-- pressure
00:11:22.080 --> 00:11:24.840
and temperature-- find
all the molar volumes that
00:11:24.840 --> 00:11:28.560
satisfy the equation of state.
00:11:28.560 --> 00:11:30.240
This is actually
overly simplified
00:11:30.240 --> 00:11:32.850
for a particular
physical problem,
00:11:32.850 --> 00:11:35.910
of looking at vapor liquid
coexistence of the Van der
00:11:35.910 --> 00:11:37.650
Waals fluid.
00:11:37.650 --> 00:11:41.490
You can't specify pressure and
temperature independently--
00:11:41.490 --> 00:11:43.830
the saturation pressure,
the coexistence pressure
00:11:43.830 --> 00:11:47.400
depends on the temperature
as the Gibbs phase rule.
00:11:47.400 --> 00:11:51.300
So actually, phase equilibria
is made up of three parts--
00:11:51.300 --> 00:11:52.740
there's thermal equilibrium.
00:11:52.740 --> 00:11:55.260
I'm going to add two
phases, a gas and a liquid,
00:11:55.260 --> 00:11:56.295
and they have to have
the same temperature,
00:11:56.295 --> 00:11:57.680
otherwise they're
not in equilibrium.
00:11:57.680 --> 00:11:59.400
There's got to be mechanical
equilibrium of two
00:11:59.400 --> 00:12:00.852
phases, the gas and the liquid.
00:12:00.852 --> 00:12:02.310
They better have
the same pressure,
00:12:02.310 --> 00:12:03.750
otherwise one is going
to be pushing harder
00:12:03.750 --> 00:12:04.690
on the other one.
00:12:04.690 --> 00:12:06.980
They'll be in motion--
that's not in equilibrium.
00:12:06.980 --> 00:12:09.480
They've got to have the same
chemical potential-- they have
00:12:09.480 --> 00:12:11.190
to be in chemical equilibrium.
00:12:11.190 --> 00:12:14.590
So there can't be any net mass
flux from one phase to another,
00:12:14.590 --> 00:12:16.132
otherwise, one phase
is going to grow
00:12:16.132 --> 00:12:17.506
and the other is
going to shrink.
00:12:17.506 --> 00:12:19.320
They're not in equilibrium
with each other.
00:12:19.320 --> 00:12:22.860
So actually, the
problem of determining
00:12:22.860 --> 00:12:26.610
vapor liquid coexistence,
in this Van der Waals fluid,
00:12:26.610 --> 00:12:30.240
involves satisfying a number
of different equations, some
00:12:30.240 --> 00:12:32.370
of which are nonlinear,
and are constrained
00:12:32.370 --> 00:12:34.050
by the equation of state.
00:12:36.790 --> 00:12:40.330
Given the temperature, there are
three unknowns-- the pressure,
00:12:40.330 --> 00:12:43.000
and the more volumes of
the gas and the liquid.
00:12:43.000 --> 00:12:45.660
And there are three nonlinear
equations we have to solve.
00:12:45.660 --> 00:12:47.366
Two of those are the
equation of state,
00:12:47.366 --> 00:12:48.490
in the gas and the liquid--
00:12:48.490 --> 00:12:49.948
I'll show them to
you in a second--
00:12:49.948 --> 00:12:53.059
and the other is this Maxwell
equal area construction,
00:12:53.059 --> 00:12:55.600
which essentially says that the
chemical potential in the two
00:12:55.600 --> 00:12:56.810
phases is equal.
00:12:56.810 --> 00:12:59.770
This is one way of
representing that.
00:12:59.770 --> 00:13:03.520
So we have to solve this
system of nonlinear equations
00:13:03.520 --> 00:13:05.950
for the saturation
pressure, the molar
00:13:05.950 --> 00:13:10.530
volume of the gas or vapor, and
the molar volume of the liquid.
00:13:10.530 --> 00:13:11.780
And these are those equations.
00:13:11.780 --> 00:13:13.770
Here's the equation
of state and the gas,
00:13:13.770 --> 00:13:15.970
here's the equation of
state in the liquid,
00:13:15.970 --> 00:13:19.140
and here's the Maxwell
equal area construction
00:13:19.140 --> 00:13:22.840
at defined values of
P sat, V G and V L
00:13:22.840 --> 00:13:25.050
that satisfy all
three equations.
00:13:25.050 --> 00:13:27.780
There's not going to be an
analytical way to do this--
00:13:27.780 --> 00:13:29.638
it has to be done numerically.
00:13:33.760 --> 00:13:36.070
Here's a simplification
that I can make, though.
00:13:36.070 --> 00:13:37.900
So I can take that
equal area construction,
00:13:37.900 --> 00:13:41.117
and I can solve for P sat
in terms of V G and V L.
00:13:41.117 --> 00:13:42.700
And so that reduces
the dimensionality
00:13:42.700 --> 00:13:45.299
of these equations
from three to two.
00:13:45.299 --> 00:13:47.590
And when it's two dimensional,
I can plot these things,
00:13:47.590 --> 00:13:48.920
so that's helpful.
00:13:48.920 --> 00:13:52.170
So let's plot f 1--
00:13:52.170 --> 00:13:57.190
the equation of state in the gas
is a function of V G and V L.
00:13:57.190 --> 00:14:00.870
Where that's equal to 0,
that's this black curve here.
00:14:00.870 --> 00:14:02.710
Let's plot f 2--
00:14:02.710 --> 00:14:05.350
the equation of
state in the liquid
00:14:05.350 --> 00:14:08.080
as a function of V G and V
L, where that's equal to 0,
00:14:08.080 --> 00:14:10.030
that's this blue curve here.
00:14:10.030 --> 00:14:12.700
And the solutions are where
these curves intersect.
00:14:12.700 --> 00:14:16.150
So we're seeking out the
specific points graphically
00:14:16.150 --> 00:14:17.660
where these curves intersect.
00:14:17.660 --> 00:14:19.430
First, this solution,
and that solution
00:14:19.430 --> 00:14:21.430
aren't the solutions we're
interested in at all.
00:14:21.430 --> 00:14:23.950
That would say that the
molar volume of the gas
00:14:23.950 --> 00:14:25.362
and the liquid is the same--
00:14:25.362 --> 00:14:26.820
that's not really
phase separation.
00:14:26.820 --> 00:14:32.530
We want these heterogeneous
solutions out here.
00:14:32.530 --> 00:14:34.990
So we need some methodology
that can reliably
00:14:34.990 --> 00:14:36.700
take us to those solutions.
00:14:36.700 --> 00:14:38.560
We'll see that
that methodology--
00:14:38.560 --> 00:14:40.720
the most reliable methodology,
and one of the ones
00:14:40.720 --> 00:14:42.690
that converges fastest
to the solutions--
00:14:42.690 --> 00:14:45.289
is called the
Newton-Raphson method.
00:14:45.289 --> 00:14:47.080
But even before we do
that, let's talk more
00:14:47.080 --> 00:14:50.175
about the structure of systems
of nonlinear equations,
00:14:50.175 --> 00:14:52.260
and what sort of
solutions we can expect.
00:14:52.260 --> 00:14:55.075
Does this example makes
sense to everyone?
00:14:55.075 --> 00:14:58.150
Have you thought about
this before, maybe?
00:14:58.150 --> 00:15:00.560
Yeah.
00:15:00.560 --> 00:15:04.010
So given a function, which
is a map from R N to R N,
00:15:04.010 --> 00:15:08.630
find the special solution, x
star, such that f of x star
00:15:08.630 --> 00:15:09.200
equals 0.
00:15:09.200 --> 00:15:10.040
That's our task.
00:15:10.040 --> 00:15:12.267
And there could be no solutions.
00:15:12.267 --> 00:15:13.850
There can be one to
an infinite number
00:15:13.850 --> 00:15:16.580
of locally unique
solutions, and there
00:15:16.580 --> 00:15:19.960
can be an infinite
number of solutions.
00:15:19.960 --> 00:15:22.360
A solution is to
be locally unique
00:15:22.360 --> 00:15:27.610
if I can wrap that solution by
some ball of points, in which
00:15:27.610 --> 00:15:28.930
there are no other solutions.
00:15:28.930 --> 00:15:31.210
That ball can be very,
very small, but as long
00:15:31.210 --> 00:15:33.430
as I can wrap that solution
in some ball of points,
00:15:33.430 --> 00:15:36.220
which are not solutions, we
term that locally unique.
00:15:39.490 --> 00:15:41.910
So consider the
simple function, one
00:15:41.910 --> 00:15:45.240
which depends on two
variables-- x1 and x2,
00:15:45.240 --> 00:15:49.410
and f 2, which depends
on x1 and x2, equals 0.
00:15:49.410 --> 00:15:52.602
And I'm going to plot, in
the x1, x2 plane, were f 1
00:15:52.602 --> 00:15:55.490
and f 2 are equal to 0,
so that these curves here.
00:15:58.280 --> 00:16:00.250
And here, we have a
locally unique solution--
00:16:00.250 --> 00:16:03.064
we see the curves cross
at exactly one point.
00:16:03.064 --> 00:16:04.480
Here, you can see
these two curves
00:16:04.480 --> 00:16:07.930
are tangent with each other.
00:16:07.930 --> 00:16:09.760
They could be coincident
with each other,
00:16:09.760 --> 00:16:11.800
over some finite distance.
00:16:11.800 --> 00:16:14.210
In which case there's a
lot of solutions that live
00:16:14.210 --> 00:16:18.370
on some locally tangent area.
00:16:18.370 --> 00:16:20.440
Or they could just
touch in one point.
00:16:20.440 --> 00:16:22.630
So they may be tangent,
and the solutions there
00:16:22.630 --> 00:16:25.300
are not locally unique, or
they may touch at one point,
00:16:25.300 --> 00:16:27.450
and the solutions are--
00:16:27.450 --> 00:16:29.780
there's one solution, and
it's locally unique there.
00:16:33.430 --> 00:16:37.220
The reason why we talk about
locally unique solutions is
00:16:37.220 --> 00:16:40.790
it's going to be hard
for a numerical method
00:16:40.790 --> 00:16:43.370
to find anything
that's not locally
00:16:43.370 --> 00:16:45.470
unique in a reliable way.
00:16:45.470 --> 00:16:47.360
Locally unique solutions,
numerical methods
00:16:47.360 --> 00:16:48.660
can find very reliably.
00:16:48.660 --> 00:16:51.710
But if they're not
locally unique?
00:16:51.710 --> 00:16:53.420
My iterative method
could converge
00:16:53.420 --> 00:16:56.690
to any one of these solutions
that lives on this line, any
00:16:56.690 --> 00:16:57.800
of these tangent points.
00:16:57.800 --> 00:16:59.633
And I'm going to have
a hard time predicting
00:16:59.633 --> 00:17:01.280
which one it's going to go to.
00:17:01.280 --> 00:17:04.069
That's a problem if you're
trying to solve something
00:17:04.069 --> 00:17:06.240
reliably over and
over again-- if I
00:17:06.240 --> 00:17:08.750
converge to one of these
solutions, or another solution,
00:17:08.750 --> 00:17:11.150
or another solution,
the data that
00:17:11.150 --> 00:17:15.790
comes out of that process isn't
going to be easy to interpret.
00:17:15.790 --> 00:17:18.640
There's something called the
inverse function theorem, which
00:17:18.640 --> 00:17:22.829
says if f of x
star is equal to 0,
00:17:22.829 --> 00:17:26.550
and the determinant
of this matrix, J,
00:17:26.550 --> 00:17:29.400
which we call the Jacobian,
evaluated at x star
00:17:29.400 --> 00:17:32.700
is not equal to 0,
then x star necessarily
00:17:32.700 --> 00:17:35.050
is a locally unique solution.
00:17:35.050 --> 00:17:37.400
So it's the inverse
function theorem.
00:17:37.400 --> 00:17:42.795
The Jacobian is a matrix of
the partial derivatives of f,
00:17:42.795 --> 00:17:44.940
if an elements of
f, with respect
00:17:44.940 --> 00:17:48.030
to different elements of x.
00:17:48.030 --> 00:17:51.780
So the first row of the
Jacobian is the derivatives
00:17:51.780 --> 00:17:54.180
of the first elements
of f, with respect
00:17:54.180 --> 00:17:56.940
to all the elements of
x, and the other rows
00:17:56.940 --> 00:17:58.890
proceed accordingly.
00:17:58.890 --> 00:18:01.110
If the determinant
of this matrix,
00:18:01.110 --> 00:18:03.750
evaluated at the solution,
is not equal to 0,
00:18:03.750 --> 00:18:06.000
then this solution is
necessarily locally unique.
00:18:06.000 --> 00:18:07.499
That's the inverse
function theorem.
00:18:11.430 --> 00:18:13.280
The Jacobian describes
the rate of change
00:18:13.280 --> 00:18:15.240
of this vector-valued
function with respect
00:18:15.240 --> 00:18:18.920
to all of its
independent variables.
00:18:18.920 --> 00:18:20.930
And you may find that,
for some solutions,
00:18:20.930 --> 00:18:23.459
the determinant in the
Jacobian is equal to 0.
00:18:23.459 --> 00:18:25.250
We can't really say
what's going on there--
00:18:25.250 --> 00:18:26.900
the solution may
be locally unique,
00:18:26.900 --> 00:18:28.700
it may not be locally unique.
00:18:28.700 --> 00:18:31.602
I'm going to provide you
some examples in a second.
00:18:31.602 --> 00:18:33.060
And most numerical
methods are only
00:18:33.060 --> 00:18:36.550
going to find one of these local
unique solutions at a time.
00:18:36.550 --> 00:18:39.930
If we have some
non-local solutions,
00:18:39.930 --> 00:18:41.370
that'll cause us problems.
00:18:41.370 --> 00:18:44.200
So we tend to want to work with
functions that have locally
00:18:44.200 --> 00:18:47.970
unique solutions to begin with.
00:18:47.970 --> 00:18:48.887
OK, here's an example.
00:18:48.887 --> 00:18:50.386
Oh, you have your
notes, so you know
00:18:50.386 --> 00:18:51.800
the formula for the Jacobian.
00:18:51.800 --> 00:18:53.383
Compute the Jacobian
of this function.
00:19:34.400 --> 00:19:39.790
So this function has a root
at x1 equals 0, x2 equals 0.
00:19:39.790 --> 00:19:41.440
If you think
graphically about what
00:19:41.440 --> 00:19:43.330
each of these little
functions represents,
00:19:43.330 --> 00:19:46.240
you would agree that that
route is locally unique-- it's
00:19:46.240 --> 00:19:49.420
just one point
where both elements
00:19:49.420 --> 00:19:52.420
of this vector-valued
function are equal to 0.
00:19:52.420 --> 00:19:55.441
Here's what the Jacobian
of this function should be.
00:19:55.441 --> 00:19:57.940
If you take the derivative of
the first element with respect
00:19:57.940 --> 00:20:00.683
to x1, and then x2,
take the derivative
00:20:00.683 --> 00:20:03.950
of the second element with
respect to x1 and then x2--
00:20:03.950 --> 00:20:08.290
at the solution, at the root of
this function, where x1 is 0,
00:20:08.290 --> 00:20:12.100
and x2 is 0, the Jacobian
is a matrix of zeros.
00:20:12.100 --> 00:20:14.010
Its determinant is 0.
00:20:14.010 --> 00:20:16.840
But the solution
is locally unique.
00:20:16.840 --> 00:20:19.330
The inverse function
theorem only
00:20:19.330 --> 00:20:22.690
tells us about what happens
when the determinant's not
00:20:22.690 --> 00:20:24.610
equal to 0.
00:20:24.610 --> 00:20:27.220
If the determinant's
not 0, then we
00:20:27.220 --> 00:20:30.380
have a locally unique solution.
00:20:30.380 --> 00:20:32.470
Solution may be locally
unique, its determinant
00:20:32.470 --> 00:20:33.790
may be equal to 0.
00:20:33.790 --> 00:20:34.770
Does that makes sense?
00:20:34.770 --> 00:20:37.260
You see how that plays out?
00:20:37.260 --> 00:20:39.240
OK.
00:20:39.240 --> 00:20:41.069
There's a physical
way to think about--
00:20:41.069 --> 00:20:43.360
or a geometric way to think
about this inverse function
00:20:43.360 --> 00:20:43.859
theorem.
00:20:43.859 --> 00:20:45.740
So think about the
linear equation,
00:20:45.740 --> 00:20:48.754
f of x is A x minus b.
00:20:48.754 --> 00:20:51.170
You can show-- and you should
actually work through this--
00:20:51.170 --> 00:20:52.820
that the Jacobian
of this function
00:20:52.820 --> 00:20:58.180
is just the matrix A. It says
how the function changes,
00:20:58.180 --> 00:21:00.620
with respect to
small changes in x.
00:21:00.620 --> 00:21:01.840
Well, that's just A--
00:21:01.840 --> 00:21:05.090
this is a linear function.
00:21:05.090 --> 00:21:07.000
So the equation,
f of x equals 0,
00:21:07.000 --> 00:21:09.310
has a locally unique
solution when the determinant
00:21:09.310 --> 00:21:12.400
of the Jacobian-- which
is the determinant of A--
00:21:12.400 --> 00:21:13.666
is not equal to 0.
00:21:13.666 --> 00:21:15.040
But you already
knew that, right?
00:21:15.040 --> 00:21:18.530
We already talked
through linear algebra.
00:21:18.530 --> 00:21:22.370
And so you know when this
matrix A is singular,
00:21:22.370 --> 00:21:24.280
then we can't invert
this system of equations
00:21:24.280 --> 00:21:27.190
and find a unique solution
in the first place.
00:21:27.190 --> 00:21:29.890
So the inverse function
theorem is nothing more
00:21:29.890 --> 00:21:32.950
than an extension of what we
learned about when functions
00:21:32.950 --> 00:21:34.720
are and aren't invertible.
00:21:34.720 --> 00:21:37.390
Because a locally unique
solution when A is invertible.
00:21:39.940 --> 00:21:43.480
In the neighborhood of f
of x, in the neighborhood
00:21:43.480 --> 00:21:47.410
of a root of f of x, we can
often approximate the function
00:21:47.410 --> 00:21:49.360
as being linear.
00:21:49.360 --> 00:21:52.120
We can treat it as though it's
a system of linear equations,
00:21:52.120 --> 00:21:54.989
very close to that root.
00:21:54.989 --> 00:21:57.280
And then the things that we
learned from linear algebra
00:21:57.280 --> 00:22:01.540
are inherited by these
linearized solutions.
00:22:01.540 --> 00:22:05.680
So here's this set of curves
that I showed you before.
00:22:05.680 --> 00:22:08.920
Near this root, let's zoom in--
00:22:08.920 --> 00:22:10.510
let's zoom in.
00:22:10.510 --> 00:22:11.980
These lines look
mostly straight.
00:22:11.980 --> 00:22:16.855
It's like the place where
two planes intersect--
00:22:16.855 --> 00:22:19.920
intersect this x1, x2 plane--
they each intersect at a line,
00:22:19.920 --> 00:22:23.010
and the crossing of those
lines is the solution.
00:22:23.010 --> 00:22:24.010
And it's locally unique.
00:22:24.010 --> 00:22:29.880
Because these two planes
span different subspaces.
00:22:29.880 --> 00:22:31.770
Here's the case
where we may have
00:22:31.770 --> 00:22:33.180
non-locally unique [INAUDIBLE].
00:22:33.180 --> 00:22:36.090
Zoom in on this root, and
very close to this root,
00:22:36.090 --> 00:22:37.650
well, it's hard to tell.
00:22:37.650 --> 00:22:40.620
Maybe these two planes are
coincident with each other,
00:22:40.620 --> 00:22:42.970
and they intersect and
form the same line--
00:22:42.970 --> 00:22:45.360
in which case they may
not be locally unique.
00:22:45.360 --> 00:22:47.970
Maybe if I zoom in close enough,
I see, no, actually, they
00:22:47.970 --> 00:22:50.053
have a slightly different
orientation with respect
00:22:50.053 --> 00:22:53.340
to each other, and there is a
locally unique solution there.
00:22:53.340 --> 00:22:56.760
It's difficult to tell here.
00:22:56.760 --> 00:22:59.490
So these cases where the curves
cross are easy to determine.
00:22:59.490 --> 00:23:00.960
These are the ones that
the inverse function
00:23:00.960 --> 00:23:01.967
theorem tells us about.
00:23:01.967 --> 00:23:03.800
These ones are a little
harder to work with.
00:23:07.630 --> 00:23:11.950
So I mentioned that you can zoom
in, and look close to a root,
00:23:11.950 --> 00:23:14.090
and approximate the
function is linear.
00:23:14.090 --> 00:23:16.680
This is a process
called linearization.
00:23:16.680 --> 00:23:19.450
You are in this
for 1-D functions--
00:23:19.450 --> 00:23:22.510
f of x, at a point
x plus delta x,
00:23:22.510 --> 00:23:26.815
is f of x plus its
derivative times delta x.
00:23:29.480 --> 00:23:32.460
And this'll typically be valid
for reasonably well-behaved
00:23:32.460 --> 00:23:34.200
functions-- this sort
of a linearization
00:23:34.200 --> 00:23:36.660
is going to be valid
as delta x goes to 0.
00:23:36.660 --> 00:23:38.640
So as long as I haven't
moved too far away
00:23:38.640 --> 00:23:42.870
from the point to x, I can
approximate my function
00:23:42.870 --> 00:23:45.894
in the neighborhood of x
using this linearization.
00:23:45.894 --> 00:23:47.310
You know, turns
out the same thing
00:23:47.310 --> 00:23:49.050
is true for nonlinear functions.
00:23:49.050 --> 00:23:53.070
So f of x plus delta
x is f of x plus--
00:23:53.070 --> 00:23:56.580
well, I need the derivatives of
my function with respect to x,
00:23:56.580 --> 00:23:59.020
and those derivatives are
partial derivatives now,
00:23:59.020 --> 00:24:01.440
because we're in higher
dimensional spaces.
00:24:01.440 --> 00:24:04.290
That's the Jacobian
multiplied by this vector
00:24:04.290 --> 00:24:06.300
of displacements, delta x.
00:24:06.300 --> 00:24:08.040
And this will typically
be valid as long
00:24:08.040 --> 00:24:11.250
as the length of this
delta x is not too big--
00:24:11.250 --> 00:24:14.130
as long as I haven't moved
too far away from the point
00:24:14.130 --> 00:24:16.860
I'm interested in, f of x,
this will be a reasonably good
00:24:16.860 --> 00:24:17.580
approximation.
00:24:17.580 --> 00:24:21.330
As long as our functions
are well-behaved.
00:24:21.330 --> 00:24:25.762
There's an error that's incurred
in making these approximations.
00:24:25.762 --> 00:24:27.720
And for a general function
that's well-behaved,
00:24:27.720 --> 00:24:30.480
that error is going to be
ordered delta x squared--
00:24:30.480 --> 00:24:33.470
and they're in the 1-D, or
the multi-dimensional case.
00:24:35.872 --> 00:24:37.330
And this sort of
an expansion, it's
00:24:37.330 --> 00:24:40.990
just part of a Taylor expansion
for each component of f of x.
00:24:40.990 --> 00:24:43.270
So we take element I of f.
00:24:43.270 --> 00:24:45.870
I want to know its
value at x plus delta x.
00:24:45.870 --> 00:24:47.650
That's its value at x.
00:24:47.650 --> 00:24:50.620
Plus the sum of partial
derivatives of that element,
00:24:50.620 --> 00:24:54.400
with respect to each of the
elements of x, times delta x--
00:24:54.400 --> 00:24:56.320
each of those delta x's.
00:24:56.320 --> 00:24:57.820
Plus, there are
some high order term
00:24:57.820 --> 00:25:01.090
in this Taylor expansion,
which is quadratic in delta x
00:25:01.090 --> 00:25:01.610
instead.
00:25:01.610 --> 00:25:03.440
There's a cubic term, and so on.
00:25:03.440 --> 00:25:05.950
These quadratic terms are
what give rise to this order,
00:25:05.950 --> 00:25:09.100
delta x squared error.
00:25:09.100 --> 00:25:10.570
In higher dimensions,
we typically
00:25:10.570 --> 00:25:12.730
don't worry about
these quadratic terms.
00:25:12.730 --> 00:25:15.250
We're pretty satisfied
with linearization
00:25:15.250 --> 00:25:17.740
of our system of equations.
00:25:17.740 --> 00:25:20.680
Sometimes for 1-D
nonlinear functions,
00:25:20.680 --> 00:25:22.210
you can take
advantage of knowing
00:25:22.210 --> 00:25:24.970
what these quadratic terms
are to do some funny things.
00:25:24.970 --> 00:25:27.730
But in many dimensions,
you usually don't use that.
00:25:27.730 --> 00:25:30.640
Usually you just think about
linearizing the solution.
00:25:30.640 --> 00:25:33.440
So if I know where the solution
is, if I know it's close,
00:25:33.440 --> 00:25:36.010
if I can figure out points
that are close to the solution,
00:25:36.010 --> 00:25:38.760
then I can linearize the
function in that neighborhood--
00:25:38.760 --> 00:25:41.440
I can find the solution to the
linearized equation, instead.
00:25:41.440 --> 00:25:44.394
That's going to be suitably
close to the exact solution.
00:25:44.394 --> 00:25:45.310
Does that makes sense?
00:25:49.970 --> 00:25:54.200
Nonlinear equations, like I
said, are solved iteratively.
00:25:54.200 --> 00:25:58.310
Which means we make a map
in our algorithmic map
00:25:58.310 --> 00:26:01.350
which takes some
value xy and generates
00:26:01.350 --> 00:26:05.390
some new value x plus 1, which
is a better approximation
00:26:05.390 --> 00:26:07.580
for the solution we're after.
00:26:07.580 --> 00:26:11.420
And we designed the map
so that the root, x star,
00:26:11.420 --> 00:26:13.230
is what's called a
fixed point of the map.
00:26:13.230 --> 00:26:15.430
So if I put x star
in on this side,
00:26:15.430 --> 00:26:18.530
I get x star in
on the other side.
00:26:18.530 --> 00:26:23.420
By design, the root is a
fixed point of the map.
00:26:23.420 --> 00:26:26.420
The map may converge,
or it may not converge,
00:26:26.420 --> 00:26:28.820
but the root is a fixed point.
00:26:28.820 --> 00:26:31.040
And we'll stop iterating
when the map is sufficiently
00:26:31.040 --> 00:26:32.090
converged.
00:26:32.090 --> 00:26:34.580
You guys came up with
two different criteria
00:26:34.580 --> 00:26:35.840
for stopping.
00:26:35.840 --> 00:26:38.470
One is called the
function norm criteria.
00:26:38.470 --> 00:26:41.204
I look at how big
my function is--
00:26:41.204 --> 00:26:42.620
I'm trying to find
the place where
00:26:42.620 --> 00:26:43.910
the function is equal to 0.
00:26:43.910 --> 00:26:47.030
So I look at how big,
in the norm space,
00:26:47.030 --> 00:26:50.090
my function is for my
current best solution,
00:26:50.090 --> 00:26:52.395
and ask if it's smaller
than some tolerance epsilon.
00:26:52.395 --> 00:26:54.020
If it is, then I say,
well, my function
00:26:54.020 --> 00:26:56.480
is sufficiently close to 0--
00:26:56.480 --> 00:26:58.790
I'm happy with this solution.
00:26:58.790 --> 00:27:00.650
The solution is close
enough to satisfying
00:27:00.650 --> 00:27:04.899
the original equation that
I'll accept it, and I stop.
00:27:04.899 --> 00:27:07.190
The other criteria, it's
called the step norm criteria.
00:27:07.190 --> 00:27:09.470
I look at two successive
approximations
00:27:09.470 --> 00:27:10.180
for the solution.
00:27:10.180 --> 00:27:13.190
I take their
difference, and ask,
00:27:13.190 --> 00:27:14.990
is the norm of that
difference smaller
00:27:14.990 --> 00:27:18.680
than either some
absolute tolerance,
00:27:18.680 --> 00:27:22.070
or some relative tolerance,
multiplied by the norm
00:27:22.070 --> 00:27:24.650
of my current solution?
00:27:24.650 --> 00:27:31.850
So suppose x is a large number,
that spacing between these x's
00:27:31.850 --> 00:27:35.360
may be quite big, but the
relative spacing may actually
00:27:35.360 --> 00:27:37.410
be quite small.
00:27:37.410 --> 00:27:39.620
And if the relative
spacing is small enough,
00:27:39.620 --> 00:27:42.350
you might say, well, this
is sufficiently converged.
00:27:42.350 --> 00:27:46.100
And so that's where this
relative error, relative error
00:27:46.100 --> 00:27:48.650
tolerance, comes into play
in the step norm criteria.
00:27:48.650 --> 00:27:52.310
Suppose x is a small
number, close to 0 instead.
00:27:52.310 --> 00:27:56.180
These steps may be very tiny--
00:27:56.180 --> 00:27:59.180
these steps may be quite tiny.
00:27:59.180 --> 00:28:03.140
They may satisfy this
relative criteria quite well,
00:28:03.140 --> 00:28:05.270
but you may want to put
some absolute tolerance
00:28:05.270 --> 00:28:08.780
on how far these steps are
before you stop instead.
00:28:08.780 --> 00:28:12.032
Because these x's may be
small in and of themselves.
00:28:12.032 --> 00:28:13.490
And so this one is
easy to satisfy,
00:28:13.490 --> 00:28:14.656
but this one becomes harder.
00:28:14.656 --> 00:28:16.316
So you usually
use both of these.
00:28:16.316 --> 00:28:17.690
Sometimes you have
solutions that
00:28:17.690 --> 00:28:19.670
are converging
toward small numbers,
00:28:19.670 --> 00:28:22.050
and then the absolute error
tolerance becomes important.
00:28:22.050 --> 00:28:23.425
Sometimes you have
solutions that
00:28:23.425 --> 00:28:25.070
are converging
towards large numbers,
00:28:25.070 --> 00:28:27.650
and so the relative error
tolerance becomes important.
00:28:27.650 --> 00:28:29.480
Does that make sense?
00:28:29.480 --> 00:28:32.030
Of course, you can't just use
this one, or just use that one.
00:28:32.030 --> 00:28:35.780
You typically like
to use all of these.
00:28:35.780 --> 00:28:38.540
Because they can fail.
00:28:38.540 --> 00:28:41.474
And if the function
norm criterion fail--
00:28:41.474 --> 00:28:42.890
here's an example
where I'm taking
00:28:42.890 --> 00:28:45.830
some iterations, some
approximate solutions
00:28:45.830 --> 00:28:49.460
that are headed towards the
actual root of this function.
00:28:49.460 --> 00:28:53.240
And at some point, I
find that this solution
00:28:53.240 --> 00:28:55.820
is within epsilon of 0.
00:28:55.820 --> 00:28:57.974
And so I'd like to
accept this solution,
00:28:57.974 --> 00:28:59.390
but graphically
it looks like it's
00:28:59.390 --> 00:29:00.860
very far away from the root.
00:29:00.860 --> 00:29:02.360
So this is a case
where the function
00:29:02.360 --> 00:29:05.690
has a very shallow slope.
00:29:05.690 --> 00:29:08.690
It's a very shallow slope,
and the functional criteria,
00:29:08.690 --> 00:29:10.670
not so good, really.
00:29:10.670 --> 00:29:14.425
I call this a solution, but it's
quite a ways away from star.
00:29:14.425 --> 00:29:16.550
So sometimes it's going to
work, but sometimes it's
00:29:16.550 --> 00:29:18.520
not going to work.
00:29:18.520 --> 00:29:20.230
Here's the step norm
criteria-- here,
00:29:20.230 --> 00:29:23.519
I have a function
nowhere near a root--
00:29:23.519 --> 00:29:25.310
I have no idea where
I am on this function,
00:29:25.310 --> 00:29:27.060
I don't know what value
this function has,
00:29:27.060 --> 00:29:30.527
but my steps suddenly
got small enough
00:29:30.527 --> 00:29:32.610
that they're smaller than
this absolute tolerance,
00:29:32.610 --> 00:29:34.818
or they're smaller than the
relative error tolerance.
00:29:34.818 --> 00:29:36.340
I might say, OK, let's stop.
00:29:36.340 --> 00:29:39.239
I'm not taking very
large steps anymore,
00:29:39.239 --> 00:29:40.780
this seems like a
good place to quit.
00:29:40.780 --> 00:29:42.792
But actually, my
function just after this
00:29:42.792 --> 00:29:45.250
didn't go to 0 at all, it curved
up and went the other way.
00:29:45.250 --> 00:29:47.770
There's not even
a solution nearby.
00:29:47.770 --> 00:29:49.810
So both of these
things can fail,
00:29:49.810 --> 00:29:51.550
and we try to use
both of them instead
00:29:51.550 --> 00:29:54.700
to evaluate whether we
have a reasonable solution
00:29:54.700 --> 00:29:57.000
to our nonlinear
equation or not.
00:29:57.000 --> 00:29:57.610
Make sense?
00:30:00.750 --> 00:30:04.114
Are there any questions about
that before you I go on?
00:30:04.114 --> 00:30:05.998
No.
00:30:05.998 --> 00:30:07.420
OK.
00:30:07.420 --> 00:30:09.580
We also talk oftentimes
about the rate
00:30:09.580 --> 00:30:13.780
of convergence of the iterative
process that we're using.
00:30:13.780 --> 00:30:18.239
We might say it converges
linearly, or quadratically.
00:30:18.239 --> 00:30:19.780
And the rate of
convergence is always
00:30:19.780 --> 00:30:24.310
assessed by looking
at the difference
00:30:24.310 --> 00:30:28.140
between successive--
well, we look
00:30:28.140 --> 00:30:31.550
at the ratio of differences
for successive approximation.
00:30:31.550 --> 00:30:35.025
So here's the difference between
my best approximation, step
00:30:35.025 --> 00:30:37.950
i plus 1 minus the
exact solution,
00:30:37.950 --> 00:30:41.880
normed, divided by my best
approximation at step i,
00:30:41.880 --> 00:30:46.770
minus the exact solution normed,
and raised to some power, q.
00:30:46.770 --> 00:30:51.650
And as I go to very large
numbers of iterations, i--
00:30:51.650 --> 00:30:54.240
this limit should
be i, I apologize.
00:30:54.240 --> 00:30:55.845
I'll fix that in
the notes online,
00:30:55.845 --> 00:30:58.580
but this limit should be i--
is i, the number of steps
00:30:58.580 --> 00:31:00.380
gets very large.
00:31:00.380 --> 00:31:04.470
This ratio should
converge to some constant.
00:31:04.470 --> 00:31:06.254
And the ratio will
converge to a constant
00:31:06.254 --> 00:31:07.920
when I choose the
right power of q here.
00:31:10.750 --> 00:31:14.491
So when this limit exists,
and it doesn't go to 0,
00:31:14.491 --> 00:31:16.490
we can identify what sort
of convergence we get.
00:31:16.490 --> 00:31:19.370
So if q equals 1,
and C is smaller
00:31:19.370 --> 00:31:21.544
than when we say that
convergence is linear,
00:31:21.544 --> 00:31:22.710
what is that saying, really?
00:31:22.710 --> 00:31:25.460
This top step here is
the absolute error,
00:31:25.460 --> 00:31:27.680
an approximation i plus 1.
00:31:27.680 --> 00:31:31.660
This bottom step here is the
absolute error in step i--
00:31:31.660 --> 00:31:34.110
remember two is one
for linear convergence.
00:31:34.110 --> 00:31:38.870
So the ratio of absolute errors,
as long as that's less than 1--
00:31:38.870 --> 00:31:42.570
I'm converging, I'm moving
my way towards the solution.
00:31:42.570 --> 00:31:45.020
And we say that rate is linear.
00:31:45.020 --> 00:31:48.140
If C is 10 to the minus
1, then each iteration
00:31:48.140 --> 00:31:52.070
will be one digit more
accurate than the previous one.
00:31:52.070 --> 00:31:54.440
The absolute error will
be 10 times smaller
00:31:54.440 --> 00:31:56.720
in the next iteration
versus the previous one.
00:31:56.720 --> 00:31:57.710
That would be great--
00:31:57.710 --> 00:31:59.150
usually C isn't that small.
00:32:02.210 --> 00:32:05.850
If this power, for which
this limit exists, q,
00:32:05.850 --> 00:32:10.140
is bigger than 1, we say the
convergence is super linear.
00:32:10.140 --> 00:32:12.750
If q is 2, which
we'll see is something
00:32:12.750 --> 00:32:15.540
that results from the
Newton-Raphson method,
00:32:15.540 --> 00:32:18.960
then we say convergence
is quadratic.
00:32:18.960 --> 00:32:22.170
What that means is the number of
accurate digits in my solution
00:32:22.170 --> 00:32:25.060
will actually double
with each iteration.
00:32:25.060 --> 00:32:28.320
Linear, which equals
10 to the minus 1,
00:32:28.320 --> 00:32:30.910
I get one digit per iteration.
00:32:30.910 --> 00:32:33.910
Quadratic, I double the number
of digits per iteration--
00:32:33.910 --> 00:32:35.430
I have one digit
on one iteration,
00:32:35.430 --> 00:32:37.740
I get two the next one,
and four the next one,
00:32:37.740 --> 00:32:39.120
and eight the next one.
00:32:39.120 --> 00:32:44.130
So quadratic convergence
is marvelous.
00:32:44.130 --> 00:32:46.320
Linear convergence, that's OK.
00:32:46.320 --> 00:32:49.210
That's about the minimum
you'd be willing to accept.
00:32:49.210 --> 00:32:50.910
Quadratic convergence
is great, so we
00:32:50.910 --> 00:32:54.030
aim for methods that try
to have these higher order
00:32:54.030 --> 00:32:54.780
convergences.
00:32:54.780 --> 00:32:57.787
So you really quickly get
highly accurate solutions.
00:32:57.787 --> 00:32:59.370
You can go back and
look at your notes
00:32:59.370 --> 00:33:00.980
and see that the
Jacobian method,
00:33:00.980 --> 00:33:04.950
and the Gauss-Seidel method both
show linear convergence rates.
00:33:04.950 --> 00:33:07.902
They're linear methods.
00:33:07.902 --> 00:33:09.710
Does this make sense?
00:33:09.710 --> 00:33:11.145
OK.
00:33:11.145 --> 00:33:12.770
So I meant it mentioned
Newton-Raphson.
00:33:12.770 --> 00:33:14.450
Hopefully, somebody
at some point
00:33:14.450 --> 00:33:16.220
told you about the
Newton-Raphson method
00:33:16.220 --> 00:33:19.790
for solving at least
one-dimensional, nonlinear
00:33:19.790 --> 00:33:20.450
equations.
00:33:20.450 --> 00:33:22.700
It goes like this, though.
00:33:22.700 --> 00:33:27.440
You say, I guess my solution is
close to this green point here.
00:33:27.440 --> 00:33:31.650
Let me linearize my
function at that point,
00:33:31.650 --> 00:33:34.830
and find where that linear
approximation has a root--
00:33:34.830 --> 00:33:36.710
which is this next green point.
00:33:36.710 --> 00:33:39.020
And then repeat
that process here.
00:33:39.020 --> 00:33:42.320
I find the linearization of
my function, this pink arrow.
00:33:42.320 --> 00:33:45.620
I look for where that
linear function has a root,
00:33:45.620 --> 00:33:47.390
and that's my next
best approximation.
00:33:47.390 --> 00:33:49.850
And I repeat this
process over and over,
00:33:49.850 --> 00:33:51.890
and it will reliably--
00:33:51.890 --> 00:33:56.320
under certain circumstances--
converge to the root.
00:33:56.320 --> 00:33:58.610
What does that look like,
in terms of the equations?
00:33:58.610 --> 00:34:02.420
So I linearized
my function, so I
00:34:02.420 --> 00:34:06.200
want to approximate
f of x at i plus 1,
00:34:06.200 --> 00:34:09.050
in terms of f of x at
i-- so it's f of x at i,
00:34:09.050 --> 00:34:12.210
plus the derivative, multiplied
by the difference between x
00:34:12.210 --> 00:34:14.050
i plus 1 and x i.
00:34:14.050 --> 00:34:16.820
And I say, find the place
where this approximation
00:34:16.820 --> 00:34:20.270
is equal to 0,
and determine what
00:34:20.270 --> 00:34:23.679
the next point that I'm going to
use to approximate my solution
00:34:23.679 --> 00:34:24.179
is.
00:34:24.179 --> 00:34:27.710
So I solve for x i plus
1, in terms of x i.
00:34:27.710 --> 00:34:30.840
How big of a step do I take
from x i to x i plus 1?
00:34:30.840 --> 00:34:34.460
It's this big, so the ratio of
the function to its derivative
00:34:34.460 --> 00:34:36.839
at x i.
00:34:36.839 --> 00:34:42.110
And the derivative does the job
of telling me which direction I
00:34:42.110 --> 00:34:43.850
should step in.
00:34:43.850 --> 00:34:45.889
Derivative gives
me directionality,
00:34:45.889 --> 00:34:50.120
and this ratio here tells me
the magnitude of the step.
00:34:50.120 --> 00:34:52.010
The magnitudes,
you know, they're
00:34:52.010 --> 00:34:54.719
not very good oftentimes,
because these functions
00:34:54.719 --> 00:34:57.260
that we're trying to
solve aren't very linear.
00:34:57.260 --> 00:34:58.940
Usually they're
highly nonlinear.
00:34:58.940 --> 00:35:01.457
What's really helpful is
getting the direction right.
00:35:01.457 --> 00:35:04.040
You could go right, you could
go left-- only one of those ways
00:35:04.040 --> 00:35:05.480
is getting you to the root.
00:35:05.480 --> 00:35:06.980
Newton-Raphson has
this advantage--
00:35:06.980 --> 00:35:09.360
it always points you
in the right direction.
00:35:09.360 --> 00:35:10.880
OK?
00:35:10.880 --> 00:35:13.340
Of course, you can do this in
any number of dimensions, not
00:35:13.340 --> 00:35:16.040
just one dimension.
00:35:16.040 --> 00:35:17.540
So you can approximate
your function
00:35:17.540 --> 00:35:21.140
as linear-- f of x i plus
1 is approximately 0.
00:35:21.140 --> 00:35:23.120
And then let's take
our linearized version
00:35:23.120 --> 00:35:27.740
of the function, and let's
find where it's equal to 0.
00:35:27.740 --> 00:35:30.170
Sometimes what's
done is to replace
00:35:30.170 --> 00:35:33.200
this difference, x
i plus 1, minus x i,
00:35:33.200 --> 00:35:35.750
with an unknown vector, d i--
00:35:35.750 --> 00:35:37.530
which is the step size.
00:35:37.530 --> 00:35:41.540
How big a step am I going to
take from x i to x i plus 1?
00:35:41.540 --> 00:35:44.720
And so we solve this equation
for the displacement, d i.
00:35:44.720 --> 00:35:46.860
Move f to the other
side, so you have
00:35:46.860 --> 00:35:50.410
Jacobian times d i is minus f.
00:35:50.410 --> 00:35:53.270
And solve-- d i is
Jacobian inverse times f.
00:35:53.270 --> 00:35:56.010
It's just a system
of linear equations.
00:35:56.010 --> 00:35:58.550
Now we know our step size.
00:35:58.550 --> 00:36:02.620
So x i plus 1 is x i
plus b, or x i plus 1
00:36:02.620 --> 00:36:07.600
is x i minus Jacobian
inverse times f.
00:36:07.600 --> 00:36:09.700
The inverse of the Jacobian
plays the same role
00:36:09.700 --> 00:36:11.620
is 1 over the derivative.
00:36:11.620 --> 00:36:13.960
It's telling us what
direction to step in,
00:36:13.960 --> 00:36:16.240
in this multi-dimensional space.
00:36:16.240 --> 00:36:19.420
And this solution to
the system of equations
00:36:19.420 --> 00:36:23.530
is giving us a magnitude of the
step that's good, not great,
00:36:23.530 --> 00:36:26.980
but is taking us closer
and closer to the root.
00:36:26.980 --> 00:36:28.480
So this is the
Newton-Raphson method
00:36:28.480 --> 00:36:30.834
applied to the system
of nonlinear equations.
00:36:30.834 --> 00:36:32.500
This is really the
way you want to solve
00:36:32.500 --> 00:36:35.430
these sorts of problems.
00:36:35.430 --> 00:36:36.920
It doesn't always work--
00:36:36.920 --> 00:36:38.732
things can go wrong.
00:36:38.732 --> 00:36:40.190
What sorts of things
go wrong here?
00:36:40.190 --> 00:36:40.773
Can you guess?
00:36:44.010 --> 00:36:44.648
Yeah?
00:36:44.648 --> 00:36:47.516
AUDIENCE: [INAUDIBLE]
00:36:47.516 --> 00:36:48.860
PROFESSOR: OK, this is good.
00:36:48.860 --> 00:36:52.550
So in the 1-D problem, sometimes
the Newton-Raphson method
00:36:52.550 --> 00:36:54.470
can get stuck.
00:36:54.470 --> 00:36:58.190
So it won't have good
necessarily global convergence
00:36:58.190 --> 00:36:58.700
properties.
00:36:58.700 --> 00:37:01.580
If you have a bad initial guess,
it might get stuck someplace,
00:37:01.580 --> 00:37:02.930
and the iterates will converge.
00:37:02.930 --> 00:37:03.920
That can be true.
00:37:03.920 --> 00:37:04.878
What else can go wrong?
00:37:04.878 --> 00:37:06.424
AUDIENCE: [INAUDIBLE]
00:37:06.424 --> 00:37:08.090
PROFESSOR: Good, so
if you're derivative
00:37:08.090 --> 00:37:10.290
is 0, that's going
to be problematic.
00:37:10.290 --> 00:37:11.930
What's the
multi-dimensional equivalent
00:37:11.930 --> 00:37:13.340
of the derivative being 0?
00:37:13.340 --> 00:37:15.030
AUDIENCE: [INAUDIBLE]
00:37:15.030 --> 00:37:16.470
PROFESSOR: What's that?
00:37:16.470 --> 00:37:17.810
Singular Jacobian, right?
00:37:17.810 --> 00:37:22.190
So if this is J, the Jacobian,
has some null space associated
00:37:22.190 --> 00:37:24.110
with it, how am I
supposed to figure out
00:37:24.110 --> 00:37:26.700
which direction to step in?
00:37:26.700 --> 00:37:29.720
There's some arbitrariness
associated with the solution
00:37:29.720 --> 00:37:31.820
of this system of equations.
00:37:31.820 --> 00:37:35.240
So the derivative is 0 in the
1-D example, that's a problem.
00:37:35.240 --> 00:37:37.070
That problem gets
a little fuzzier,
00:37:37.070 --> 00:37:38.630
but it's still a
big problem when
00:37:38.630 --> 00:37:42.020
we try to solve for the
step size, or the step--
00:37:42.020 --> 00:37:43.100
the Newton-Raphson step.
00:37:43.100 --> 00:37:44.660
They may not be able to do this.
00:37:47.400 --> 00:37:49.040
You don't run into
this very often,
00:37:49.040 --> 00:37:51.105
but you can, from time to time.
00:37:51.105 --> 00:37:52.730
One place where this
is going to happen
00:37:52.730 --> 00:37:56.137
is if we have a non-locally
unique solution.
00:37:56.137 --> 00:37:58.220
When we have one of those,
we know the determinant
00:37:58.220 --> 00:38:02.480
of the Jacobian at that
point is going to be 0.
00:38:02.480 --> 00:38:04.370
If we're close to
those solutions,
00:38:04.370 --> 00:38:05.870
well the determinant
of the Jacobian
00:38:05.870 --> 00:38:08.720
is going to be close to 0--
00:38:08.720 --> 00:38:11.420
you might expect that the system
of equations you have to solve
00:38:11.420 --> 00:38:14.120
becomes ill-conditioned.
00:38:14.120 --> 00:38:16.280
So even though there
may be an exact solution
00:38:16.280 --> 00:38:18.562
for all the steps
leading up to that point,
00:38:18.562 --> 00:38:20.270
the equations may
become ill-conditioned.
00:38:20.270 --> 00:38:22.760
You may not be able to
reliably find those solutions
00:38:22.760 --> 00:38:24.290
with your computer, either.
00:38:24.290 --> 00:38:26.330
So then these steps
you take, well,
00:38:26.330 --> 00:38:29.380
who knows where they're
going at that point.
00:38:29.380 --> 00:38:31.577
It's going to be crazy.
00:38:31.577 --> 00:38:33.410
There are ways of fixing
all these problems.
00:38:33.410 --> 00:38:36.442
Let's do an example.
00:38:36.442 --> 00:38:37.900
This is a geometry
example, but you
00:38:37.900 --> 00:38:40.910
can write it is a system of
nonlinear equations, as well.
00:38:40.910 --> 00:38:43.450
So we have two circles--
00:38:43.450 --> 00:38:47.320
circle f 1, circle f
2 in the x1, x2 plane.
00:38:47.320 --> 00:38:49.450
They satisfy-- these
are the locus of points,
00:38:49.450 --> 00:38:54.115
that satisfy the equation f
1 of x 1 and x 2 equals 0--
00:38:54.115 --> 00:38:55.800
this is the equation
for one circle,
00:38:55.800 --> 00:38:57.770
and this is the equation
for the other circle.
00:38:57.770 --> 00:38:59.145
And we want the
solution, we want
00:38:59.145 --> 00:39:02.920
the roots of this
vector-valued function, f,
00:39:02.920 --> 00:39:04.330
for vector-valued x.
00:39:04.330 --> 00:39:09.110
And those are the intercepts
between these two circles.
00:39:09.110 --> 00:39:12.110
You can do it using
Newton-Raphson,
00:39:12.110 --> 00:39:14.990
so you're going to need
to know the Jacobian.
00:39:14.990 --> 00:39:17.010
So compute the Jacobian
of this function.
00:39:17.010 --> 00:39:18.840
This is practice--
maybe most of you
00:39:18.840 --> 00:39:20.370
know how to compute a
Jacobian, but some people
00:39:20.370 --> 00:39:21.060
haven't done it before.
00:39:21.060 --> 00:39:23.100
So it's always good to
make sure you remember
00:39:23.100 --> 00:39:25.260
that the first row
of the Jacobian
00:39:25.260 --> 00:39:28.530
is the derivatives of
the first element of f.
00:39:28.530 --> 00:39:30.885
And later rows are
later elements.
00:39:30.885 --> 00:39:32.760
You don't want the
transpose of the Jacobian.
00:39:32.760 --> 00:39:33.551
Then it won't work.
00:40:02.400 --> 00:40:05.130
OK, so should look
something like this.
00:40:05.130 --> 00:40:08.260
There's your Jacobian.
00:40:08.260 --> 00:40:10.360
The Newton-Raphson
process tells us
00:40:10.360 --> 00:40:14.620
how to take steps from one
approximation to the next.
00:40:14.620 --> 00:40:21.060
The step is equal to minus the
Jacobian inverse evaluated,
00:40:21.060 --> 00:40:23.980
and my best guess for
the solution multiplied
00:40:23.980 --> 00:40:29.080
by the function-- evaluated at
my best guess of the solution.
00:40:29.080 --> 00:40:31.810
You're never going to compute
Jacobian inverse explicitly--
00:40:31.810 --> 00:40:35.170
that's just code for solve this
system of linear equations.
00:40:35.170 --> 00:40:38.922
So use the slash operator
in MatLab, for example.
00:40:38.922 --> 00:40:39.630
And here you go--
00:40:39.630 --> 00:40:42.880
I had an initial guess for
the solution, iterate 0
00:40:42.880 --> 00:40:44.190
at minus 1 and 3.
00:40:44.190 --> 00:40:45.884
This is somewhere
outside the circles--
00:40:45.884 --> 00:40:47.550
it's pretty far away
from the solutions.
00:40:47.550 --> 00:40:51.180
But I do my Newton-Raphson
steps, I iterate on and on.
00:40:51.180 --> 00:40:54.180
And after four
steps, you can see
00:40:54.180 --> 00:40:56.130
that the step size
in absolute value
00:40:56.130 --> 00:40:57.540
is order 10 to the minus 3.
00:40:57.540 --> 00:41:00.637
The function norm is
order 10 to the minus 3,
00:41:00.637 --> 00:41:02.470
as well-- and maybe
order 10 to the minus 2,
00:41:02.470 --> 00:41:03.800
but it's getting down there.
00:41:03.800 --> 00:41:05.980
These things are
decreasing pretty quickly.
00:41:05.980 --> 00:41:07.740
And I move to a
point that you'll
00:41:07.740 --> 00:41:09.240
see is pretty close
to the solution.
00:41:13.110 --> 00:41:16.070
Here's some things you need
to know about Newton-Raphson,
00:41:16.070 --> 00:41:19.160
you'll want to think carefully
about as we go forward.
00:41:19.160 --> 00:41:24.990
So it possesses a local
convergence property.
00:41:24.990 --> 00:41:28.260
I'm going to illustrate
that graphically for you.
00:41:28.260 --> 00:41:31.070
So here, I didn't solve the
problem once, I solved it--
00:41:31.070 --> 00:41:32.600
I don't know, 10,000 times.
00:41:32.600 --> 00:41:36.610
And I chose different initial
points to start iterating with.
00:41:36.610 --> 00:41:39.240
Here, minus 1, 3,
that was one point.
00:41:39.240 --> 00:41:41.300
But I chose a whole
bunch of them.
00:41:41.300 --> 00:41:43.970
And I asked, how
many iterations--
00:41:43.970 --> 00:41:46.100
how many steps did my
Newton-Raphson method
00:41:46.100 --> 00:41:48.440
have to take before
I got sufficiently
00:41:48.440 --> 00:41:53.540
close to either this root
here, or this root there?
00:41:53.540 --> 00:41:55.992
I don't remember what that
convergence criterion was--
00:41:55.992 --> 00:41:57.950
it doesn't really matter,
but there was some 10
00:41:57.950 --> 00:42:00.366
to the minus 3, or 10 to the
minus 5 or 10 to the minus 8,
00:42:00.366 --> 00:42:02.000
the convergence
criterion that I made
00:42:02.000 --> 00:42:05.180
sure the Newton-Raphson method
hit, in both the function norm
00:42:05.180 --> 00:42:08.450
and step norm cases.
00:42:08.450 --> 00:42:11.620
And then, if the color
on this map is blue--
00:42:11.620 --> 00:42:14.240
the solution converged to
this star in the blue zone--
00:42:14.240 --> 00:42:17.390
if the color's orange, the
solution converged to the star
00:42:17.390 --> 00:42:19.180
in the orange zone.
00:42:19.180 --> 00:42:21.920
And if the color is light, it
didn't take so many iterations
00:42:21.920 --> 00:42:22.544
to converge.
00:42:22.544 --> 00:42:24.710
And if the color gets darker,
it takes more and more
00:42:24.710 --> 00:42:27.080
iterations to converge.
00:42:27.080 --> 00:42:28.130
So that's the picture--
00:42:28.130 --> 00:42:29.104
that's this map.
00:42:29.104 --> 00:42:30.770
I solved it a bunch
of times, and then I
00:42:30.770 --> 00:42:33.124
mapped out how
many iterations did
00:42:33.124 --> 00:42:35.040
it take me to converge
to different solutions?
00:42:35.040 --> 00:42:37.307
So you can see if I start
close to the solution,
00:42:37.307 --> 00:42:38.390
the color is really light.
00:42:38.390 --> 00:42:41.120
It doesn't take very many
iterations to get there.
00:42:41.120 --> 00:42:43.335
And the further way I
move in this direction--
00:42:43.335 --> 00:42:45.710
still doesn't seem like it
take so many iterations to get
00:42:45.710 --> 00:42:46.880
there, either.
00:42:46.880 --> 00:42:49.100
I need a good initial guess--
00:42:49.100 --> 00:42:51.910
I want to be close to where
I think the solution is.
00:42:51.910 --> 00:42:54.590
Because once I'm over here
somewhere, I do pretty well.
00:42:54.590 --> 00:42:56.215
And the same is true
on the other side,
00:42:56.215 --> 00:42:59.000
because this problem
is symmetric.
00:42:59.000 --> 00:43:03.170
There's a line down the middle
here, and along this line,
00:43:03.170 --> 00:43:05.585
the determinant of the
Jacobian is equal to 0.
00:43:08.580 --> 00:43:11.970
So we talked about these points
with the Newton-Raphson method.
00:43:11.970 --> 00:43:14.520
And if I pick initial
guesses sufficiently close
00:43:14.520 --> 00:43:17.970
to this line, you can see the
color gets darker and darker.
00:43:17.970 --> 00:43:20.100
The number of iterations
required to converge
00:43:20.100 --> 00:43:23.640
the solution goes way up.
00:43:23.640 --> 00:43:28.190
Now, the Newton-Raphson method
possesses a local convergence
00:43:28.190 --> 00:43:28.740
property.
00:43:28.740 --> 00:43:32.750
Which means, if a
locally unique solution,
00:43:32.750 --> 00:43:34.910
there's always going
to be some neighborhood
00:43:34.910 --> 00:43:38.090
around that solution for which
the determinant of the Jacobian
00:43:38.090 --> 00:43:39.560
is not equal to 0.
00:43:39.560 --> 00:43:42.710
And in that neighborhood,
I can guarantee
00:43:42.710 --> 00:43:46.010
that this iterative process will
eventually reach the solution.
00:43:46.010 --> 00:43:48.070
That's pretty good.
00:43:48.070 --> 00:43:48.770
That's handy.
00:43:48.770 --> 00:43:50.270
These iterates can
go anywhere-- how
00:43:50.270 --> 00:43:51.700
do you know you're
getting to the solution?
00:43:51.700 --> 00:43:52.720
Are you going to
waste your time,
00:43:52.720 --> 00:43:53.300
or are you going to get there?
00:43:53.300 --> 00:43:55.625
So that's this local
convergence property associated
00:43:55.625 --> 00:43:58.690
with it, which is nice.
00:43:58.690 --> 00:44:00.700
But it'll break down
as I get to places
00:44:00.700 --> 00:44:04.450
where the determent of the
Jacobian is equal to 0.
00:44:04.450 --> 00:44:06.820
so there could be a zone--
00:44:06.820 --> 00:44:09.730
it's not in this one-- there
could be a zone, for example,
00:44:09.730 --> 00:44:11.967
like a ring on which the
determinant of the Jacobian
00:44:11.967 --> 00:44:12.550
sequence is 0.
00:44:12.550 --> 00:44:14.410
And if I take a guess
inside that ring, who
00:44:14.410 --> 00:44:16.276
knows where that solution
is going to go to.
00:44:16.276 --> 00:44:17.650
Could be something
like Sam said,
00:44:17.650 --> 00:44:21.580
where the iterative method just
bounces around inside that ring
00:44:21.580 --> 00:44:24.070
and never converges.
00:44:24.070 --> 00:44:26.800
But when I have roots
that are locally unique,
00:44:26.800 --> 00:44:29.830
and I start with good
guesses close to those roots,
00:44:29.830 --> 00:44:31.660
I can guarantee the
Newton-Raphson method
00:44:31.660 --> 00:44:32.620
will converge.
00:44:32.620 --> 00:44:35.290
I'll show you next time that
not only does it converge,
00:44:35.290 --> 00:44:37.060
but it also converges
quadratically.
00:44:37.060 --> 00:44:39.452
So you start sufficiently
close to the solution,
00:44:39.452 --> 00:44:41.410
you get to double the
number of accurate digits
00:44:41.410 --> 00:44:42.340
in each iteration.
00:44:42.340 --> 00:44:44.200
You can see that happening here.
00:44:44.200 --> 00:44:47.620
OK, so I have one accurate
digit, now I have two.
00:44:47.620 --> 00:44:49.967
The next iteration I'll
have four, and so on.
00:44:49.967 --> 00:44:51.550
The number of accurate
digits is going
00:44:51.550 --> 00:44:56.030
to double at each iteration.
00:44:56.030 --> 00:44:57.870
So that's going to
conclude for today.
00:44:57.870 --> 00:45:01.160
Next time, we'll talk about
how to fix these problems
00:45:01.160 --> 00:45:02.720
with the Newton-Raphson method.
00:45:02.720 --> 00:45:04.220
So there are going
to be cases where
00:45:04.220 --> 00:45:07.345
the convergence isn't ideal,
where we can improve things.
00:45:07.345 --> 00:45:08.720
There are going
to be cases where
00:45:08.720 --> 00:45:10.525
we don't want to
compute the Jacobian
00:45:10.525 --> 00:45:11.720
or the Jacobian inverse.
00:45:11.720 --> 00:45:13.840
And we can improve the method.
00:45:13.840 --> 00:45:15.390
Thanks.