WEBVTT
00:00:00.500 --> 00:00:02.756
The following content is
provided under a Creative
00:00:02.756 --> 00:00:03.610
Commons license.
00:00:03.610 --> 00:00:05.770
Your support will help
MIT OpenCourseWare
00:00:05.770 --> 00:00:09.910
continue to offer high quality
educational resources for free.
00:00:09.910 --> 00:00:12.530
To make a donation or to
view additional materials
00:00:12.530 --> 00:00:15.610
from hundreds of MIT courses,
visit MIT OpenCourseWare
00:00:15.610 --> 00:00:20.835
at ocw.mit.edu.
00:00:20.835 --> 00:00:21.710
PROFESSOR STRANG: OK.
00:00:21.710 --> 00:00:25.840
So this is Lecture 14,
it's also the lecture
00:00:25.840 --> 00:00:29.300
before the exam on
Tuesday evening.
00:00:29.300 --> 00:00:35.150
I thought I would just go ahead
and tell you what questions
00:00:35.150 --> 00:00:37.800
there are, so you could see.
00:00:37.800 --> 00:00:39.580
I haven't filled
in all the numbers,
00:00:39.580 --> 00:00:44.630
but this will tell
you, it's a way
00:00:44.630 --> 00:00:47.680
of reviewing of
course, to sort of see
00:00:47.680 --> 00:00:50.010
the things that we've done.
00:00:50.010 --> 00:00:52.330
Which is quite a bit, really.
00:00:52.330 --> 00:00:58.380
And also of course topics
that will not be in the exam.
00:00:58.380 --> 00:01:01.510
You can see that they're not
in the exam, for example.
00:01:01.510 --> 00:01:06.240
Topics from 1.7 on
condition number, or 2.3
00:01:06.240 --> 00:01:09.440
on Gram-Schmidt, that I can
speak about a little bit.
00:01:09.440 --> 00:01:12.740
So those are the,
only thing I didn't
00:01:12.740 --> 00:01:16.830
fill in is the end time there.
00:01:16.830 --> 00:01:19.360
So I don't usually
ask four questions,
00:01:19.360 --> 00:01:20.870
three is more typical.
00:01:20.870 --> 00:01:25.060
So it's a little bit longer
but it's not a difficult exam.
00:01:25.060 --> 00:01:33.620
And I never want time to be
the essential ingredient.
00:01:33.620 --> 00:01:37.270
So 7:30 to 9 is
the nominal time,
00:01:37.270 --> 00:01:42.110
but it would not be a
surprise if you're still
00:01:42.110 --> 00:01:45.030
going on after 9
o'clock a little bit.
00:01:45.030 --> 00:01:50.890
And I'll try not to, like,
tear a paper away from you.
00:01:50.890 --> 00:01:57.090
So, just figure that if you move
along at a reasonable speed,
00:01:57.090 --> 00:02:04.470
and so you can bring papers,
the book, anything with you.
00:02:04.470 --> 00:02:09.240
And I'm open for questions
now about the exam.
00:02:09.240 --> 00:02:13.060
You know where 54-100 is, it's
a classroom unfortunately.
00:02:13.060 --> 00:02:17.260
Not a, sometimes we'll have the
top of Walker where you have
00:02:17.260 --> 00:02:21.030
a whole table to work
on, that's-- This 54-100,
00:02:21.030 --> 00:02:26.600
in the tallest building of MIT
out in the middle of the middle
00:02:26.600 --> 00:02:31.690
space out there, is
a large classroom.
00:02:31.690 --> 00:02:38.380
So if you kind of spread
out your papers we'll be OK.
00:02:38.380 --> 00:02:40.730
It's a pretty big room.
00:02:40.730 --> 00:02:42.810
Questions.
00:02:42.810 --> 00:02:47.090
And, of course, don't forget
this afternoon in 1-190
00:02:47.090 --> 00:02:48.530
rather than here.
00:02:48.530 --> 00:02:58.620
So, review is in
1-190, 4 to 5 today.
00:02:58.620 --> 00:03:04.470
But if you're not
free at that time
00:03:04.470 --> 00:03:07.990
don't feel that you've
missed anything essential.
00:03:07.990 --> 00:03:11.810
I thought this would
be the right way
00:03:11.810 --> 00:03:18.070
to tell you what the exam is and
guide your preparation for it.
00:03:18.070 --> 00:03:19.010
Questions.
00:03:19.010 --> 00:03:19.680
Yes, thanks.
00:03:19.680 --> 00:03:26.870
AUDIENCE: [INAUDIBLE].
00:03:26.870 --> 00:03:30.380
PROFESSOR STRANG: So these
are four different questions.
00:03:30.380 --> 00:03:36.080
So this would be a question
about getting to this matrix
00:03:36.080 --> 00:03:40.150
and what it's about,
A transpose C A.
00:03:40.150 --> 00:03:49.430
AUDIENCE: [INAUDIBLE]
00:03:49.430 --> 00:03:52.060
PROFESSOR STRANG: OK, I don't
use beam bending for that.
00:03:52.060 --> 00:03:53.810
I'm thinking of elastic bar.
00:03:53.810 --> 00:03:57.090
Yeah, stretching equation, yeah.
00:03:57.090 --> 00:04:02.720
So the stretching equation,
so the first one we ever saw,
00:04:02.720 --> 00:04:08.740
the stretching equation, was
just u'', or -u'', equal 1.
00:04:08.740 --> 00:04:12.460
And now that allows
a C to sneak in
00:04:12.460 --> 00:04:15.980
and that allows a matrix
C to sneak in there,
00:04:15.980 --> 00:04:21.880
but I think you'll see what
they should look like, but yeah.
00:04:21.880 --> 00:04:24.480
AUDIENCE: [INAUDIBLE]
00:04:24.480 --> 00:04:26.250
PROFESSOR STRANG: Yeah, yeah.
00:04:26.250 --> 00:04:30.500
So this is Section 2.2,
directly out of the book.
00:04:30.500 --> 00:04:33.920
M will be the mass matrix, K
will be the stiffness matrix,
00:04:33.920 --> 00:04:35.480
yep.
00:04:35.480 --> 00:04:38.825
AUDIENCE: [INAUDIBLE]
00:04:38.825 --> 00:04:39.700
PROFESSOR STRANG: OK.
00:04:39.700 --> 00:04:43.140
So, good point to say.
00:04:43.140 --> 00:04:46.630
In the very first
section, 1.1, we
00:04:46.630 --> 00:04:51.530
gave the name K to a very
special matrix, a specific one.
00:04:51.530 --> 00:04:57.120
But then later, now, I'm
using the same letter K
00:04:57.120 --> 00:04:59.450
for matrices of that type.
00:04:59.450 --> 00:05:03.750
That was the most special,
simplest, completely understood
00:05:03.750 --> 00:05:07.620
case, but now I'll use
K for stiffness matrices
00:05:07.620 --> 00:05:11.560
and when we're doing finite
elements in a few weeks
00:05:11.560 --> 00:05:15.100
again it'll be K.
Yeah, same name, right.
00:05:15.100 --> 00:05:18.110
So here you'll want
to create K and M
00:05:18.110 --> 00:05:24.650
and know how to deal
with, this was our only
00:05:24.650 --> 00:05:26.370
time-dependent thing.
00:05:26.370 --> 00:05:27.990
So I guess what
you're seeing here
00:05:27.990 --> 00:05:31.540
is not only what time-dependent
equation will be there
00:05:31.540 --> 00:05:35.160
but also that I'm
not going in detail
00:05:35.160 --> 00:05:42.450
into those trapezoidal
difference methods.
00:05:42.450 --> 00:05:45.660
Important as they are, we
can't do everything on the quiz
00:05:45.660 --> 00:05:55.360
so I'm really focusing on things
that are central to our course.
00:05:55.360 --> 00:05:56.030
Good.
00:05:56.030 --> 00:05:57.380
Other questions.
00:05:57.380 --> 00:06:01.271
I'm very open for more
questions this afternoon.
00:06:01.271 --> 00:06:01.770
Yep.
00:06:01.770 --> 00:06:04.039
AUDIENCE: [INAUDIBLE]
00:06:04.039 --> 00:06:05.080
PROFESSOR STRANG: Others.
00:06:05.080 --> 00:06:05.780
OK.
00:06:05.780 --> 00:06:12.280
So, let me, I don't want to
go on and do new material,
00:06:12.280 --> 00:06:15.410
because we're focused
on these things.
00:06:15.410 --> 00:06:17.940
And this course, the
name of this course
00:06:17.940 --> 00:06:20.620
is computational
science and engineering.
00:06:20.620 --> 00:06:23.680
And by the way I just
had an email last week
00:06:23.680 --> 00:06:26.240
from the Dean of
Engineering, or a bunch of us
00:06:26.240 --> 00:06:30.030
did, to say that the
School of Engineering
00:06:30.030 --> 00:06:36.800
is establishing Center for
Computational Engineering, CCE.
00:06:36.800 --> 00:06:40.170
Several faculty members
there and, like myself
00:06:40.170 --> 00:06:43.820
in the School of Science
and the Sloan School
00:06:43.820 --> 00:06:47.750
are involved with computation,
and this new center
00:06:47.750 --> 00:06:50.330
is going to organize that.
00:06:50.330 --> 00:06:53.270
So, it's a good development.
00:06:53.270 --> 00:06:58.620
And it's headed by people
in Course 2 and Course 16.
00:06:58.620 --> 00:06:59.420
So.
00:06:59.420 --> 00:07:03.090
If we're talking about
computations, and I do
00:07:03.090 --> 00:07:06.260
have to say something about
how you would actually
00:07:06.260 --> 00:07:08.310
do the computations.
00:07:08.310 --> 00:07:12.410
And what are the
issues about accuracy.
00:07:12.410 --> 00:07:16.880
Speed and accuracy
is what you're
00:07:16.880 --> 00:07:19.510
aiming for in the computations.
00:07:19.510 --> 00:07:22.270
Of course, the first step
is to know what problem
00:07:22.270 --> 00:07:23.520
is it you want to compute.
00:07:23.520 --> 00:07:26.230
What do you want to solve,
what's the equation?
00:07:26.230 --> 00:07:28.430
That's what we've
been doing all along.
00:07:28.430 --> 00:07:33.060
Now, I just take a
little time-out to say,
00:07:33.060 --> 00:07:35.920
suppose I have the equation.
00:07:35.920 --> 00:07:39.920
When I write K, I'm thinking of
a symmetric, positive definite
00:07:39.920 --> 00:07:42.150
or at least
semi-definite matrix.
00:07:42.150 --> 00:07:46.800
When I write A I'm thinking
of any general, usually
00:07:46.800 --> 00:07:48.290
tall, thin, matrix.
00:07:48.290 --> 00:07:49.440
Rectangular.
00:07:49.440 --> 00:07:54.720
So that I would need least
squares for this guy, where
00:07:54.720 --> 00:07:58.380
straightforward elimination
would work for that one.
00:07:58.380 --> 00:08:02.050
And so my first question
is-- Let's take this,
00:08:02.050 --> 00:08:05.180
so these are two
topics for today.
00:08:05.180 --> 00:08:10.340
This one would come out of 1.7,
that discussion with condition
00:08:10.340 --> 00:08:11.100
number.
00:08:11.100 --> 00:08:16.180
This one would come out of
2.3, the least squares section.
00:08:16.180 --> 00:08:21.440
OK, so if I give
you, and I'm thinking
00:08:21.440 --> 00:08:24.270
that the computational
questions emerge
00:08:24.270 --> 00:08:26.360
when the systems are large.
00:08:26.360 --> 00:08:28.970
So I'm thinking thousands
of unknowns here.
00:08:28.970 --> 00:08:31.310
Thousands of equations,
at the least.
00:08:31.310 --> 00:08:31.930
OK.
00:08:31.930 --> 00:08:40.500
So, and the question is, I
do Gaussian elimination here,
00:08:40.500 --> 00:08:42.500
ordinary elimination.
00:08:42.500 --> 00:08:43.870
Backslash.
00:08:43.870 --> 00:08:48.060
And how accurate is the answer?
00:08:48.060 --> 00:08:49.540
And how do you understand?
00:08:49.540 --> 00:08:51.870
I mean, the accuracy
of the answer
00:08:51.870 --> 00:08:53.850
is going to kind of
depend on two things.
00:08:53.850 --> 00:08:55.730
And it's good to separate them.
00:08:55.730 --> 00:09:01.050
One is the method you
use, like elimination.
00:09:01.050 --> 00:09:03.570
Whatever adjustments
you might make.
00:09:03.570 --> 00:09:04.640
Pivoting.
00:09:04.640 --> 00:09:07.330
Exchanging rows to
get larger pivots.
00:09:07.330 --> 00:09:10.620
All that is in the
algorithm, in the code.
00:09:10.620 --> 00:09:13.290
And then the second,
very important aspect,
00:09:13.290 --> 00:09:16.360
is the matrix K itself.
00:09:16.360 --> 00:09:19.950
Is this a tough problem to solve
whatever method you're using,
00:09:19.950 --> 00:09:21.410
or is it a simple problem?
00:09:21.410 --> 00:09:24.520
Is the problem ill
conditioned, meeting K
00:09:24.520 --> 00:09:28.640
would be like nearly
singular, and then we
00:09:28.640 --> 00:09:30.550
would know we had
a tougher problem
00:09:30.550 --> 00:09:32.820
to solve, whatever method.
00:09:32.820 --> 00:09:35.410
Or is K quite well
conditioned, I
00:09:35.410 --> 00:09:37.990
mean the best
condition would be when
00:09:37.990 --> 00:09:42.480
all the columns are unit
vectors and all orthogonal
00:09:42.480 --> 00:09:43.560
to each other.
00:09:43.560 --> 00:09:46.860
Yeah, I mean that would be
the best conditioning of all.
00:09:46.860 --> 00:09:49.080
That condition
number would be one
00:09:49.080 --> 00:09:55.580
if this K, which, not too likely
is a matrix that I would call
00:09:55.580 --> 00:10:00.120
Q. Q, which is going
to show up over here,
00:10:00.120 --> 00:10:05.670
in the second problem is,
Q stands for a matrix which
00:10:05.670 --> 00:10:08.060
has orthonormal columns.
00:10:08.060 --> 00:10:17.610
So, you remember what
orthonormal means.
00:10:17.610 --> 00:10:19.900
Ortho is telling
us perpendicular,
00:10:19.900 --> 00:10:21.390
that's the key point.
00:10:21.390 --> 00:10:23.220
Normal is telling
us that they're
00:10:23.220 --> 00:10:25.460
unit vectors, lengths one.
00:10:25.460 --> 00:10:28.990
So that's the Q, and then
you might ask what's the R.
00:10:28.990 --> 00:10:35.790
And the R is upper triangular.
00:10:35.790 --> 00:10:38.130
OK.
00:10:38.130 --> 00:10:39.820
OK.
00:10:39.820 --> 00:10:42.220
So what I said
about this problem,
00:10:42.220 --> 00:10:46.690
that there's the method you
use, and also the sensitivity,
00:10:46.690 --> 00:10:49.230
the difficulty of the
problem in the first place,
00:10:49.230 --> 00:10:51.290
applies just the same here.
00:10:51.290 --> 00:10:53.940
There's the method
you use, do you
00:10:53.940 --> 00:10:59.630
use A transpose A to
find u hat, this is, now
00:10:59.630 --> 00:11:03.060
we're looking for u hat, of
course, the best solution.
00:11:03.060 --> 00:11:05.560
Do I use A transpose A?
00:11:05.560 --> 00:11:09.380
Well, you would say
of course, what else.
00:11:09.380 --> 00:11:11.720
That equation, that
least squares equation
00:11:11.720 --> 00:11:16.040
has A transpose A u hat equal A
transpose b, what's the choice?
00:11:16.040 --> 00:11:24.350
But, if you're interested
in high accuracy,
00:11:24.350 --> 00:11:27.120
and stability,
numerical stability,
00:11:27.120 --> 00:11:29.740
maybe you don't go
to A transpose A.
00:11:29.740 --> 00:11:34.160
Going to A transpose A kind of
squares the condition number.
00:11:34.160 --> 00:11:37.420
You get to an A transpose
A, that'll be our K,
00:11:37.420 --> 00:11:42.150
but its condition number
will somehow be squared
00:11:42.150 --> 00:11:46.680
and if the problem is
nice, you're OK with that.
00:11:46.680 --> 00:11:48.390
But if the problem is delicate?
00:11:48.390 --> 00:11:51.330
Now, what does
delicate mean for Au=b?
00:11:51.330 --> 00:11:55.910
I'm kind of giving you a
overview of the two problems
00:11:55.910 --> 00:11:57.690
before I start on this one.
00:11:57.690 --> 00:11:59.180
And then that one.
00:11:59.180 --> 00:12:01.610
So with this one
the problem was,
00:12:01.610 --> 00:12:03.520
is the matrix nearly singular.
00:12:03.520 --> 00:12:05.240
What does that mean?
00:12:05.240 --> 00:12:07.970
Does MATLAB tell you,
what does MATLAB tell you
00:12:07.970 --> 00:12:09.110
about the matrix?
00:12:09.110 --> 00:12:12.860
And that is measured by
the condition number.
00:12:12.860 --> 00:12:20.810
What's the issue here is, when
would this be a numerically
00:12:20.810 --> 00:12:22.600
difficult, sensitive problem?
00:12:22.600 --> 00:12:27.690
Well, the columns of
A are not orthonormal.
00:12:27.690 --> 00:12:30.980
If they are, then you're golden.
00:12:30.980 --> 00:12:33.100
If the columns of
A are orthonormal,
00:12:33.100 --> 00:12:36.600
then you're all set.
00:12:36.600 --> 00:12:39.150
So what's the opposite?
00:12:39.150 --> 00:12:42.680
Well, the extreme opposite
would be when the columns of A
00:12:42.680 --> 00:12:44.110
are dependent.
00:12:44.110 --> 00:12:46.440
If the columns of A
are linearly dependent
00:12:46.440 --> 00:12:50.520
and some column is a
combination of other columns,
00:12:50.520 --> 00:12:52.550
you're in trouble right away.
00:12:52.550 --> 00:12:56.060
So that's like big trouble,
that's like K being singular.
00:12:56.060 --> 00:12:57.640
Those are the cases.
00:12:57.640 --> 00:13:03.160
K singular here,
dependent columns here.
00:13:03.160 --> 00:13:05.470
Not full rank.
00:13:05.470 --> 00:13:11.160
So, again we're supposing
we're not facing disaster.
00:13:11.160 --> 00:13:12.825
Just near disaster.
00:13:12.825 --> 00:13:15.160
So we want to know,
is K near the singular
00:13:15.160 --> 00:13:17.490
and how to measure
that, and we want
00:13:17.490 --> 00:13:22.190
to know what to do
when the columns of A
00:13:22.190 --> 00:13:28.780
are independent
but maybe not very.
00:13:28.780 --> 00:13:31.470
And that would show up
in a large condition
00:13:31.470 --> 00:13:35.420
number for A transpose A. And
this happens all the time;
00:13:35.420 --> 00:13:39.410
if you don't set up
your problem well,
00:13:39.410 --> 00:13:43.920
your experimental
problem, you can easily
00:13:43.920 --> 00:13:49.390
get matrices A, whose columns
are not very independent.
00:13:49.390 --> 00:13:54.000
Measured by A transpose A
being close to singular.
00:13:54.000 --> 00:13:56.040
Right, everybody
here's got that idea.
00:13:56.040 --> 00:13:58.700
If the columns of
A are independent,
00:13:58.700 --> 00:14:01.170
A transpose A is non-singular.
00:14:01.170 --> 00:14:03.910
In fact, positive definite.
00:14:03.910 --> 00:14:09.980
Then, now we're talking about
when we have that property,
00:14:09.980 --> 00:14:16.760
but the columns of A
are not very independent
00:14:16.760 --> 00:14:20.530
and the matrix A transpose
A is not very invertible.
00:14:20.530 --> 00:14:23.350
OK, so that's what
the things are.
00:14:23.350 --> 00:14:27.910
And then just to,
because-- Say, on this one,
00:14:27.910 --> 00:14:30.350
what's the good thing to do?
00:14:30.350 --> 00:14:36.370
The good thing to do is
to call the qr code which
00:14:36.370 --> 00:14:43.750
gets its name because it takes
the matrix A and it factors it.
00:14:43.750 --> 00:14:49.150
Of course, we all know
that lu is the code here.
00:14:49.150 --> 00:14:56.490
It factors K. And qr is
the code that factors
00:14:56.490 --> 00:15:04.810
A into a very good guy, an
optimal Q. It couldn't be beat.
00:15:04.810 --> 00:15:08.430
And an R, that's upper
triangular, and therefore
00:15:08.430 --> 00:15:12.050
in the simplest form you
see exactly what you're
00:15:12.050 --> 00:15:15.040
dealing with.
00:15:15.040 --> 00:15:24.600
Let me continue this
least squares idea.
00:15:24.600 --> 00:15:27.870
Because Q and R are
probably not so familiar.
00:15:27.870 --> 00:15:30.730
Maybe the name
Gram-Schmidt is familiar?
00:15:30.730 --> 00:15:34.700
How many have seen Gram-Schmidt?
00:15:34.700 --> 00:15:37.500
The Gram-Schmidt idea
I'll describe quickly,
00:15:37.500 --> 00:15:40.530
but just do those words, do
those guys' names mean anything
00:15:40.530 --> 00:15:41.030
to?
00:15:41.030 --> 00:15:42.180
Yes, if they do.
00:15:42.180 --> 00:15:43.090
Quite a few.
00:15:43.090 --> 00:15:44.420
But not all.
00:15:44.420 --> 00:15:45.390
OK.
00:15:45.390 --> 00:15:47.220
OK.
00:15:47.220 --> 00:15:53.190
And can I just say,
also Gram-Schmidt
00:15:53.190 --> 00:15:57.390
is kind of our name for
getting these two factors.
00:15:57.390 --> 00:16:00.810
And you'll see why,
it's very cool to have,
00:16:00.810 --> 00:16:06.030
why this is a good first step.
00:16:06.030 --> 00:16:08.420
It costs a little
to take that step,
00:16:08.420 --> 00:16:12.550
but if you're interested
in safety, take it.
00:16:12.550 --> 00:16:16.470
It might cost twice as much
as solving the A transpose
00:16:16.470 --> 00:16:21.330
A equation, so you double the
cost by going this safer route.
00:16:21.330 --> 00:16:24.250
And double is not a
big deal, usually.
00:16:24.250 --> 00:16:26.750
OK.
00:16:26.750 --> 00:16:29.350
So, I was going to
say that Gram-Schmidt,
00:16:29.350 --> 00:16:31.270
that's the name everybody uses.
00:16:31.270 --> 00:16:37.090
But actually their method
is no longer the winner.
00:16:37.090 --> 00:16:41.210
And in Section 2.3 and
I'll try to describe
00:16:41.210 --> 00:16:46.540
a slightly better method
than the Gram-Schmidt idea
00:16:46.540 --> 00:16:50.720
to arrive at Q and R. But let's
suppose you got to Q and R.
00:16:50.720 --> 00:16:53.600
Then, what would be the
least squares equation?
00:16:53.600 --> 00:16:59.240
A transpose A u hat is
A transpose b, right?
00:16:59.240 --> 00:17:02.840
That's the equation
everybody knows.
00:17:02.840 --> 00:17:07.210
But now if we have A
factored into Q times R,
00:17:07.210 --> 00:17:09.790
let me see how that simplifies.
00:17:09.790 --> 00:17:14.710
So now A is QR, and A
transpose, of course,
00:17:14.710 --> 00:17:18.870
is R transpose Q
transpose, u hat, and this
00:17:18.870 --> 00:17:23.200
is R transpose Q transpose b.
00:17:23.200 --> 00:17:26.060
Same equation,
I'm just supposing
00:17:26.060 --> 00:17:29.350
that I've got A into
this nice form where
00:17:29.350 --> 00:17:35.580
Q has-- where I've taken these
columns that possibly lined up
00:17:35.580 --> 00:17:40.370
too close to each other like,
you know, angles of one degree.
00:17:40.370 --> 00:17:42.800
And I've got better angles.
00:17:42.800 --> 00:17:47.330
I've got-- These columns
of A are too close,
00:17:47.330 --> 00:17:52.690
so I spread them out, to columns
of Q that are at 90 degrees.
00:17:52.690 --> 00:17:54.430
Orthogonal columns.
00:17:54.430 --> 00:17:57.460
Now, what's the deal
with orthogonal columns?
00:17:57.460 --> 00:17:59.990
Let me just remember
the main point
00:17:59.990 --> 00:18:06.310
about Q. It has
orthogonal columns, right,
00:18:06.310 --> 00:18:11.360
and I'll call those
q's. q_1 to q_n.
00:18:11.360 --> 00:18:12.450
OK.
00:18:12.450 --> 00:18:14.340
And the good deal
is, what happens
00:18:14.340 --> 00:18:16.820
when I do Q transpose Q?
00:18:16.820 --> 00:18:23.610
So I do q_1 transpose, these
are now rows, to q_n transpose.
00:18:23.610 --> 00:18:27.460
q_1 columns to q_n.
00:18:27.460 --> 00:18:31.880
And what do I get when I
multiply those matrices?
00:18:31.880 --> 00:18:37.960
Q transpose times Q. I get
I. q_1 transpose q_1, that's
00:18:37.960 --> 00:18:40.680
the length of q_1
squared is one,
00:18:40.680 --> 00:18:44.090
and q_1 is orthogonal
to all the others.
00:18:44.090 --> 00:18:48.930
And then q_2, you
see I get the I. q_3,
00:18:48.930 --> 00:18:52.300
get an n by n-- I get
the identity matrix.
00:18:52.300 --> 00:18:58.280
Q transpose Q is I.
That's the beautiful,
00:18:58.280 --> 00:19:00.720
just remember that fact.
00:19:00.720 --> 00:19:02.760
And use it right away.
00:19:02.760 --> 00:19:05.360
You see where it's
used, Q transpose Q
00:19:05.360 --> 00:19:08.460
is I in the middle of that.
00:19:08.460 --> 00:19:12.350
So I can just delete that,
I just have R transpose R
00:19:12.350 --> 00:19:14.410
and I can even
simplify this further.
00:19:14.410 --> 00:19:15.680
What can I do now?
00:19:15.680 --> 00:19:20.350
So that's the identity,
so I have R transpose R,
00:19:20.350 --> 00:19:25.640
but now I have an R transpose
over here so am I left with?
00:19:25.640 --> 00:19:28.760
I'll multiply both sides
by R transpose inverse
00:19:28.760 --> 00:19:35.360
and that will lead me to, R
u hat equals, knocking out
00:19:35.360 --> 00:19:40.690
our transpose inverse on
both sides, Q transpose b.
00:19:40.690 --> 00:19:45.280
Well, that's, our least
squares equation has become
00:19:45.280 --> 00:19:47.390
completely easy to solve.
00:19:47.390 --> 00:19:49.300
We've got a triangular
matrix here,
00:19:49.300 --> 00:19:52.030
I mean it's just
back substitution.
00:19:52.030 --> 00:19:57.800
It's just back substitution now,
and a Q transpose b over there.
00:19:57.800 --> 00:20:06.150
So a very simple solution for
our equation after the initial
00:20:06.150 --> 00:20:09.380
work of A=QR.
00:20:09.380 --> 00:20:10.990
OK.
00:20:10.990 --> 00:20:18.340
But very safe, Q is a
great matrix to work with.
00:20:18.340 --> 00:20:22.320
In fact people-- codes
are written so as
00:20:22.320 --> 00:20:27.920
to use orthogonal matrices
Q as often as they can.
00:20:27.920 --> 00:20:34.240
Alright, so you had a look
ahead of the computational side
00:20:34.240 --> 00:20:39.980
of 2.3, let me come back to
the most basic equations, just
00:20:39.980 --> 00:20:42.310
symmetric, positive
definite equations.
00:20:42.310 --> 00:20:53.050
Ku=f, and consider OK, how do
we measure whether K is nearly
00:20:53.050 --> 00:20:55.260
singular?
00:20:55.260 --> 00:20:57.400
OK, let me just
ask that question.
00:20:57.400 --> 00:21:00.190
That's the central question.
00:21:00.190 --> 00:21:07.060
How to measure,
when is K, which is,
00:21:07.060 --> 00:21:12.150
which we're assuming be
symmetric positive definite,
00:21:12.150 --> 00:21:13.650
nearly singular?
00:21:13.650 --> 00:21:20.820
How to measure that?
00:21:20.820 --> 00:21:24.590
How to know whether we're
in any danger or not?
00:21:24.590 --> 00:21:25.480
OK.
00:21:25.480 --> 00:21:29.570
Well, first you
might think OK, if it
00:21:29.570 --> 00:21:33.200
is singular its
determinant is zero.
00:21:33.200 --> 00:21:36.310
So why not take its determinant?
00:21:36.310 --> 00:21:40.030
Well determinants,
as we've said,
00:21:40.030 --> 00:21:45.420
are not a good idea numerically.
00:21:45.420 --> 00:21:47.600
First, they're not
fun to compute.
00:21:47.600 --> 00:21:53.460
Second, they depend on the
number of unknowns, right?
00:21:53.460 --> 00:21:57.190
If I just have
twice the identity,
00:21:57.190 --> 00:22:01.140
suppose K is twice
the identity matrix.
00:22:01.140 --> 00:22:04.250
You could not get a better
problem than that, right?
00:22:04.250 --> 00:22:06.070
If K was twice the
identity matrix
00:22:06.070 --> 00:22:08.270
the whole thing's simple.
00:22:08.270 --> 00:22:12.330
Or if K is, suppose
K is one millionth
00:22:12.330 --> 00:22:15.190
of the identity matrix.
00:22:15.190 --> 00:22:18.970
OK, again, that's a
perfect problem, right?
00:22:18.970 --> 00:22:22.210
If K is one millionth
of the identity matrix,
00:22:22.210 --> 00:22:25.860
well to solve the problem you
just multiply by a million,
00:22:25.860 --> 00:22:27.330
you've got the answer.
00:22:27.330 --> 00:22:29.680
So those are good.
00:22:29.680 --> 00:22:34.490
And we have to have some
measure of bad or good
00:22:34.490 --> 00:22:37.160
that tells us those are good.
00:22:37.160 --> 00:22:40.030
OK.
00:22:40.030 --> 00:22:42.320
So the determinant won't do.
00:22:42.320 --> 00:22:44.850
Because the determinant
of 2 I would be two
00:22:44.850 --> 00:22:48.610
to the nth, the
size of the matrix.
00:22:48.610 --> 00:22:51.640
Or the determinant of
one millionth identity
00:22:51.640 --> 00:22:53.840
would be one millionth to the n.
00:22:53.840 --> 00:22:56.160
Those are not numbers we want.
00:22:56.160 --> 00:22:57.310
What's a better number?
00:22:57.310 --> 00:22:59.950
Maybe you could
suggest a better number
00:22:59.950 --> 00:23:05.070
to measure how close is the
matrix to being singular.
00:23:05.070 --> 00:23:08.350
What would you say?
00:23:08.350 --> 00:23:10.370
I think if you think
about it a little,
00:23:10.370 --> 00:23:13.900
so what numbers do we know?
00:23:13.900 --> 00:23:17.330
Well eigenvalues jumps to mind.
00:23:17.330 --> 00:23:18.840
Eigenvalues jumps to mind.
00:23:18.840 --> 00:23:22.940
Because this matrix K, being
symmetric positive definite,
00:23:22.940 --> 00:23:31.520
has eigenvalues say lambda_1
less than lambda_2, so on.
00:23:31.520 --> 00:23:32.790
So on.
00:23:32.790 --> 00:23:40.020
Up to, so this is lambda_max
and that's lambda_min,
00:23:40.020 --> 00:23:42.840
and they're all positive.
00:23:42.840 --> 00:23:50.640
And so what's your idea of
whether the thing's nearly
00:23:50.640 --> 00:23:52.150
singular now?
00:23:52.150 --> 00:23:54.390
Look at lambda_1, right?
00:23:54.390 --> 00:23:56.920
If lambda_1 is near
zero, that somehow
00:23:56.920 --> 00:23:59.320
indicates near singular.
00:23:59.320 --> 00:24:02.600
So lambda_1 is sort
of a natural test.
00:24:02.600 --> 00:24:05.060
Not that I intend
to compute lambda_1,
00:24:05.060 --> 00:24:08.070
that would take longer
than solving the system.
00:24:08.070 --> 00:24:10.990
But an estimate of
lambda_1 would be enough.
00:24:10.990 --> 00:24:13.060
OK.
00:24:13.060 --> 00:24:17.270
But my answer is
not just lambda_1.
00:24:20.090 --> 00:24:22.510
And why is that?
00:24:22.510 --> 00:24:28.850
Because the examples I gave you,
when I had twice the identity,
00:24:28.850 --> 00:24:31.780
what would lambda_1
be there in that case?
00:24:31.780 --> 00:24:35.930
If my matrix K was beautiful,
twice the identity matrix,
00:24:35.930 --> 00:24:40.250
lambda_1 would be two.
00:24:40.250 --> 00:24:44.630
All the eigenvalues are two
for the identity matrix.
00:24:44.630 --> 00:24:48.340
Now if my matrix was one
millionth of the identity,
00:24:48.340 --> 00:24:50.450
again I have a
beautiful problem.
00:24:50.450 --> 00:24:52.540
Just as good, just
as beautiful problem.
00:24:52.540 --> 00:24:55.190
What's lambda_1 for that one?
00:24:55.190 --> 00:24:56.400
One millionth.
00:24:56.400 --> 00:24:59.770
It looks not as good, it
looks much more singular,
00:24:59.770 --> 00:25:05.220
but that's not really there.
00:25:05.220 --> 00:25:11.320
So you could say, we'll
scale your matrix.
00:25:11.320 --> 00:25:15.740
And scaling the matrices, in
fact scaling individual rows
00:25:15.740 --> 00:25:19.350
and columns to get it,
you might have used,
00:25:19.350 --> 00:25:25.890
your unknowns might be
somehow in the wrong units.
00:25:25.890 --> 00:25:29.825
So one of the answers is way
big and the second component
00:25:29.825 --> 00:25:30.960
is way small.
00:25:30.960 --> 00:25:33.640
That's not good.
00:25:33.640 --> 00:25:36.630
So scaling is important.
00:25:36.630 --> 00:25:42.670
But even then you still
end up with a matrix K,
00:25:42.670 --> 00:25:47.990
some eigenvalues and I'll
tell you the condition number.
00:25:47.990 --> 00:25:55.670
The condition number of K is the
ratio of this guy to this one.
00:25:55.670 --> 00:26:00.170
In other words, two K, or a
million K, or one millionth K,
00:26:00.170 --> 00:26:02.320
all have the same
condition number.
00:26:02.320 --> 00:26:06.530
Because those problems
are identical problems.
00:26:06.530 --> 00:26:08.690
Multiplying by two,
multiplying by a million,
00:26:08.690 --> 00:26:12.150
dividing by a million
didn't change reality there.
00:26:12.150 --> 00:26:17.130
So if we're in floating
points, it just didn't change.
00:26:17.130 --> 00:26:19.210
So the condition
number is going to be
00:26:19.210 --> 00:26:20.880
lambda_max over lambda_min.
00:26:24.710 --> 00:26:31.310
And this is for symmetric
positive definite matrices.
00:26:31.310 --> 00:26:35.170
And MATLAB will print
out that number.
00:26:35.170 --> 00:26:37.230
Or print an estimate
for that number;
00:26:37.230 --> 00:26:39.740
as I said we don't want
to compute it exactly.
00:26:39.740 --> 00:26:41.720
Lambda_max over lambda_min.
00:26:41.720 --> 00:26:48.220
That measures how sensitive,
how tough your problem is.
00:26:48.220 --> 00:26:49.470
OK.
00:26:49.470 --> 00:26:55.450
And then I have to think,
how does that come in, why
00:26:55.450 --> 00:26:57.200
is that an appropriate number?
00:26:57.200 --> 00:26:59.790
I guess I've tried to
give an instinct for why
00:26:59.790 --> 00:27:05.410
it's appropriate, but we can
be pretty specific about it.
00:27:05.410 --> 00:27:08.770
In fact, let's do that now.
00:27:08.770 --> 00:27:13.510
So what would be the condition
number of twice the identity?
00:27:13.510 --> 00:27:15.720
It would be one.
00:27:15.720 --> 00:27:17.470
Perfectly conditioned problem.
00:27:17.470 --> 00:27:20.440
What would be the
condition, yeah, OK.
00:27:20.440 --> 00:27:24.020
What would be the condition
of a diagonal matrix?
00:27:24.020 --> 00:27:32.900
Suppose K was the diagonal
matrix two, three, four?
00:27:32.900 --> 00:27:36.650
The condition number of
that matrix is two, right?
00:27:36.650 --> 00:27:39.990
Lambda_max is sitting there,
lambda_min is sitting there,
00:27:39.990 --> 00:27:41.370
the ratio is two.
00:27:41.370 --> 00:27:44.950
Of course, any condition
number under 100 or 1000
00:27:44.950 --> 00:27:47.500
is no problem.
00:27:47.500 --> 00:27:52.509
Roughly the rule of
thumb is that the--
00:27:52.509 --> 00:27:53.550
What's the rule of thumb?
00:27:53.550 --> 00:27:55.700
I think that maybe
the number of digits
00:27:55.700 --> 00:28:03.020
in the condition number,
the number of digits-- Maybe
00:28:03.020 --> 00:28:06.660
if the condition number was
1000 you would be taking
00:28:06.660 --> 00:28:11.920
a chance that your last three
digits, single precision that
00:28:11.920 --> 00:28:18.160
would be three out of
six, three bits-- Somehow
00:28:18.160 --> 00:28:22.240
the log of the condition number,
the number of digits in it,
00:28:22.240 --> 00:28:26.340
is some measure of the
number of digits you'd lose.
00:28:26.340 --> 00:28:30.450
Because you're doing floating
point, of course, here.
00:28:30.450 --> 00:28:33.670
That's it, totally well
conditioned matrix.
00:28:33.670 --> 00:28:35.740
I wouldn't touch that one.
00:28:35.740 --> 00:28:37.630
I mean that's just fine.
00:28:37.630 --> 00:28:45.260
But we can figure out-- Here's
the point that I should make.
00:28:45.260 --> 00:28:48.490
Because here's the
computational science point.
00:28:48.490 --> 00:28:57.540
When this is our special
K, our -1, 2, -1 matrix.
00:28:57.540 --> 00:29:03.730
-1, 2, -1 matrix of size
n, the condition number
00:29:03.730 --> 00:29:09.430
goes like n squared.
00:29:09.430 --> 00:29:11.910
Because we know the
eigenvalues of that matrix,
00:29:11.910 --> 00:29:13.160
we could see it.
00:29:13.160 --> 00:29:18.340
The largest eigenvalue, when
n is big, say n is 1000.
00:29:18.340 --> 00:29:21.850
And we're dealing with our
standard second difference
00:29:21.850 --> 00:29:24.830
matrix, the most important
example I could possibly
00:29:24.830 --> 00:29:25.980
present.
00:29:25.980 --> 00:29:30.430
Then the largest eigenvalues
we actually are there
00:29:30.430 --> 00:29:36.430
in Section 1.5, we
didn't do them in detail,
00:29:36.430 --> 00:29:39.280
we'll probably come back
to them when we need them.
00:29:39.280 --> 00:29:42.370
But the largest
one is about four.
00:29:42.370 --> 00:29:46.030
And the smallest
one is pretty small.
00:29:46.030 --> 00:29:50.120
The smallest one is
sort of like a sine
00:29:50.120 --> 00:29:54.390
squared of a small number.
00:29:54.390 --> 00:29:59.970
And so the smallest eigenvalue
is of order 1/n squared.
00:29:59.970 --> 00:30:04.840
And then when I do that
ratio of four, lambda_max
00:30:04.840 --> 00:30:09.620
is just like four, this
lambda_min is like 1/n squared,
00:30:09.620 --> 00:30:10.720
quite small.
00:30:10.720 --> 00:30:13.260
That ratio gives
me the n squared.
00:30:13.260 --> 00:30:16.890
So there's an indication.
00:30:16.890 --> 00:30:19.680
Basically, that's not bad.
00:30:19.680 --> 00:30:26.330
That's not bad if n is 1000 in
most engineering problems, that
00:30:26.330 --> 00:30:29.160
gives you extremely,
extremely good accuracy.
00:30:29.160 --> 00:30:33.060
Condition number of a
million, you could live with.
00:30:33.060 --> 00:30:37.480
If n is 100, more typical,
condition number of 10,000
00:30:37.480 --> 00:30:44.330
is basically, I think OK.
00:30:44.330 --> 00:30:46.280
And I would go with it.
00:30:46.280 --> 00:30:53.820
But if the condition number is
way up, then I'd think again,
00:30:53.820 --> 00:30:55.970
did I model the problem well.
00:30:55.970 --> 00:30:56.950
OK.
00:30:56.950 --> 00:30:57.870
Alright.
00:30:57.870 --> 00:31:01.650
So that's-- Now I have to tell
you, why is this inappropriate?
00:31:01.650 --> 00:31:04.990
How do you look at the error?
00:31:04.990 --> 00:31:15.500
So can I write down a way
of approaching this Ku=f?
00:31:21.680 --> 00:31:24.950
So this is the first time I've
used the word round-off error.
00:31:24.950 --> 00:31:28.970
So in all the calculations,
in all the calculations
00:31:28.970 --> 00:31:36.710
that you have to do, to get
to u, those row operations,
00:31:36.710 --> 00:31:40.680
and you're doing them
to the right side too,
00:31:40.680 --> 00:31:42.960
so all those are floating
point operations in which
00:31:42.960 --> 00:31:44.910
small errors are sneaking in.
00:31:44.910 --> 00:31:50.520
And it was very unclear,
in the early years,
00:31:50.520 --> 00:31:54.090
whether the millions and
millions of operations that you
00:31:54.090 --> 00:31:59.150
do, additions, subtractions,
multiplications,
00:31:59.150 --> 00:32:06.480
in elimination, do those,
could those add up?
00:32:06.480 --> 00:32:10.960
If they don't cancel,
you've got problems, right?
00:32:10.960 --> 00:32:14.930
But in general you
would expect somehow
00:32:14.930 --> 00:32:17.120
that these are just
round off errors,
00:32:17.120 --> 00:32:19.760
you're making them millions
and millions of times,
00:32:19.760 --> 00:32:23.870
it would be pretty bad luck,
I mean like Red Sox twelfth
00:32:23.870 --> 00:32:29.790
inning bad luck to have
them pile up on you.
00:32:29.790 --> 00:32:32.050
So you don't expect that.
00:32:32.050 --> 00:32:37.120
Now, what you do solve, so
what you actually compute,
00:32:37.120 --> 00:32:41.110
so this is the exact.
00:32:41.110 --> 00:32:44.630
This would be the computed.
00:32:44.630 --> 00:32:48.610
Let me suppose that the
computed one is sort of a,
00:32:48.610 --> 00:32:49.900
there's an error.
00:32:49.900 --> 00:32:51.850
I'll call it delta u.
00:32:51.850 --> 00:32:54.580
That's our error.
00:32:54.580 --> 00:32:58.180
And it's equal to
an f plus delta f.
00:32:58.180 --> 00:33:08.740
And this is our round off
error, this is error we make,
00:33:08.740 --> 00:33:13.920
and this is error in the answer.
00:33:13.920 --> 00:33:18.620
In the final answer.
00:33:18.620 --> 00:33:21.570
OK, now I would like to
have an error equation.
00:33:21.570 --> 00:33:23.810
An equation for
that error, delta u,
00:33:23.810 --> 00:33:26.700
because that's what I'm
trying to get an idea of.
00:33:26.700 --> 00:33:28.050
No problem.
00:33:28.050 --> 00:33:32.140
If I subtract the exact
equation from this equation,
00:33:32.140 --> 00:33:35.290
I have a simple error equation.
00:33:35.290 --> 00:33:43.030
So this is my error equation.
00:33:43.030 --> 00:33:45.550
OK.
00:33:45.550 --> 00:33:50.740
So I want to estimate
the size of that error,
00:33:50.740 --> 00:33:54.310
compared to the exact.
00:33:54.310 --> 00:33:58.010
You might say, and you
would be right, in saying,
00:33:58.010 --> 00:34:03.130
well wait a minute as you do
all these operations you're also
00:34:03.130 --> 00:34:09.050
creating errors in K. So I could
have a K plus delta K here,
00:34:09.050 --> 00:34:09.660
too.
00:34:09.660 --> 00:34:13.940
And actually it wouldn't
be difficult to deal with.
00:34:13.940 --> 00:34:18.960
And would certainly be there
in a proper error analysis.
00:34:18.960 --> 00:34:20.760
And it wouldn't make
a big difference,
00:34:20.760 --> 00:34:23.820
the condition number would
still be the right measure.
00:34:23.820 --> 00:34:28.020
So let me concentrate
here on the error
00:34:28.020 --> 00:34:32.260
in f, when subtracting
one from the other
00:34:32.260 --> 00:34:37.780
gives me this simple
error equation.
00:34:37.780 --> 00:34:46.990
So my question is, when is
that error, delta u, big?
00:34:46.990 --> 00:34:49.210
When do I get a large error?
00:34:49.210 --> 00:34:51.330
And delta f, I'm
not controlling.
00:34:51.330 --> 00:34:55.920
I might control the size, but
the details of it I can't know.
00:34:55.920 --> 00:35:03.730
So what delta f, now I'll
take worst possible here.
00:35:03.730 --> 00:35:07.620
Suppose this is of
some small size,
00:35:07.620 --> 00:35:13.150
ten to the minus something,
times some vector of errors,
00:35:13.150 --> 00:35:15.340
but I don't know anything
about that vector
00:35:15.340 --> 00:35:17.950
and therefore I'd better
take the worst possibility.
00:35:17.950 --> 00:35:19.800
What would be the
worst possibility?
00:35:19.800 --> 00:35:24.490
What right hand side would
give me the biggest delta u?
00:35:24.490 --> 00:35:24.990
Yeah.
00:35:24.990 --> 00:35:29.360
Maybe that's the
right question to ask.
00:35:29.360 --> 00:35:31.930
So now we're being a
little pessimistic.
00:35:31.930 --> 00:35:35.070
We're saying what
right hand side, what
00:35:35.070 --> 00:35:37.650
set of errors in
the measurements
00:35:37.650 --> 00:35:42.650
or from the calculations would
give me the largest delta u?
00:35:42.650 --> 00:35:49.380
Well, so let's see.
00:35:49.380 --> 00:35:51.710
I'm thinking the
worst case would
00:35:51.710 --> 00:35:58.890
be if delta f was an eigenvector
with the smallest eigenvalue,
00:35:58.890 --> 00:36:00.160
right?
00:36:00.160 --> 00:36:05.910
If delta f is an eigenvector,
is x_1, the eigenvector that
00:36:05.910 --> 00:36:11.340
goes with lambda_1,
the worst case
00:36:11.340 --> 00:36:17.330
would be for that to be
the first eigenvector.
00:36:17.330 --> 00:36:19.166
That would be the
worst direction.
00:36:19.166 --> 00:36:20.540
Of course, it
would be multiplied
00:36:20.540 --> 00:36:22.710
by some little number.
00:36:22.710 --> 00:36:26.080
Epsilon is every mathematician's
idea of a little number.
00:36:26.080 --> 00:36:26.590
OK.
00:36:26.590 --> 00:36:32.330
So epsilon x_1, then
what is delta u?
00:36:32.330 --> 00:36:39.610
Then the worst delta u is what?
00:36:39.610 --> 00:36:44.140
What would be the
solution to that equation?
00:36:44.140 --> 00:36:49.640
If the right-hand side was
epsilon and an eigenvector?
00:36:49.640 --> 00:36:51.780
This is the whole
point of eigenvectors.
00:36:51.780 --> 00:36:54.310
You can tell me what
the solution is.
00:36:54.310 --> 00:36:57.880
Is it a multiple of
that eigenvector?
00:36:57.880 --> 00:36:59.910
You bet.
00:36:59.910 --> 00:37:01.610
If this is an
eigenvector, then I
00:37:01.610 --> 00:37:03.540
can put in the same
eigenvector there,
00:37:03.540 --> 00:37:05.480
I just have to
scale it properly.
00:37:05.480 --> 00:37:14.380
So it'll be just this
side and what do I need?
00:37:14.380 --> 00:37:15.720
I think I just need lambda_1.
00:37:19.640 --> 00:37:24.830
The worst K inverse can
be is like 1/lambda_1.
00:37:24.830 --> 00:37:30.600
Right, if I claim that that's
the answer, if the right hand
00:37:30.600 --> 00:37:33.140
side is sort of in
the worst direction,
00:37:33.140 --> 00:37:36.380
then the answer is
that same right hand
00:37:36.380 --> 00:37:39.830
side divided by lambda_1.
00:37:39.830 --> 00:37:40.960
Let me just check.
00:37:40.960 --> 00:37:45.850
If I multiply both sides by
K, I have K delta u equals K,
00:37:45.850 --> 00:37:48.860
what's K*x_1?
00:37:48.860 --> 00:37:49.690
Everybody with me?
00:37:49.690 --> 00:37:51.470
What's K*x_1?
00:37:51.470 --> 00:37:54.280
Lambda_1*x_1, so the
lambda_1's cancel,
00:37:54.280 --> 00:37:56.750
then I got the
epsilon x_1 I want.
00:37:56.750 --> 00:37:59.360
So, no surprise.
00:37:59.360 --> 00:38:03.380
That's just telling us that
the worst error is an error
00:38:03.380 --> 00:38:06.310
in the direction of
the low eigenvector,
00:38:06.310 --> 00:38:10.530
and that error gets
amplified by 1/lambda_1.
00:38:10.530 --> 00:38:11.240
OK.
00:38:11.240 --> 00:38:15.070
So that's brought lambda_1,
lambda_min into it,
00:38:15.070 --> 00:38:16.540
in the denominator.
00:38:16.540 --> 00:38:19.920
Now, here's another point.
00:38:19.920 --> 00:38:22.350
Second point now.
00:38:22.350 --> 00:38:25.790
So that would be
the absolute error.
00:38:25.790 --> 00:38:29.290
But we saw for those
factors of two and one
00:38:29.290 --> 00:38:33.980
millionth and so on, that
really it's the relative error.
00:38:33.980 --> 00:38:40.060
So I want to estimate not
the absolute error, delta u,
00:38:40.060 --> 00:38:45.050
but the error delta u
relative to u itself.
00:38:45.050 --> 00:38:47.460
So that if I scale
the whole problem,
00:38:47.460 --> 00:38:50.220
my relative error
wouldn't change.
00:38:50.220 --> 00:38:52.630
So, in other words,
what I want to do
00:38:52.630 --> 00:39:00.960
is ask in this case,
what's the right hand side?
00:39:00.960 --> 00:39:07.540
How big, yeah, I know, I want
to know how small u could be,
00:39:07.540 --> 00:39:08.280
right?
00:39:08.280 --> 00:39:10.410
I'm shooting for the worst.
00:39:10.410 --> 00:39:20.370
The relative error
is the size of u,
00:39:20.370 --> 00:39:21.925
maybe I should put it this way.
00:39:21.925 --> 00:39:26.320
The relative error is
the size of the error
00:39:26.320 --> 00:39:28.300
relative to the size of u.
00:39:28.300 --> 00:39:32.620
And I want to know
how big that could be.
00:39:32.620 --> 00:39:33.200
OK.
00:39:33.200 --> 00:39:37.280
So now I know how big delta u
could be, it could be that big.
00:39:37.280 --> 00:39:40.710
But u itself, how
big could u be?
00:39:40.710 --> 00:39:44.030
How small could u be, right?
u's in the denominator.
00:39:44.030 --> 00:39:45.940
So if I'm trying
to make this big,
00:39:45.940 --> 00:39:47.510
I'll try to make that small.
00:39:47.510 --> 00:39:50.990
So when is u the smallest?
00:39:50.990 --> 00:39:54.035
Over there I said when is delta
u the biggest, now I'm going
00:39:54.035 --> 00:39:56.730
to say when is u the smallest?
00:39:56.730 --> 00:40:03.260
What f would point
me in the direction
00:40:03.260 --> 00:40:07.390
in which u was the smallest?
00:40:07.390 --> 00:40:10.530
Got to be the other eigenvector.
00:40:10.530 --> 00:40:11.480
This end.
00:40:11.480 --> 00:40:17.620
The worst case
would be when this
00:40:17.620 --> 00:40:19.840
is in the direction of x_n.
00:40:19.840 --> 00:40:21.420
The top eigenvector.
00:40:21.420 --> 00:40:27.800
In that case, what is u?
00:40:27.800 --> 00:40:34.300
So I'm saying the worst f is
the one that makes u smallest,
00:40:34.300 --> 00:40:39.460
and the worst delta f is the
one that makes delta u biggest.
00:40:39.460 --> 00:40:41.540
I'm going for the
worst case here,
00:40:41.540 --> 00:40:52.040
so if the right side is x_n,
what is u? x_n over lambda_n.
00:40:52.040 --> 00:40:56.560
Because when I multiply by K, K
times x_n brings me a lambda_n,
00:40:56.560 --> 00:40:58.660
cancel that lambda_n
I get it right.
00:40:58.660 --> 00:41:04.980
So there is the smallest u, and
here is the largest delta u.
00:41:04.980 --> 00:41:09.220
And the epsilon is coming
from the method we use,
00:41:09.220 --> 00:41:13.930
so that's not involved
with the matrix K. So
00:41:13.930 --> 00:41:17.340
do you see on there, if
I'm trying to estimate
00:41:17.340 --> 00:41:23.360
delta u over u, that's big.
00:41:23.360 --> 00:41:27.610
The size of delta u over
this, the size of delta u
00:41:27.610 --> 00:41:31.400
over the size of u is what?
00:41:31.400 --> 00:41:36.190
Delta u has some epsilon
that measures the machine
00:41:36.190 --> 00:41:41.850
number of digits we're keeping,
the machine length, word length
00:41:41.850 --> 00:41:43.020
and so on.
00:41:43.020 --> 00:41:46.040
This is a unit
vector over lambda_1.
00:41:48.610 --> 00:41:50.850
That's delta u over
lambda_1, this u
00:41:50.850 --> 00:41:54.550
is in the denominator.
lambda_n-- u is down here,
00:41:54.550 --> 00:41:57.430
so the lambda_n flips up.
00:41:57.430 --> 00:42:00.940
Do you see it?
00:42:00.940 --> 00:42:07.550
By taking the worst case, I've
got the worst relative error.
00:42:07.550 --> 00:42:15.430
So that for other methods,
other f's and delta f's, they
00:42:15.430 --> 00:42:17.410
won't be the very worst ones.
00:42:17.410 --> 00:42:22.960
But here I've written
down what's the worst.
00:42:22.960 --> 00:42:27.570
And that's the reason that
this is the condition number.
00:42:27.570 --> 00:42:35.640
So I'm speaking about topics
that are there in 1.7, trying
00:42:35.640 --> 00:42:37.660
to give you the main point.
00:42:37.660 --> 00:42:40.950
The main point is,
look at relative error
00:42:40.950 --> 00:42:43.430
because that's the
right thing to look at.
00:42:43.430 --> 00:42:46.000
Look at the worst
cases, which are
00:42:46.000 --> 00:42:49.530
in the directions of the
top and bottom eigenvectors.
00:42:49.530 --> 00:42:52.670
In that case, that relative
error has this condition
00:42:52.670 --> 00:42:58.960
number, lambda_n/lambda_1, and
that's the good measure for how
00:42:58.960 --> 00:43:01.970
singular the matrix is.
00:43:01.970 --> 00:43:05.870
So one millionth of the identity
is not a nearly singular
00:43:05.870 --> 00:43:08.800
matrix, because lambda_max
and lambda_min are equal,
00:43:08.800 --> 00:43:11.040
that's a perfectly
conditioned matrix.
00:43:11.040 --> 00:43:14.990
This matrix has condition
number two, 4/2.
00:43:14.990 --> 00:43:16.360
It's quite good.
00:43:16.360 --> 00:43:21.670
This matrix is getting worse,
with an n squared in there,
00:43:21.670 --> 00:43:29.230
if n is big, and other matrices
could be worse than that.
00:43:29.230 --> 00:43:30.460
OK.
00:43:30.460 --> 00:43:37.560
So that's my discussion
of condition numbers.
00:43:37.560 --> 00:43:40.570
I'll add one more thing.
00:43:40.570 --> 00:43:43.750
These eigenvalues
were a good measure
00:43:43.750 --> 00:43:47.300
when my matrix was
symmetric positive definite.
00:43:47.300 --> 00:43:50.300
If I have a matrix
that I would never
00:43:50.300 --> 00:44:00.310
call K, a matrix like one, one,
zero and a million, OK, that,
00:44:00.310 --> 00:44:02.480
I would never write K for that.
00:44:02.480 --> 00:44:05.200
I would shoot myself
first before writing K.
00:44:05.200 --> 00:44:07.110
So what are the
eigenvalues lambda_min
00:44:07.110 --> 00:44:11.340
and lambda_max for that matrix?
00:44:11.340 --> 00:44:13.970
Are you up now on eigenvalues?
00:44:13.970 --> 00:44:15.490
We haven't done a
lot of eigenvalues
00:44:15.490 --> 00:44:19.190
but triangular matrices
are really easy.
00:44:19.190 --> 00:44:21.770
The eigenvalues of
that matrix are?
00:44:21.770 --> 00:44:23.470
One and one.
00:44:23.470 --> 00:44:28.370
The condition number should
not be 1/1, that would be bad.
00:44:28.370 --> 00:44:32.250
So instead, if
this was my matrix
00:44:32.250 --> 00:44:34.800
and I wanted to know
its condition number,
00:44:34.800 --> 00:44:36.530
what would I do?
00:44:36.530 --> 00:44:40.520
How would I define the
condition number of A?
00:44:40.520 --> 00:44:42.670
You know what I
do whenever I have
00:44:42.670 --> 00:44:46.770
a matrix that is not symmetric.
00:44:46.770 --> 00:44:50.700
I get to a symmetric matrix
by forming A transpose A,
00:44:50.700 --> 00:44:56.550
I get a K, I take
its condition number
00:44:56.550 --> 00:45:00.700
by my formula, which I
like in a symmetric case,
00:45:00.700 --> 00:45:05.950
and then I would
take the square root.
00:45:05.950 --> 00:45:08.840
So that would be a
pretty big number.
00:45:08.840 --> 00:45:13.770
For this, for that matrix A, the
condition number of that matrix
00:45:13.770 --> 00:45:16.470
A is up around 10^6.
00:45:16.470 --> 00:45:19.850
The condition number of that
matrix is up around 10^6 even
00:45:19.850 --> 00:45:24.070
though its eigenvalues are one
and one because when I form A
00:45:24.070 --> 00:45:29.880
transpose A, those eigenvalues
will jump all over.
00:45:29.880 --> 00:45:37.670
And then probably this thing
will have eigenvalues way up
00:45:37.670 --> 00:45:41.370
and the condition
number will be high.
00:45:41.370 --> 00:45:42.170
OK.
00:45:42.170 --> 00:45:45.070
So that's a little, you'll
meet condition number.
00:45:45.070 --> 00:45:49.410
MATLAB shows it, and you
naturally wonder what is this?
00:45:49.410 --> 00:45:52.050
Well, if it's a positive
definite matrix,
00:45:52.050 --> 00:45:54.640
it's just the ratio
lambda_max to lambda_min
00:45:54.640 --> 00:45:57.010
and it tells you
as it gets bigger
00:45:57.010 --> 00:46:00.320
that means the matrix
is tougher to work with.
00:46:00.320 --> 00:46:01.660
OK.
00:46:01.660 --> 00:46:07.020
We have just five minutes left
to say a few words about QR.
00:46:07.020 --> 00:46:11.970
OK, can I do that in
just a few minutes?
00:46:11.970 --> 00:46:16.140
And much more is in the codes.
00:46:16.140 --> 00:46:19.800
OK, what's the deal with QR?
00:46:19.800 --> 00:46:28.910
I'm starting with a
matrix A. Let's make
00:46:28.910 --> 00:46:32.540
it two by two, a two by two.
00:46:32.540 --> 00:46:34.750
OK, it's got a
couple of columns.
00:46:34.750 --> 00:46:36.100
Can I draw them?
00:46:36.100 --> 00:46:39.050
So it's got a column there,
that's its first column
00:46:39.050 --> 00:46:40.790
and it's got another
column there,
00:46:40.790 --> 00:46:44.320
maybe that's not a very
well conditioned matrix.
00:46:44.320 --> 00:46:47.840
Those are the columns of
A. Plotted in the plane,
00:46:47.840 --> 00:46:48.980
two-space.
00:46:48.980 --> 00:46:55.530
OK, so now the Gram-Schmidt
idea is: out of those columns,
00:46:55.530 --> 00:46:57.910
get orthonormal columns.
00:46:57.910 --> 00:47:02.560
Get from A to Q. So the
Gram-Schmidt idea is, out
00:47:02.560 --> 00:47:09.650
of these two vectors, two axes
that are not at 90 degrees,
00:47:09.650 --> 00:47:12.340
produce vectors that
are at 90 degrees.
00:47:12.340 --> 00:47:16.330
Actually, you can guess
how you're going to do it.
00:47:16.330 --> 00:47:19.090
Let me say, OK I'll settle
for that direction, that
00:47:19.090 --> 00:47:22.400
can be my first direction, q_1.
00:47:22.400 --> 00:47:26.870
What should be q_2?
00:47:26.870 --> 00:47:30.080
If that direction is
the right one for q_1
00:47:30.080 --> 00:47:36.270
I say OK I'll settle for
that, what's the q_2 guy?
00:47:36.270 --> 00:47:38.850
Well, what am I going to do?
00:47:38.850 --> 00:47:43.500
I mean, Gram thought of it
and Schmidt thought of it.
00:47:43.500 --> 00:47:46.210
Schmidt was a little
later, but it wasn't,
00:47:46.210 --> 00:47:49.830
like that hard to think of it.
00:47:49.830 --> 00:47:52.700
What do you do here?
00:47:52.700 --> 00:47:54.470
Well, we know how
to do projections
00:47:54.470 --> 00:47:59.010
from this least squares.
00:47:59.010 --> 00:48:01.600
What am I looking for?
00:48:01.600 --> 00:48:03.880
Subtract off the
projection, right.
00:48:03.880 --> 00:48:07.510
Take the projection
and subtract it off
00:48:07.510 --> 00:48:12.040
and be left with the component
that's perpendicular.
00:48:12.040 --> 00:48:17.500
So this will be the q_1
direction and this will be,
00:48:17.500 --> 00:48:20.550
this guy e, what we
called e, would tell me
00:48:20.550 --> 00:48:22.750
the q_2 direction.
00:48:22.750 --> 00:48:26.440
And then we could make
those into unit vectors
00:48:26.440 --> 00:48:29.750
and we'd be golden.
00:48:29.750 --> 00:48:34.750
And if we did that we would
discover that the original a_1,
00:48:34.750 --> 00:48:40.160
a_2 and a_1, so this is
the original first column,
00:48:40.160 --> 00:48:45.040
and the original second
column, would be the good q_1
00:48:45.040 --> 00:48:53.000
and the good q_2 times some
matrix R. So here's our A=QR.
00:48:53.000 --> 00:48:57.620
It's your chance to see this
second major factorization
00:48:57.620 --> 00:48:59.470
of linear algebra.
00:48:59.470 --> 00:49:03.520
LU being the first,
QR being the second.
00:49:03.520 --> 00:49:05.240
So what's up?
00:49:05.240 --> 00:49:09.380
Well, compare first columns.
00:49:09.380 --> 00:49:12.240
First columns, I didn't
change direction.
00:49:12.240 --> 00:49:17.580
So all I have is here is
some scaling r_(1,1), zero.
00:49:17.580 --> 00:49:20.700
Some number times q_1 is a_1.
00:49:20.700 --> 00:49:23.880
That direction was fine.
00:49:23.880 --> 00:49:31.320
The second direction, q_2 and
a_2, those involve also a_1.
00:49:31.320 --> 00:49:37.550
So there's an r_(1,2)
and an r_(2,2) there.
00:49:37.550 --> 00:49:40.570
The point was, this
came out triangular.
00:49:40.570 --> 00:49:42.990
And that's what
makes things good.
00:49:42.990 --> 00:49:46.170
It came out triangular
because of the order
00:49:46.170 --> 00:49:48.300
that Gram and Schmidt worked.
00:49:48.300 --> 00:49:51.250
Gram and Schmidt
settled the first one
00:49:51.250 --> 00:49:53.070
in the first direction.
00:49:53.070 --> 00:49:58.180
Then they settled the first two
in the first two directions.
00:49:58.180 --> 00:50:00.050
If we were in three
dimensions, there'd
00:50:00.050 --> 00:50:04.350
be an a_3 somewhere here,
coming out of the board.
00:50:04.350 --> 00:50:09.580
And then the q_3 would come
straight out of the board.
00:50:09.580 --> 00:50:11.650
Right?
00:50:11.650 --> 00:50:15.910
If you just see that you've got
Gram-Schmidt completely. a_1
00:50:15.910 --> 00:50:17.190
is there.
00:50:17.190 --> 00:50:19.900
So is q_1. a_2 is there.
00:50:19.900 --> 00:50:25.720
I'm in the board still, in
the plane of a_1 and a_2,
00:50:25.720 --> 00:50:31.300
is the plane of q_1 and q_2, I'm
just getting right angles in.
00:50:31.300 --> 00:50:34.850
a_3, the third column in
a three by three case,
00:50:34.850 --> 00:50:37.760
is coming out at some angle.
00:50:37.760 --> 00:50:42.100
I want q_3 to come out
at a 90 degree angle.
00:50:42.100 --> 00:50:50.140
So that q_3 will involve some
combination of all the a's.
00:50:50.140 --> 00:50:52.380
So if it was three
by three, this
00:50:52.380 --> 00:50:57.450
would grow to q_1, q_2, q_3,
and this would then have
00:50:57.450 --> 00:50:59.530
three guys in its third column.
00:50:59.530 --> 00:51:02.970
But maybe you see that picture.
00:51:02.970 --> 00:51:06.670
So that's what
Gram-Schmidt achieves.
00:51:06.670 --> 00:51:09.940
And I just can't
let time run out
00:51:09.940 --> 00:51:14.800
without saying that this
is a pretty good way.
00:51:14.800 --> 00:51:19.330
Actually, nobody thought there
was a better one for centuries.
00:51:19.330 --> 00:51:23.330
But then a guy named
Householder came up
00:51:23.330 --> 00:51:29.480
with a different way, and a
numerically little better way.
00:51:29.480 --> 00:51:31.000
Numerically a little better way.
00:51:31.000 --> 00:51:33.440
So this is the Gram-Schmidt way.
00:51:33.440 --> 00:51:35.620
Can I just put
those words up here?
00:51:35.620 --> 00:51:41.760
So there's the Gram-Schmidt,
the classical Gram-Schmidt idea,
00:51:41.760 --> 00:51:47.150
which was what I described, was
the easy one, easy to describe.
00:51:47.150 --> 00:51:53.090
And then there's a method
called Householder, just named
00:51:53.090 --> 00:51:56.220
after him, that
MATLAB would follow.
00:51:56.220 --> 00:52:01.310
That every good qr code now
uses Householder matrices.
00:52:01.310 --> 00:52:04.440
It achieves the same results.
00:52:04.440 --> 00:52:06.120
And if I had a
little bit more time
00:52:06.120 --> 00:52:08.400
I could draw a picture
of what it does.
00:52:08.400 --> 00:52:09.720
But there you go.
00:52:09.720 --> 00:52:15.560
So that's my lecture
on, my quick lecture
00:52:15.560 --> 00:52:17.180
on numerical linear algebra.
00:52:17.180 --> 00:52:23.010
These two essential points and
I'll see you this afternoon.
00:52:23.010 --> 00:52:26.270
Let me bring those quiz
questions down again,
00:52:26.270 --> 00:52:29.480
for any discussion
about the quiz.