# Lecture 3: Orthonormal Columns in Q Give Q’Q = I

Flash and JavaScript are required for this feature.

## Description

This lecture focuses on orthogonal matrices and subspaces. Professor Strang reviews the four fundamental subspaces: column space C(A), row space C(AT), nullspace N(A), left nullspace N(AT).

## Summary

Examples:

• Rotations
• Reflections
• Haar wavelets
• Discrete Fourier Transform (DFT)
• Complex inner product

Related section in textbook: I.5

Instructor: Prof. Gilbert Strang

ANNOUNCER: The following content is provided under a Creative Commons license. Your support will help MIT Open Courseware continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MITopencourseware@ocw.MIT.edu.

PROFESSOR: So we're really moving along this review of the highlights of linear algebra. And today it's matrices Q. They get that name there. They have orthonormal columns. So that's what one looks like.

And then the key fact, orthonormal columns translates directly into that simple fact that you just keep remembering every time you see Q transpose Q, you've got the identity matrix. Let's just see why. So Q transpose would be-- I'll take those columns and make them into rows.

And then I multiply by Q with the columns. And what do I get? Well hopefully, I get the identity matrix. Why? Because-- oh yeah, the normal part tells me that the length of each vector-- that's the length squared Q transpose Q-- the length squared is one.

So that gives me the one in the identity matrix all along the diagonal. And then Q transpose times a different Q is zero. That's the ortho part. So that gives me the zeros. So that's a simple identity, but it translates from a lot of words into a simple expression.

Now does that mean that in the other order, Q, Q transpose, is that the identity? So that's a question to think about. Is Q, Q transpose equal the identity? Question, sometimes yes, sometimes no-- easy to tell which.

If the answer is yes, yes when Q is square, the answer is yes. If Q is a square m a square matrix-- this is saying that a square matrix Q has that inverse on its left. But for square matrices, a left inverse, Q transpose is also a right inverse. So for a square matrix, if you have an inverse that works on one side, it will work on the other side. So the answer is yes in that case.

And then in that case, that's the case when we call Q is-- well, we really-- I don't know what the right name would be, but here is the name everybody uses, an orthogonal matrix. And that's only in their square case, square. Q is an orthogonal matrix.

Do you want to just see an example of how that works? So if Q is rectangular-- let me do a rectangular Q and a square Q. So I think there must be a board up there somewhere. Here. OK, square.

All right. Good to see some orthogonal matrices, because my message is that they are really important in all kinds of applications. Let's start two by two. I can think of two different ways to get an orthogonal matrix. That's a two by two matrix.

And one of them you will know immediately, cos theta sine theta. So that's a unit vector. It's normalized. Cos squared plus sine squared is one. And this guy has to be orthogonal to it.

So I'll make that minus sine theta and cos theta. Those are both length one, they're orthogonal, then this is my Q. And the inverse of Q will be the transpose. The transpose would put the minus sign down here and would produce the inverse matrix.

And what is that particular matrix represent? Geometrically, where do we see that matrix? It's a rotation, thank you. It's a rotation of the whole plane by theta. Yeah.

So if I apply that to one, zero for example, I get the first column, which is cos theta sine theta. And that's just-- let me draw a picture. That vector one, zero has gotten rotated up to-- so there's the one, zero. And there, it got rotated through an angle theta.

And similarly, zero, one will get rotated through an angle theta to there. So the whole plane rotates. Oh, that makes me remember a highly important, very important property of Q. It doesn't change length. The length of any vector is the same after you rotate it.

The length of any vector is the same after you multiply by Q. Can I just do that? I claim any x, any vector x, I want to look at the length of Qx. And I claim it has the same length as x.

Actually, that's the reason in computations that orthogonal matrix are so much loved, because no overflow can happen with orthogonal matrices. The lengths don't change. I can multiply by any number of orthogonal matrices and the length don't change. Can we just see why that's true?

So what do we have to go on? What we have to go on is Q transpose Q equal I. Whatever we're going to prove, it's got to come out of that, because that's all we know. So how do I use that to get that one?

Well, we haven't said a whole lot about length, but you'll see it all now. It'll be easier to prove that the squares are the same. So what's the what's the matrix expression for the length squared? What's the right hand side of that equation? X transpose x, right?

X transpose x gives me the sum of the squares. Pythagoras says that's the length squared. So that right hand side is x transpose x. What's the left side?

It's the length squared of this, of Qx. So it must be the same as Qx transpose Qx. And the claim is that that equation holds. And do you see it? So any property from Q has to just come out directly from that.

Where is it here? Do I just like, push away a little bit at that left hand side and see it? Qx transpose is the same as x transpose Q transpose. And Qx is Qx.

And now I'm seeing-- well, you might say, wait a minute. The parentheses were there and there. But I say the most important law for matrix multiplication is you can move parentheses or throw them away.

Let's throw them away. So in here I'm seeing Q transpose Q, which is the identity. So it's true, yeah. So that means that you're never under flow or overflow when you're multiplying by Q. Every numerical algorithm is written to use orthogonal matrices wherever it can.

And here's the first example. I think it may be good for me to think of other examples or for us to think of other examples of orthogonal matrices. So I'm using that word orthogonal matrix. We should really be saying orthonormal.

And I'm really thinking mostly of square ones. So in this square case when Q transpose is Q inverse. Of course, that fact makes it easy to solve all equations that have Q as a coefficient matrix, because you want the inverse and you just use the transpose.

Let's just take some minutes to think of examples of Q's. If they're so important, there have to be interesting examples. And that was a first one.

Now there's one more two by two example that you should know. Do you know what that would be? This will be an example two, and it's also going to be only two by two and real. And what possibility have I got left here?

I'll use this same first column, cos theta sine theta, because that's more or less any unit vector in two dimensions, this has got that form. So what do you propose for the second column? Yes?

AUDIENCE: [INAUDIBLE].

PROFESSOR: Yeah, put the minus sign down here. So you think does that make any difference? So sine theta and minus cos theta. I don't know if you've ever looked at that matrix. We're trying to collect together a few matrices that are worth knowing, are worth looking at.

Now what's happened here? You may say that was a trivial change, which it kind of was. But it's a different matrix now. It's not a rotation anymore. That's not a rotation.

And yeah, somehow now it's symmetric. And yeah, it's eigenvectors must be something or other. We'll get to those.

But what does that matrix do? I don't know if you've seen it. If you haven't, it doesn't jump out, but it's a important case. This is a reflection matrix.

Notice that it's determinant is minus one. You have minus cos square theta, minus sine squared theta. It's determinant is minus one. There's some eigenvalue coming up that's got a minus.

So what do I mean by a reflection matrix? Let me draw the plane. So one, zero, let's follow that, follow again.

One, zero, where does that go? That gives me the first column. So as before, it goes to cos theta sine theta. And when I say reflection, let me put the mirror into the picture so you see what reflection it is. The mirror is along here at angle theta over two, theta over two line.

So sure enough, one, zero at angle zero got reflected into a unit vector at angle theta, and halfway between was theta over two line. That's OK. Now what about the other guy, zero, one? Here's zero, one. I multiply that.

Can I put the zero, one up here, so your I does the multiplication? Where is the result of zero, one? What's the output from Q applied to zero, one? Sine theta minus cos theta, right? It's the second column.

And so where is that? Well, it's perpendicular to that guy. That's what I know. That was the point, that the two columns are perpendicular.

So it must go down this way, right? Sine theta cos theta. And it doesn't change the length. All these facts that we just learned are key. So there's zero, one. And it goes to this guy, which is whatever that second column is sine theta minus cos theta.

And if you check that, actually, gosh! This is like plain geometry. I believe-- it never occurred to me before-- but I believe it, that this angle going down to there, that that goes straight through, and that the halfway one is that line. Yeah, I think that picture has got it. And I think it's in the note.

So that's a reflection matrix. Well, that's a two by two reflection matrix. Would you like to see some other matrices like this one, but larger? They're named after a guy named Householder. So these are Householder reflections.

What am I doing here? I'm collecting together some orthogonal matrices that are useful and important. And Householder found a whole bunch of them. And his algorithm is a much used part of numerical linear algebra.

So he started with a unit vector. Start with a unit vector u transpose u equal one. So the length of the vector is one. And then he created-- let's name it after him-- H.

He created this matrix, the identity minus two u, u transpose. And I believe that that's a really useful matrix. I think this review is like going beyond 18.06, into what ones are really worth knowing, worth knowing individually.

Could we just check what are the properties of Householders reflection of that I minus two? You recognize here a column times a row. So that's a matrix. And what could you tell me about that matrix u, u transpose? It's yeah?

What can we say about H? So I guess I'm believing that H is a orthogonal matrix, otherwise it wouldn't be here today. So I believe that-- and that not only is it orthogonal, it is also-- have a look at it-- symmetric. It's also symmetric. The identity is symmetric. u, u transpose is symmetric.

So this is a family of symmetric orthogonal matrices. And that was one of them. That's a symmetric orthogonal matrix. These matrices are really great to have.

In using linear algebra, you just get a collection of useful matrices that you can call on. And these are definitely one family. Well, it's obviously symmetric. Shall we check that it's orthogonal?

So to check that it's orthogonal, so I'm going to check that H transpose H is the identity. Can I just check that? Well, H transpose is the same as H because it was symmetric. So I'm going to square this guy. This is really H times H.

I'm squaring it. And what do I get if I square-- So I get-- I hope I get the identity, but let's see it. What do I get when I square this? I get little-- multiply it out. So I times I is I.

And then I get some number of u, u transposes. How many do I get from that? So I'm squaring this thing because H transpose J is the same as H times H. So I'm squaring it.

So what do I put here? Four, thanks. And now I've got this guy squared with a plus. So that's four, u, you, transpose u, u transpose.

Yeah, I'm totally realizing I've practiced for a lifetime doing these dinky little calculations. But they are dinky. And you'll get the hang of it quickly.

Now what am I hoping out of that bottom line? That it is I. We're hoping we're going to get I. Do we get I? Yes.

Who sees how to get I out of that thing? Yeah, u transpose u in here is a number. That was u-- that was column times row times column times row. And I look at in the middle here is row times column. And that's the number one right because it's there.

So that's one and then I have minus 4 of that plus 4 that that. They cancel each other, and I get I. So those are good matrices. We'll use them.

We'll use them. Actually, they're better than Gram-Schmidt. So we'll use them in making things orthogonal.

So what other orthogonal matrices? Let's create some. Creating good orthogonal matrices is-- you know, it pays off. Let's think.

So there a family named after this French guy who lived to 100. He was a real old timer. Well, MIT had a faculty member in math, when I came, Professor Struik, who lived to 106. And I heard him give a lecture at Brown University at age 100. And it was perfect.

You could not have done it better. So he's my inspiration. I'm keep going. I only have I like, n more years to get there. And then it's-- well, it's too many anyway. So Hadamard, he created-- well, that's the simplest, the smallest.

Now the next guy is going to be four by four. I'm going to put that-- so where I see a one, I'm going to put Hadamard one, one, one minus one, one, one, one minus one. And then when I say a minus, I'm going to put an n with a minus. You saw that picture? It was a picture of H2, H2, H2, and minus H2.

That's what I've got there. And I believe those columns are orthogonal. Right? Now what could I do? Well it's not quite an orthogonal matrix.

What do I have to do to make-- this isn't quite an orthogonal matrix either. What do I do to make that an orthogonal matrix? Divide by square root of two? I need unit vectors there. And here, those links are one squared, one squared, got four, square root of four is two.

So I better divide by the two there. And now here I'm up to-- yeah. So that was that one. Tell me the next one up.

What's that going to be? Eight by eight? What's that? So this is H4 here. Oops, four.

So tell me, what I should do for eight by eight. You know, it's simple. But that's a good thing to say about it. You know, they're in coding theory, all sorts of places you want matrices of ones and minus ones. What's H8?

I'm going to build it out of H4. So what's it going to be? I'm going to put an H4 there. What am I going to put here? Another H4.

And up here? Another H4. And finally, here? Minus H4. And I think I've got orthogonal columns again. Because the columns within these dot products with themselves give zero and zero.

The dot products from these columns and these columns obviously, have the minus. And the dot products in here are zero from that and zero from that. Yeah, it works. And we could keep going to 16 and 32 and 64. But then up comes a question.

What about H12? Is there ones and minus ones matrix of size 12, 12 by 12? It doesn't come directly from our little pattern, which is doubling size every time. But you could still hope. And it works.

I don't know what it is, but there's a there's a matrix of ones and minus ones 12 orthogonal columns. So the answer is yes. So we make an 18.065 conjecture. Every matrix size-- well, not every matrix size, because three by three is not going to work.

One by one is not going to work either. But H12 works, H8, H16-- What's our conjecture? We won't be the first to conjecture it, that there is a ones and minus ones orthogonal matrix with orthogonal columns of every size n.

So I'll start the conjecture, always possible if n, which is the size of the matrix, let's say. What would you guess? Just take a shot. You won't be asked for a proof, because nobody has a proof. Even?

Well, you could hope for even. You could look for try six, I guess, would be the first guy there. And I don't think it's possible. I think six is not possible. So every even size is, I think, not going to work, but a natural idea to try.

What's the next thought? Every multiple of four. If n over a four is a whole number. But nobody has a systematic way to create these things. So like some of them, at this point, we're down to doing it one at a time.

And we're up to 668. But we haven't got that one yet. Isn't that crazy? So all this is coming from Wikipedia. My source of all that's good in mathematics is there on Wikipedia.

Anyway, this is the conjecture. Conjecture means you don't have any damn idea of whether it's true or not. Is that divisible by four? Yeah, I guess it would be. 600 certainly is, and 68 certainly is.

Yeah, OK. So I think that's the first one. If you find one of size 668, just skip the homework and tell us about that one. But I don't think I'll assign that.

Yeah, I must have searched for it online. But yeah. Anyway, so those are the Hadamard matrices.

Now where else do I remember orthogonal matrices coming from? Well, yeah really, the biggest source. So when I'm looking for orthogonal matrices, I'm looking for a basis of orthogonal vectors. And where in math am I going to find vectors that come out to be orthogonal? We haven't seen-- that's the next section of the notes, but maybe you are remembering.

Where will we sort of like, automatically show up with orthogonal vectors? They could be the eye eigenvectors of symmetric matrix. And that's where the most important ones come from. Oh, I could tell you about wavelets though. Wavelets are more like this picture.

They're ones and minus ones, or the simplest wavelets are ones and minus ones. Before I go on to the eigenvector business, can I mention the wavelets matrices? Yeah, these are really important simple and important constructions. So wavelets- let me draw a picture of-- I'm going to come up with four-- I'll do the four by four case. And these are the orthogonal guys.

And then the next one is and down and zero. And the last one is zero and up and down. So that's four things. But let me show you the matrix. So I'll call it W for wavelets.

So that guy, I'm thinking of was one, one, one, one. This guy I'm thinking of as one, one, minus one, minus one. It's looking sort of Hadamard's way. But there's a difference here. This guy is one minus one, zero, zero.

So the wavelengths rescale. That's the difference between Hadamard and wavelets. Wavelets are self-scaling. And what's the last guy here? What's the fourth column?

From that fourth wavelet? Zero, zero, one minus one, yeah. So Haar came up with that. This is the Haar wavelet, which was many years before the word wavelets was invented.

He came up with this construction, the Haar matrix, the Haar functions. So they're very simple functions, but you know, that makes them usable that the fact that they're so simple. Now I don't know if you want to see the pattern in eight by eight but let me start the eight by eight so you'll know what wavelets are about. You'll know what these Haar wavelets are about, anyway. They're the ones that were kind of easy to visualize.

So if that's W4, let's just take a minute. It won't take long for W8. So the first column is going to be eight, one. And what's the next column going to be? Four ones and four minus ones, like so.

So four ones and four minus ones. And now the next column, two ones two minus ones and zeros. One, one, minus one, minus one and zero. And the fourth will be zeros and two ones and two minus ones. We got half a matrix now.

Now if we just tell me the fifth, what do you think? What do I put in the fifth one? So again, it's going to squeeze down and rescale.

And what's fifth column up here that's going to be ones and minus ones and zeros now? So it's not Hadamard, it's Haar. And what shall I put? One, one. Shall I start with one, one?

AUDIENCE: 1 minus 1.

PROFESSOR: Oh! And then all zeros? Oh yeah, thanks! Perfect! One minus and then all zeros.

And then the next three columns, we'll have the one minus one here and the one minus one here, and the one minus one here. And otherwise all zeros. Yeah. So you see the pattern. It's scaling at every step.

So that matrix has the advantage of being quite sparse, short of-- This, in my mind is a-- or four ones, that get involved with like, taking the average. Then this guy is like, taking the differences between those and those. And then this is like taking the difference at a smaller scale. And that also at a smaller scale. So that's what we keep doing.

Yeah. Yeah. So that wavelets. It looks so simple right, but just a one minute history of wavelets, so Haar invented this in like, 1910. I mean, a long-- forever.

But then you wanted wavelets said we're a little better, and not just ones and minus ones and zeros. And that took a lot of thinking. A lot of people were searching for it. And Ingrid Daubechies-- so I'll just put her name-- became famous for finding them.

So in about 1988 she found a whole lot of families of wavelets. And when I say wavelets, she found a whole lot of orthogonal matrices that had good properties. Yeah. So that's the wavelet picture. OK.

Now to close today and to connect with the next lecture on eigenvalues, eigenvectors, positive definite matrices-- we're really getting to the heart of things here. Let me follow through on that idea. So the eigenvectors of a symmetric matrix, but also of an orthogonal matrix are orthogonal.

And that is really where, people-- where you can invent-- because you don't have to work hard. You just find a symmetric matrix. Its eigenvectors are automatically orthogonal. That doesn't mean they're great for use, but some of them are really important. And that maybe the most important of all is the Fourier.

So you probably have seen Fourier series sines and cosines. Those guys are orthogonal functions. But the discrete Fourier series is what everybody computes. And those are orthogonal vectors. So the are orthogonal vectors that go into the discrete Fourier transform.

And then they're done at high speed by the fast Fourier transform. Those are eigenvectors of Q, eigenvectors of the right Q. So let me just tell you in the right Q that get-- who's eigenvectors-- So here we go. Eigenvectors of Q-- you will just be amazed by how simple this matrix is. It's just that matrix.

It's called a permutation matrix. It just puts the-- those are the four, eight, four Fourier discrete-- let me put that word discrete up here-- transform. So I really meant to put discrete Fourier transform. Yeah. The eigenvectors of that matrix, first of all, they are orthogonal, and then second, and more important, they're tremendously useful.

They're the heart of signal processing. In signal processing, they just take a discrete Fourier transform of a vector before they even look at it. I mean, like that's the way to see it, is split it into its frequencies. And that's what the eigenvectors of this will do. So we're going to see the discrete Fourier transform.

But my point here is to know that they're orthogonal just comes out of this fact that the eigenvectors are orthogonal for any Q. And that is certainly a Q. Everybody can see that those columns are orthogonal. That's a permutation matrix.

You've taken the columns of the identity, which are totally orthogonal, and you just put them in a different order. So a permutation matrix is a reordering of the identity matrix. It's got to be a Q, and therefore, its eigenvectors are orthogonal. And they're just the winners. I mean, the matrix of the Fourier matrix with those four eigenvectors in it, I'll show you now.

We're finishing today by leading into Wednesday, eigenvectors and eigenvalues. And we happen to be doing eigenvectors and eigenvalues of a Q, not today, of an S. Most of the time, it's a symmetric matrix whose eigenvectors we take, but here that happens to be a Q. Can I show you the eigenvectors, the four eigenvectors of that? Now, oh!

The complex number I is going to come in. You have to let it in here. Yeah, sorry!

If S is a real symmetric matrix, its eigenvectors are real. But if Q is a orthogonal matrix, its eigenvectors-- even though you couldn't ask for a more real matrix than that-- but its eigenvectors are at least, the good way are complex numbers. So can I just show you the eigenvectors of--

So again, overall the point today is to see orthogonal matrices. So I'll just repeat now while I can-- rotations here, reflections, wavelets, the Householder idea of reflections of big, large matrices that have this form I minus 2 u, u transpose are orthogonal. And now we're going to see the big guys, the eigenvectors of Q. Yes?

AUDIENCE: [INAUDIBLE]?

PROFESSOR: Ah yeah, we don't have orthonormal. That's right. We don't have orthonormal. I better divide by the square root of eight.

AUDIENCE: [INAUDIBLE].

PROFESSOR: Oh yes! Oh, you're right! Sorry. I just thought I'd get away with that, but I didn't. Yeah.

So these guys these words of eight. So are these. But these guys are square roots of four. These guys are square roots of two. Thank you.

Absolutely right. Absolutely. Yeah. OK, so what are the eigenvectors of a permutation. This is going to be nice to see.

And I'll use the matrix F for the eigenvector matrix of that Q up there, and F is for Fourier. And it'd be the four by four Fourier matrix. OK. What are the eigenvectors of Q? OK. So Q is a permutation.

So like, I'm going to ask you for one eigenvector of every permutation matrix. What vector can you tell me that actually the eigenvalue will be one? What vector can you tell me where if I permute it, I don't change it? One, one, one, one. Like, it's everywhere here, one, one, one, one.

So that's the zero frequency for a vector, the constant vector, all ones. Everybody sees if I'm multiply by Q, it doesn't change. OK. Now the next one-- I'll show you the four now. The next one will be 1 I, I squared and I cubed.

Of course, I squared-- I don't know how many course 6J people are in this audience, but this is a math building. We paid for it. It's 2-190. And it's I in this room.

Anyway, I is the first letter in what? Imaginary. Thank you, the first letter in imaginary. You can't say jimaginary. So that's it.

OK. And then the next one is 1 I squared, I fourth, I sixth. And the next one is 1 I cubed, I6 and I9. Isn't that just beautiful? And you could show that every one of those four columns, if you multiply them by Q, you would get the eigenvalues. And you would see that it's an eigenvector.

And this is just sort of like a discrete Fourier stuff, instead of e to the I, e to the Ix, e to the 2Ix, e to the 3Ix and so on. We just have vectors. So those are the four eigenvectors of that permutation. And those are orthogonal.

Could I just check that? How do you know that this first column and the second-- well, I should really say zeros column and first column, if I'm talking frequencies. Do you see that that's orthogonal to that? Well, y is one plus I plus I squared plus I cubed equal zero.

That's the dot product is this column one dot column two. Yeah, you're right. This happens to come out zero. Is that right?

AUDIENCE: Yeah, that will come out zero.

PROFESSOR: That will come out zero. But somebody mentioned that this isn't right. It's true that that came out zero, but when I have imaginary numbers anywhere around, this isn't a correct dot product to test orthogonal. If I have imaginary-- complex vectors-- if I have complex numbers, complex vectors, I should test column one conjugate dotted with column two-- column I conjugate-- well, let me take column see, which ones shall I take? Maybe that guy and that guy.

Many of them, you luck out here. But really, I should be taking that conjugate. So these ones-- But the thing is, the complex conjugate of one is one. So that was OK. But in general, if I wanted to take column two dotted with column maybe four would be a little dodgy-- yeah.

Look what happens. Take that column and that column. Take their dot product. Do it the wrong way. So what's the wrong way?

Forget about the complex conjugate and just do it the usual way. So one times one is one. I times I cube is? One.

I squared times I6 is? One. I'm getting all ones. I'm not getting orthogonality there. And that's because I forgot that I should take the complex conjugate-- well, of these guys-- one-- I should take minus I, minus I, minus I squared-- well, that's real. So it's OK.

Minus there. So minus I squared is still minus one. So now if I do it, it comes out zero. So let me repeat again.

Let me just make this statement. If Q transpose Q is I, and Qx is lambda x, and Qy is different eigenvalue y-- so I'm setting up the main fact. In the last minute, I'm just going to write down. So I have an orthogonal matrix. I have an eigenvector with eigenvalue lambda.

I have another eigenvector with a different eigenvalue, mu. Then the claim is that-- what is the claim about eigenvectors? So here, this has an eigenvalue.

This has a different eigenvalue. I need them to be different to really know that the x's and the y's can't be the same. So what is it that I want to show?

AUDIENCE: [INAUDIBLE].

PROFESSOR: Yes! That x-- and I have to remember to do that. x transpose y is zero. That's orthogonality. That's orthogonality for a complex vectors.

I have to remember to change are every I to a minus I in one of the vectors. And I can prove that fact by playing with these. By starting from here, I can get to that.

OK. That's it. We've done a lot today, a lot of stuff about orthogonal matrices. Important ones, and those sources of important ones, eigenvectors. And so it will be eigenvectors on Wednesday.