32.7 Other Notions of Diagonalizability

]> 32.7 Other Notions of Diagonalizability

Home \| 18.013A \| Chapter 32		Tools Glossary Index Up Previous Next

32.7 Other Notions of Diagonalizability

We have noted that our first question has a number of variants, and we will note the changes in the answers when the variants are used.

When we allow complex matrix elements, and complex vectors, we can diagonalize a wider class of matrices.

When a vector has complex valued entries, we still want to interpret its length as the square root of its dot product with itself. We want this to be positive.

Therefore we redefine the dot product to make this so: the dot product of a complex vector with itself is the sum of the absolute value squared of its entries.

We generalize this to the dot product of a row vector and a column vector by making it the sum of the products of the complex conjugate of the component of the row vector with the corresponding component of the column vector.

Thus the dot product of the column vector with entries $(a + i b, c + i d)$ with the same row vector is

(a - i b) * (a + i b) + (c - i d) * (c + i d)

a^{2} + b^{2} + c^{2} + d^{2}

The dot product of the same column vector with $(e + i f, g + i h)$ is instead

(e - i f) * (a + i b) + (g - i h) * (c + i d)

Notice that with this definition the dot product is no longer symmetric. However it does not change if you interchange row and column and also take the complex conjugate, since the asymmetry lies in taking the complex conjugate of the row and not the column.

With complex vectors we define an orthonormal basis to be one for which the dot product of each column with the complex conjugate of the entries in the other columns are zero.

This means that with this definition, a matrix that takes a given basis into another orthonormal basis in this context has the property that its complex conjugate transpose is its inverse.

Such a matrix is called a unitary matrix, and the linear transformation which takes one orthonormal complex basis to another is called a unitary transformation.

The effect of a unitary transformation described by the unitary matrix $U$ on a matrix $M$ is now $U^{t} * M U$ as can be shown by the same argument as before. (Of course real unitary matrices are orthogonal.)

Again we can ask, what matrices can be diagonalized by a unitary transformation? A preliminary question is: which matrices can be diagonalized so that its eigenvalues, which are what appear on the diagonal when it is diagonalized, are all real?

The answer now is that any matrix that is its own transpose complex conjugate will have this property: which implies if $M$ is $n$ by $n$ , $M$ has $n$ real eigenvalues and an orthonormal basis of eigenvectors.

Such matrices are called Hermitian matrices.

Again the necessity of this condition follows from the fact that "Hermitivity" is preserved by unitary transformations and real diagonal matrices are Hermitian.

Hermitian matrices are of particular importance because they have the possibility of representing measurable real observables in physical systems. They do so in quantum mechanics.

Answer to the general question, without reference to real eigenvalues is that the matrix must commute with its complex conjugate transpose.

This condition is again preserved under unitary transformations, and it is a property of diagonal matrices, since all diagonal matrices commute with one another, so it is definitely necessary.

Still another question is, when can a matrix be diagonalized by any change of basis, without any requirement about orthonormality; that is when does there exist any kind of a basis of eigenvectors for the matrix $M$ ?

There is an easy answer which again can easily be seen to be necessary. Suppose $a_{1}, a_{2}, \dots a_{k}$ are the distinct eigenvalues of $M$ .

Any vector can be written as a sum of basis vectors.

If each basis vector is an eigenvector of $M$ , say corresponding to eigenvalue $a_{j}$ , then $M - a_{j} I$ acting on it will be the zero vector.

On the other hand $M - a_{h} I$ for $a_{h}$ different from $a_{j}$ , acting on it merely multiplies it by $a_{j} - a_{h}$ .

Thus, if there is a basis consisting of eigenvectors of $M$ then the product over all $j$ from 1 to $k$ of $(M - a_{j} I)$ must be the zero matrix, since it must give 0 in acting on every basis vector.

This product is called the minimal polynomial of $M$ and the equation that it is the zero matrix is called the minimal equation for $M$ . Thus if $M$ obeys its own minimal equation then it has a basis of eigenvectors.

By the way an interesting and curious fact is that every matrix obeys its own characteristic equation (that is if you substitute $M$ for the variable $x$ in it, you get the 0 matrix).