Linear Algebra: #21 Which Matrices can be Diagonalized?
Now we have seen that not all orthogonal matrices can be diagonalized. (Think about the rotations of ℜ2.) On the other hand, we can prove that all unitary, and also all Hermitian matrices can be diagonalized.
Of course, a matrix M is only a representation of a linear mapping f : V → V with respect to a given basis {v1, . . . , vn} of the vector space V. So the idea that the matrix can be diagonalized is that it is similar to a diagonal matrix. That is, there exists another matrix S, such that S−1MS is diagonal.
But this means that there must be a basis for V, consisting entirely of eigenvectors.
In this section we will consider complex vector spaces — that is, V is a vector space over the complex numbers ℂ. The vector space V will be assumed to have a scalar product associated with it, and the bases we consider will be orthonormal. We begin with a definition.
Definition
Let W ⊂ V be a subspace of V. Let
W⊥ = {v ∈ V : <v, w> = 0, ∀w ∈ W}.
Then W⊥ is called the perpendicular space to W.
It is a rather trivial matter to verify that W⊥ is itself a subspace of V, and furthermore W∩ W⊥ = {0}. In fact, we have:
Theorem 53
V = W ⊕ W⊥.
Proof
Let {w1, . . . , wm} be some orthonormal basis for the vector space W. This can be extended to a basis {w1, . . . , wm, wm+1, . . . , wn} of V. Assuming the GramSchmidt process has been used, we may assume that this is an orthonormal basis. The claim is then that {wm+1, . . . , wn} is a basis for W⊥.
Now clearly, since <wj, wk> = 0, for j ≠ k, we have {wm+1, . . . , wn} ⊂ W⊥. If u ∈ W⊥ is some arbitrary vector in W⊥, then we have
since <wj, u> = 0 if j ≤ m. (Remember, u ∈ W⊥) Therefore, {wm+1, . . . , wn} is a linearly independent, orthonormal set which generates W⊥, so it is a basis. And so we have V = W ⊕ W⊥.
Theorem 54
Let f : V → V be a unitary mapping (V is a vector space over the complex numbers ℂ). Then there exists an orthonormal basis {v1, . . . , vn} for V consisting of eigenvectors under f. That is to say, the matrix of f with respect to this basis is a diagonal matrix.
Proof
If the dimension of V is zero or one, then obviously there is nothing to prove. So let us assume that the dimension n is at least two, and we prove things by induction on the number n. That is, we assume that the theorem is true for spaces of dimension less than n.
Now, according to the fundamental theorem of algebra, the characteristic polynomial of f has a zero, λ say, which is then an eigenvalue for f. So there must be some non-zero vector vn ∈ V, with f(vn) = λvn. By dividing by the norm of vn if necessary, we may assume that ||vn|| = 1.
Let W ⊂ V be the 1-dimensional subspace generated by the vector vn. Then W⊥ is an n−1 dimensional subspace. We have that W⊥ is invariant under f. That is, if u ∈ W⊥ is some arbitrary vector, then f(u) ∈ W⊥ as well. This follows since
λ<f(u), vn> = <f(u), λvn> = <f(u), f(vn)> = <u, vn> = 0.
But we have already seen that for an eigenvalue λ of a unitary mapping, we must have |λ| = 1. Therefore we must have <f(u), vn> = 0.
So we can consider f, restricted to W⊥, and using the inductive hypothesis, we obtain an orthonormal basis of eigenvectors {v1, . . . , vn-1} for W⊥. Therefore, adding in the last vector vn, we have an orthonormal basis of eigenvectors {v1, . . . , vn} for V.
Theorem 55
All Hermitian matrices can be diagonalized.
Proof
This is similar to the last one. Again, we use induction on n, the dimension of the vector space V. We have a self-adjoint mapping f : V → V. If n is zero or one, then we are finished. Therefore we assume that n ≥ 2.
Again, we observe that the characteristic polynomial of f must have a zero, hence there exists some eigenvalue λ, and an eigenvector vn of f, which has norm equal to one, where f(vn) = λvn. Again take W to be the one dimensional subspace of V generated by vn. Let W⊥ be the perpendicular subspace. It is only necessary to show that, again, W⊥ is invariant under f. But this is easy. Let u ∈ W⊥ be given. Then we have
<f(u), vn> = <u, f(vn)> = <u, λvn> = λ<u, vn> = λ· 0 = 0
.
The rest of the proof follows as before.
In the particular case where we have only real numbers (which of course are a subset of the complex numbers), then we have a symmetric matrix.
Corollary
All real symmetric matrices can be diagonalized.
Note furthermore, that even in the case of a unitary matrix, the symmetry condition, namely ajk = akj , implies that on the diagonal, we have ajj = ajj for all j. That is, the diagonal elements are all real numbers. But these are the eigenvalues. Therefore we have:
Corollary
The eigenvalues of a self-adjoint matrix — that is, a symmetric or a Hermitian matrix — are all real numbers.
Orthogonal matrices revisited
Let A be an n × n orthogonal matrix. That is, it consists of real numbers, and we have At = A−1. In general, it cannot be diagonalized. But on the other hand, it can be brought into the following form by means of similarity transformations.
To see this, start by imagining that A represents the orthogonal mapping f : ℜn → ℜn with respect to the canonical basis of ℜn. Now consider the symmetric matrix
B = A + At = A + A−1.
This matrix represents another linear mapping, call it g : ℜn → ℜn, again with respect to the canonical basis of ℜn.
But, as we have just seen, B can be diagonalized. In particular, there exists some vector v ∈ ℜn with g(v) = λg(v), for some λ ∈ ℜ. We now proceed by induction on the number n. There are two cases to consider:
- v is also an eigenvector for f, or
- it isn’t.
where A* is an (n−1) ×(n−1) orthogonal matrix, which, according to the inductive hypothesis, can be transformed into the required form.
If v is not an eigenvector of f, then, still, we know it is an eigenvector of g, and furthermore g = f + f −1. In particular, g(v) = λv = f(v) + f −1(v). That is,
f(f(v)) = λf(v) − v.
So this time, let W = [v, f(v)]. This is a 2-dimensional subspace of V. Again, consider W⊥. We have V = W ⊕ W⊥. So we must show that W⊥ is invariant under f. Now we have another two cases to consider:
- λ = 0, and
- λ ≠ 0.
On the other hand, if λ ≠ 0 then we have v = λf(v) − f(f(v)) so that <f(w), v> = <f(w), λf(v) − f(f(v))> = λ<f(w), f(v)> − <f(w), f(f(v))>, and we have seen that both of these scalar products are zero. Finally, we again have <f(w), f(v)> = <w, v> = 0.
Therefore we have shown that V = W⊕ W⊥, where both of these subspaces are invariant under the orthogonal mapping f. By our inductive hypothesis, there is an orthonormal basis for f restricted to the n −2 dimensional subspace W⊥ such that the matrix has the required form. As far as W is concerned, we are back in the simple situation of an orthogonal mapping ℜ2 → ℜ2, and the matrix for this has the form of one of our 2 × 2 blocks.
No comments:
Post a Comment
If it's a past exam question, do not include links to the paper. Only the reference.
Comments will only be published after moderation