Linear Algebra: #21 Which Matrices can be Diagonalized?

Linear Algebra: #21 Which Matrices can be Diagonalized?

The complete answer to this question is a bit too complicated. It all has to do with a thing called the “minimal polynomial”.

Now we have seen that not all orthogonal matrices can be diagonalized. (Think about the rotations of ℜ².) On the other hand, we can prove that all unitary, and also all Hermitian matrices can be diagonalized.

Of course, a matrix M is only a representation of a linear mapping f : V → V with respect to a given basis {v₁, . . . , v_n} of the vector space V. So the idea that the matrix can be diagonalized is that it is similar to a diagonal matrix. That is, there exists another matrix S, such that S⁻¹MS is diagonal.

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 1

But this means that there must be a basis for V, consisting entirely of eigenvectors.

In this section we will consider complex vector spaces — that is, V is a vector space over the complex numbers ℂ. The vector space V will be assumed to have a scalar product associated with it, and the bases we consider will be orthonormal. We begin with a definition.

Deﬁnition
Let W ⊂ V be a subspace of V. Let

W^⊥ = {v ∈ V : <v, w> = 0, ∀w ∈ W}.

Then W^⊥ is called the perpendicular space to W.

It is a rather trivial matter to verify that W^⊥ is itself a subspace of V, and furthermore W∩ W^⊥ = {0}. In fact, we have:

Theorem 53

V = W ⊕ W^⊥.

Proof
Let {w₁, . . . , w_m} be some orthonormal basis for the vector space W. This can be extended to a basis {w₁, . . . , w_m, w_m+1, . . . , w_n} of V. Assuming the GramSchmidt process has been used, we may assume that this is an orthonormal basis. The claim is then that {w_m+1, . . . , w_n} is a basis for W^⊥.

Now clearly, since <w_j, w_k> = 0, for j ≠ k, we have {w_m+1, . . . , w_n} ⊂ W^⊥. If u ∈ W^⊥ is some arbitrary vector in W^⊥, then we have

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic21

since <w_j, u> = 0 if j ≤ m. (Remember, u ∈ W^⊥) Therefore, {w_m+1, . . . , w_n} is a linearly independent, orthonormal set which generates W^⊥, so it is a basis. And so we have V = W ⊕ W^⊥.

Theorem 54
Let f : V → V be a unitary mapping (V is a vector space over the complex numbers ℂ). Then there exists an orthonormal basis {v₁, . . . , v_n} for V consisting of eigenvectors under f. That is to say, the matrix of f with respect to this basis is a diagonal matrix.

Proof
If the dimension of V is zero or one, then obviously there is nothing to prove. So let us assume that the dimension n is at least two, and we prove things by induction on the number n. That is, we assume that the theorem is true for spaces of dimension less than n.

Now, according to the fundamental theorem of algebra, the characteristic polynomial of f has a zero, λ say, which is then an eigenvalue for f. So there must be some non-zero vector v_n ∈ V, with f(v_n) = λv_n. By dividing by the norm of v_n if necessary, we may assume that ||v_n|| = 1.

Let W ⊂ V be the 1-dimensional subspace generated by the vector v_n. Then W^⊥ is an n−1 dimensional subspace. We have that W^⊥ is invariant under f. That is, if u ∈ W^⊥ is some arbitrary vector, then f(u) ∈ W^⊥ as well. This follows since

λ<f(u), v_n> = <f(u), λv_n> = <f(u), f(v_n)> = <u, v_n> = 0.

But we have already seen that for an eigenvalue λ of a unitary mapping, we must have |λ| = 1. Therefore we must have <f(u), v_n> = 0.

So we can consider f, restricted to W^⊥, and using the inductive hypothesis, we obtain an orthonormal basis of eigenvectors {v₁, . . . , v_n-1} for W^⊥. Therefore, adding in the last vector v_n, we have an orthonormal basis of eigenvectors {v₁, . . . , v_n} for V.

Theorem 55
All Hermitian matrices can be diagonalized.

Proof
This is similar to the last one. Again, we use induction on n, the dimension of the vector space V. We have a self-adjoint mapping f : V → V. If n is zero or one, then we are ﬁnished. Therefore we assume that n ≥ 2.

Again, we observe that the characteristic polynomial of f must have a zero, hence there exists some eigenvalue λ, and an eigenvector v_n of f, which has norm equal to one, where f(v_n) = λv_n. Again take W to be the one dimensional subspace of V generated by v_n. Let W^⊥ be the perpendicular subspace. It is only necessary to show that, again, W^⊥ is invariant under f. But this is easy. Let u ∈ W^⊥ be given. Then we have

<f(u), v_n> = <u, f(v_n)> = <u, λv_n> = λ<u, v_n> = λ· 0 = 0

.
The rest of the proof follows as before.

In the particular case where we have only real numbers (which of course are a subset of the complex numbers), then we have a symmetric matrix.

Corollary
All real symmetric matrices can be diagonalized.

Note furthermore, that even in the case of a unitary matrix, the symmetry condition, namely a_jk= a_kj , implies that on the diagonal, we have a_jj= a_jj for all j. That is, the diagonal elements are all real numbers. But these are the eigenvalues. Therefore we have:

Corollary
The eigenvalues of a self-adjoint matrix — that is, a symmetric or a Hermitian matrix — are all real numbers.

Orthogonal matrices revisited
Let A be an n × n orthogonal matrix. That is, it consists of real numbers, and we have A^t = A⁻¹. In general, it cannot be diagonalized. But on the other hand, it can be brought into the following form by means of similarity transformations.

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 3

To see this, start by imagining that A represents the orthogonal mapping f : ℜⁿ → ℜⁿ with respect to the canonical basis of ℜⁿ. Now consider the symmetric matrix

B = A + A^t = A + A⁻¹.

This matrix represents another linear mapping, call it g : ℜⁿ → ℜⁿ, again with respect to the canonical basis of ℜⁿ.

But, as we have just seen, B can be diagonalized. In particular, there exists some vector v ∈ ℜⁿ with g(v) = λg(v), for some λ ∈ ℜ. We now proceed by induction on the number n. There are two cases to consider:

v is also an eigenvector for f, or
it isn’t.

The ﬁrst case is easy. Let W ⊂ V be simply W = [v]. i.e. this is just the set of all scalar multiples of v. Let W^⊥ be the perpendicular space to W. (That is, w ∈ W means that <w, v> = 0.) But it is easy to see that W^⊥ is also invariant under f. This follows by observing ﬁrst of all that f(v) = αv, with α = ±1. (Remember that the eigenvalues of orthogonal mappings have absolute value 1.) Now take w ∈ W^⊥. Then <f(w), v> = α⁻¹<f(w), αv> = α⁻¹<f(w), f(v)> = α⁻¹<fw, v> = α⁻¹ · 0 = 0. Thus, by changing the basis of ℜⁿ to being an orthonormal basis, starting with v (which we can assume has been normalized), we obtain that the original matrix is similar to the matrix

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 4

where A^* is an (n−1) ×(n−1) orthogonal matrix, which, according to the inductive hypothesis, can be transformed into the required form.

If v is not an eigenvector of f, then, still, we know it is an eigenvector of g, and furthermore g = f + f ⁻¹. In particular, g(v) = λv = f(v) + f ⁻¹(v). That is,

f(f(v)) = λf(v) − v.

So this time, let W = [v, f(v)]. This is a 2-dimensional subspace of V. Again, consider W^⊥. We have V = W ⊕ W^⊥. So we must show that W^⊥ is invariant under f. Now we have another two cases to consider:

λ = 0, and
λ ≠ 0.

So if λ = 0 then we have f(f(v)) = −v. Therefore, again taking w ∈ W^⊥, we have <f(w), v> = <f(w), −f(f(v))> = −<w, f(v)> = 0. (Remember that w ∈ W^⊥, so that <w, f(v)> = 0.) Of course we also have <f(w), f(v)> = <w, v> = 0.

On the other hand, if λ ≠ 0 then we have v = λf(v) − f(f(v)) so that <f(w), v> = <f(w), λf(v) − f(f(v))> = λ<f(w), f(v)> − <f(w), f(f(v))>, and we have seen that both of these scalar products are zero. Finally, we again have <f(w), f(v)> = <w, v> = 0.

Therefore we have shown that V = W⊕ W^⊥, where both of these subspaces are invariant under the orthogonal mapping f. By our inductive hypothesis, there is an orthonormal basis for f restricted to the n −2 dimensional subspace W^⊥ such that the matrix has the required form. As far as W is concerned, we are back in the simple situation of an orthogonal mapping ℜ² → ℜ², and the matrix for this has the form of one of our 2 × 2 blocks.

Tuesday, June 24, 2014

Linear Algebra: #21 Which Matrices can be Diagonalized?

Linear Algebra: #21 Which Matrices can be Diagonalized?

No comments:

Post a Comment