Physics Reference: High

Showing posts with label High. Show all posts

Thursday, June 26, 2014

Linear Algebra: #22 Dual Spaces

Linear Algebra: #22 Dual Spaces

Again let V be a vector space over a ﬁeld F (and, although its not really necessary here, we continue to take F = ℜ or ℂ).

Deﬁnition
The dual space to V is the set of all linear mappings f : V → F. We denote the dual space by V^*.

Examples

Let V = ℜⁿ. Then let f_i be the projection onto the i-th coordinate. That is, if e_j is the j-th canonical basis vector, then

So each f_i is a member of V^*, for i = 1, . . . , n, and as we will see, these dual vectors form a basis for the dual space.

More generally, let V be any ﬁnite dimensional vector space, with some basis {v₁, . . . , v_n}. Let f_i : V → F be defined as follows. For an arbitrary vector v ∈ V there is a unique linear combination

v = a₁v₁ + · · · + a_nv_n .

Then let f_i(v_i ) = a_i. Again, f_i ∈ V^*, and we will see that the n vectors, f₁, . . . , f_n form a basis of the dual space.

Let C₀([0, 1]) be the space of continuous functions f : [0, 1] → ℜ. As we have seen, this is a real vector space, and it is not ﬁnite dimensional. For each f ∈ C₀([0, 1]) let

This gives us a linear mapping Λ : C₀([0, 1]) → ℜ. Thus it belongs to the dual space of C₀([0, 1]).

Another vector in the dual space to C₀([0, 1]) is given as follows. Let x ∈ [0, 1] be some ﬁxed point. Then let Γ_x : C₀([0, 1]) → ℜ is defined to be Γ(f) = f(x), for all f ∈ C₀([0, 1]).

For this last example, let us assume that V is a vector space with scalar 0 product. (Thus F = ℜ or ℂ) For each v ∈ V, let φ_v(u) = <v, u>. Then φ_v ∈ V^*.

Theorem 56
Let V be a ﬁnite dimensional vector space (over ℂ) and let V^* be the dual space. For each v ∈ V, let φ_v : V → ℂ be given by φ_v(u) = <v, u>. Then given an orthonormal basis {v₁, . . . , v_n} of V, we have that {φ_v1, . . ., φ_vn} is a basis of V^*. This is called the dual basis to {v₁, . . . , v_n}.

Proof
Let φ ∈ V^* be an arbitrary linear mapping φ : V → ℂ. But, as always, we remember that φ is uniquely determined by vectors (which in this case are simply complex numbers) φ(v₁), . . . , φ(v_n). Say φ(v_j) ∈ ℂ, for each j. Now take some arbitrary vector v ∈ V. There is the unique expression

Linear Algebra: #22 Dual Spaces equation pic 3

Therefore, φ = c₁φ_v1 + · · · + c_nφ_vn, and so {φ_v1, . . ., φ_vn} generates V^*.

To show that {φ_v1, . . ., φ_vn} is linearly independent, let φ = c₁φ_v1 + · · · + c_nφ_vn be some linear combination, where c_j ≠ 0, for at least one j. But then φ(v_j) = c_j ≠ 0, and thus φ ≠ 0 in V^*.

Corollary
dim(V^*) = dim(V).

Corollary
More specifically, we have an isomorphism V → V^*, such that v → φ_v for each v ∈ V.

But somehow, this isomorphism doesn’t seem to be very “natural”. It is defined in terms of some specific basis of V. What if V is not ﬁnite dimensional so that we have no basis to work with? For this reason, we do not think of V and V^* as being “really” just the same vector space. [In case we have a scalar product, then there is a “natural” mapping V → V^*, where v → φ_v, such that φ_v(u) = <v, u>, for all u ∈ V.]

On the other hand, let us look at the dual space of the dual space (V^*)^*. (Perhaps this is a slightly mind-boggling concept at ﬁrst sight!) We imagine that “really” we just have (V^*)^* = V. For let Φ ∈ (V^*)^*. That means, for each φ ∈ V^* we have Φ(φ) being some complex number. On the other hand, we also have φ(v) being some complex number, for each V ∈ V. Can we uniquely identify each V ∈ V with some Φ ∈ (V^*)^*, in the sense that both always give the same complex numbers, for all possible φ ∈ V^*?

Let us say that there exists a v ∈ V such that Φ(φ) = φ(v), for all φ ∈ V^*. In fact, if we define φ_v to be Φ(φ) = φ(v), for each φ ∈ V^*, then we certainly have a linear mapping, V^* → ℂ. On the other hand, given some arbitrary Φ ∈ (V^*)^*, do we have a unique v ∈ V such that Φ(φ) = φ(v), for all φ ∈ V^*? At least in the case where V is ﬁnite dimensional, we can affirm that it is true by looking at the dual basis.

Dual mappings
Let V and W be two vector spaces (where we again assume that the ﬁeld is ℂ). Assume that we have a linear mapping f : V → W. Then we can define a linear mapping f ^* : W^* → V^* in a natural way as follows. For each φ ∈ W^*, let f ^*(φ) = φ ◦ f. So it is obvious that f ^*(φ) : V → ℂ is a linear mapping. Now assume that V and W have scalar products, giving us the mappings s : V → V^* and t : W → W^*. So we can draw a little “diagram” to describe the situation.

Linear Algebra: #22 Dual Spaces equation pic 4

The mappings s and t are isomorphisms, so we can go around the diagram, using the mapping f ^adj = s⁻¹ ◦ f^* ◦ t : W → V. This is the adjoint mapping to f. So we see that in the case V = W, we have that a self-adjoint mapping f : V → V is such that f ^adj = f.

Does this correspond with our earlier definition, namely that <u, f(v)> = <f(u), v> for all u and v ∈ V? To answer this question, look at the diagram, which now has the form

Linear Algebra: #22 Dual Spaces equation pic 5

where s(v) ∈ V^* is such that s(v)(u) = <v, u>, for all u ∈ V. Now f ^adj = s⁻¹ ◦ f^* ◦ s; that is, the condition f ^adj = f becomes s⁻¹ ◦ f^* ◦ s = f. Since s is an isomorphism, we can equally say that the condition is that f^* ◦ s = s ◦ f. So let v be some arbitrary vector in V. We have s ◦ f(v) = f^* ◦ s(v). However, remembering that this is an element of V^*, we see that this means

(s ◦ f(v))(u) = (f^* ◦ s)(v)(u),

for all u ∈ V. But (s ◦ f(v))(u) = <f(v), u> and (f^* ◦ s)(v)(u) = <v, f(u)>. Therefore we have

<f(v), u> = <v, f(u)>

for all v and u ∈ V, as expected.

This is the last section for this series on Linear Algebra. But that is not to say that there is nothing more that you have to know about the subject. For example, when studying the theory of relativity you will encounter tensors, which are combinations of linear mappings and dual mappings. One speaks of “covariant” and “contravariant” tensors. That is, linear mappings and dual mappings.

But then, proceeding to the general theory of relativity, these tensors are used to describe differential geometry. That is, we no longer have a linear (that is, a vector) space. Instead, we imagine that space is curved, and in order to describe this curvature, we define a thing called the tangent vector space which you can think of as being a kind of linear approximation to the spacial structure near a given point. And so it goes on, leading to more and more complicated mathematical constructions, taking us away from the simple “linear” mathematics which we have seen in this semester.

After a few years of learning the mathematics of contemporary theoretical physics, perhaps you will begin to ask yourselves whether it really makes so much sense after all. Can it be that the physical world is best described by using all of the latest techniques which pure mathematicians happen to have been playing around with in the last few years — in algebraic topology, functional analysis, the theory of complex functions, and so on and so forth? Or, on the other hand, could it be that physics has been loosing touch with reality, making constructions similar to the theory of epicycles of the medieval period, whose conclusions can never be verified using practical experiments in the real world?

IMPORTANT NOTE:
This series on Linear Algebra has been taken from the lecture notes prepared by Geoffrey Hemion. I used his notes when studying Linear Algebra for my physics course and it was really helpful. So, I thought that you could also benefit from his notes. The document can be found at his homepage.

Tuesday, June 24, 2014

Linear Algebra: #21 Which Matrices can be Diagonalized?

Linear Algebra: #21 Which Matrices can be Diagonalized?

The complete answer to this question is a bit too complicated. It all has to do with a thing called the “minimal polynomial”.

Now we have seen that not all orthogonal matrices can be diagonalized. (Think about the rotations of ℜ².) On the other hand, we can prove that all unitary, and also all Hermitian matrices can be diagonalized.

Of course, a matrix M is only a representation of a linear mapping f : V → V with respect to a given basis {v₁, . . . , v_n} of the vector space V. So the idea that the matrix can be diagonalized is that it is similar to a diagonal matrix. That is, there exists another matrix S, such that S⁻¹MS is diagonal.

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 1

But this means that there must be a basis for V, consisting entirely of eigenvectors.

In this section we will consider complex vector spaces — that is, V is a vector space over the complex numbers ℂ. The vector space V will be assumed to have a scalar product associated with it, and the bases we consider will be orthonormal. We begin with a definition.

Deﬁnition
Let W ⊂ V be a subspace of V. Let

W^⊥ = {v ∈ V : <v, w> = 0, ∀w ∈ W}.

Then W^⊥ is called the perpendicular space to W.

It is a rather trivial matter to verify that W^⊥ is itself a subspace of V, and furthermore W∩ W^⊥ = {0}. In fact, we have:

Theorem 53

V = W ⊕ W^⊥.

Proof
Let {w₁, . . . , w_m} be some orthonormal basis for the vector space W. This can be extended to a basis {w₁, . . . , w_m, w_m+1, . . . , w_n} of V. Assuming the GramSchmidt process has been used, we may assume that this is an orthonormal basis. The claim is then that {w_m+1, . . . , w_n} is a basis for W^⊥.

Now clearly, since <w_j, w_k> = 0, for j ≠ k, we have {w_m+1, . . . , w_n} ⊂ W^⊥. If u ∈ W^⊥ is some arbitrary vector in W^⊥, then we have

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic21

since <w_j, u> = 0 if j ≤ m. (Remember, u ∈ W^⊥) Therefore, {w_m+1, . . . , w_n} is a linearly independent, orthonormal set which generates W^⊥, so it is a basis. And so we have V = W ⊕ W^⊥.

Theorem 54
Let f : V → V be a unitary mapping (V is a vector space over the complex numbers ℂ). Then there exists an orthonormal basis {v₁, . . . , v_n} for V consisting of eigenvectors under f. That is to say, the matrix of f with respect to this basis is a diagonal matrix.

Proof
If the dimension of V is zero or one, then obviously there is nothing to prove. So let us assume that the dimension n is at least two, and we prove things by induction on the number n. That is, we assume that the theorem is true for spaces of dimension less than n.

Now, according to the fundamental theorem of algebra, the characteristic polynomial of f has a zero, λ say, which is then an eigenvalue for f. So there must be some non-zero vector v_n ∈ V, with f(v_n) = λv_n. By dividing by the norm of v_n if necessary, we may assume that ||v_n|| = 1.

Let W ⊂ V be the 1-dimensional subspace generated by the vector v_n. Then W^⊥ is an n−1 dimensional subspace. We have that W^⊥ is invariant under f. That is, if u ∈ W^⊥ is some arbitrary vector, then f(u) ∈ W^⊥ as well. This follows since

λ<f(u), v_n> = <f(u), λv_n> = <f(u), f(v_n)> = <u, v_n> = 0.

But we have already seen that for an eigenvalue λ of a unitary mapping, we must have |λ| = 1. Therefore we must have <f(u), v_n> = 0.

So we can consider f, restricted to W^⊥, and using the inductive hypothesis, we obtain an orthonormal basis of eigenvectors {v₁, . . . , v_n-1} for W^⊥. Therefore, adding in the last vector v_n, we have an orthonormal basis of eigenvectors {v₁, . . . , v_n} for V.

Theorem 55
All Hermitian matrices can be diagonalized.

Proof
This is similar to the last one. Again, we use induction on n, the dimension of the vector space V. We have a self-adjoint mapping f : V → V. If n is zero or one, then we are ﬁnished. Therefore we assume that n ≥ 2.

Again, we observe that the characteristic polynomial of f must have a zero, hence there exists some eigenvalue λ, and an eigenvector v_n of f, which has norm equal to one, where f(v_n) = λv_n. Again take W to be the one dimensional subspace of V generated by v_n. Let W^⊥ be the perpendicular subspace. It is only necessary to show that, again, W^⊥ is invariant under f. But this is easy. Let u ∈ W^⊥ be given. Then we have

<f(u), v_n> = <u, f(v_n)> = <u, λv_n> = λ<u, v_n> = λ· 0 = 0

.
The rest of the proof follows as before.

In the particular case where we have only real numbers (which of course are a subset of the complex numbers), then we have a symmetric matrix.

Corollary
All real symmetric matrices can be diagonalized.

Note furthermore, that even in the case of a unitary matrix, the symmetry condition, namely a_jk= a_kj , implies that on the diagonal, we have a_jj= a_jj for all j. That is, the diagonal elements are all real numbers. But these are the eigenvalues. Therefore we have:

Corollary
The eigenvalues of a self-adjoint matrix — that is, a symmetric or a Hermitian matrix — are all real numbers.

Orthogonal matrices revisited
Let A be an n × n orthogonal matrix. That is, it consists of real numbers, and we have A^t = A⁻¹. In general, it cannot be diagonalized. But on the other hand, it can be brought into the following form by means of similarity transformations.

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 3

To see this, start by imagining that A represents the orthogonal mapping f : ℜⁿ → ℜⁿ with respect to the canonical basis of ℜⁿ. Now consider the symmetric matrix

B = A + A^t = A + A⁻¹.

This matrix represents another linear mapping, call it g : ℜⁿ → ℜⁿ, again with respect to the canonical basis of ℜⁿ.

But, as we have just seen, B can be diagonalized. In particular, there exists some vector v ∈ ℜⁿ with g(v) = λg(v), for some λ ∈ ℜ. We now proceed by induction on the number n. There are two cases to consider:

v is also an eigenvector for f, or
it isn’t.

The ﬁrst case is easy. Let W ⊂ V be simply W = [v]. i.e. this is just the set of all scalar multiples of v. Let W^⊥ be the perpendicular space to W. (That is, w ∈ W means that <w, v> = 0.) But it is easy to see that W^⊥ is also invariant under f. This follows by observing ﬁrst of all that f(v) = αv, with α = ±1. (Remember that the eigenvalues of orthogonal mappings have absolute value 1.) Now take w ∈ W^⊥. Then <f(w), v> = α⁻¹<f(w), αv> = α⁻¹<f(w), f(v)> = α⁻¹<fw, v> = α⁻¹ · 0 = 0. Thus, by changing the basis of ℜⁿ to being an orthonormal basis, starting with v (which we can assume has been normalized), we obtain that the original matrix is similar to the matrix

Linear Algebra: #21 Which Matrices can be Diagonalized? equation pic 4

where A^* is an (n−1) ×(n−1) orthogonal matrix, which, according to the inductive hypothesis, can be transformed into the required form.

If v is not an eigenvector of f, then, still, we know it is an eigenvector of g, and furthermore g = f + f ⁻¹. In particular, g(v) = λv = f(v) + f ⁻¹(v). That is,

f(f(v)) = λf(v) − v.

So this time, let W = [v, f(v)]. This is a 2-dimensional subspace of V. Again, consider W^⊥. We have V = W ⊕ W^⊥. So we must show that W^⊥ is invariant under f. Now we have another two cases to consider:

λ = 0, and
λ ≠ 0.

So if λ = 0 then we have f(f(v)) = −v. Therefore, again taking w ∈ W^⊥, we have <f(w), v> = <f(w), −f(f(v))> = −<w, f(v)> = 0. (Remember that w ∈ W^⊥, so that <w, f(v)> = 0.) Of course we also have <f(w), f(v)> = <w, v> = 0.

On the other hand, if λ ≠ 0 then we have v = λf(v) − f(f(v)) so that <f(w), v> = <f(w), λf(v) − f(f(v))> = λ<f(w), f(v)> − <f(w), f(f(v))>, and we have seen that both of these scalar products are zero. Finally, we again have <f(w), f(v)> = <w, v> = 0.

Therefore we have shown that V = W⊕ W^⊥, where both of these subspaces are invariant under the orthogonal mapping f. By our inductive hypothesis, there is an orthonormal basis for f restricted to the n −2 dimensional subspace W^⊥ such that the matrix has the required form. As far as W is concerned, we are back in the simple situation of an orthogonal mapping ℜ² → ℜ², and the matrix for this has the form of one of our 2 × 2 blocks.

Sunday, June 22, 2014

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices

20.1 Orthogonal matrices
Let V be an n-dimensional real vector space (that is, over the real numbers ℜ), and let {v₁, . . . , v_n} be an orthonormal basis for V. Let f : V → V be an orthogonal mapping, and let A be its matrix with respect to the basis {v₁, . . . , v_n}. Then we say that A is an orthogonal matrix.

Theorem 50
The n × n matrix A is orthogonal ⇔ A⁻¹ = A^t (Recall that if a_ij is the ij-th element of A, then the ij-th element of A^t is a_ji . That is, everything is “ﬂipped over” the main diagonal in A.)

Proof
For an orthogonal mapping f, we have <u, w> = <f(u), f(w)>, for all j and k. But in the matrix notation, the scalar product becomes the inner product. That is, if

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 1

In other words, the matrix whose jk-th element is always <v_j , v_k> is the n×n identity matrix I_n. On the other hand,

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 2

That is, we obtain the j-th column of the matrix A. Furthermore, since <v_j , v_k> = <f(v_j), f(v_k)>, we must have the matrix whose jk-th elements are <f(v_j), f(v_k)> being again the identity matrix. So

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 3

But now, if you think about it, you see that this is just one part of the matrix multiplication A^tA. All together, we have

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 4

Thus we conclude that A⁻¹ = A^t. (Note: this was only the proof that f orthogonal ⇒ A⁻¹ = A^t. The proof in the other direction, going backwards through our argument, is easy, and is left as an exercise for you.)

20.2 Unitary matrices

Theorem 51
The n × n matrix A is unitary ⇔ A⁻¹ = A^t. (The matrix A is obtained by taking the complex conjugates of all its elements.)

Proof
Entirely analogous with the case of orthogonal matrices. One must note however, that the inner product in the complex case is

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 5

20.3 Hermitian and symmetric matrices

Theorem 52
The n × n matrix A is Hermitian ⇔ A⁻¹ = A^t.

Proof
This is again a matter of translating the condition <v_j, f(v_k)> = <f(v_j), v_k> into matrix notation, where f is the linear mapping which is represented by the matrix A, with respect to the orthonormal basis {v₁, . . . , v_n}. We have

Linear Algebra: #20 Characterizing Orthogonal, Unitary, and Hermitian Matrices equation pic 6

In particular, we see that in the real case, self-adjoint matrices are symmetric.

Saturday, June 21, 2014

Linear Algebra: #19 "Classical Groups" often seen in Physics

Linear Algebra: #19 "Classical Groups" often seen in Physics

The orthogonal group O(n): This is the set of all linear mappings f : ℜⁿ → ℜⁿ such that <u, v> = <f(u), f(v)>, for all u, v ∈ ℜⁿ. We think of this as being all possible rotations and inversions (Spiegelungen) of n-dimensional Euclidean space.

The special orthogonal group SO(n): This is the subgroup of O(n), containing all orthogonal mappings whose matrices have determinant +1.

The unitary group U(n): The analog of O(n), where the vector space is n-dimensional complex space ℂⁿ. That is, <u, v> = <f(u), f(v)>, for all u, v ∈ ℂⁿ.

The special unitary group SU(n): Again, the subgroup of U(n) with determinant +1.

Note that for orthogonal, or unitary mappings, all eigenvalues — if they exist — must have absolute value 1. To see this, let v be an eigenvector with eigenvalue λ. Then we have

Linear Algebra: #19 Classical Groups often seen in Physics equation pic 1

Since v is an eigenvector, and thus v ≠ 0, we must have |λ| = 1.

We will prove that all unitary matrices can be diagonalized. That is, for every unitary mapping ℂⁿ → ℂⁿ, there exists a basis consisting of eigenvectors. On the other hand, as we have already seen in the case of simple rotations of 2-dimensional space, “most” orthogonal matrices cannot be diagonalized. On the other hand, we can prove that every orthogonal mapping ℜⁿ → ℜⁿ, where n is an odd number, has at least one eigenvector. [For example, in our normal 3-dimensional space of physical reality, any rotating object — for example the Earth rotating in space — has an axis of rotation, which is an eigenvector.]

The self-adjoint mappings f (of ℜⁿ → ℜⁿ or ℂⁿ → ℂⁿ) are such that <u, f(v)> = <f(v), u>, for all u, v in ℜⁿ or ℂⁿ, respectively. As we will see, the matrices for such mappings are symmetric in the real case, and Hermitian in the complex case. In either case, the matrices can be diagonalized. Examples of Hermitian matrices are the Pauli spin-matrices:

We also have the Lorentz group, which is important in the Theory of Relativity. Let us imagine that physical space is ℜ⁴, and a typical point is v = (t_v, x_v, y_v, z_v). Physicists call this Minkowski space, which they often denote by M⁴. A linear mapping f : M⁴ → M⁴ is called a Lorentz transformation if, for f(v) = (t_v^*, x_v^*, y_v^*, z_v^*), we have

− (t_v^*)² + (x_v^*)² + (y_v^*)² + (z_v^*)² = − t_v² + x_v² + y_v² + z_v² for all v ∈ M⁴, and also the mapping is “time-preserving” in the sense that the unit vector in the time direction, (1, 0, 0, 0) is mapped to some vector (t^*, x^*, y^*, z^*), such that t^* > 0.

The Poincare group is obtained if we consider, in addition, translations of Minkowski space. But translations are not linear mappings, so I will not consider these things further in this lecture.

Friday, June 20, 2014

Linear Algebra: #18 Orthogonal Bases

Linear Algebra: #18 Orthogonal Bases

Our vector space V is now assumed to be either Euclidean, or else unitary — that is, it is deﬁned over either the real numbers ℜ, or else the complex numbers ℂ. In either case we have a scalar product <·,·> : V × V → F (here, F = ℜ or ℂ).

As always, we assume that V is ﬁnite dimensional, and thus it has a basis {v₁, . . . , v_n}. Thinking about the canonical basis for ℜⁿ or ℂⁿ, and the inner product as our scalar product, we see that it would be nice if we had

<v_j , v_j> = 1, for all j (that is, the basis vectors are normalized), and furthermore
<v_j , v_k> = 0, for all j ≠ k (that is, the basis vectors are an orthogonal set in V).

Linear Algebra: #18 Orthogonal Bases equation pic 1

That is to say, {v₁, . . . , v_n} is an orthonormal basis of V. Unfortunately, most bases are not orthonormal. But this doesn’t really matter. For, starting from any given basis, we can successively alter the vectors in it, gradually changing it into an orthonormal basis. This process is often called the Gram-Schmidt orthonormalization process. But ﬁrst, to show you why orthonormal bases are good, we have the following theorem.

Theorem 48
Let V have the orthonormal basis {v₁, . . . , v_n}, and let x ∈ V be arbitrary. Then

Linear Algebra: #18 Orthogonal Bases equation pic 2

That is, the coefficients of x, with respect to the orthonormal basis, are simply the scalar products with the respective basis vectors.

Proof
This follows simply because if x = ∑ a_jv_j, with (j = 1, ... , n), then we have for each k,

Linear Algebra: #18 Orthogonal Bases equation pic 3

So now to the Gram-Schmidt process. To begin with, if a non-zero vector v ∈ V is not normalized — that is, its norm is not one — then it is easy to multiply it by a scalar, changing it into a vector with norm one. For we have <v, v> 0. Therefore ||v|| = √<v, v> > 0 and we have

Linear Algebra: #18 Orthogonal Bases equation pic 4

In other words, we simply multiply the vector by the inverse of its norm.

Theorem 49
Every ﬁnite dimensional vector space V which has a scalar product has an orthonormal basis.

Proof
The proof proceeds by constructing an orthonormal basis {u₁, . . . , u_n} from a given, arbitrary basis {v₁, . . . , v_n}. To describe the construction, we use induction on the dimension, n. If n = 1 then there is almost nothing to prove. Any non-zero vector is a basis for V, and as we have seen, it can be normalized by dividing by the norm. (That is, scalar multiplication with the inverse of the norm.)

So now assume that n ≥ 2, and furthermore assume that the Gram-Schmidt process can be constructed for any n−1 dimensional space. Let U ⊂ V be the subspace spanned by the ﬁrst n − 1 basis vectors {v₁, . . . , v_n−1}. Since U is only n − 1 dimensional, our assumption is that there exists an orthonormal basis {u₁, . . . , u_n−1} for U. Clearly, adding in v_n gives a new basis {u₁, . . . , u_n−1, v_n} for V.

[Since both {v₁, . . . , v_n−1} and {u₁, . . . , u_n−1} are bases for U, we can write each v_j as a linear combination of the u_k’s. Therefore {u₁, . . . , u_n−1, v_n} spans V, and since the dimension is n, it must be a basis.]

Unfortunately, this last vector, v_n, might disturb the nice orthonormal character of the other vectors. Therefore, we replace v_n with the new vector (A linear independent set remains linearly independent if one of the vectors has some linear combination of the other vectors added on to it.)

Linear Algebra: #18 Orthogonal Bases equation pic 6

Wednesday, June 18, 2014

Linear Algebra: #17 Scalar Products, Norms, etc.

Linear Algebra: #17 Scalar Products, Norms, etc.

So now we have arrived at the subject matter which is usually taught in the second semester of the beginning lectures in mathematics — that is in Linear Algebra II— namely, the properties of (ﬁnite dimensional) real and complex vector spaces. Finally now, we are talking about geometry. That is, about vector spaces which have a distance function. (The word “geometry” obviously has to do with the measurement of physical distances on the earth.)

So let V be some ﬁnite dimensional vector space over ℜ, or ℂ. Let v ∈ V be some vector in V. Then, since V ≅ ℜⁿ, or ℂⁿ, we can write v = ∑ a_je_j, (with j ranging from 1, ... , n) where {e₁, . . . , e_n} is the canonical basis for ℜⁿ or ℂⁿ and a_j ∈ ℜ or ℂ, respectively, for all j. Then the length of v is defined to be the non-negative real number

||v|| = √|a₁|²+ · · · + |a_n|² .

Of course, as these things always are, we will not simply conﬁne ourselves to measurements of normal physical things on the earth. We have already seen that the idea of a complex vector space defies our normal powers of geometric visualization. Also, we will not always restrict things to ﬁnite dimensional vector spaces. For example, spaces of functions — which are almost always inﬁnite dimensional — are also very important in theoretical physics. Therefore, rather than saying that ||v|| is the “length” of the vector v, we use a new word, and we say that ||v|| is the norm of v. In order to define this concept in a way which is suitable for further developments, we will start with the idea of a scalar product of vectors.

Deﬁnition
Let F = ℜ or ℂ and let V, W be two vector spaces over F. A bilinear form is a mapping s : V × W → F satisfying the following conditions with respect to arbitrary elements v, v₁ and v₂ ∈ V, w, w₁ and w₂ ∈ W, and a ∈ F.

s(v₁ + v₂, w) = s(v₁, w) + s(v₂, w),
s(av, w) = as(v, w),
s(v, w₁ + w₂) = s(v, w₁) + s(v, w₂) and
s(v, aw) = as(v, w).

If V = W, then we say that a bilinear form s : V × V → F is symmetric, if we always have s(v₁, v₂) = s(v₂, v₁). Also the form is called positive definite if s(v, v) > 0 for all v ≠ 0.

On the other hand, if F = ℂ and f : V → W is such that we always have

f(v₁ + v₂) = f(v₁) + f(v₂) and
f(av) = af(v)

Then f is a semi-linear (not a linear) mapping. (Note: if F = ℜ then semi-linear is the same as linear.)

A mapping s : V × W → F such that

The mapping given by s(·, w) : V → F, where v → s(v, w) is semi-linear for all w ∈ W, whereas
The mapping given by s(v, ·) : W → F, where w → s(v, w) is linear for all v ∈ V

is called a sesqui-linear form.

In the case V = W, we say that the sesqui-linear form is Hermitian (or Euclidean, if we only have F = ℜ), if we always have s(v₁, v₂) = s(v₂, v₁). (Therefore, if F = ℜ, an Hermitian form is symmetric.)

Finally, a scalar product is a positive definite Hermitian form s : V × V → F. Normally, one writes (v₁, v₂), rather than s(v₁, v₂).

Well, these are a lot of new words. To be more concrete, we have the inner products, which are examples of scalar products.

Inner products

Linear Algebra: #17 Scalar Products, Norms, etc. equation pic 1

Thus, we are considering these vectors as column vectors, defined with respect to the canonical basis of ℂⁿ. Then deﬁne (using matrix multiplication)

Linear Algebra: #17 Scalar Products, Norms, etc. equation pic 2

It is easy to check that this gives a scalar product on ℂⁿ. This particular scalar product is called the inner product.

Remark
One often writes u · v for the inner product. Thus, considering it to be a scalar product, we just have u · v = <u, v>.

This inner product notation is often used in classical physics; in particular in Maxwell’s equations. Maxwell’s equations also involve the “vector product” u × v. However the vector product of classical physics only makes sense in 3-dimensional space. Most physicists today prefer to imagine that physical space has 10, or even more — perhaps even a frothy, undefinable number of — dimensions. Therefore it appears to be the case that the vector product might have gone out of fashion in contemporary physics. Indeed, mathematicians can imagine many other possible vector-space structures as well. Thus I shall dismiss the vector product from further discussion here.

Deﬁnition
A real vector space (that is, over the ﬁeld of the real numbers ℜ), together with a scalar product is called a Euclidean vector space. A complex vector space with scalar product is called a unitary vector space.

Now, the basic reason for making all these definitions is that we want to define the length — that is the norm — of the vectors in V. Given a scalar product, then the norm of v ∈ V — with respect to this scalar product — is the non-negative real number

||v|| = √<v, v> .

More generally, one defines a norm-function on a vector space in the following way.

Deﬁnition
Let V be a vector space over ℂ (and thus we automatically also include the case ℜ ⊂ ℂ as well). A function || · || : V → ℜ is called a norm on V if it satisfies the following conditions.

||av|| = |a| ||v|| for all v ∈ V and for all a ∈ ℂ,
||v₁ + v₂|| ≤ ||v₁|| + ||v₂|| for all v ∈ V (the triangle inequality), and
||v|| = 0 ⇔ v = 0.

Theorem 46 (Cauchy-Schwarz inequality)
Let V be a Euclidean or a unitary vector space, and let ||v|| = √<v, v> for all v ∈ V. Then we have

|<u, v>| ≤ ||u|| · ||v||

for all u and v ∈ V. Furthermore, the equality |<u, v>| ≤ ||u|| · ||v|| holds if, and only if, the set {u, v} is linearly dependent.

Proof
It suffices to show that |<u, v>|² ≤ <u, u><v, v>. Now, if v = 0, then — using the properties of the scalar product — we have both <u, v> = 0 and <v, v> = 0. Therefore the theorem is true in this case, and we may assume that v ≠ 0. Thus <v, v> > 0. Let

Linear Algebra: #17 Scalar Products, Norms, etc. equation pic 3

which gives the Cauchy-Schwarz inequality. When do we have equality?

If v = 0 then, as we have already seen, the equality |<u, v>| ≤ ||u|| · ||v|| is trivially true. On the other hand, when v ≠ 0, then equality holds when <u − av, u − av> = 0. But since the scalar product is positive definite, this holds when u − av = 0. So in this case as well, {u, v} is linearly dependent.

Theorem 47
Let V be a vector space with scalar product, and define the non-negative function || · || : V → ℜ by ||v|| = √<v, v> . Then || · || is a norm function on V.

Proof
The ﬁrst and third properties in our definition of norms are obviously satisfied. As far as the triangle inequality is concerned, begin by observing that for arbitrary complex numbers z = x + yi ∈ ℂ we have

Linear Algebra: #17 Scalar Products, Norms, etc. equation pic 4

Tuesday, June 17, 2014

Linear Algebra: #16 Complex Numbers

Linear Algebra: #16 Complex Numbers

On the other hand, looking at the characteristic polynomial, namely x² − 2x cos θ +1 in the previous example, we see that in the case θ = ±π this reduces to x² + 1. And in the realm of the complex numbers ℂ, this equation does have zeros, namely ±i. Therefore we have the seemingly bizarre situation that a “complex” rotation through a quarter of a circle has vectors which are mapped back onto themselves (multiplied by plus or minus the “imaginary” number i). But there is no need for panic here! We need not follow the example of numerous famous physicists of the past, declaring the physical world to be “paradoxical”, “beyond human understanding”, etc. No. What we have here is a purely algebraic result using the abstract mathematical construction of the complex numbers which, in this form, has nothing to do with rotations of real physical space!

So let us forget physical intuition and simply enjoy thinking about the artiﬁcial mathematical game of extending the system of real numbers to the complex numbers. I assume that you all know that the set of complex numbers ℂ can be thought of as being the set of numbers of the form x + yi, where x and y are elements of the real numbers ℜ and i is an abstract symbol, introduced as a “solution” to the equation x² + 1 = 0. Thus i² = −1. Furthermore, the set of numbers of the form x + 0 · i can be identiﬁed simply with x, and so we have an embedding ℜ ⊂ ℂ. The rules of addition and multiplication in ℂ are

Linear Algebra: #16 Complex Numbers equation pic 1

Let z = x +yi be some complex number. Then the absolute value of z is deﬁned to be the (non-negative) real number |z| = √(x² + y²). The complex conjugate of z is z = x − yi. Therefore |z| = √(zz).

It is a simple exercise to show that ℂ is a ﬁeld. The main result — called (in German) the Hauptsatz der Algebra — is that ℂ is an algebraically closed ﬁeld. That is, let ℂ[z] be the set of all polynomials with complex numbers as coefficients. Thus, for P(z) ∈ ℂ[z] we can write P(z) = c_nzⁿ + c_n₋₁zⁿ⁻¹ + · · · + c₁z + c₀, where c_j ∈ ℂ, for all j = 0, . . . , n. Then we have:

Theorem 44 (Hauptsatz der Algebra)
Let P(z) ∈ ℂ[z] be an arbitrary polynomial with complex coefficients. Then P has a zero in ℂ. That is, there exists some λ ∈ ℂ with P(λ) = 0.

The theory of complex numbers (Funktionentheorie in German) is an extremely interesting and pleasant subject. Complex analysis is quite different from the real analysis.

Theorem 45
Every complex polynomial can be completely factored into linear factors. That is, for each P(z) ∈ ℂ[z] of degree n, there exist n complex numbers (perhaps not all different) λ₁, . . . , λ_n, and a further complex number c, such that

P(z) = c(λ₁ − z) · · · (λ_n − z).

Proof
Given P(z), theorem 44 tells us that there exists some λ₁ ∈ ℂ, such that P(λ₁) = 0. Let us therefore divide the polynomial P(z) by the polynomial (λ₁− z). We obtain

P(z) = (λ₁− z) · Q(z) + R(z),

where both Q(z) and R(z) are polynomials in ℂ[z]. However, the degree of R(z) is less than the degree of the divisor, namely (λ₁− z), which is 1. That is, R(z) must be a polynomial of degree zero, i.e. R(z) = r ∈ ℂ, a constant. But what is r? If we put λ₁ into our equation, we obtain

0 = P((λ₁) = (λ₁ − λ₁)Q(z) + r = 0 + r.

Therefore r = 0, and so

P(z) = (λ₁− z)Q(z),

where Q(z) must be a polynomial of degree n−1. Therefore we apply our argument in turn to Q(z), again reducing the degree, and in the end, we obtain our factorization into linear factors.

So the consequence is: let V be a vector space over the ﬁeld of complex numbers ℂ. Then every linear mapping f : V → V has at least one eigenvalue, and thus at least one eigenvector.

Monday, June 16, 2014

Linear Algebra: #15 Why is the Determinant Important?

Linear Algebra: #15 Why is the Determinant Important?

I am sure there are many points which could be advanced in answer to this question. But here I will concentrate on only two special points.

The transformation formula for integrals in higher-dimensional spaces.
This is a theorem which is usually dealt with in the Analysis III lecture. Let G ⊂ ℜⁿ be some open region, and let f : G → ℜ be a continuous function. Then the integral

has some particular value (assuming, of course, that the integral converges). Now assume that we have a continuously differentiable injective mapping φ : G → ℜⁿ and a continuous function F : φ(G) → ℜ. Then we have the formula

Here, D(φ(x)) is the Jacobi matrix of φ at the point x.

This formula reﬂects the geometric idea that the determinant measures the change of the volume of n-dimensional space under the mapping φ.

If φ is a linear mapping, then take Q ⊂ ℜⁿ to be the unit cube: Q = {(x₁, . . . , x_n) : 0 ≤ x_i ≤ 1, ∀i}. Then the volume of Q, which we can denote by vol(Q) is simply 1. On the other hand, we have vol(φ(Q)) = det(A), where A is the matrix representing φ with respect to the canonical coordinates for ℜⁿ. (A negative determinant — giving a negative volume — represents an orientation-reversing mapping.)

The characteristic polynomial.
Let f : V → V be a linear mapping, and let v be an eigenvector of f with f(v) = λv. That means that (f − λI_n)(v) = 0; therefore the mapping (f − λI_n) : V → V is singular. Now consider the matrix A, representing f with respect to some particular basis of V. Since λI_n is the matrix representing the mapping λI_n, we must have that the difference A − λI_n is a singular matrix. In particular, we have det(A − λI_n) = 0.

Another way of looking at this is to take a “variable” x, and then calculate (for example, using the Leibniz formula) the polynomial in x

P(x) = det(A − xI_n).

This polynomial is called the characteristic polynomial for the matrix A. Therefore we have the theorem:

Theorem 41
The zeros of the characteristic polynomial of A are the eigenvalues of the linear mapping f : V → V which A represents.

Obviously the degree of the polynomial is n for an n × n matrix A. So let us write the characteristic polynomial in the standard form

P(x) = c_nxⁿ + c_n₋₁xⁿ⁻¹ + · · · + c₁x + c₀ .

The coefficients c₀, . . . , c_n are all elements of our ﬁeld F.

Now the matrix A represents the mapping f with respect to a particular choice of basis for the vector space V. With respect to some other basis, f is represented by some other matrix A', which is similar to A. That is, there exists some C ∈ GL(n, F) with A' = C⁻¹AC. But we have

Therefore we have:

Theorem 42
The characteristic polynomial is invariant under a change of basis; that is, under a similarity transformation of the matrix.

In particular, each of the coefficients c_i of the characteristic polynomial P(x) = c_nxⁿ + c_n₋₁xⁿ⁻¹ + · · · + c₁x + c₀ remains unchanged after a similarity transformation of the matrix A.

What is the coefficient c_n? Looking at the Leibniz formula, we see that the term xⁿ can only occur in the product

(a₁₁ − x)(a₂₂ − x) · · · (a_nn − x) = (−1)xⁿ − (a₁₁ + a₂₂ + · · · + a_nn)xⁿ⁻¹ + · · · .

Therefore c_n = 1 if n is even, and c_n = −1 if n is odd. This is not particularly interesting.

So let us go one term lower and look at the coefficient c_n₋₁. Where does xⁿ⁻¹ occur in the Leibniz formula? Well, as we have just seen, there certainly is the term

(−1)ⁿ⁻¹(a₁₁ + a₂₂ + · · · + a_nn)xⁿ⁻¹,

which comes from the product of the diagonal elements in the matrix A − xI_n. Do any other terms also involve the power xⁿ⁻¹? Let us look at Leibniz formula more carefully in this situation. We have

Here, δ_ij = 1 if i = j. Otherwise, δ_ij = 0. Now if σ is a non-trivial permutation — not just the identity mapping — then obviously we must have two different numbers i₁ and i₂, with σ(i₁) ≠ i₁ and also σ(i₂) ≠ i₂. Therefore we see that these further terms in the sum can only contribute at most n − 2 powers of x. So we conclude that the (n − 1)-st coefficient is

c_n₋₁ + = (−1)ⁿ⁻¹(a₁₁ + a₂₂ + · · · + a_nn).

Definition

Let A be an n × n matrix. The trace of A (in German, the spur of A) is the sum of the diagonal elements:

tr(A) = a₁₁ + a₂₂ + · · · + a_nn.

Theorem 43
tr(A) remains unchanged under a similarity transformation.

An example
Let f : ℜ² → ℜ² be a rotation through the angle θ. Then, with respect to the canonical basis of ℜ², the matrix of f is

Linear Algebra: #15 Why is the Determinant Important? equation pic 6

That is to say, if λ ∈ ℜ is an eigenvalue of f, then λ must be a zero of the characteristic polynomial. That is,

λ² − 2λ cos θ + 1 = 0.

But, looking at the well-known formula for the roots of quadratic polynomials, we see that such a λ can only exist if |cos θ| = 1. That is, θ = 0 or π. This reﬂects the obvious geometric fact that a rotation through any angle other than 0 or π rotates any vector away from its original axis. In any case, the two possible values of θ give the two possible eigenvalues for f, namely +1 and −1.

Sunday, June 15, 2014

Linear Algebra: #14 Leibniz Formula

Linear Algebra: #14 Leibniz Formula

Deﬁnition
A permutation of the numbers {1, . . . , n} is a bijection

σ : {1, . . . , n} → {1, . . . , n}.

The set of all permutations of the numbers {1, . . . , n} is denoted S_n. In fact, S_n is a group: the symmetric group of order n. Given a permutation σ ∈ S , we will say that a pair of numbers (i, j), with i, j ∈ {1, . . . , n} is a “reversed pair” if i < j, yet σ(i) > σ(j). Let s(σ) be the total number of reversed pairs in σ. Then the sign of sigma is deﬁned to be the number

sign(σ) = (−1)^s(σ)

Theorem 37 (Leibniz)
Let the elements in the matrix A be a_ij , for i, j between 1 and n. Then we have

Linear Algebra: #14 Leibniz Formula equation pic 1

As a consequence of this formula, the following theorems can be proved:

Theorem 38
Let A be a diagonal matrix

Linear Algebra: #14 Leibniz Formula equation pic 2

Then det(A) = λ₁ λ₂ · · · λ_n .

Theorem 39
Let A be a triangular matrix

Linear Algebra: #14 Leibniz Formula equation pic 3

Then det(A) = a₁₁a₂₂ · · · a_nn .

Leibniz formula also gives:

Deﬁnition
Let A ∈ M(n × n, F). The transpose A^t of A is the matrix consisting of elements a_ij^t such that for all i and j we have a_ij^t = a_ji, where a_ji are the elements of the original matrix A.

Theorem 40
det(A^t) = det(A).

14.1 Special rules for 2 × 2 and 3 × 3 matrices

Linear Algebra: #14 Leibniz Formula equation pic 4

For the 2 × 2 matrix, the Leibniz formula reduces to the simple formula

det(A) = a₁₁a₂₂- a₁₂a₂₁

For the 3 × 3 matrix, the formula is a little more complicated.

det(A) = a₁₁a₂₂a₃₃+ a₁₂a₂₃a₃₃ + a₁₃a₂₁a₃₂ - a₁₁a₂₃a₃₂ - a₁₂a₂₁a₃₃ - a₁₁a₂₃a₃₂

14.2 A proof of Leibniz Formula
Let the rows of the n × n identity matrix be ε₁ , . . . ,ε_n. Thus

ε₁ = (1 0 0 · · · 0), ε₂ = (0 1 0 · · · 0), . . . , ε_n = (0 0 0 · · · 1).

Therefore, given that the i-th row in a matrix is

ξ_i = (a_i1a_i2· · · a_in),

then we have

Linear Algebra: #14 Leibniz Formula equation pic 5

So let the matrix A be represented by its rows,

Linear Algebra: #14 Leibniz Formula equation pic 6

It was an exercise to show that the determinant function is additive. That is, if B and C are n × n matrices, then we have det(B + C) = det(B) + det(C). Therefore we can write

Linear Algebra: #14 Leibniz Formula equation pic 7

To begin with, observe that if ε_jk = ε_jl for some j_k ≠ j_l, then two rows are identical, and therefore the determinant is zero. Thus we need only the sum over all possible permutations (j₁, j₂, . . . , j_n) of the numbers (1, 2, . . . , n). Then, given such a permutation, we have the matrix

Linear Algebra: #14 Leibniz Formula equation pic 8

This can be transformed back into the identity matrix

Linear Algebra: #14 Leibniz Formula equation pic 9

by means of successively exchanging pairs of rows.

Each time this is done, the determinant changes sign (from +1 to -1, or from -1 to +1). Finally, of course, we know that the determinant of the identity matrix is 1.

Therefore we obtain the Leibniz formula

Linear Algebra: #14 Leibniz Formula equation pic 10

Saturday, June 14, 2014

Linear Algebra: #13 Determinant

Linear Algebra: #13 Determinant

Let M(n × n, F) be the set of all n × n matrices of elements of the ﬁeld F.

Deﬁnition
A mapping det : M(n × n, F) → F is called a determinant function if it satisfies the following three conditions.

det(I_n) = 1, where I_n is the identity matrix.

If A ∈ M(n×n, F) is changed to the matrix A' by multiplying all the elements in a single row with the scalar a ∈ F, then det(A') = a · det(A). (This is our row operation S_i(a).)

If A' is obtained from A by adding one row to a different row, then det(A') = det(A). (This is our row operation S_ij(1).)

Simple consequences of this definition
Let A ∈ M(n × n, F) be an arbitrary n × n matrix, and let us say that A is transformed into the new matrix A' by an elementary row operation. Then we have:

If A' is obtained by multiplying row i by the scalar a ∈ F, then det(A') = a · det(A). This is completely obvious! It is just part of the definition of “determinants”.

Therefore, if A' is obtained from A by multiplying a row with −1 then we have det(A' ) = −det(A).

Also, it follows that a matrix containing a row consisting of zeros must have zero as its determinant.

If A has two identical rows, then its determinant must also be zero. For can we multiply one of these rows with −1, then add it to the other row, obtaining a matrix with a zero row.

If A ′ is obtained by exchanging rows i and j, then det(A') = −det(A). This is a bit more difficult to see. Let us say that A = (u₁, . . . , u_i, . . . , u_j, . . . , u_n), where u_k is the k-th row of the matrix, for each k. Then we can write

(This is the elementary row operation S_ij.)

If A' is obtained from A by an elementary row operation of the form S_ij(c), then det(A') = det(A). For we have:

Therefore we see that each elementary row operation has a well-defined effect on the determinant of the matrix. This gives us the following algorithm for calculating the determinant of an arbitrary matrix in M(n × n, F).

How to ﬁnd the determinant of a matrix
Given: An arbitrary matrix A ∈ M(n × n, F).
Find: det(A).

Method:

Using elementary row operations, transform A into a matrix in step form, keeping track of the changes in the determinant at each stage.

If the bottom line of the matrix we obtain only consists of zeros, then the determinant is zero, and thus the determinant of the original matrix was zero.

Otherwise, the matrix has been transformed into an upper triangular matrix, all of whose diagonal elements are 1. But now we can transform this matrix into the identity matrix I_n( by elementary row operations of the type S_ij(c). Since we know that det(I_n) must be 1, we then ﬁnd a unique value for the determinant of the original matrix A. In particular, in this case det(A) ≠ 0.

Note that in both this algorithm, as well as in the algorithm for ﬁnding the inverse of a regular matrix, the method of Gaussian elimination was used. Thus we can combine both ideas into a single algorithm, suitable for practical calculations in a computer, which yields both the matrix inverse (if it exists), and the determinant. This algorithm also proves the following theorem.

Theorem 34
There is only one determinant function and it is uniquely given by our algorithm. Furthermore, a matrix A ∈ M(n × n, F) is regular if and only if det(A) ≠ 0.

In particular, using these methods it is easy to see that the following theorem is true.

Theorem 35
Let A, B ∈ M(n×n, F). Then we have det(A· B) = det(A) · det(B).

Proof
If either A or B is singular, then A · B is singular. This can be seen by thinking about the linear mappings V → V which A and B represent. At least one of these mappings is singular. Thus the dimension of the image is less than n, so the dimension of the image of the composition of the two mappings must also be less than n. Therefore A · B must be singular. That means, on the one hand, that det(A · B) = 0. And on the other hand, that either det(A) = 0 or else det(B) = 0. Either way, the theorem is true in this case.

If both A and B are regular, then they are both in GL(n, F). Therefore, as we have seen, they can be written as products of elementary matrices. It suffices then to prove that det(S₁)det(S₂) = det(S₁S₂), where S₁ and S₂ are elementary matrices. But our arguments above show that this is, indeed, true.

Remembering that A is regular if and only if A ∈ GL(n, F), we have:

Corollary

If A ∈ GL(n, F) then det(A⁻¹) = (det(A))⁻¹.

In particular, if det(A) = 1 then we also have det(A⁻¹) = 1. The set of all such matrices must then form a group.

Another simple corollary is the following.

Corollary
Assume that the matrix A is in block form, so that the linear mapping which it represents splits into a direct sum of invariant subspaces (see theorem 29). Then det(A) is the product of the determinants of the blocks.

Proof
If

Linear Algebra: #13 Determinant equation pic 3

That is, for the matrix A_i^*, all the blocks except the i-th block are replaced with identity-matrix blocks. Then A = A₁^* · · · A_p^*, and it is easy to see that det(A_i^*) = det(A_i) for each i.

Deﬁnition
The special linear group of order n is defined to be the set

SL(n, F) = {A ∈ GL(n, F) : det(A) = 1}.

Theorem 36
Let A' = C⁻¹AC. Then det(A ′ ) = det(A).

Proof
This follows, since det(C⁻¹) = (det(C))⁻¹.