B Linear Algebra

Consider

ℝ^{n}

with the dot product. With this we can determine the lengths of vectors and the angle between them. In fact the relation

a \cdot b = ∥ a ∥ ∥ b ∥ cos 𝜃

says that the two are equivalent information. The reason to use the dot product over simply the length and angle information is because it is very useful to encode this information into a bilinear operation. Likewise, in

ℝ^{3}

given two non-parallel vectors in order, there is a third perpendicular vector that completes the oriented basis. It is useful to have an operation that goes from a pair of vectors to the perpendicular complement, but even more useful if this operation has nice algebraic properties.

Here is a construction that leads to the standard cross product. It must be anticommutative to encode the orientation. We want it to be bilinear, so it is sufficient to construct it on unit vectors. Note that negating a vector changes the orientation so the output should also be negated to preserve the orientation, and this is compatible with bilinearity. Finally, it should be rotationally preserved. With these stipulations, we must choose an odd function

f (𝜃)

that determines

By anticommutativity,

i \times i = 0

, so the left hand side simplifies to

i \times (sin 𝜃 j)

. If the cross product is to be bilinear, then

We see that the cross product is determined essentially by the relation

i \times j = k

and bilinearity. This can be used to calculate any cross product and is called the ‘algebraic method’. Alternatively, we have

∥ a \times b ∥ = ∥ a ∥ ∥ b ∥ | sin 𝜃 |

. We can interpret the right hand side as the area of the parallelogram spanned by

a

and

b

. Together with the perpendicularity, this information also determines the cross product. It is called the ‘geometric method’. This is the reason that cross products often appear in surface area formulas. For example, a parameterised surface

Φ : U \subseteq ℝ^{2} \to ℝ^{3}

has a surface area of

Finally, the computation method I use the most is the ‘determinant method’. Given

a = a_{1} i + a_{2} j + a_{3} k

and

b = b_{1} i + b_{2} j + b_{3} k

take the following determinant

Although it is an abomination that mixes vectors and scalars in one matrix, it gives the correct result.

Linear Atlases

In this section we offer a different perspective on vector space theory. It is well known that every finitely generated vector space over

𝕂

is isomorphic to

𝕂^{n}

for a single

n \in ℕ^{+}

called the dimension. This is often summarised as ‘there is only one vector space per (finite) dimension’. One proves this by construction of an ordered basis, which is equivalent to a linear isomorphism to

𝕂^{n}

via

where

(e_{1}, \dots, e_{n})

is the standard basis of

𝕂^{n}

. After establishing this result, one rushes to bring all the tools of matrix theory into vector spaces.

This is of course all correct, we offer now only a different emphasis. If you have a vector space

V

over the field

ℝ

of dimension

n

then unlike

ℝ^{n}

is does not have a distinguished ordered basis. Let us put our focus on the isomorphisms to

ℝ^{n}

rather than the ordered bases. If you have two linear isomorphisms

ϕ_{1}, ϕ_{2} : V \to ℝ^{n}

then you can compose them to a linear isomorphism between euclidean spaces

ϕ_{2} \circ ϕ_{1}^{- 1} : ℝ^{n} \to ℝ^{n}

. It is the change of basis matrix that you learn about in vector space theory. We, being differential geometers, recognise this situation as a manifold

V

with coordinates

ϕ_{1}, ϕ_{2}

and a transition function

ϕ_{2} \circ ϕ_{1}^{- 1}

. Just as for a

C^{k}

-manifold the transition maps are bijective

C^{k}

functions with

C^{k}

inverse, here the transition maps are bijective linear maps (which makes the inverse automatically linear). In this way, we could define a vector space as a set

V

with a linear atlas, an atlas where all transition maps are linear isomorphisms, as opposed to defining a vector space through a list of axioms. Thus a vector space is a special type of manifold and the choice of a basis is the choice of a chart.

Let us say that matrices describe linear maps between

ℝ^{n}

and

ℝ^{m}

, but not other vector spaces. Suppose you have two vector spaces

V, W

of dimension

n, m

with linear isomorphisms

ϕ_{1}, ϕ_{2} : V \to ℝ^{n}

and

ψ_{1}, ψ_{2} : W \to ℝ^{m}

and a linear map

A : V \to W

. For any manifold, we can examine a map between manifolds in charts

ψ_{1} \circ A \circ ϕ_{1}^{- 1} : ℝ^{n} \to V \to W \to ℝ^{m}

. This function

ψ_{1} \circ A \circ ϕ_{1}^{- 1}

is a linear function between euclidean spaces, unlike

A

, so it can be represented as a matrix. Thus writing a linear map as matrix with respect to bases is nothing other than writing it in charts. Likewise

ψ_{2} \circ A \circ ϕ_{2}^{- 1}

can also be represented as a matrix, and the relationship between the two is exactly that change of coordinates formula from differential geometry:

Hopefully you agree that by using the differential geometry idea of clearly separating an object from its coordinates/charts, we clarify how to change coordinates.

Let us change topics and consider Gaussian elimination, aka row and column operations. Perhaps you are aware that the three elementary row operations can be implemented as matrix multiplication. Here they are for

2 \times n

matrix:

We see that all three

2 \times 2

matrices are linear isomorphisms, therefore can be interpreted as a change of basis (or a change of chart). In a similar way, the elementary column operations are the multiplication of an invertible matrix from the right. If we have a linear map

A : V \to W

we see that starting with the matrix of

A

in some bases applying row and column operations gives

CA D^{- 1}

, the matrix of

A

with respect to some other bases. The point of Gaussian elimination is to bring a matrix to reduced row echelon form. But this is exactly finding special bases (or charts) for

V

and

W

in which the matrix of

A

has a simple form, namely an identity block and the rest zeroes.

Consider if we have a linear map

A : V \to V

. We can treat the domain and codomain as separate and use Gaussian elimination to find two bases of

V

that together give the matrix of

A

a nice form. But we can also ask what can be done if we are forced to use the same basis of

V

for both domain and codomain. If the change of basis matrix is

C

, we are asking what can be said about

CA C^{- 1}

. This is matrix conjugation/similarity. One learns that the eigenvalues of a matrix are preserved by conjugation and the complete answer is given by the Jordan normal form.

Bilinear Forms

Introductory linear algebra concentrates on vector spaces and linear maps. But these are not the only topics of linear algebra. Historically bilinear forms were perhaps even more studied than linear maps. A bilinear form is a function

B : V \times V \to ℝ

that is linear in both arguments. The term ‘form’ is an old term indicating a function to the scalars. Like linear maps, bilinear forms on

ℝ^{n}

can be represented as a square matrix

B (v, w) = v^{T} Bw

. What is different to linear maps is that under a change of basis

C^{- 1}

the matrix of a bilinear form changes to

C^{T} BC

. This is called matrix congruence and leads to a different set of invariants. For example, the eigenvalues are invariants of a linear map because they are roots of

det (λI - A)

and

The same calculation does not work if the transformation is

C^{T} BC

. Therefore eigenvalues are not invariants of matrices of bilinear forms.

A bilinear form is symmetric/antisymmetric if and only if the matrix is symmetric/antisymmetric. This is an invariant of the matrices of bilinear forms; a linear map might have a symmetric matrix in one basis but not another. Another invariant is the dimension of the left and right kernels of bilinear form. The left kernel is the set of vectors

v

such that

B (v, w) = 0

for all

w \in V

, and ditto for the right kernel. These are vector subspaces and have the same dimension, called the nullity. For symmetric and antisymmetric bilinear forms, the left and right kernels are equal. A bilinear form is called non-degenerate if its kernels are trivial.

The theory for the symmetric and antisymmetric are significantly different from each other. The invariants of a symmetric bilinear form are revealed by Sylvester’s law of inertia.

Proof. We first show that $B$ can be diagonalised with ${1, - 1, 0}$ on the diagonal. If all vectors have $B (v, v) = 0$ , then

0 = B (v + w, v + w) - B (v - w, v - w) = 4 B (v, w)

shows that $B$ is identically zero. Hence it is the zero matrix in any basis. Otherwise there is a vector with $B (v, v) \neq 0$ . By rescaling

B ({\sqrt{| B (v, v) |}}^{- 1} v, {\sqrt{| B (v, v) |}}^{- 1} v) = {\sqrt{| B (v, v) |}}^{- 2} B (v, v) = | B (v, v) |^{- 1} B (v, v) = \pm 1 .

So without loss of generality assume $B (v, v) = \pm 1$ . Then we can consider its orthogonal complement $v^{⊥} : = {w \in V ∣ B (v, w) = 0}$ . This is the kernel of $w \mapsto B (v, w)$ . Its image is $1$ dimensional by the definition of $v$ , so $v^{⊥}$ is $(n - 1)$ -dimensional. We can restrict $B$ to $v^{⊥}$ and apply the inductive hypothesis to get a basis $(v_{2}, \dots, v_{n})$ for which $B (v_{i}, v_{j}) = 0$ for $i \neq j$ and $B (v_{i}, v_{i}) \in {1, - 1, 0}$ . But additionally $B (v, v_{i}) = 0$ and $B (v, v) \in {1, - 1}$ . So $(v, v_{2}, \dots, v_{n})$ is a basis that meets our requirements.

Clearly by reordering the basis we can ensure that all the $1$ s come first, then the $- 1$ s and lastly the $0$ s. It remains to show that signature is an invariant. The kernel of $B$ is independent of basis, so we know that the sum $p + q$ is an invariant $r$ . Take any two diagonalising bases, $v_{i}$ with signature $(p, q)$ and $w_{i}$ with $(p^{'}, q^{'})$ . Suppose that $p < p^{'}$ . Construct a linear map

L : ℝ^{n} \to ℝ^{p + q^{'}}, x \mapsto (B (v_{1}, x), \dots, B (v_{p}, x), B (w_{p^{'} + 1}, x), \dots, B (w_{p^{'} + q^{'}}, x)) .

Any vector $x \in ker B$ is also in the kernel of $L$ . Notice however that $p + q^{'} < p^{'} + q^{'} = n - r$ so $dim ker L = n - (p + q^{'}) > r$ . Hence there must be a vector $u$ in the kernel of $L$ that isn’t in the kernel of $B$ .

We write $u = \sum_{i} u_{i} v_{i}$ . The coefficients with $i \leq p$ must be zero, since for $j \leq p$

0 = B (v_{j}, u) = \sum_{i} u_{i} B (v_{j}, v_{i}) = u_{j} .

And because $u$ is not in the kernel of $B$ , at least one of the coefficients with $p + 1 \leq i \leq p + q$ must be non-zero. Therefore

B (u, u) = \sum_{i, j \geq p + 1} u_{i} u_{j} B (v_{i}, v_{j}) = \sum_{i \geq p + 1} u_{i} u_{i} B (v_{i}, v_{i}) = \sum_{p + 1 \leq i \leq p + q} u_{i}^{2} (- 1) < 0 .

But now we can apply the same argument in the $w_{i}$ basis, but this time with the conclusion that $B (u, u) > 0$ . This is a contradiction, proving that $p = p^{'}$ . □

A vector space with a non-degenerate symmetric bilinear form is called an inner product space, which should already be familiar to you. Non-degeneracy is equivalent to a signature of

(n, 0)

. Hence all inner product spaces of the same finite dimension are essentially the same. The above proof in this case yields an orthonormal basis. In fact, we can insert the Gram-Schmidt process to make the above proof fully constructive. After we have normalised

v

to have

B (v, v) = \pm 1

choose more vectors to have a basis

(v, v_{2}^{'}, \dots v_{n}^{'})

of the whole vector space. Observe that

Hence

v_{i} : = v_{i}^{'} - \frac{B (v, v_{i}^{'})}{B (v, v)} v

is a basis of

v^{⊥}

Slightly more can be said about a symmetric bilinear form on an inner product space. This is a situation we often find ourselves in in Riemannian geometry, where the metric

g

is the inner product and we have another symmetric bilinear form such as the second fundamental form

h

. In particular, in this context we can identity a preferred class of bases, namely the orthonormal bases. The transformation from one orthonormal basis to another is an orthogonal matrix, so

The characteristic polynomial (and hence eigenvalues, determinant, trace) of the matrix of a symmetric bilinear form in an orthonormal basis is an invariant. This explains Lemma 1.38. We essentially defined the principal curvatures to be the eigenvalues of the second fundamental form (with respect to

g

-orthonormal bases). The Gauss curvature is their product, ie the determinant. The determinant can then be taken with respect to any orthonormal basis and be the same.

Finally, a vector space with a non-degenerate antisymmetric bilinear form is called an symplectic space. Antisymmetric bilinear forms can only be non-degenerate if the dimension is even. For a symplectic space it is possible to find a basis (Darboux basis) such that the matrix of the bilinear form is

Just like for inner product spaces, there is essentially only one symplectic vector space in every even dimension. Symplectic vector spaces are the starting point for symplectic geometry in the same way that inner product spaces are the starting point for Riemannian geometry. Unlike in Riemannian geometry, which has curvature, symplectic geometry does not have local invariants.

Appendix B
Linear Algebra

Cross product

Linear Atlases

Bilinear Forms

Appendix BLinear Algebra

Cross product

Linear Atlases

Bilinear Forms

Appendix B
Linear Algebra