Appendix B
Linear Algebra

Cross product

Consider n with the dot product. With this we can determine the lengths of vectors and the angle between them. In fact the relation a b = abcos 𝜃 says that the two are equivalent information. The reason to use the dot product over simply the length and angle information is because it is very useful to encode this information into a bilinear operation. Likewise, in 3 given two non-parallel vectors in order, there is a third perpendicular vector that completes the oriented basis. It is useful to have an operation that goes from a pair of vectors to the perpendicular complement, but even more useful if this operation has nice algebraic properties.

Here is a construction that leads to the standard cross product. It must be anticommutative to encode the orientation. We want it to be bilinear, so it is sufficient to construct it on unit vectors. Note that negating a vector changes the orientation so the output should also be negated to preserve the orientation, and this is compatible with bilinearity. Finally, it should be rotationally preserved. With these stipulations, we must choose an odd function f(𝜃) that determines

i × (cos 𝜃i + sin 𝜃j) = f(𝜃)k.

By anticommutativity, i ×i = 0, so the left hand side simplifies to i × (sin 𝜃j). If the cross product is to be bilinear, then

f(𝜃)k = sin 𝜃(i ×j) = sin 𝜃f(π2)k.

The obvious choice is to set f(π2) = 1 so that f(𝜃) = sin 𝜃.

We see that the cross product is determined essentially by the relation i ×j = k and bilinearity. This can be used to calculate any cross product and is called the ‘algebraic method’. Alternatively, we have a × b = ab|sin 𝜃|. We can interpret the right hand side as the area of the parallelogram spanned by a and b. Together with the perpendicularity, this information also determines the cross product. It is called the ‘geometric method’. This is the reason that cross products often appear in surface area formulas. For example, a parameterised surface Φ : U 2 3 has a surface area of

UuΦ × vΦdudv.

Finally, the computation method I use the most is the ‘determinant method’. Given a = a1i + a2j + a3k and b = b1i + b2j + b3k take the following determinant

| i j k a 1 a2 a3 b1 b2 b3 | = (a2b3a3b2)i+(a3b1a1b3)j+(a1b2a2b1)k.

Although it is an abomination that mixes vectors and scalars in one matrix, it gives the correct result.

Linear Atlases

In this section we offer a different perspective on vector space theory. It is well known that every finitely generated vector space over 𝕂 is isomorphic to 𝕂n for a single n + called the dimension. This is often summarised as ‘there is only one vector space per (finite) dimension’. One proves this by construction of an ordered basis, which is equivalent to a linear isomorphism to 𝕂n via

(v1,,vn) V ϕ(vi) := ei ϕ : V 𝕂n (ϕ1(e 1),,ϕ1(e n)),

where (e1,,en) is the standard basis of 𝕂n. After establishing this result, one rushes to bring all the tools of matrix theory into vector spaces.

This is of course all correct, we offer now only a different emphasis. If you have a vector space V over the field of dimension n then unlike n is does not have a distinguished ordered basis. Let us put our focus on the isomorphisms to n rather than the ordered bases. If you have two linear isomorphisms ϕ1,ϕ2 : V n then you can compose them to a linear isomorphism between euclidean spaces ϕ2 ϕ11 : n n. It is the change of basis matrix that you learn about in vector space theory. We, being differential geometers, recognise this situation as a manifold V with coordinates ϕ1,ϕ2 and a transition function ϕ2 ϕ11. Just as for a Ck-manifold the transition maps are bijective Ck functions with Ck inverse, here the transition maps are bijective linear maps (which makes the inverse automatically linear). In this way, we could define a vector space as a set V with a linear atlas, an atlas where all transition maps are linear isomorphisms, as opposed to defining a vector space through a list of axioms. Thus a vector space is a special type of manifold and the choice of a basis is the choice of a chart.

Let us say that matrices describe linear maps between n and m, but not other vector spaces. Suppose you have two vector spaces V,W of dimension n,m with linear isomorphisms ϕ1,ϕ2 : V n and ψ1,ψ2 : W m and a linear map A : V W. For any manifold, we can examine a map between manifolds in charts ψ1 A ϕ11 : n V W m. This function ψ1 A ϕ11 is a linear function between euclidean spaces, unlike A, so it can be represented as a matrix. Thus writing a linear map as matrix with respect to bases is nothing other than writing it in charts. Likewise ψ2 A ϕ21 can also be represented as a matrix, and the relationship between the two is exactly that change of coordinates formula from differential geometry:

ψ2 A ϕ21 = (ψ 2 ψ11) (ψ 1 A ϕ11) (ϕ 2 ϕ11)1.

We can interpret this formula in terms of matrices:

matrix of A with respect to the new bases = change of basis matrix for W ×matrix of A with respect to the old bases ×inverse of change of basis matrix for V .

Hopefully you agree that by using the differential geometry idea of clearly separating an object from its coordinates/charts, we clarify how to change coordinates.

Let us change topics and consider Gaussian elimination, aka row and column operations. Perhaps you are aware that the three elementary row operations can be implemented as matrix multiplication. Here they are for 2 × n matrix:

( 0 1 1 0 ) ( v1 w 1 ) = ( w1 v 1 ) ( 1 0 0 λ ) ( v1 w 1 ) = ( v1 λw1 ) ( 1 0 λ 1 ) ( v1 w 1 ) = ( v1 w1 + λv1 ).

We see that all three 2 × 2 matrices are linear isomorphisms, therefore can be interpreted as a change of basis (or a change of chart). In a similar way, the elementary column operations are the multiplication of an invertible matrix from the right. If we have a linear map A : V W we see that starting with the matrix of A in some bases applying row and column operations gives CAD1, the matrix of A with respect to some other bases. The point of Gaussian elimination is to bring a matrix to reduced row echelon form. But this is exactly finding special bases (or charts) for V and W in which the matrix of A has a simple form, namely an identity block and the rest zeroes.

Consider if we have a linear map A : V V . We can treat the domain and codomain as separate and use Gaussian elimination to find two bases of V that together give the matrix of A a nice form. But we can also ask what can be done if we are forced to use the same basis of V for both domain and codomain. If the change of basis matrix is C, we are asking what can be said about CAC1. This is matrix conjugation/similarity. One learns that the eigenvalues of a matrix are preserved by conjugation and the complete answer is given by the Jordan normal form.

Bilinear Forms

Introductory linear algebra concentrates on vector spaces and linear maps. But these are not the only topics of linear algebra. Historically bilinear forms were perhaps even more studied than linear maps. A bilinear form is a function B : V × V that is linear in both arguments. The term ‘form’ is an old term indicating a function to the scalars. Like linear maps, bilinear forms on n can be represented as a square matrix B(v,w) = vT Bw. What is different to linear maps is that under a change of basis C1 the matrix of a bilinear form changes to CT BC. This is called matrix congruence and leads to a different set of invariants. For example, the eigenvalues are invariants of a linear map because they are roots of det (λI A) and

det (λI CAC1) = det C(λI A)C1 = det C1C(λI A) = det (λI A).

The same calculation does not work if the transformation is CT BC. Therefore eigenvalues are not invariants of matrices of bilinear forms.

Every bilinear form can be split into a symmetric and antisymmetric part

B(v,w) = 1 2 (B(v,w) + B(w,v) ) + 1 2 (B(v,w) B(w,v) ).

A bilinear form is symmetric/antisymmetric if and only if the matrix is symmetric/antisymmetric. This is an invariant of the matrices of bilinear forms; a linear map might have a symmetric matrix in one basis but not another. Another invariant is the dimension of the left and right kernels of bilinear form. The left kernel is the set of vectors v such that B(v,w) = 0 for all w V , and ditto for the right kernel. These are vector subspaces and have the same dimension, called the nullity. For symmetric and antisymmetric bilinear forms, the left and right kernels are equal. A bilinear form is called non-degenerate if its kernels are trivial.

The theory for the symmetric and antisymmetric are significantly different from each other. The invariants of a symmetric bilinear form are revealed by Sylvester’s law of inertia.

Theorem B.1 (Sylvester’s law of inertia). There is a basis that reduces the matrix of a symmetric bilinear form B to the following block-diagonal matrix

( Ip Iq 0 ).

The sizes (p,q) are called the signature and are invariants.

Proof. We first show that B can be diagonalised with {1,1,0} on the diagonal. If all vectors have B(v,v) = 0, then

0 = B(v + w,v + w) B(v w,v w) = 4B(v,w)

shows that B is identically zero. Hence it is the zero matrix in any basis. Otherwise there is a vector with B(v,v)0. By rescaling

B (|B(v, v)|1v,|B(v, v)|1v) = |B(v, v)|2B(v,v) = |B(v,v)|1B(v,v) = ±1.

So without loss of generality assume B(v,v) = ±1. Then we can consider its orthogonal complement v := {w V B(v,w) = 0}. This is the kernel of wB(v,w). Its image is 1 dimensional by the definition of v, so v is (n 1)-dimensional. We can restrict B to v and apply the inductive hypothesis to get a basis (v2,,vn) for which B(vi,vj) = 0 for ij and B(vi,vi) {1,1,0}. But additionally B(v,vi) = 0 and B(v,v) {1,1}. So (v,v2,,vn) is a basis that meets our requirements.

Clearly by reordering the basis we can ensure that all the 1s come first, then the 1s and lastly the 0s. It remains to show that signature is an invariant. The kernel of B is independent of basis, so we know that the sum p + q is an invariant r. Take any two diagonalising bases, vi with signature (p,q) and wi with (p,q). Suppose that p < p. Construct a linear map

L : n p+q ,x (B(v1,x),,B(vp,x),B(wp+1,x),,B(wp+q,x) ).

Any vector x ker B is also in the kernel of L. Notice however that p + q < p + q = n r so dim ker L = n (p + q) > r. Hence there must be a vector u in the kernel of L that isn’t in the kernel of B.

We write u = iuivi. The coefficients with i p must be zero, since for j p

0 = B(vj,u) = iuiB(vj,vi) = uj.

And because u is not in the kernel of B, at least one of the coefficients with p + 1 i p + q must be non-zero. Therefore

B(u,u) = i,jp+1uiujB(vi,vj) = ip+1uiuiB(vi,vi) = p+1ip+qui2(1) < 0.

But now we can apply the same argument in the wi basis, but this time with the conclusion that B(u,u) > 0. This is a contradiction, proving that p = p. □

A vector space with a non-degenerate symmetric bilinear form is called an inner product space, which should already be familiar to you. Non-degeneracy is equivalent to a signature of (n,0). Hence all inner product spaces of the same finite dimension are essentially the same. The above proof in this case yields an orthonormal basis. In fact, we can insert the Gram-Schmidt process to make the above proof fully constructive. After we have normalised v to have B(v,v) = ±1 choose more vectors to have a basis (v,v2,vn) of the whole vector space. Observe that

B (v,viB(v,vi) B(v,v) v) = B(v,vi) B(v,vi) B(v,v) B(v,v) = 0.

Hence vi := viB(v,vi) B(v,v) v is a basis of v.

Slightly more can be said about a symmetric bilinear form on an inner product space. This is a situation we often find ourselves in in Riemannian geometry, where the metric g is the inner product and we have another symmetric bilinear form such as the second fundamental form h. In particular, in this context we can identity a preferred class of bases, namely the orthonormal bases. The transformation from one orthonormal basis to another is an orthogonal matrix, so

det (λI OT AO) = det OT (λI A)O = det OOT (λI A) = det (λI A).

The characteristic polynomial (and hence eigenvalues, determinant, trace) of the matrix of a symmetric bilinear form in an orthonormal basis is an invariant. This explains Lemma 1.38. We essentially defined the principal curvatures to be the eigenvalues of the second fundamental form (with respect to g-orthonormal bases). The Gauss curvature is their product, ie the determinant. The determinant can then be taken with respect to any orthonormal basis and be the same.

Finally, a vector space with a non-degenerate antisymmetric bilinear form is called an symplectic space. Antisymmetric bilinear forms can only be non-degenerate if the dimension is even. For a symplectic space it is possible to find a basis (Darboux basis) such that the matrix of the bilinear form is

( 0 In In 0 ).

Just like for inner product spaces, there is essentially only one symplectic vector space in every even dimension. Symplectic vector spaces are the starting point for symplectic geometry in the same way that inner product spaces are the starting point for Riemannian geometry. Unlike in Riemannian geometry, which has curvature, symplectic geometry does not have local invariants.