Consider with the dot product. With this we can determine the lengths of vectors and the angle between them. In fact the relation says that the two are equivalent information. The reason to use the dot product over simply the length and angle information is because it is very useful to encode this information into a bilinear operation. Likewise, in given two non-parallel vectors in order, there is a third perpendicular vector that completes the oriented basis. It is useful to have an operation that goes from a pair of vectors to the perpendicular complement, but even more useful if this operation has nice algebraic properties.
Here is a construction that leads to the standard cross product. It must be anticommutative to encode the orientation. We want it to be bilinear, so it is sufficient to construct it on unit vectors. Note that negating a vector changes the orientation so the output should also be negated to preserve the orientation, and this is compatible with bilinearity. Finally, it should be rotationally preserved. With these stipulations, we must choose an odd function that determines
By anticommutativity, , so the left hand side simplifies to . If the cross product is to be bilinear, then
The obvious choice is to set so that .
We see that the cross product is determined essentially by the relation and bilinearity. This can be used to calculate any cross product and is called the ‘algebraic method’. Alternatively, we have . We can interpret the right hand side as the area of the parallelogram spanned by and . Together with the perpendicularity, this information also determines the cross product. It is called the ‘geometric method’. This is the reason that cross products often appear in surface area formulas. For example, a parameterised surface has a surface area of
Finally, the computation method I use the most is the ‘determinant method’. Given and take the following determinant
Although it is an abomination that mixes vectors and scalars in one matrix, it gives the correct result.
In this section we offer a different perspective on vector space theory. It is well known that every finitely generated vector space over is isomorphic to for a single called the dimension. This is often summarised as ‘there is only one vector space per (finite) dimension’. One proves this by construction of an ordered basis, which is equivalent to a linear isomorphism to via
where is the standard basis of . After establishing this result, one rushes to bring all the tools of matrix theory into vector spaces.
This is of course all correct, we offer now only a different emphasis. If you have a vector space over the field of dimension then unlike is does not have a distinguished ordered basis. Let us put our focus on the isomorphisms to rather than the ordered bases. If you have two linear isomorphisms then you can compose them to a linear isomorphism between euclidean spaces . It is the change of basis matrix that you learn about in vector space theory. We, being differential geometers, recognise this situation as a manifold with coordinates and a transition function . Just as for a -manifold the transition maps are bijective functions with inverse, here the transition maps are bijective linear maps (which makes the inverse automatically linear). In this way, we could define a vector space as a set with a linear atlas, an atlas where all transition maps are linear isomorphisms, as opposed to defining a vector space through a list of axioms. Thus a vector space is a special type of manifold and the choice of a basis is the choice of a chart.
Let us say that matrices describe linear maps between and , but not other vector spaces. Suppose you have two vector spaces of dimension with linear isomorphisms and and a linear map . For any manifold, we can examine a map between manifolds in charts . This function is a linear function between euclidean spaces, unlike , so it can be represented as a matrix. Thus writing a linear map as matrix with respect to bases is nothing other than writing it in charts. Likewise can also be represented as a matrix, and the relationship between the two is exactly that change of coordinates formula from differential geometry:
We can interpret this formula in terms of matrices:
Hopefully you agree that by using the differential geometry idea of clearly separating an object from its coordinates/charts, we clarify how to change coordinates.
Let us change topics and consider Gaussian elimination, aka row and column operations. Perhaps you are aware that the three elementary row operations can be implemented as matrix multiplication. Here they are for matrix:
We see that all three matrices are linear isomorphisms, therefore can be interpreted as a change of basis (or a change of chart). In a similar way, the elementary column operations are the multiplication of an invertible matrix from the right. If we have a linear map we see that starting with the matrix of in some bases applying row and column operations gives , the matrix of with respect to some other bases. The point of Gaussian elimination is to bring a matrix to reduced row echelon form. But this is exactly finding special bases (or charts) for and in which the matrix of has a simple form, namely an identity block and the rest zeroes.
Consider if we have a linear map . We can treat the domain and codomain as separate and use Gaussian elimination to find two bases of that together give the matrix of a nice form. But we can also ask what can be done if we are forced to use the same basis of for both domain and codomain. If the change of basis matrix is , we are asking what can be said about . This is matrix conjugation/similarity. One learns that the eigenvalues of a matrix are preserved by conjugation and the complete answer is given by the Jordan normal form.
Introductory linear algebra concentrates on vector spaces and linear maps. But these are not the only topics of linear algebra. Historically bilinear forms were perhaps even more studied than linear maps. A bilinear form is a function that is linear in both arguments. The term ‘form’ is an old term indicating a function to the scalars. Like linear maps, bilinear forms on can be represented as a square matrix . What is different to linear maps is that under a change of basis the matrix of a bilinear form changes to . This is called matrix congruence and leads to a different set of invariants. For example, the eigenvalues are invariants of a linear map because they are roots of and
The same calculation does not work if the transformation is . Therefore eigenvalues are not invariants of matrices of bilinear forms.
Every bilinear form can be split into a symmetric and antisymmetric part
A bilinear form is symmetric/antisymmetric if and only if the matrix is symmetric/antisymmetric. This is an invariant of the matrices of bilinear forms; a linear map might have a symmetric matrix in one basis but not another. Another invariant is the dimension of the left and right kernels of bilinear form. The left kernel is the set of vectors such that for all , and ditto for the right kernel. These are vector subspaces and have the same dimension, called the nullity. For symmetric and antisymmetric bilinear forms, the left and right kernels are equal. A bilinear form is called non-degenerate if its kernels are trivial.
The theory for the symmetric and antisymmetric are significantly different from each other. The invariants of a symmetric bilinear form are revealed by Sylvester’s law of inertia.
Theorem B.1 (Sylvester’s law of inertia). There is a basis that reduces the matrix of a symmetric bilinear form to the following block-diagonal matrix
The sizes are called the signature and are invariants.
Proof. We first show that can be diagonalised with on the diagonal. If all vectors have , then
shows that is identically zero. Hence it is the zero matrix in any basis. Otherwise there is a vector with . By rescaling
So without loss of generality assume . Then we can consider its orthogonal complement . This is the kernel of . Its image is dimensional by the definition of , so is -dimensional. We can restrict to and apply the inductive hypothesis to get a basis for which for and . But additionally and . So is a basis that meets our requirements.
Clearly by reordering the basis we can ensure that all the s come first, then the s and lastly the s. It remains to show that signature is an invariant. The kernel of is independent of basis, so we know that the sum is an invariant . Take any two diagonalising bases, with signature and with . Suppose that . Construct a linear map
Any vector is also in the kernel of . Notice however that so . Hence there must be a vector in the kernel of that isn’t in the kernel of .
We write . The coefficients with must be zero, since for
And because is not in the kernel of , at least one of the coefficients with must be non-zero. Therefore
But now we can apply the same argument in the basis, but this time with the conclusion that . This is a contradiction, proving that . □
A vector space with a non-degenerate symmetric bilinear form is called an inner product space, which should already be familiar to you. Non-degeneracy is equivalent to a signature of . Hence all inner product spaces of the same finite dimension are essentially the same. The above proof in this case yields an orthonormal basis. In fact, we can insert the Gram-Schmidt process to make the above proof fully constructive. After we have normalised to have choose more vectors to have a basis of the whole vector space. Observe that
Hence is a basis of .
Slightly more can be said about a symmetric bilinear form on an inner product space. This is a situation we often find ourselves in in Riemannian geometry, where the metric is the inner product and we have another symmetric bilinear form such as the second fundamental form . In particular, in this context we can identity a preferred class of bases, namely the orthonormal bases. The transformation from one orthonormal basis to another is an orthogonal matrix, so
The characteristic polynomial (and hence eigenvalues, determinant, trace) of the matrix of a symmetric bilinear form in an orthonormal basis is an invariant. This explains Lemma 1.38. We essentially defined the principal curvatures to be the eigenvalues of the second fundamental form (with respect to -orthonormal bases). The Gauss curvature is their product, ie the determinant. The determinant can then be taken with respect to any orthonormal basis and be the same.
Finally, a vector space with a non-degenerate antisymmetric bilinear form is called an symplectic space. Antisymmetric bilinear forms can only be non-degenerate if the dimension is even. For a symplectic space it is possible to find a basis (Darboux basis) such that the matrix of the bilinear form is
Just like for inner product spaces, there is essentially only one symplectic vector space in every even dimension. Symplectic vector spaces are the starting point for symplectic geometry in the same way that inner product spaces are the starting point for Riemannian geometry. Unlike in Riemannian geometry, which has curvature, symplectic geometry does not have local invariants.