Finally we come to the definition of a Riemannian metric, the object that gives this field its name. Let us dispel a common misunderstanding: a Riemannian metric is not a distance function, which goes against modern terminology (a la metric spaces). Instead it is a generalisation of an inner product. As we saw for surfaces, an inner product allows us to define a notion of length, so there is a close relation between distance functions and inner products on manifolds. But a new student to the field must get used to the change in terminology.
Definition 3.1. A Riemannian metric on a manifold is a choice of inner product for every tangent space . If is a chart of , then we can express in charts using the coordinate basis vectors:
A Riemannian metric should be smooth in the sense that the functions are smooth in any chart. A manifold with a Riemannian metric is called a Riemannian manifold. Length of and angle between vectors is defined in the usual way
The functions are sufficient to determine the inner product of any two vectors by bilinearity:
The symmetry and positive definiteness of imply that the matrix is symmetric and positive definite.
Example 3.2 (Euclidean Space). We have seen in Example 2.3 that any open subset of euclidean space is a manifold with one chart. It is also a Riemannian manifold with the usual dot product
Notice that the matrix of the metric in charts is symmetric and positive definite. This is also called the standard metric on .
Example 3.3 (Helicoid). In fact we have seen Riemannian metrics already, namely the first fundamental form of a surface. For the helicoid, in Example 1.22, in the chart we had coordinates and
For this example we see that are non-constant functions (at least, is non-constant). We understand that the length coordinate basis vector
is different at different points of the helicoid.
We can ask how the functions in a chart are related to those in an overlapping chart . We know that the inner product should be independent of basis, so we compute it in two ways:
Notice the subtle contrast to the equivalence relation for vectors:
The term for objects that transform with , like , is covariant, whereas those that transform with , like the coefficients of vectors, are called contravariant. The convention is to use lower indices for covariant things, and upper indices for contravariant things. Historically this convention came before the summation convention. Because by the chain rule, when covariant and contravariant objects are ‘multiplied’, as in the above formula for , then the result is independent of charts. This explains why there are so many sums of upper index with lower index, and was the motivation of the summation convention.
Clearly one can endow a manifold with functions that satisfy the necessary properties and thereby make it a Riemannian manifold. But this is not usually how we construct Riemannian manifolds. It is far more common to ‘inherit’ a metric from a bigger Riemannian manifold. This is how we got a metric on the helicoid. In general, we use the tangent map to move vectors on one manifold into the tangent space of another.
Definition 3.4. Let be a manifold, a Riemannian manifold with metric . Let be an immersion. That means that is injective at every point. Then we define a metric on , called the pullback metric or the induced metric, by
for any .
Exercise 3.5. The formula for is well-defined for all smooth functions , so why is it necessary that is an immersion?
Let’s go through how the definitions of Section 1.4 fit with the definitions in this section. First we have the definition of a regular parameterised surface , Definition 1.20. is a function between euclidean spaces, so the tangent map is just the Jacobian . The condition that the Jacobian is rank two is equivalent to it being injective by the rank-nullity theorem of linear algebra. Therefore regular and immersed are equivalent.
The first fundamental form is exactly the standard metric on pullbacked by . In the coordinate basis vectors, we have
which is the definition of the first fundamental form.
Example 3.6 (Stereographic Projection). What does the induced metric from look like in stereographic coordinates on ? Well, we need to compute the pushforward of the coordinates vector fields and take the dot product. The pushforward was already computed for the chart in Example 2.22:
Therefore
The matrix of the metric has only one entry because the dimension of the manifold is one.
Using this we can calculate the lengths of vectors. For example has length
This is because we saw in Example 2.22 that it pushes forward to .
On the other hand has length
So although the vector field appears to be constant in the chart, its length is in fact changing.
Exercise 3.7 (Stereographic Projection). Compute in the chart and verify the change of chart formula for the metric.
Exercise 3.8 (Stereographic Projection). For the chart of verify
Finally, consider the notion of isometry in Definition 1.39. It says that two parameterised surfaces are isometric if their parametrisations induce equal metrics. We give the following more general definition.
Definition 3.9. Let be Riemannian manifolds and let be an immersion. We call an Riemannian immersion if . In words, if the metric on induced by the immersion is equal to the existing metric on . If additionally is a diffeomorphism (bijective, smooth, smooth inverse) then we call an isometry. Two Riemannian manifolds are isometric if there is a isometry between them.
As above, if is just a manifold and we have an immersion to a Riemannian manifold, then we can endow with the pullback metric. Then becomes a Riemannian immersion by definition.
Example 3.10. Suppose that we have an Riemannian immersion and let be a rotation. Define ; this is also a Riemannian immersion, as we will now prove. The essential step of the calculation is to notice that because is a linear transformation, and that a rotation doesn’t change the inner product . Therefore
In the last line we used that is a Riemannian immersion.
Exercise 3.11. Generalise the above example to prove: the composition of two Riemannian immersions is a Riemannian immersion.
A weaker condition to isometry is that of a conformal map.
Definition 3.12. Let be Riemannian manifolds and let be an immersion. We say that is conformal if there exists a smooth function such that .
A conformal map does not preserve lengths or distances, but it does preserve angles since
implies
Example 3.13 (Stereographic Projection). Consider inverse stereographic projection as a function between with the standard metric and the sphere with the induced metric of .
For , and indeed on any one-dimensional manifold, all metrics are conformally equivalent because there is only one metric coefficient .
For , Exercise 3.8 shows us that is not a Riemannian immersion, because the pullback metric is not equal to the standard metric . However, is conformal because
Notice for example, that in stereographic coordinates the lines through the origin are lines of longitude and circles centered at the origin are lines of latitude, and these are always perpendicular to one another.
A calculation similar to the case shows that sterographic projection is conformal for all . Therefore stereographic charts have the advantage that the angle between vectors as naively calculated in the chart is the same as in .
Example 3.14 (Helicoid). We have seen the pullback metric of the helicoid in Example 3.3. It is a metric on . On the other hand we could give the plane the standard metric . With these metrics, the immersion is not conformal.
We could use a different parameterisation of the helicoid
The pushforwards of the coordinate vectors are
The pullback of the standard metric on by this map is
That is to say
Therefore is a conformal map between and with the standard metrics.
In this section we introduce the quaternions as a means to understand the rotations of the 3-Sphere . The 3-sphere is a beautiful manifold because it is also a group. A manifold that is also a group is called a Lie group. We will not go into the general theory of Lie groups, but they come with a natural way to move vectors around, something we are trying to achieve in this chapter. The example of Lie groups is therefore very instructive for us.
The quaternions are a four dimensional real vector space . A quaternion has a real part and an imaginary part . Unlike for complex numbers, the imaginary part of a quaternion is not real. The quaternionic conjugate is . Clearly and . Elements of the subspace are called imaginary.
Famously the quaternions have an associative but non-commutative multiplication, defined by and is the identity. We also use the notation to aid clarity. For example because we multiply on the right by to get and use . On the other hand : from we get and now multiply on the left by . This doesn’t mean that every multiplication of quaternions is anti-commuting:
According to legend on Monday 16 October 1843, as Hamilton was walking to the Royal Irish Academy, he had the idea that to define a multiplication on it must be non-commutative, whereupon he carved the above equations into the side of Brougham Bridge. I have been to the bridge but was unable to find the carving, so instead I offer the following simple trick to remember the multiplication rule. Draw on a directed circle. Multiplication of two elements gives the third, with a plus sign if they are in the correct direction and a minus sign if they are in the reverse direction. This is of course the same rule as for the cross product in .
A direct computation shows that is always real and non-negative. Thus we can define the norm . The norm shows that every non-zero quaternion has a two-sided inverse, namely . Therefore the quaternions are a non-commutative field.
Exercise 3.15. Prove the following:
This norm is plainly the same as the usual norm on . The unit quaternions (those with norm ) are as a set . Therefore the 3-sphere is a Lie group, because we can multiply two elements of it together in a way that can be undone. This is rather special, the only spheres that are Lie groups are ( in ), (add the angles), and .
If we choose we can look at the function defined by This is a bijective function, because the inverse is . And . Therefore the tangent map of takes to . Moreover, the tangent map is also bijective: from the chain rule
Indeed, this inverse has the property that it takes to the identity . This gives us a way to move any tangent vector of to . Just as in Example 2.28, this shows us that is trivial. The function is called the left trivialisation. Likewise we can define and we have the right trivialisation
Example 3.16 (3-Sphere). Let us compute the trivialisations for the point in . The inverse of is , since . If we have any point then
This does indeed have the property that . Next we use some geometry to avoid using charts. We know that the tangent vectors in are perpendicular to , because this is a sphere. We write
Because is linear in , we know
For the right trivialisation
So these two trivialisations on are different from one another.
Example 3.17. We can generalise the previous example to work for any point . Just like is a right-angle rotation of the complex plane, are all right-angle rotations of the quaternions. Therefore , , is an orthonormal basis of . Alternatively, since
and is a basis for we know that
is all of . This shows us that identifying with is the same as writing it with respect to the pushforward of a basis. If then we get
We call the vector field on a left-invariant field when it has the form
for , because every vector corresponds to using the left trivialisation. Ditto we have the right-invariant vector fields
We have seen numerous examples thus far of how we cannot simply move vectors around in a chart like we can in euclidean space. If you take a tangent vector at one point of the sphere and translate it in to another point of the sphere, it may not be tangent anymore. As we observed below Example 2.19, a vector might have the same coordinates at different points in one chart, but not in another. And in Example 3.3 we saw that one coordinate basis vector changed its length as you moved around, while the other stayed the same length.
There is also a common thought experiment. Suppose that you are standing on the equator facing east. You walk forward without turning, until you have walked half way around the Earth. Then, still without turning, you begin to sidestep to the north. You sidestep all the way to the north pole, but keep going until you have returned to your original position. The remarkable fact is, even though at no stage did you turn, you are now facing west.
Exercise 3.19. Can you modify the journey so that you end up facing other directions? What is the connection between the area your journey encompasses and the final rotation angle?
However, the naive definition of the derivative of a vector field
asks us to subtract two vectors at different points. Indeed, any non-trivial definition of a derivative of a vector field is going to require us to compare vectors at different points. Geometrically, thinking about a surface, what we want to do is to ‘roll’ the tangent plane along the surface to another point. This idea is called development and the relation between two tangent planes was called an affine connection, because it was an affine transformation of one plane to another. In modern terminology it is more common to call this a parallel transport operator, for reasons that will be explained in Section 3.4. Already from the above thought experiment we see that a parallel transport operator will depend not just on the two start and end points, but on the path between those points.
The modern approach, which we will ultimately take, uses a different point of view. It asks: how much are vector fields are changing? Once we have a basis of vector fields and we know their changes, then we can measure all other vector fields against them. This leads to the definition of a covariant derivative, a type of differential operator on vector fields. It is extremely common to call this an connection, but we will refrain from doing so, at least until we have made clear the relationship with the parallel transport operator. Though the two approaches are equivalent, the modern approach is the much easier place to begin. On the other hand, some of the definitions and motivations for the modern approach only really make sense from the point of view of the traditional approach.
Definition 3.20. A covariant derivative on a manifold is a function that acts on two vector fields to produce a third. We write it as , with being the ‘direction’. It has the following properties for all smooth functions and vector fields :
It is -linear in the direction:
It is additive in the derivative:
It obeys the product (Leibniz) rule:
Example 3.21 (Euclidean Space). Consider euclidean space and let be vector fields in the chart . Then
is a covariant derivative.
You might be confused, because in Example 2.33 we said this formula didn’t work. Indeed, this formula is not chart independent. This definition is saying explicitly “use this particular coordinates to do the derivative and not others”. If you write this covariant derivative in polar coordinates, then the formula for this covariant derivative will look different. But this is why we say that it is a covariant derivative, we are not claiming uniqueness.
Exercise 3.22. Check the above example has the three properties that are required of a covariant derivative.
The above example suggests that there are many covariant derivatives on a manifold. At least for a manifold that can be covered by a single chart, every set of coordinates gives a covariant derivative. In the following theorem we characterise the set of covariant derivatives.
Theorem 3.23 (Tensorial). Let be two covariant derivatives derivatives. Define their difference . Then is -linear in both and .
Proof. -linear in is immediate from Property a of covariant derivatives. -linear in is not too much harder to show, we use Properties b and c:
Exercise 3.24. Prove the converse of Theorem 3.23: Let is a covariant derivative on . For all vector fields let be a smooth vector field. Suppose that this function is -linear in both . Then is also a covariant derivative.
Corollary 3.25 (Affineness). The space of covariant derivatives on is affine in the following sense: if is a constant and are two covariant derivatives, so is .
The above theorems give us a way to construct new covariant derivatives from existing ones (and in fact construct every covariant derivative). But we need one to start with. One can prove1 that every manifold has a covariant derivative, but the proof is technical and not practically useful. We have seen in Example 3.21 that if one chart covers the whole space, then we can declare it is special and use the directional derivative. For manifolds that are a submanifold of a bigger space, the following example is typical.
Example 3.26 (Stereographic Projection,Tangent Connection). Consider the sphere inside . We can understand any vector field on as a function using the pushforward. Therefore we can differentiate as an valued function in the usual way.
For the sake of a numerical example, let us take both and to be the vector field from Example 2.19. The pushforward of the vector field is
and interpreting this a function to we have
If we differentiate along , then using the product rule to avoid some nasty but unimportant terms we get
The first term is tangent to the circle, but the second is not. So we see the trouble is that the directional derivative is no longer be tangent to . Therefore this does not meet the definition of a covariant derivative on .
What we can do however is to project this directional derivative onto the tangent space. We define the tangent covariant derivative as
Let’s check the three required properties. The two linearity properties just follow from the linearity of the projection
For the third property, we need to recognise that is already tangent to , so the projection leaves it unaltered:
Nothing in the calculation depended on specifically, so this is a general construction for immersed submanifolds.
Next we examine what type of derivative a covariant derivative is. We will show that it is a directional derivative, in a sense that will be developed. To this end, the first property to notice is that although the direction and the derived vector fields have dramatically different behaviour under scaling by a smooth function, they are both -linear. If is a constant then
Consequently, if either field is zero, then so is the covariant derivative. Moreover, using cutoff functions, the covariant derivative only depends on local information.2 In fact something stronger is true of :
Proof. By linearity, it suffices to prove that implies . Writing in a chart we have and for all the coefficients. Then
For this reason we sometimes speak of the covariant derivative in a direction . The same is not true for : the covariant derivative really is a derivative of and depends on its values in a neighbourhood of a point. However, to compute you don’t need to know completely on an open neighbourhood of , it is enough to know on a curve whose tangent is .
Lemma 3.28 (Curve Derivative). Let be two vector fields and let be a smooth curve with and . Suppose that . Then .
Proof. Let us consider the situation in a chart, writing , and . Then by the properties of covariant derivatives,
and likewise for . Now, and agree on , so . Moreover, by the chain rule
Hence
This lemma tells us that we can really view the covariant derivative as a generalisation of a directional derivative. This is in contrast to other derivatives of vector fields. Recall Example 2.37. Now consider the vector fields from that example along the curve , the -axis. We have , , and . But while . This shows that the Lie bracket is not a covariant derivative.
To break up all this theory, let’s do another example.
Example 3.29 (3-Sphere). We define a covariant derivative on in the following way. Given any vector field on , use left trivialisation to write it as a function . From Example 3.17 we know this has the formula using quaternions. Now that we have a function to the same vector space, there is no problem differentiating. This gives us a function . Use the left trivialisation again to move the result back to .
Putting this all in one formula gives
This covariant derivative has the property that the derivative of a left-invariant vector field is always zero. This is because, by definition, after you bring its vectors to they are all the same. In other words is constant and thus has zero derivative.
So to see an interesting example, we need to use a non-left-invariant vector field. Consider . We know that . To proceed we need to choose a direction field . We know that the value of the covariant derivative at any point only depends on the value of at that point. So for simplicity let us calculate for the point in the direction :
Finally, we move this back to
In the same manner, we can define a covariant derivative using the right trivialisation.
In the examples above, to define a covariant derivative we really gave a directional derivative. But what is the minimal information required to specify a covariant derivative? Because covariant derivatives are local, we give the answer in a chart. Let be the coordinate vector fields. Then for each pair we have a vector field . This vector field must be able to be written
for some coefficients . These coefficients are called Christoffel coefficients, though be aware that some authors reserve this name for a special case. This is sufficient information to determine because
Example 3.30 (Polar Coordinates). Let us consider with . We see by comparison of its definition in Example 3.21 with the formula above that is zero for all points and all indices in the standard chart.
But let us compute it with respect to polar coordinates. By the definition of , we have to calculate in the coordinates. We have
Hence we can calculate
and hence in polar coordinates
The other six coefficients are calculated similarly.
Example 3.31 (Tangent Connection). Let’s calculate the Christoffel coefficients for a submanifold with the connection from Example 3.26 in some chart . Let be a parameterisation, a map from a chart to . Because the definition of uses the geometry of we need the pushforwards of the coordinate basis vectors. We use the notation . From the directional derivative definition of the pushforward map
Therefore the covariant derivative is
Finally to give the Christoffel coefficients, we write this vector in the coordinate basis . This requires solving some linear algebra problem.
Example 3.32 (Stereographic Projection). Let’s calculate the Christoffel coefficients for with the connection from Example 3.26 in the chart . This is a special case of the previous example. The immersion is the identity map, so the parameterisation is . There is only one coordinate vector field
The composition with is simply saying that we should express the coefficients in the variables of the chart. We prepare some calculations
You can do the orthogonal projection in the standard linear algebra way, but because this is the plane it’s easy to write down a vector perpendicular to . This leads to
Hence
Exercise 3.33 (Lee Lemma 4.4). Suppose that is a manifold covered by a single chart . Show that the set of covariant derivatives on is in one-to-one correspondence with the set of Christoffel coefficients. That is, show that every choice of functions gives a covariant derivative.
Exercise 3.34. Derive the transformation formula for between two charts. Observe that it is neither covariant nor contravariant.
Exercise 3.35 (Stereographic Projection). Repeat the calculation of the Christoffel coefficients from Example 3.32 for in the chart . The following formulas may prove useful. Here we have the pushforwards of the coordinate vector fields and combinations that align with longitude and latitude:
The derivatives are
and
With the derivatives in this form, you should be able to calculate the Christoffel coefficients easily. For example, from
we read that
For the other derivative of , the projection is trivial, and
And from the derivatives of we obtain:
We began Section 3.3 with the motivation that we want to compare different tangent spaces to one another and a thought experiment about walking around the Earth. Then we went on to define covariant derivatives. Now it is time to connect the two (pardon the pun).
Definition 3.36. Let be a manifold with a covariant derivative , a smooth curve and a vector field. We say that is parallel along (with respect to ) if at all points on the curve.
The inspiration of the name parallel is that the vectors of the vector field at different points are meant to be (in some sense) parallel to one another. Phrased different: we have a field of parallel vectors. Even though is not a vector field on , this is well-defined due to Lemma 3.27. Similarly, we really only need to values of along the curve to compute this condition, due to Lemma 3.28. Therefore many books build a theory of ‘vector fields on curves’. We will avoid this extra theory by assuming the main result: so long as the curve is injective and not pathological, every vector field on can be extended to a vector field on .
In a chart we have , so the condition becomes
where we treat the vector field as a function of , i.e. . Since and are specified, we treat this as a system of ODEs for the functions . By the uniqueness of solutions to ODEs, a parallel vector field is uniquely determined by its value at one point of the curve. On the other hand the existence of solutions to ODEs ensures that given a vector there exists a unique parallel field along with .
Let us make our thought experiment rigorous by using the tangent covariant derivative. We can expand the thought experiment in the following way: while we are walking around the world without turning, we are holding a stick. The stick represents a vector field along the curve of our journey. Suppose at the start of our journey, the stick is pointing south (recall we are facing east). As we walk east around the world, our stick will continue to point south. Thus we ask whether the vector field is parallel with respect to along the equator . Indeed it is, since is constant with respect to ,
Now what about the original thought experiment? This time as we walk around the world, let the stick point forward. Clearly, if we don’t turn, it should continue to point forward. In other words
This is not constant as a function into . Now when we compute
We see from the calculation that the derivative of along the curve points towards the center of the sphere, so when projected to the tangent plane it becomes zero. In summary, parallel transport by on the sphere matches our intuition of ‘walking without turning’. Of course there are many other covariant derivatives on the sphere, and with respect to them perhaps these two vector fields are not parallel.
Example 3.38 (3-Sphere). Let us consider the covariant derivative on from Example 3.29. We noted there that left-invariant vector fields have -derivative zero at any point and in any direction. Hence left-invariant fields are parallel along every curve in .
Conversely, suppose is parallel along . It follows from the definition of that is constant. In words, if we consider as a function of , ie and move the vectors to using the tangent map of the left action, ie , then this function is constant. Though we don’t have a formal definition, it is fair to say that is left-invariant along the curve.
The final observation for this example is that given any vector there is a unique left-invariant vector field with . Let . Then is the field. Therefore there is a unique way to parallel transport any vector to any other point of . Manifolds with this property are called parallelisable. It is equivalent to having a trivial tangent bundle.
In the above example, we encountered the idea of taking a vector at one point , finding a vector field with that is parallel along , and in particular calculating the parallel vector at another point . We call the parallel transport of along . This is a function called the parallel transport operator. Because the ODE is linear in , the parallel transport operator is linear: If is the parallel vector field with and is the parallel vector field with , then is also parallel and . The same idea works with scaling .
Some other properties of follow easily from its definition as the solution of an ODE. We have semi-group properties and . By the uniqueness of the solutions to ODEs, we have that is injective, and therefore an isomorphism of vector spaces. And so on.
Conversely, if one has the parallel transport operator for a curve , the we can recover the covariant derivative in the direction through the formula
Exercise 3.39. Prove the above formula. Hint: Take a basis of and parallel transport it along . As a reward for solving this exercise, you may now use the word connection for a covariant derivative.
Exercise 3.40. Argue that parallel transport with respect to on is . If we insert this into the above equation we obtain
Explain why this is the same formula as Example 3.29.
So intuitively the two approaches, covariant derivatives and parallel transport operators, are equivalent. The reason that it is difficult to start with parallel transport operators is that is tricky to characterise exactly when a set of linear functions between tangent spaces, one for every curve, correspond to a covariant derivative. Note our logic above: if we begin with a covariant derivative, then we have a parallel transport operator, and taking a limit we can recover the covariant derivative. But if you begin with a arbitrary set of operators, there is no guarantee that the limit will exist. You need to have some type of smooth dependence of on and . Further, what conditions should you impose on the dependence of on such if two curves are tangent at a point, the above limit produces the same result. Hopefully, these questions give you an appreciation of the difficulty involved.
Special mention should go to Appendix B in Sharpe, which does start with the classical idea of rolling a plane (or another space) around on a surface and shows how that gives various modern structures on the manifold.
In this section we discuss a quantity called torsion that is derived from a covariant derivative. There is a relation between the torsion of a connection and the torsion of a space curve, but we will not be explore it in this course3. Ultimately we will only be interested in covariant derivatives with zero torsion, so in a sense we are introducing it only to rule it out. Which brings us to the point: how should we motivate the definitions in this section without going deep into theory we will not use? We ask some natural questions and give some reasonable answers.
In euclidean space we have Schwarz’ theorem, also known as Clairaut’s theorem, that the partial derivatives with respect to different variables commute (for smooth functions among others). This result is embedded in the definition of the Lie bracket, where it was necessary to have the second order terms cancel. In fact sometimes the theorem is expressed as . So naturally we ask this question of the covariant derivative, but the answer is negative in general:
This leads to the following definition
Definition 3.41. We say that a covariant derivative is torsion-free (in some chart) if . Equivalently in terms of Christoffel coefficients, if at every point.
In this first definition, torsion of a covariant derivative is a measure of the non-commutativity of coordinate vector fields. It seems natural therefore that this should depend on the choice of chart as much as the covariant derivative. But if you have done Exercise 3.34, you may already know that if at a point in one chart then it also holds at that point in any overlapping chart. We will return to this idea shortly.
Example 3.42 (Euclidean Space). We have with one chart, and from Example 3.21. In Example 3.30 computed that the Christoffel coefficients are all zero. Thus this covariant derivative is torsion-free in this chart.
Example 3.43 (Stereographic Projection). In Exercise 3.35 you found all the Christoffel coefficients. Observe that they are symmetric in the lower two indices
This shows that the covariant derivative of is torsion-free on . Since the torsion is a continuous function, it must also be zero at the north pole.
We have the expectation that the coordinate vector fields should commute, or that this is a desirable property, but we do not have that expectation for general vector fields . We find
The meaning of this equation is that the ‘covariant derivative commutator’ of two vector fields is their Lie bracket plus a factor coming from the fact that the coordinate vector fields do not ‘covariantly commute’.
Definition 3.44. Given a covariant derivative , we define the torsion of two vector fields to be a third vector field
Remarkably the value of at any point only depends on , with the formula
The definition of is in terms of three vector fields , , and , so clearly is independent of charts. A covariant derivative is torsion-free if , and so this too is independent of charts. The second formula is just a rearrangement of the calculation preceding the definition. We say that the second formula is remarkable because although is defined using derivatives both of which depend on the local behaviour of vector fields, the torsion only depends on the pointwise values of the vector fields. Because the Lie bracket is an antisymmetric function of , so too is the torsion .
Example 3.45 (3-Sphere). In this example we show that the torsion of the covariant derivative on from Example 3.29 is non-zero. The trick is to not work with coordinate vector fields, but rather work with left-invariant vector fields. Let and likewise denote the left-invariant vector fields that are obtained by pushing forward . We have already noted in Example 3.38 that for any vector .
Further at any point is a basis for . This means that every vector field on can be written as
Thus have similar properties to the coordinate vector basis field, except that they do not come from coordinates. A set of vector fields with this basis property is called a frame, but we will not explore this concept in generality. In this frame, the covariant derivative can be reckoned with
Similarly the Lie bracket simplifies
Together this yields
Thus the torsion comes down to the Lie brackets of this frame.
For this example we will evaluate :
We can generalise this argument; set so that we can use index notation.
When , the quaternions commute and the bracket is zero (as expected). If they are not equal then the quaternions anti-commute. This gives and . (There is in fact a close relationship between the Lie bracket of and the cross product of ).
Example 3.46 (3-Sphere). We can also ask for the torsion of on . Of course we could do the same as the previous example, except using a right-invariant frame, and get a similar answer. But to make the two examples comparable, let us compute the torsion of using the left-invariant frame .
What changes about the calculation is that . Instead we must generalise the calculation from Example 3.29:
The covariant derivative of an arbitrary vector field is
Hence
Thus the torsion of is the negative of the torsion of .
Recall Exercise 3.24 that given one connection we can create another by the addition of a vector valued function . We can ask how the torsion of the new covariant derivative related to the torsion of the original. This follows easily, for ,
Purely algebraically, for any function of two variables we can split it into a symmetric and antisymmetric parts
If is already symmetric or antisymmetric, then it is just equal to its symmetric or antisymmetric part respectively and the other part is zero. Thus we can express the relationship of the torsions by the dictum “adding to a covariant derivative adds twice the antisymmetric part of to its torsion”. In particular, for any covariant derivative, we can absorb the torsion. This means we construct a new torsion-free covariant derivative .
Example 3.47 (3-Sphere). We have just seen in Examples 3.45 and 3.46 that with respect to the left-invariant fields the covariant derivatives are
(Aside: the formula on the right makes it seem as if and are equal. They are not in general, only for left-invariant vector fields. Remember: a covariant derivative has the product rule in , whereas the torsion is -linear.)
If we absorb the torsion on these two connections we get the torsion-free connection
This fits nicely with Corollary 3.25, because can also be understood as the average of the left and right covariant derivatives: . I’ll give you one guess what the stands for!
We have seen now that for a torsion-free connection that the coordinate vector fields will ‘covariant commute’ but general vector fields will not.
Definition 3.48. A smooth family of curves is a function . By smooth family we mean that it is smooth in both variables and . We typically think of the main curves of the family for fixed . But we also have the transverse curves, where we fix and allow to vary. We can write to emphasise this duality.
Therefore we have two vector fields: the tangents in the main direction and the tangents in the transverse direction. Well, this is not completely true as we do not really have vector fields because the curves may cross each other, giving multiple vectors at the same point. (Technically what we have is the pushforwards of two vector fields.) Regardless, for each value of it makes sense to ask how the derivative is changing in comparison to .
Lemma 3.49 (Mixed Derivatives). Let be a torsion-free covariant derivative and a smooth family of curves. Then .
Proof. This is a purely computational proof. In a chart, the tangent vectors are
Then
By the symmetry of the Christoffel coefficients for torsion-free covariant derivatives, these are equal. □
We should comment about why the expression is well-defined even though the tangents do not necessarily form a vector field. We know that the direction of depends only on the pointwise value, so this is no issue. And for we need to know its values along a curve in the direction of , but this is exactly the meaning of partial derivative. So understood correctly, these expressions are valid. This is an instance where a fleshed out notion of ‘vector field on a curve’ would have been more precise, but hopefully you see that not much has been lost by skipping this concept.
Let us once more return to the thought experiment of walking along the equator with our stick. We now understand that we are parallel transporting our stick. But consider the vector field . To push the metaphor into silliness, it is an telescoping selfie stick that is lengthening and shortening. The vector field always points south, but it is not parallel according to definition. If we write for , a known parallel vector field, then
This illustrates the point that parallel is about more than just direction, it also concerns length (which is unlike how we use the term in elementary geometry and linear algebra). Therefore, among the many covariant derivatives that exists on a Riemannian manifold, we are interested in those whose parallel transport preserves length and angle.
Let us now turn this intuition into a definition. Suppose is a Riemannian manifold with metric and that is a connection that preserves the lengths and angles of parallel transport vectors. For any curve , let be parallel fields along with respect to . This means that is a constant function along . For all smooth functions , we must have
On the other hand
Therefore we make the definition
Definition 3.50. A covariant derivative is called metric-compatible or a metric connection if for all vector fields
The choice to define this property using a third vector field instead of the tangent vector is purely a matter of style. The converse of the above argument is immediate: if are parallel along a curve then the right hand side is zero and thus is constant on the curve.
Example 3.51 (3-Sphere). We can show that the left and right covariant derivatives are compatible with the metric on coming from . Write vector fields and with respect to the left-invariant basis fields from Example 3.45. By the property of quaternions that we see that
since has unit length. In particular it is constant on all of . Additionally, the covariant derivatives of the are zero in every direction. Therefore, similar to the calculation before the definition, we have
This shows that is metric-compatible.
For we can reuse some of this calculation. What changes is that may not be zero. Instead . We need to prove a version of the cyclic property for the triple product (for vectors in we have ):
This allows us to write
This proves that is also metric-compatible.
It is useful to reduce the metric-compatibility condition to a condition on the Christoffel coefficients in some chart.
Lemma 3.52. Let a connection in some chart be described by the Christoffel coefficients . It is compatible with the metric if and only if
Proof. Notice that the formula for metric-compatibility is -linear in , so it enough to show it holds for each coordinate basis vector. The following calculation is a set of equivalences:
In other words, a covariant derivative is metric-compatible if and only if its Christoffel coefficients satisfy (3.53). □
The above equation seems to say that metric-compatibility is a rather strong condition. We know that there are choices of smooth functions for the Christoffel coefficients, and counting the possible values for gives conditions. It is almost enough to guarantee uniqueness, but not quite, because the we get the same condition if we swap and . However, metric-compatibility and torsion-free are enough to ensure uniqueness. This result is given a rather impressive sounding name, though sometimes it is called a theorem and other times a lemma. We have our cake and eat it too:
Theorem 3.54 (Fundamental Lemma of Riemannian Geometry). On every Riemannian manifold there exists a unique metric-compatible torsion-free covariant derivative.
Proof. Our strategy for the proof is as follows. First we will establish the so-called Koszul formula. Uniqueness is then a direct consequence. To prove existence we will show that the Koszul formula defines a torsion-free metric-compatible covariant derivative in every chart. Since we already have uniqueness, we can conclude that these give a well-defined covariant derivative on the whole manifold.
The idea of the Koszul formula is to use the symmetries of the metric and the Lie bracket to get an expression with exact one covariant derivative. Begin with the metric-compatibility property and then use the fact that torsion is zero:
Now write this equation two more times with the vector fields permuted
Notice that of the six possible permutations, only , and occur. This is a result of using the torsion-free property. Each of the three covariant derivatives occurs twice. Now, add any two equations and subtract the other. We will add the second and third and subtract the first, but it’s not important which you choose.
If you like, you can clean this up a little, though the role each of the vector fields play in is different, so there cannot be perfect symmetry in the formula. Here is a version I like:
This is the Koszul formula. Since the metric is non-degenerate, what it shows is that if there is a metric-compatible torsion-free covariant derivative then it can be calculated purely in terms of Lie brackets and inner products. Therefore we have established uniqueness.
For existence, it is possible to take the Koszul formula as the definition and directly check all the required properties. It is easier however to first reduce the Koszul formula to an expression in charts. Choose any chart and suppose that are coordinate vector fields. The Lie brackets are zero. We get
If we view the left hand side in matrix notation rather than index notation, we see that to solve for we need to invert the matrix . There is a sneaky convention that the components of the inverse matrix use upper indices . With this convention, the fact that these matrices are inverse can be written . In index notation, multiplying by the inverse matrix looks like . Thus we can write
So given a metric , define a covariant derivative on this chart using this formula for the Christoffel coefficients. Exercise 3.33 tells us that this does indeed define a covariant derivative on this chart, but it remains to show that it is metric-compatible and torsion-free. Torsion-free is an easy because the above formula is symmetric in and . Using the Christoffel coefficients defined through the Koszul formula, we see that the condition of Lemma 3.52 satisfied:
Therefore the covariant derivative that we have defined in each chart is metric-compatible and torsion-free. As mentioned at the outset of the proof, it only remains to show that this definition in each chart agrees, but this follows due to uniqueness. □
We celebrate this result with more terminology. It honours the Italian mathematician Tullio Levi-Civita, who developed much of the ‘tensor calculus’ (covariant, contravariant, indices, etc). His name tricks many students (myself included) into thinking there are two mathematicians Levi and Civita. In response to being asked what he liked best about Italy, Einstein once said “spaghetti and Levi-Civita”.
Definition 3.56. The unique metric-compatible torsion-free covariant derivative on a Riemannian manifold is called the Levi-Civita connection or the Riemannian connection.
Example 3.57 (Euclidean Space). On any open subset of with the dot product as metric, the Levi-Civita connection is from Example 3.21. Because its Christoffel coefficients are identically zero, obviously it is torsion-free and satisfies Equation (3.53) so is metric-compatible.
Example 3.58 (3-Sphere). A corollary of Lemma 3.52 is that the set of metric-compatible covariant derivatives has a affine structure. Corollary 3.25 shows us that the affine combination of covariant derivatives is again a covariant derivative. The Christoffel coefficients of the new covariant derivative is the same affine combination of Christoffel coefficients . Inserting this into Equation (3.53) shows that such an affine combination is also metric-compatible.
We saw in Example 3.51 that both and were metric-compatible. Therefore is metric-compatible. Additionally, we proved in Example 3.47 that it is torsion-free. Therefore really is the Levi-Civita connection for .
Another obvious example of a Levi-Civita connection would be on . Instead of proving it for the specific case, instead we generalise the construction to any Riemannian immersed submanifold.
Definition 3.59 (Tangent Connection). Let be an Riemannian immersed submanifold. We identify the manifold with its image under the immersion to simplify the statement. Let be the Levi-Civita connection of . We define the tangent connection on to be the covariant derivative
This is definition extends the previous definition from Example 3.26 because is the Levi-Civita connection of .
Proof. The proof that it is in fact a covariant derivative is entirely similar to the corresponding statement in Example 3.26. We check the three properties of a covariant derivative:
using that is already tangent to . You might observe that this part of the proof works for any covariant derivative and that we have not yet used the metric-compatibility or torsion-free of .
For torsion-free, we need to know that if are tangent to that is too, even when we consider them as vector fields on . To prove this fact requires a proper investigation of submanifolds, and the construction of a special chart on that aligns with a chart on . This is beyond the scope of this course, which has tried to avoid manifold theory as much as possible. We have seen an example of this phenomenon though: in Example 3.45 the Lie bracket of the fields was again an field. Assuming this result,
is zero. The generalised statement for arbitrary connections would be that the tangent connection is torsion-free iff the torsion of is perpendicular to at every point of . Though we hadn’t defined it, one could also say iff lies in the normal bundle of .
Lastly, we need to show that the tangent connection is metric-compatible. This is where we need to use that is Riemannian immersed, so that the metric on and the metric on agree for tangent vectors to .
To explain the working here a little, for any vector in we can split it into a part in and a part perpendicular to . Because , the inner product of with a vector perpendicular to is zero. Thus we can go from the first to the second line.
Now we know that is a metric-compatible torsion-free covariant derivative on . By the uniqueness in Theorem 3.54, it is the Levi-Civita connection. □
To close the chapter, we revisit the question “why torsion-free”? Our first answer was that it is a natural expectation, based on the commutativity of partial derivatives in the euclidean setting. Our second answer is that torsion-free is a matter of convenience: