Chapter 2
Manifolds

In this chapter we give the ‘patchwork’ definition of manifolds. Manifolds are geometrically nice spaces and a natural generalisation of n. The most common way to define a manifold is as a special type of topological space, namely a ‘second-countable Hausdorff locally-euclidean topological space with an atlas’1. Because students usually encounter differential geometry before abstract topology, the lecturer then gives a speed run of all the definitions in topology. I think this approach is better suited to the second time you encounter manifolds. Then you already know a little bit about what makes manifolds nice, and you can appreciate the interesting but weird topological spaces that need to be excluded from the definition. The standard approach in effect defines a manifold by saying what a manifold isn’t. In the approach below we avoid defining general topological spaces and instead use concrete gluing construction of open sets of n. After this construction we will still need to impose certain conditions, so topology cannot be avoided completely, but hopefully they are suitable for a new-commer to manifolds.

Before we dive into theory, we define a concept that you probably know but have never had a word for. A partial function from X to Y is a function from a subset S X to Y . In the context of partial functions, a function with S = X is called a total function. Many common functions are really partial functions: 1 x and x are partial functions from to , with S being respectively {0} and [0,). There doesn’t seem to be good standard terminology to talk about X and S, though S is often called the natural domain. Let’s call X the source of the function and S the domain, with the symbols src f = X and dom f = S. You are no doubt familiar with the difference between the codomain codom f = Y (also called the target) and the image img f = f[S] (also called the range). We will use f : X Y for partial functions (harpoon arrow), in contrast to f : X Y for total functions.

Many students in Analysis I are confused about the relationship between injective and surjective, and those students are correct to be confused. Just as surjective means that the image is equal to the codomain, total means that the domain is equal to the source; they are the true counterparts to one another. In fact a partial function that is injective has an inverse partial function. If f : X Y is injective then

f1(y) = x if y = f(x)

is a perfectly valid definition of a partial function f1 : Y X with dom f1 = img f and img f1 = dom f.

Besides inverses, many of the usual definitions for functions carry over with sensible modifications. For example, partial functions from X to Y are equal if they have the same domains and are equal on all inputs. Likewise the composition of two partial functions f : X Y and g : Y Z is a partial function g f : X Z, but the domain of g f will be smaller than of the domain of f if the image of f lies partly outside the domain of g. We should also think about the ‘empty’ partial function. According to the set theory definition of a function, there is exactly one function from the empty set to any other set. This is usually considered a quirk in the definition. But the composition of partial functions where img f dom g = results in the empty function, so we cannot ignore it.

Remark 2.1. In this course, functions and partial functions will always have open sets as their sources and their domains will be open subsets, unless specifically noted otherwise. This is needed so that derivatives can be defined.

2.1 Manifolds

Fix a dimension n. To keep track of the pieces we will glue together, let us introduce an index set I. In our examples this will usually be a finite set, but we do not make this assumption generally. For every i I let Ui be an open subset of n. Ui is called a chart. These are the pieces we will glue together.

For two charts, we describe how to glue them together using a partial function. A gluing function φij : Ui Uj is a partial function that is a homeomorphism from its domain to its image. This means it is a homeomorphism (bijective, continuous, continuous inverse) between open subsets V i Ui and V j Uj. The idea is that the point x V i is glued to φij(x) V j. Note the order of the subscripts: “from i to j”. The points of Ui V i are not glued to any points of Uj. Other names for gluing functions include ‘transition functions’, ‘change of coordinate functions’, and ‘overlap functions’. We allow here the possibility that V i = V j = and φij : is the empty function, this represents the situation that Ui and Uj are not glued together at all. To avoid having too many named sets, we will mostly use dom φij instead of V i.

This information tells how to glue the pieces together, but how should we represent the completed glued object? First we define the disjoint union

iIUi = {(i,x)i I,x Ui},

which is a set of pairs. We think of this as saying that even if a point is common to both Ui and Uj, in the disjoint union we consider it as two separate points. For example, if U1 = (1,1) and U2 = (0,2) then the normal union is U1 U2 = (1,2) but the disjoint union is two intervals. We often do not write the index i if it is clear, and even when it is not clear we tend to write it as a subscript. Continuing the example, U1 U2 has two points that might both be called 0.5, namely 0.51 U1 and 0.52 U2. Formally these points should be written (1,0.5) and (2,0.5) respectively.

We want to create an equivalence relation on the disjoint union of all the charts such that x Ui y Uj iff φij(x) = y. If this is to be an equivalence relation, the set of gluing functions is required to have certain properties. To get reflexivity of , we need φii = id Ui. Symmetry of the relation holds if and only if φji = φij1. These are simple enough, but expressing the condition for transitivity is more difficult. The usual way to express the transitivity condition is that y x,x z y z, but if we have symmetry then this is equivalent to x y,x z y z. That means for all x dom φij dom φik we need y = φij(x) dom φjk and z = φik(x) = φjk(y). In the language of partial functions

φik = φjk φij.

If a set of gluing functions have these three properties, and thus defines an equivalence relation, we say that fulfill the cocycle conditions.

Definition 2.2. An atlas is a tuple A = (n,I,{Ui}iI,{φij}i,jI), where {φij}i,jI fulfills the cocycle conditions. We have included n in our definition of atlas to make all the charts have the same dimension (some authors allow spaces with a mix of dimensions). We call M = Ui the glued space.2 Each ‘point’ in M is an equivalence class of points in different Ui.

Example 2.3 (Euclidean Space). Take any open subset U n. We can construct the trivial atlas for U as follows. Let the index set I = {0} and U0 = U the only chart. Then φ00 = id U is a gluing function. The cocycle condition is fulfilled, so (n,I,{U},{φ00}) is an atlas. The corresponding equivalence relation is the weakest equivalence relation on U, namely x is equivalent to itself but no other points. Therefore we say M = U.

Example 2.4 (Polar Coordinates). Next we consider an example with two charts: I = {1,2}. Consider the plane U1 = 2 and the half-strip U2 = (0,) × (π,π) 2. To give an atlas it is sufficient to describe φ21, because the cocycle condition requires φ11 = id U1 and φ22 = id U2, as well as φ12 = φ211. Set

φ21 : U2 U1,(r,𝜃)(rcos 𝜃,rsin 𝜃).

This is a homeomorphism from U2 to img φ21 = 2 {(x,0)x 0}. Therefore A = (2,I,{U1,U2},{φ11,φ12,φ21,φ22}) is an atlas. This example shows why φij is sometimes called a change of coordinates function.

Every point of U2 is glued to some point of U1, but not every point of U1 is glued to some point of U2. This leads to the points of the glued space M are of two types: either they are an equivalence class with two points {(rcos 𝜃,rsin 𝜃),(r,𝜃)} for (rcos 𝜃,rsin 𝜃) U1 and (r,𝜃) U2, or they are an equivalence class with a single point {(x,0)} for (x,0) U1 with x 0. Every equivalence class in M contains an element of U1, so we might say casually M = U1.

Example 2.5 (Glued Circle). Consider n = 1, I = {1,2} and U1 = (1,1),U2 = (1,1). As in the previous example, it is sufficient to give φ12. Set V 1 = U1 {0} and V 2 = U2 {0} and give the gluing function φ12 : V 1 V 2 by the formula

φ12(x) = { x + 1for x (1,0), x 1 for  x (0, 1).

For example, this tells us that we should glue 0.51 U1 to φ12(0.5) = 0.52 U2 and 0.31 U1 to φ12(0.3) = 0.72 U2. Here the glued space M is a circle, which you can see by cutting two strips of paper of the same length, drawing a number line from 1 to 1 on each of them, and then gluing as instructed. Every point of M is equivalent to a point of U1 or to 02.

Example 2.6. Consider everything the same as in Example 2.5, but this time give φ12 the formula

φ12(x) = x,x (1,0) (0,1).

The space M is a called an interval with two origins. This is because every point of M is either {x1,x2} for x0, or 01, or 02.

The interval with two origins might seem like a harmless curiosity, but in fact it is a weird topological space that we want to avoid. Let us say that a sequence pn in M converges to a point p if the sequence ϕi(pn) converges to ϕi(p) in the chart Ui. In Example 2.6 consider the sequence n1 for n +. We can view this sequence in U1 or U2. In U1 it has the limit 01, but in U2 is has the limit 02. Therefore in M this sequence has two different limits!

There are other ways that this space is badly behaved. It is not a metric space, because the distance between the two origins is zero. So although every point has a neighbourhood that is homeomorphic to an open subset of euclidean space (locally euclidean), it is not really like euclidean space. We therefore want to exclude such pathological spaces. It turns out that the above example is the only way a gluing can cause non-unique limits.

Lemma 2.7 (Non-unique Limits). A glued space M has non-unique limits if and only if there is a sequence xn dom φ12 U1 such that xn converges to a point x U1 dom φ12 and φ12(xn) converges to a point y U2, for some charts U1,U2.

Proof. Suppose first we have a sequence xn dom φ12 U1 such that xn converges to a point x U1 dom φ12 and φ12(xn) converges to a point y U2. By definition, x and y are both limits of the sequence xn. But x is outside the domain of the gluing function, so by definition is it not glued to any point of U2. Therefore x and y are distinct points in M.

Conversely suppose a glued space M has a sequence with two distinct limits. In euclidean space limits are unique, so if there are two limits then they must come from two charts U1 and U2. Let’s use the notation xn,x U1 and yk,y U2 with yn = φ12(xn). Suppose x were in the domain of the gluing function. The gluing function by definition is a continuous function, so we would have φ12(x) = lim φ12(xn) = lim yn = y. But φ12(x) = y means exactly that x y and this contradicts our assumption that the limits are distinct in M. Therefore x is not in the domain of φ12, but it is the limit of a sequence in the domain. We have shown that x U1 dom φ12. □

Definition 2.8. We say that an atlas has the unique limit property if it does not not satisfy the condition of Lemma 2.7. That is, if for every i,j I and every sequence xn dom φij that converges to some x U1 dom φij, the sequence φij(xn) does not converge in U2.

Observe why this is not an issue with Example 2.5. Consider the sequence xn = n1 in U1. It has the limit 01. On the other hand, the sequence yn = φ12(xn) = n1 1 converges to 1 in , but 1 is not in U2. Therefore this ‘other’ limit point is not in the space M.

The other way that gluing can produce a topologically bad space is if we glue too many charts together. We will not provide an example of this; interested students may search for the ‘ long line’ or the ‘long ray’, which are standard examples of this phenomenon.

Definition 2.9. A manifold is an atlas with the unique limit property and such that the index set is countable.

Exercise 2.10. For students who know topology: show that this is equivalent to the standard definition of a manifold.

It is very common to talk about the glued space M as the manifold without explicitly stating the atlas. This is similar to talking about a vector space as the set of vectors, when in fact it is the operations of addition and scalar multiplication that make a vector space interesting.

There are different sorts of manifolds, based on additional conditions on the gluing functions. We will use the notation that a function is C when it is -times continuously differentiable. By convention, C0 means that the function is continuous, and C means that the function is smooth. An atlas (or a manifold) is called C when all of the gluing functions are C. Probably the most common type of manifold that is studied, and the one we will study in this course, are smooth manifolds. Henceforth, when we say manifold we mean smooth manifold.

Because we have given a non-standard definition of manifolds, we should explain how this compares to the standard definition. We do this using the example of a sphere and stereographic projection, which seems to be the first non-trivial example in every book on manifolds.

Example 2.11 (Stereographic Projection). The circle is commonly defined as the set 𝕊1 = {p 2p = 1}, and we name the north pole N = (0,1) and the south pole S = (0,1). If one draws chooses a point p other than the north pole, and considers the line through p and N, there is a unique intersection point ϕN(p) on the x-axis. This is a bijective function ϕN : 𝕊1 called stereographic projection. A nice geometry exercise using similar triangles gives the formula

ϕN : (p1,p2) p1 1 p2.

Likewise stereographic projection from the south pole is

ϕS : 𝕊1 {S} ,(p1,p2) p1 1 + p2.

Notice that these constructions are ill-defined when applied to the pole itself, because a single point does not determine a line, so it is naturally a partial function on the circle.

The inverses of the these functions,

ϕN1 : 𝕊1 {N},x 1 x2 + 1(2x,x2 1),

and

ϕS1 : 𝕊1 {S},y 1 y2 + 1(2y,1 y2),

are regular parametrisations of (parts of) the circle in the sense of Section 1.1. So given a point p on the sphere, we can apply ϕN to get a point x in , called its coordinate with respect to ϕN, and putting the coordinate into the parameterisation ϕN1 gives back the point p. If we know the coordinate x with respect to ϕN then y = ϕS ϕN1(x) is the coordinate with respect to ϕS. For this reason ϕS ϕN1 : {0} {0} is the change of coordinates function. In this example, one can calculate

ϕS ϕN1(x) = ϕ S ( 2x x2 + 1, x2 1 x2 + 1 ) = 2x x2+1 1 + x21 x2+1 = 2x x2 + 1 + x2 1 = 1 x.

Notice that it is not possible to cover the circle by a single regular parameterisation (parameterisations must be injective). Thus to deal with spaces generally we must consider multiple overlapping parameterisations and change of coordinate functions.

In this example we have presented the standard approach to manifolds through coordinate functions and parameterisations. Let us connect it now with our definition. The index set is I = {N,S}. The image of the coordinate functions are the charts, UN = and US = . And the gluing function from UN {0} to US {0} is φNS(x) = x1. This data forms an atlas.

But what does this atlas have to do with the circle? We indeed see that the equivalence classes of points are S = {0N},N = {0S}, or {x,y} for x UN {0} and y = x1 US {0}. That is, there are in one-to-one correspondence with the points of 𝕊1. The glued space M is a space that has all the same points as 𝕊1 and the same topology, but it is not a subset of a euclidean space. Whatever we can describe on M must be intrinsic to the circle, independent of its relationship to 2.

Exercise 2.12 (Stereographic Projection). Repeat the above construction for the n-sphere 𝕊n = {p n+1p = 1}. We choose3 our coordinates on n+1 such that the north pole is N = (0,,0,1) and the south pole is S = (0,,0,1). Show that stereographic projection have the formulas

ϕN : 𝕊n {N} n,p 1 1 pn+1(p1,,pn), ϕS : 𝕊n {S} n,p 1 1 + pn+1(p1,,pn),

and inverse stereographic projection is

ϕN1 : n 𝕊n {N},x 1 x2 + 1(2x1,,2xn,x2 1), ϕS1 : n 𝕊n {S},y 1 y2 + 1(2y1,,2yn,1 y2).

Finally, show that the transition function is

φNS(x) = ϕS ϕN1(x) = x2x.

In the standard definition of a manifold you begin with M and coordinate functions ϕi : M n. The procedure in Example 2.11 to go from this data to our definition of a manifold is fully general: define the charts to be Ui = img ϕi and the gluing functions as φij = ϕj ϕi1.

Conversely, if you begin with a manifold in our sense then M is the glued space Ui . There is of course the canonical projection πM : Ui M that sends every element to its equivalence class, but this is too coarse. We call the restriction Φi := πM|Ui : Ui M a parameterisation of πM[Ui] M. It sends a point of Ui to its equivalence class in M. We really should think of this as a parameterisation because an ordinary set Ui in euclidean space is describing part of a complicated object M. In the other direction ϕi := (Φi)1 : πM[Ui] Ui n is the coordinate function. It sends an equivalence class to its representative that lies in Ui. We usually write ϕi1 for the parameterisation rather than Φi, as it is unnecessary to have two symbols. Thus the two approaches to manifolds give equivalent information.

Let us summarise our terminology and the relations between the objects. Charts are open sets of n. An atlas is set of charts and gluing functions with the cocycle property. A gluing function can also be called a change of coordinate function or a transition function. The information of an atlas allows us to glue the charts together to get a manifold M. Functions from the charts to the manifold are called parametrisations and functions from the manifold to the charts are coordinates. The composition of a parameterisation and a coordinate is a transition function. One of the drawbacks of differential geometry being an old and widely practised field is that notation and terminology has been around for a long time and is not completely standardised. Different authors use the word chart to describe ϕi,ϕi1,Ui, or ϕi1[Ui].

2.2 Functions

Next we want to define functions between manifolds. On one hand, there is nothing to do. If we have manifolds M and M~, they are sets, and a function f : M M~ is defined in the normal way. But manifolds are more than sets of points, they have atlases. Let

A = (n,I,{Ui}iI,{φi,j}i,jI), A~ = (n~,I~,{U~i}iI~,{φ~i,j}i,jI~)

be atlases for M and M~ respectively. Then we can look at f ‘in charts’. This means we look at the partial functions

fik := ϕ~k f ϕi1 : U i U~k for i I and k I~.

These are functions between euclidean spaces, so we can ask whether they are C.

Definition 2.13. A function f : M M~ is C at a point p M if there is a chart Ui ϕi(p) and chart U~k ϕ~j(f(p)) such that fik is C, and it obeys a further technical condition.4 We say that f is C on M if it is C at every point of M.

The definition of C uses a chart. But we know that a point may belong to more than one chart. This opens the possibility that f is C at p according to one chart, but not C according to another chart. However, because of the relation

fjl = ϕ~l f ϕj1 = (ϕ~ l ϕ~k1 ϕ~ k) f (ϕi1 ϕ i ϕj1) = φ~k,l (ϕ~k f ϕi1) φ j,i = φ~k,l fik φj,i,

the definition does not depend on the which charts are used.

Example 2.14 (Euclidean Space). We have already seen that open subsets of euclidean space are manifolds in a particularly simple way, namely the coordinates and parametrisations are the identity function. Therefore a function in charts is the same thing as a function. This shows that a function between euclidean spaces is C according to the manifold definition if and only if it is C according to the ordinary definition.

Example 2.15 (Stereographic Projection). Consider the circle 𝕊1 and the function f : 𝕊1 given by f(p1,p2) = p2. This is the height function. We can look at this function in charts

fN0 : UN fN0(x) = ϕ~0 f ϕN1(x) = id f ( 2x x2 + 1, x2 1 x2 + 1 ) = x2 1 x2 + 1,

and likewise

fS0 : US fS0 (y) = 1 y2 y2 + 1.

It may help to understand this if we calculate it for a few points. Consider the south pole S = (0,0,1), which has a height f(0,0,1) = 1. In ϕN-coordinates the south pole is 0 UN and fN0(0) = 1 as expected. Now consider the point p = (0.65,0.76) 𝕊2. Clearly is has f(p) = 0.76. In ϕN-coordinates it is 2.7 UN and in ϕS-coordinates it is 0.37 US. So then we compute

fN0(2.7) = 7.3 1 7.3 + 1 = 0.76,fS0(0.37) = 1 0.137 0.137 + 1 = 0.76.

In conclusion, f in charts is nothing other than a manipulation of the formula for f to use coordinates; it gives the same result.

Perhaps not unexpectedly, this is a smooth function in the sense of manifolds because both fN0 and fS0 are smooth functions in the usual sense.

Example 2.16 (Glued Circle). We can give a function f that embeds M into euclidean space 2. It is easier to write the formulas in charts. We define

f11(x) = (cos πx,sin πx), f21(y) = (cos π(y + 1),sin π(y + 1)).

We can see that this is well-defined in the following way. Take a point x dom φ12. Then

f21(φ12(x)) = (cos π(φ12(x) + 1),sin π(φ12(x) + 1)) = { (cos π(x + 1 + 1),sin π(x + 1 + 1)) for x (1,0) (cos π(x 1 + 1), sin π(x 1 + 1) )  for  x (0, 1) = { (cos (πx + 2π),sin (πx + 2π)) for x (1,0) (cos πx, sin πx )  for  x (0, 1) = (cos πx,sin πx) = f11(x).

What we have shown is that if x U1 y U2 then f11(x) = f21(y). This means that it doesn’t matter which chart you use, the result is the same. In other words we have defined a function f that doesn’t depend on charts, f is defined on M.

This example shows us how to embed Example 2.5 into 2 to get what we would normally think of as the circle. But please keep in mind that manifolds are defined as the gluing of charts; they are defined as an abstract space that does not need live in a bigger space. There are many ways to embed the circle into euclidean space. Even when we define a manifold starting with a subset of euclidean space, we leave the embedding behind.

2.3 Vectors

Our definition of manifolds makes it easy to define (tangent) vectors and vector fields. At any point x U1 n we have the tangent vectors {(v1,,vn) n} and a vector field on U1 is a function X : U1 n. To make this into a definition on a manifold, however, we need a way to make this independent of the chart. Alternatively, we need a way to compare vectors that are defined using different charts. There are essentially two equivalent ways to do this: using curves and using directional derivatives. We first give an example using curves as motivation.

Example 2.17 (Polar Coordinates). Consider the plane 2 with cartesian coordinates (x1,x2), but also polar coordinates (y1,y2) = (r,𝜃), as in Example 2.4. The transition function from polar to cartesian is φ(r,𝜃) = (rcos 𝜃,rsin 𝜃). We consider vectors at the point p with (r,𝜃) = (12,π4), which is equivalent to (x1,x2) = (0.5,0.5).

Let α(t) = (t + 12,π4), a curve in U2 with α(0) = p. In the U2 chart it is a horizontal line with tangent vector α(0) = (1,0). But we can also view this curve in the U1 chart. It has the formula

β(t) = φ21(α(t)) = ((t + 1 2)cos π 4 ,(t + 1 2)sin π 4 ) = ( 1 2t + 1 2, 1 2t + 1 2) .

At p it has the tangent vector β(0) = (12,12).

But these two charts represent the same points of manifold; the curves α and β have the same points under the gluing equivalence relationship. Likewise we should think of the two tangent vectors, v = α(0) = (1,0) at (12,π4) in the U2 chart and w = β(0) = (12,12) at (0.5,0.5) in the U1 chart, as equivalent.

Both methods have the same setup. Let U1,U2 n be two charts and φ : U1 U2 the transition (we leave off the subscripts for this explanation to simplify notation). Let x be a point in dom φ, y = φ(x), and v = (v1,,vn) be a vector on U1 and w = (w1,,wn) be a vector on U2.

Curve method: Consider the curve α(t) = x + vt U1. This curve has α(0) = x and α(0) = v. Using the transition, we also have a curve β = φ α in U2 with β(0) = φ(x) = y. The idea is that v in the first chart is transformed to w = β(0) in the second chart. Using the chain rule

w = β(0) = (J xφ)α(0) = (J xφ)v,

where Jxφ is the matrix of partial derivatives of φ, also called the Jacobian matrix, evaluated at the point x.

Directional derivative method: Take any function h : U2 . We can use the transition function to write h as a function on U1, at least near x, namely h φ. We compute the derivative in the direction of v at the point x and apply the chain rule

j=1nvj(h φ) xj (x) = i,j=1nvj ∂h yi(φ(x)) yi xj(x) = i,j=1nvj ∂h yi(y)(Jxφ)ij = i=1n(J xφv)i ∂h yi(y).

We see that the derivative of h φ at x in the direction of v is equal to the derivative of h at y in the direction w = Jxφv. From both of these methods we get the same answer:

Definition 2.18. The vector v at x U1 is equivalent to the vector w at y U2 when w = Jxφv. This formula is called the change of coordinates for vectors. We denote the set of equivalence classes of vectors at a point p M by TpM, called the tangent space to M at p.

Because of the second method, it is common to write a tangent vector as i=1nvi xi | p. This notation has two advantages. First that the change of coordinates is built into the notation via the chain rule, as above. And second, if we leave out the point at which we should evaluate the derivative then xi is a vector field. In fact it gives a basis of the vector fields. Every vector field on U1 can be written as a function

X : x i=1nvi(x) xi

for functions vi : U1 . Because vector fields thought of this way can be evaluated at a point to give a tangent vector as well as act on a function, the notation X(x) is potentially ambiguous. We will use X|p for evaluation and X(f) for action on a function.

Example 2.19 (Stereographic Projection). Let’s see how the coordinate vector fields transform between the two charts of stereographic projection on 𝕊1. In particular y = φNS(x) = x1 has the Jacobian matrix

= ( ∂y ∂x ) = ( x2 ) .

Therefore

∂x = ∂y ∂x ∂y = x2 ∂y = y2 ∂y.

We should note that this equality, which is really equivalence of vectors as per Definition 2.18, holds on the overlap between the two charts, namely away from the north and south poles. In particular, the vector ∂x is not defined at the north pole, this expression only has meaning on the chart UN. If we were to define a vector field by the formula

X = { ∂x for x UN, y2 ∂yfor y US

then this is a well-defined vector field on all of 𝕊1, because it gives a vector at every point and on overlaps the two cases give equivalent vectors. Observe that X|y=0 is the zero vector.

On the other hand, if we consider the vector field x3∂x on U1 then it corresponds to the vector field

x3 ∂x = (y3) (y2 ∂y ) = y1 ∂y

on U2 {y = 0}. This has no continuous extension to all of 𝕊1.

The main lesson of this example is that what looks like the ‘same vector’ at two different points in one chart look like completely different vectors in another chart. From the above example, ∂x | x=1 and ∂x | x=2 look the same in the UN chart, but in the US chart these two vectors are

∂y |y=1and 1 4 ∂y |y=0.5

respectively. This is because the Jacobian Jxφ has a dependence on x, on the point in the manifold. Hence the equivalence relation is different at different points. The consequence is that there is no easy way to identity tangent vectors at different points of a manifold. We will examine how this problem can be overcome in the next chapter, which concerns ‘connections’ between different points of a manifold.

Example 2.20 (Stereographic Projection). Let us continue Example 2.19 to see that equivalent vectors give the same directional derivatives. Consider how the vector field

X = { ∂x for x UN, y2 ∂yfor y US

acts on the height function f from Example 2.15. In the chart UN we have

(Xf)N = fN ∂x = ∂x (x2 1 x2 + 1 ) = 2x(x2 + 1) 2x(x2 1) (x2 + 1)2 = 4x (x2 + 1)2.

Likewise in the chart US we have

(Xf)S = y2fS ∂y = y2 ∂y (1 y2 y2 + 1 ) = y2 ( 4y (y2 + 1)2 ) = 4y3 (y2 + 1)2

So in the UN chart for the point x = 0.5 we have that the function f is changing in the direction X by (Xf)N(0.5) = 1.28. And in the US chart for the corresponding point y = 2 we have that the function f is changing in the direction X by (Xf)S(2) = 1.28. The directional derivatives in the two charts agree. This is a demonstration that everything is all well-defined.

Example 2.21 (Glued Circle). Because the Jacobian of the transition function is just the constant 1, the following is a well-defined vector field:

X = { ∂xfor x U1, ∂yfor y U2

We can act this vector field on the function f : 𝕊1 defined in charts as

f11(x) = cos πx,f21(y) = cos π(y + 1),

which is just the first component of the function from Example 2.16. We get

(Xf)11(x) = ∂xcos πx = πsin πx,(Xf)21(y) = ∂ycos π(y + 1) = πsin π(y + 1).

This is a well-defined function on 𝕊1, which we know because it is π multiplied by the second component from Example 2.16.

These two examples illustrate that a vector field X applied in a chart to a function fi is another function (Xf)i. The fact that we used the chain rule to define equivalence of vectors is precisely the condition to ensure that the resulting functions in charts piece together to give a well-defined function Xf on the whole manifold.

If we have a function between two manifolds f : M M~, the tangent map or pushforward of f at p is a function between the tangent spaces Tpf : TpM TpM~. We will define it in two ways. The easiest definition is using the curve method of vectors. If we have a vector v TxM then in a chart Ui the curve α(t) = x + vt is a representative of this vector. Observe that f α : (a,b) M~ is a curve in M~. We define Tpf(v) to be the tangent vector of the curve f α at t = 0.

We also give a practical formula for calculating the tangent map. The curve f α must lie in some chart U~j of M~. In charts we have αi : (a,b) Ui and fij : Ui M~j. So the tangent vector of the curve is

d dtfij(α(t))|t=0 = (Jα(t)fij)α(t) | t=0 = (Jα(0)fij)α(0) = (J xfij)v,

using the chain rule and the fact that α(0) = x, α(0) = v. Thus we see in charts that the tangent map is the Jacobian of fij.

We could have also given a definition of the pushforward based on the directional derivative idea. If w = Tpf(v) Tf(p)M~ is the pushforward of a vector v TpM, then we can ask how w acts on a function g : M~ . The observation is that g f : M and

w(g) = Tpf(v)(g) = v(g f).

Example 2.22 (Stereographic Projection). Consider the inclusion map ι : 𝕊1 2. As a formula it is just identity, but we consider it as a function between manifolds. We can ask how the vector ∂x |0 TS𝕊1, a coordinate vector of the UN chart, is pushed forward by this map to a vector in T(0,1)2, since 0 UN = S 𝕊1 is mapped to ι(S) = (0,1). Let us calculate the tangent map in the chart UN. As per above, this is Jacobian of ιN0 = id 3 ι ϕN1 = ϕN1:

JxϕN1 = ( (ϕN1)1 ∂x (ϕN1)2 ∂x ) = ( ∂x ( 2x x2+1 ) ∂x (x21 x2+1 ) ) = 2 (x2 + 1)2 ( x2 + 1 2x )

and at x = 0 we have

J0ϕN1 = ( 2 0 )

In coordinates, the vector ∂x |0 is just 1, so its pushforward in coordinates is

(TSι)(∂x |0) = (J0ϕN1)(1) = ( 2 0 ) ( 1 ) = ( 2 0 ) = 2 p1 | (0,1) + 0 p2 | (0,1).

Note that this vector is tangent to 𝕊1 in the usual sense.

An important observation should be made about the previous example. We were careful to stress that the tangent map was applied to a vector at a particular point and not a vector field. That is because the tangent map does not, as students sometimes assume, take vector fields to vector fields. There are two ways it fails to do this. We could easily generalise this example to ∂x at any point of UN, which would then give a vector on 2. But not every point of 2 gets a vector. Only the points in the image of ϕN1 get a vector. Therefore the result is not a vector field on 2.

The other way that the tangent map can fail to transform a vector field into another vector field, which does not apply to the above example, is if the function is not injective. In that case some points in the image get multiple vectors from the tangent map. Since a vector field is a certain type of function, and functions must have exactly one output, the pushforward of a vector field may not be a function.

Exercise 2.23 (Stereographic Projection). Verify the following formula for the tangent map of ι : 𝕊2 3 in the UN chart.

Jx(ϕN1) = 2 (x2 + 1)2 ( (x1)2 + (x2)2 + 1 2x1x2 2x1x2 (x1)2 (x2)2 + 1 2x1 2x2 ) .

We end this section by highlighting a distinction that we have elided until now. A tangent vector is an intrinsic concept on a manifold, it does not require the manifold to be immersed in euclidean space. And yet in the previous chapter we repeatedly talked about tangent vectors to surfaces as a vector in 3 that was tangent in the geometrical sense. The connections between these two ideas is exactly the idea of immersion and pushforward.

Definition 2.24. A function f : M M~ is an immersion at p M if Tpf is injective.

We commented on this at the end of Example 2.22. Another possible example would be Example 2.16. We saw that a manifold can be mapped into a euclidean space by a function. The pushforward of its tangent vectors by this function are exactly the tangent vectors in the intuitive picture. Thus for an immersion, it is possible to consider the elements of TpM as certain elements of TpM~, and this is unambiguous because by definition Tpf is injective.

2.4 Vector Bundles

This section will not be used later in the script. It exists just to flex how natural the construction of the tangent bundle is in our approach to manifolds.

Above we have defined the tangent space TpM to a manifold M at a point p M and discussed how the vectors of TpM and TqM should be thought of as distinct, even if p and q are in the same chart. But we also worked with examples of vector fields, whose values are vectors at different points. We can reconcile this tension by putting all the vectors of a manifold together into a new manifold:

Definition 2.25. Let M be a manifold with atlas A = (n,I,{Ui}iI,{φij}i,jI). We construct a new atlas TA = (2n,I,{TUi}iI,{Tφij}i,jI), where TUi = Ui × n and

Tφij : dom φij × n img φ ij × n,(x,v)(φ ij(x),(Jxφij)v).

The corresponding manifold is called the tangent bundle TM. There is a function π : TM M in charts by

πii(x,v) = x.

We tend to think of the tangent bundle of M as all the vectors of M, with the understanding that vectors at different points are distinct from one another. The function π is called the canonical projection of the bundle. Intuitively it takes a tangent vector to its base point. Hence π1[{p}] are all the vectors which live at the same point, in other words TpM.

The tangent bundle allows us to speak and reason formally about tangent vectors as a whole. We can define a vector field on M as a function X from M to TM with the property that π X = id M. This property says that for every point p M the vector X|p must have the base point π(X|p) = p, which is exactly what a vector field is.

Exercise 2.26. Check that the tangent bundle is a manifold. In particular you must check the unique limit condition.

Example 2.27 (Euclidean Space). Since the trivial atlas for an open subset U n has only one chart U0 = U, so too does its tangent bundle TU0 = U × n. Therefore the tangent bundle is just the cartesian product. The only difference between TU and usual way we think about vectors in euclidean space is that the base point of the vector is important for the tangent bundle.

Example 2.28 (Glued Circle). We saw in Example 2.21 that there was a vector field X on 𝕊1 that was never zero. Because Tp𝕊1 is one dimensional, every vector in it must be a scalar multiple of X|p. This means we can define a function from TpM to

va, where v = aX|p.

Allowing p to change gives us a function from T𝕊1 to 𝕊1 × . Thus the tangent bundle of the circle is also a product. Tangent bundles that a products are called trivial.

Example 2.29. In this example we want to show a non-trivial tangent bundle, but it’s actually somewhat difficult to prove non-triviality. Instead we will try to convey the idea. The tangent bundle of 𝕊2 is non-trivial. If it were trivial, that would mean there would exist a smooth bijective correspondence between T𝕊2 and 𝕊2 × 2. We could use this correspondence to write any vector field X on 𝕊2 as (p,X~(p)) with X~(p) 2. Conversely there would exist a vector field with the form (p,(1,0)). This vector field is never zero.

But it is a theorem that every vector field on the sphere has at least one zero. This is called the hairy ball theorem or the hedgehog theorem. There are many intuitive interpretations of this theorem; one is that there must be somewhere on earth where the wind is not blowing. Try imagining different wind maps, and perhaps you will convince yourself of the truth of this theorem. This is one reason that the tangent bundle of 𝕊2 cannot be trivial.

The tangent map takes its nicest form when expressed with the tangent bundle. We can collect together all the tangent maps Tpf : TpM TpN into a single map Tf : TM TN. Using the formula in charts in terms of the Jacobian we get

(Tf)ij(x,v) = (fij(x),(Jxfij)v ) V j × n = TV j.

This formula shows that the tangent map is the generalisation of the Jacobian, and all local properties of the Jacobian carry over to the tangent map.

Exercise 2.30. Prove that the above formula for the tangent map is well-defined (in different charts) by using the chain rule for Jacobians: for y = φM,i,j(x) and z = fjk(y)

Jx(φN,k,l fjk φM,i,j) = (JzφN,k,l)(Jyfjk)(JxφM,i,j).

Exercise 2.31 (Glued Circle). Calculate the tangent map for f from Example 2.16. Use this to give an alternative description of the tangent bundle T𝕊1 by locating it as a set inside T2 = 2 × 2.

2.5 Summation Convention

As you have seen, when working with vectors in charts, there are many Σ summations, but all of them are just from i = 1 to n. There is a convention, called the Einstein summation convention or the Einstein rule, that allows us to omit all these sigmas.

Definition 2.32 (Einstein Summation). We apply the following notational rule: when an index occurs twice in a term, once as an upper index and once as a lower index, we sum that index over its range. For the purposes of this rule xi = ∂i counts as a lower index.

In this notation, the chain rule reads

x1 = y1 x1 y1 + y2 x1 y2 = i=12 yi x1 yi = yi x1 yi.

A vector field X that has coefficients Xi with respect to a chart can neatly be written as X = Xi∂i, and its expression with respect to a second chart y might be written using the equivalences of vectors as

X = Xi xi = Xiyj xi yj.

This convention is useful for even ordinary linear algebra. If you have a matrix A we would normally write the entries Aij. However if we instead write it Aji with a mix of upper and lower indices then we can write matrix multiplication in the following way

Av = (Aji ) (vj ) = j=1nA jivj = A jivj.

Likewise matrix multiplication AB would be (AB)ki = AjiBkj. The saying to turn matrix algebra into index algebra with summation convention is “Upper indices go up to down; lower indices go left to right”.

We know that bilinear functions can also be represented as a matrix; in usual matrix notation:

g(v,w) = vT (g ij )w = i,j=1nvig ijwj.

In summation convention we write this as gijviwj, with the vector components having upper indices, which forces g to have two lower indices. This is actually an advantage of index notation over matrix notation. Though linear transforms and bilinear transforms can both be represented as a matrix, those matrix representations behave differently under change of basis. We can see that they are different in index notation, but not in matrix notation.

Some authors do not have the condition that you need one upper and one lower index, and allow summation over any repeated index. I do not like this, because it makes it impossible to talk about the diagonal elements gii. If you need to sum over two indices on the same level, you can of course use a summation sign. Alternatively you can use an identity matrix, whose entries are usually called δij. For example

tr (gij ) = i=1ng ii = δi,jg ij.

Finally, some useful rules of calculation are

xi yj yj xk = δkiandviδ ij = vj.

2.6 The Lie Bracket

We have seen examples of how vectors are a type of directional derivative of functions and that a vector field applied to a function gives another function. A natural question is to ask whether we can also differentiate a vector field using a vector field. The answer is yes: in fact vector fields can differentiate many objects on manifolds using a process called the Lie derivative LX, but the process complicated and we won’t explain it here. We show with the following example why the ‘obvious’ way to differentiate vector fields doesn’t work.

Example 2.33 (Polar Coordinates). You might guess that you can differentiate a vector field just by differentiating its coefficient functions:

Y (X):=? i=1nY (Xi) xi,

but this doesn’t produce a well-defined vector field, because in different charts it produces inequivalent vectors.

Let us give an example. Consider the vector fields

X = x1 x1 + x2 x2,Y = x2 x1 x1 x2,

on U1 = 2. For example, if we act Y on X with this definition, in U1 we have

[x2x1 x1 x1x1 x2 ] x1 + [x2x2 x1 x1x2 x2 ] x2 = x2 x1 x1 x2 = Y.

But in U2, polar coordinates, we have

r ∂r = rxi ∂r xi = rcos 𝜃 x1 + rsin 𝜃 x2 = X, ∂𝜃 = xi ∂𝜃 xi = rsin 𝜃 x1 rcos 𝜃 x2 = Y.

So the same calculation in polar coordinates gives

[∂r ∂𝜃 ] ∂r = 0.

Hence this operation depends on which coordinates one uses. It is not well-defined on the manifold.

It turns out the correct way to differentiate vector fields is to take the commutator of the above guess.

Definition 2.34. The Lie derivative of a vector field Y with respect to a vector field X, also called the Lie bracket, acts on a function f by

(LXY )(f) = [X,Y ](f) := X(Y (f)) Y (X(f)).

From this definition we can ask how to write it in charts:

[X,Y ](f) = X(Y (f)) Y (X(f)) = X (Y j ∂f xj ) Y (Xi ∂f xi ) = Xi xi (Y j ∂f xj ) Y j xj (Xi ∂f xi ) = XiY j xi ∂f xj + XiY j 2f xixj Y jXi xj ∂f xi Y jXi 2f xjxi = [XjY i xj Y jXi xj ] ∂f xi = [X(Y i) Y (Xi) ] ∂f xi.

So even though the definition of [X,Y ] appears to have second derivatives of f, these cancel out and what remains is indeed a (first order) vector field acting on f.

Example 2.35 (Polar Coordinates). Let us take the vector fields from the previous example and compute their Lie bracket. In the U1 chart

[X,Y ] = [ (x1 x1 + x2 x2 ) (x2) (x2 x1 x1 x2 ) (x1)] x1 + [ (x1 x1 + x2 x2 ) (x1) (x2 x1 x1 x2 ) (x2)] x2 = [x2 x2] x1 + [x1 + x1] x2 = 0.

and in the U2 chart

[X,Y ] = [ (r ∂r + 0 ∂𝜃 )(0) (0 ∂r ∂𝜃 )(r)] ∂r + [ (r ∂r + 0 ∂𝜃 )(1) (0 ∂r ∂𝜃 )(0)] ∂𝜃 = 0.

Whenever we have an expression in charts, it is good to check that in different charts it produces equivalent vectors. We do so in the following lemma. This is not strictly necessary, because the [X,Y ](f) = X(Y (f)) Y (X(f)) definition doesn’t use charts, but good practice none-the-less.

Lemma 2.36. The above chart expression gives a vector field that is independent of the choice of chart.

Proof. If X = Xi(x) xi and Y = Y i(x) xi are the expressions of two vector fields in the x-chart, then the expression of these fields in the y-chart is

X = Xj yi xj yi,Y = Y j yi xj yi.

In this chart, the ith component of [X,Y ] is

X (Y ~i) Y (X~i) = Xlyk xl yk (Y j yi xj ) Y jyk xj yk (Xlyi xl ) = Xlyk xl yk (Y j yi xj ) Y jyk xj yk (Xlyi xl ) = Xl xl (Y j yi xj ) Y j xj (Xlyi xl ) = XlY j xl yi xj + XlY j nyi xlxj Y jXl xj yi xl Y jXl nyi xjxl = XlY j xl yi xj Y lXj xl yi xj = [XlY j xl Y lXj xl ] yi xj.

In total then, in the vector field [X,Y ] in the y-chart is

[X,Y ] = [XlY j xl Y lXj xl ] yi xj yi.

But we see that this is equivalent to the expression in the x-chart. Therefore the definition produces a well-defined vector field. □

There are of course many things that can be said about the Lie bracket. The first observation is that it is -bilinear and antisymmetric: [aX + bX~,Y ] = a[X,Y ] + b[X~,Y ] and [X,Y ] = [Y,X]. If you have two coordinate vector fields, then their Lie bracket is zero. Thus one interpretation of the Lie bracket is that it is a measurement of how far two vector fields are from being coordinate vector fields. The final property that we will give is a product rule: if f is a function, then

[X,fY ] = X(f)Y + f[X,Y ], [fX,Y ] = [Y,fX] = Y (f)X f[Y,X] = f[X,Y ] Y (f)X.

Example 2.37 (Euclidean Space). We give some euclidean examples. Consider the plane 2. Let X = 1 and Y = 2. Then plugging in the definitions

[X,Y ] = X(Y i) i Y (Xi) i = X(1)2 Y (1)1 = 0.

Next consider V = (1 + x2)1. Then

[V,Y ] = V (1)2 Y (1 + x2) 1 = 1.

Finally, set W = x21 x12. then

[V,W] = V (x2) 1 V (x1) 2 W(1 + x2) 1 = 0 (1 + x2) 2 (0 x1) 1 = (1 + x2) 2 + x1 1.

1Sources differ: some use second-countable, others use the ‘Lindelöff’ property, which is equivalent in this context, and others use the more general ‘paracompact’

2If you are familiar with topology, the quotient topology makes M a locally euclidean topological space.

3Another common convention is to zero-index the components the have the ±1 in the zeroth component

4It is necessary to require that ϕi(p) has an open neighbourhood U Ui such that f[U] U~j. If you omit this condition, it is possible to make an example where f is continuous in every chart but is not continuous on M as a whole. See Lee “Introduction to Smooth Manifolds” Problem 2-1 for an example.