1. Fenchel-Nielsen Coordinates for Teichmuller Space

Here we discuss a very nice set of coordinates for Teichmuller space. The basic idea is that we cut the surface up into small pieces (pairs of pants); hyperbolic structures on these pieces are easy to parameterize, and we also understand the ways we can put these pieces together.

In order to define these coordinates, we first cut the surface up. A pair of pants is a thrice-punctured sphere.

Another way to specify it is that it is a genus {0} surface with euler characteristic {-1} and three boundary components. We can cut any surface up into pairs of pants with simple closed curves. To see this, we can just exhibit a general cutting: slice with {3g-3} “vertical” simple closed curves.

This is not the only way to cut a surface into pairs of pants. For example, with the once-punctured torus any pair of coprime integers gives us a curve which cuts the surface into a pair of pants. We are going to show that a point in Teichmuller space is determined by the lengths of the {3g-3} curves, plus {3g-3} other coordinates, which record the “twisting” of each gluing curve.

Now, given a choice of {3g-3} disjoint simple closed surves {\{\alpha_i\}}, we associate to {(f, \Sigma) \in \mathrm{Teich}(S)} the family of geodesics in {\Sigma} in the homotopy classes of the {f(\alpha_i)}. In each class, there is a unique geodesic, but how do we know the geodesics in {\{f(\alpha_i)\}} are pairwise disjoint?

Lemma 1 Suppose {\{\alpha_i\}} is a family of pairwise disjoint simple closed curves in a hyperbolic surface {\Sigma}, and {\{\gamma_i\}} are the (unique) geodesic representatives in the homotopy classes of the {\alpha_i}.

 

  • The geodesics in {\{\gamma_i\}} are pairwise disjoint simple closed curves.
  • As a family, the {\{\gamma_i\}} are ambient isotopic to {\{\alpha_i\}}.

 

Proof: Consider a loop {\alpha} and its geodesic representative {\gamma}. Suppose that {\gamma} intersects itself. Now {\alpha} and {\gamma} cobound an annulus, which lifts to the universal cover: in the universal cover we must find the lift of the intersection as an intersection between two lifts {\tilde{\gamma}} and {\tilde{\gamma}'}. Because the annulus bounding {\alpha} and {\gamma} lifts to the universal cover, there are two lifts {\tilde{\alpha}} and {\tilde{\alpha}'} of {\alpha} which are uniformly close to {\tilde{\gamma}} and {\tilde{\gamma}'}. We therefore find that {\tilde{\alpha}} and {\tilde{\alpha}'} intersect, which means that {\alpha} intersects itself, which is a contradiction. The same idea shows that the geodesic representatives {\gamma_i} are pairwise disjoint.

To see that they are ambient isotopic as a family, it is easiest to lift the picture to the universal cover. At that point, we just need to “wiggle” everything a little to match up the lifts of the {\alpha_i} and {\gamma_i}. \Box

With the lemma, we see that to a point in Teichmuller space we get {3g-3} pairwise disjoint simple closed geodesics, which gives us {3g-3} positive coordinates, namely, the lengths of these curves. We might wonder: what triples of points can arise as the lengths of the boundary curves in hyperbolic pairs of pants? It turns out that:

Lemma 2 There exists a unique hyperbolic pair of pants with cuff lengths {(l_1, l_2, l_3)}, for any {l_1, l_2, l_3 > 0}. Cuff lengths here refers to the lengths of the three boundary components.

Proof: We will now prove the lemma, which involves a little discussion. Suppose we are given a hyperbolic pair of pants. We can double it to obtain a genus two surface:

The {\alpha} curves are shown in red, and representatives of the other isotopy class fixed by the involution are in blue.

There is an involution (rotation around a skewer stuck through the surface horizontally) which fixes the (glued up) boundaries of the pairs of pants. This involution also fixes the isotopy classes of three other disjoint simple closed curves, and there is a unique geodesic {\beta_i} in these isotopy classes. Since the {\beta_i} are fixed by the involution, they must intersect the {\alpha_i} at right angles. If we cut along the {\alpha_i} to get (two copies of) our original pair of pants, we have found that there is a unique triple of geodesics {\beta_i} which meet the boundaries at right angles:

Cutting along the {\beta_i}, we get two hyperbolic hexagons:

We will prove in a moment that there is a unique hyperbolic right-angled hexagon with three alternating edge lengths specified. In particular, there is a unique hyperbolic right-angled hexagon with alternating edge lengths {(l_1/2, l_2/2, l_3/2)}. Since there is a unique way to glue up the hexagons to obtain our original {(l_1, l_2, l_3)} pair of pants, there is a unique hyperbolic pair of pants with specified edge lengths. \Box

Lemma 3 There is a unique hyperbolic right-angled hexagon with alternating edge lengths {(l_1, l_2, l_3)}.

Proof: Pick some geodesic {g_1} and some point {x_1} on it. We will show the hexagon is now determined, and since we can map a point on a geodesic to any other point on a geodesic, the hexagon will be unique up to isometry. Draw a geodesic segment of length {l_1} at right angles from {x_1}. Call the other end of this segment {x_2}. There is a unique geodesic {g_2} passing through {x_2} at right angles to the segment. Pick some point {x_3} on {g_2} at length {y} from {x_2} (we will be varying {y}). From {x_3} there is a unique geodesic segment of length {l_2} at right angles to {g_2}; call its endpoint {x_4}. There is a unique geodesic {g_3} through {x_4} at right angles to this segment. Now, there is a unique geodesic segment at right angles to {g_1} and {g_3}. Of course, the length {z} of this segment depends on {y}.

If we make {y} large, then {z} becomes large, and there is some positive {y} such that {z} goes to {0}. Therefore, there is a unique length {y} making {z = l_3}. We have now determined the hexagon, and, up to isometry, all of our choices were forced, so there is only one. \Box

Since there is a unique hyperbolic pair of pants with specified cuff lengths, when we cut our surface of interest {S} up into pairs of pants, we get a map {\mathrm{Teich}(S) \rightarrow (\mathbb{R}^+)^{3g-3}} which takes a point {(f, \Sigma)} to the {3g-3} lengths of the curves cutting {S} into pairs of pants. This map is not injective: the fiber over a point is all the ways to glue together the pairs of pants.

The issue is that when we want to glue two {\alpha} curves together, we have to decide whether to twist them at all before gluing. Up to isometry, there are {\mathbb{R}/\mathbb{Z}} ways to glue these curves together (all the angles). However, in (marked) Teichmuller space, there are {\mathbb{R}} ways to glue it up. Draw another curve {\beta} (this {\beta} is not the same as the {\beta_i} before). The marking on {S} lets us observe what happens to {\beta} under {f}, and we can see that twisting the pairs of pants around {\alpha} results in nontrivial movement in Teichmuller space.

The twist above results in the following new {\beta} curve:

The length of {\beta} determines how twisted the gluing is, since twisting requires increasing its length. That is, given the image of {\beta}, there is a unique way to untwist it to get a minimum length. This tells us how twisted the original gluing was.

To understand the twisting around all the {3g-3} curves in {S}, we must pick another {3g-3} curves; one simple way is to declare that {\beta} looks like the above pictures if we are gluing two distinct pairs of pants, and like this:

if we are gluing a pair of pants to itself. This construction gives us a global homeomorphism

\displaystyle \mathrm{Teich}(S) \rightarrow (\mathbb{R}^+)^{3g-3} \times \mathbb{R}^{3g-3} \cong \mathbb{R}^{6g-6}

Here is an example of a choice of {\alpha} and {\beta} curves. The {\beta} curves get a little messy in the middle: try to fit the pictures above into the context of the one below to see that they are correct.

1.1. A Symplectic Form on Moduli Space

The length and twist coordinates {l_i} and {t_i} are not well-defined on Moduli space, but their derivatives are: define the 2 form on Teichmuller space

\displaystyle \omega = \sum_i dl_i \wedge dt_i

It is a theorem of Wolpert that this 2-form is independent of the choice of coordinates, so it descends to a 2-form on Moduli space. It is very usful that Modi space is symplectic.

This post introduces Teichmuller and Moduli space. The upcoming posts will talk about Fenchel-Nielsen coordinates for Teichmuller space; it’s split up because I figured this was a relatively nice break point. Hopefully, I will later add some pictures to this post.

1. Uniformization

This section starts to talk about Teichmuller space and related stuff. First, we recall the uniformization theorem:

If {S} is a closed surface (Riemannian manifold), then there is a unique* metric of constant curvature in its conformal class. The asterisk * refers to the fact that the metric is unique if we require that it has curvature {\pm 1}. If {\chi(S)=0}, then the metric has curvature zero and it is unique up to euclidean similarities.

2. Teichmuller and Moduli Space of the Torus

Let us see what we can conclude about flat metrics on the torus. We would like to classify them in some way. Choose two straight curves {\alpha} and {\beta} on the torus intersecting once (a longitude and a meridian) and cut along these curves. We obtain a parallelogram which can be glued up along its edges to retrieve the original torus. This parallelogram lives/embeds in {\mathbb{C}^2}, and, by composing the embedding with euclidean similarities, we may assume that the bottom left corner is at {0} and the bottom right is {1}. The parallelogram is therefore determined by where the upper left hand corner is: some complex number {z} with {\mathrm{Im}(z) > 0}. Notice that this is the upper half-plane, which we can think of as hyperbolic space. Therefore, there is a bijection:

{ Torii with two chosen loops up to euclidean similarity } {\leftrightarrow} { {z \in \mathbb{C} \, | \, \mathrm{Im}(z) > 0} }

This set is called the Teichmuller space of the torus. We don’t really care about the loops {\alpha} and {\beta}, so we’d like to find a group which takes one choice of loops to another and acts transitively. The quotient of this will be the set of flat metrics on the torus up to isometry, which is known as Moduli space.

We are interested in the mapping class group of the torus, which is defined to be

\displaystyle \mathrm{MCG}(T^2) = \mathrm{Homeo}^+(T^2) / \mathrm{Homeo}_\circ(T^2)

Where {\mathrm{Homeo}_\circ(T^2)} denotes the connected component of the identity. That is, the mapping class group is the group of homeomorphisms (homotopy equivalences), up to isotopy (homotopy). The reason for the parentheses is that for surfaces, we may replace homeomorphism and isotopy by homotopy equivalence and homotopy, and we will get the same group (these catagories are equivalent for surfaces).

To find {\mathrm{MCG}(T^2)}, think of the torus as the unit square in {\mathbb{R}^2} spanned by the standard unit basis vectors. Then a homeomorphism of {T^2} must send the integer lattice to itself, so the standard basis must go to a basis for this lattice, and the transformation must preserve the area of the torus. Up to isotopy, this is just linear maps of determinant {1} (not {-1} because we want orientation-preserving) preserving the integer lattice, which we care about up to scale, otherwise known as {\mathrm{PSL}(2,\mathbb{Z})}.

Using the bijection above, the mapping class group of the torus acts on {\{ z \in \mathbb{C} \, | \, \mathrm{Im}(z) > 0 \}}, and this action is

\displaystyle \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right] z = \frac{az + b}{cz+d}

This action is probably familiar to you from complex analysis.

In summary, the Teichmuller space of the torus is (can be represented as) {\{ z \in \mathbb{C} \, | \, \mathrm{Im}(z) > 0 \}}, and the mapping class group {\mathrm{PSL}(2,\mathbb{Z})} acts on this space, and the quotient of this action is the set of flat metrics up to isometry, which is Moduli space. What is the quotient? A fundamental region for the action is the set

\displaystyle \{ z\in\mathbb{C} \,\, |\,\, |\mathrm{Re}(z)| \le \frac{1}{2}, \, |z| \ge 1\}

Which is glued to itself by a flip in the {y} axis. The resulting Moduli space is an orbifold: one point is ideal and goes off to infinity, one point looks locally like {\mathbb{R}^2} quotiented by a rotation of {\frac{2\pi}{3}}, and the other point looks like {\mathbb{R}^2} quotiented by a rotation of {\pi}.

3. Teichmuller Space and Moduli Space for Negatively Curved Surfaces

Now we will go through a similar process for closed, boundaryless, oriented surfaces of negative Euler characteristic. It is possible to do this for surfaces with boundary, etc, but for simplicity, we will stick to multi-holed torii (this what closed, boundaryless, oriented surfaces of negative Euler characteristic are) for now.

We start with a topological surface {S}. Topological meaning we do not associate with it a metric. We want to classify the hyperbolic metrics we could give to {S}. Define Teichmuller space {\mathrm{Teich}(S)} to be the set of equivalence classes of pairs {(f, \Sigma)} where {\sigma} is a hyperbolic surface and {f: S \rightarrow \Sigma} is a homotopy equivalence. As mentioned earlier, anywhere “homotopy equivalence” appears here, you may replace it with “homeomorphism” as long as you replace “homotopy” with “isotopy.” The equivalence relation on pairs is the following: {(f, \Sigma_1) \sim (g, \Sigma_2)} iff there exists an isometry {i: \Sigma_1 \rightarrow \Sigma_2} such that {i \circ f} is homotopic to {g}.

Define the Moduli space {\mathcal{M}(S)} of {S} to be isometry classes of surfaces {\Sigma} which are homotopy equivalent to {S}. There is an obvious map {\mathrm{Teich}(S) \rightarrow \mathcal{M}(S)} defined by mapping {(f, \Sigma) \mapsto \Sigma}, and this map respects the equivalence relations, because if {(f, \Sigma_1) \sim (g, \Sigma_2)}, then {\Sigma_1} is isometric to {\Sigma_2} (since it is isometric by an isometry commuting with {f} and {g}).

As with the torus, define the mapping class group {\mathrm{MCG}(S)} to be the group of homotopy equivalences of {S} with itself, up to homotopy. Then {\mathrm{MCG}(S)} acts on {\mathrm{Teich}(S)} by {\varphi \cdot (f,\Sigma) = (f \circ \varphi, \Sigma)}. The quotient of {\mathrm{Teich}(S)} by this action is {\mathcal{M}(S)}: clearly we never identify surfaces which are not isometric, and if {i : \Sigma_1 \rightarrow \Sigma_2} is an isometry, and {(f,\Sigma_1)}, {(g,\Sigma_2)} are points in Teichmuller space with any {f,g}, then notice {f} has an inverse (up to homotopy), so if we act on {(f,\Sigma_1)} by {f^{-1}\circ g}, we get {(f\circ f^{-1}\circ g, \Sigma_1)}, which is the same point in {\mathrm{Teich}(S)} as {(g,\Sigma_2)}. We are abusing notation here, because we are thinking of {\Sigma_1}, {\Sigma_2} and {S} as the same surface (which they are, topologically). The point is that by acting by {\mathrm{MCG}(S)} we can rearrange {S} so that after mapping by {f \circ i} we are homotopic to {g}. The result of this is that

\displaystyle \mathrm{Teich}(S) / \mathrm{MCG}(S) \cong \mathcal{M}(S)

A priori, we are interested in hyperbolic metrics on {S} up to isometry — Moduli space. The reason for defining Teichmuller space is that Moduli space is rather complicated. Teichmuller space, on the other hand, will turn out to be as nice as you could want ({\mathbb{R}^{6g-6}} for a genus {g} surface). By studying the very nice Teichmuller space plus the less-nice-but-still-understandable mapping class group, we can approach Moduli space.

4. Coordinates for Teichmuller Space

Now we will take a closer look at Teichmuller space and give it coordinates.

4.1. Very Overdetermined (But Easy) Coordinates

One way to give this space coordinates is the following. Let us choose a homotopy class of loop in {S} (this is a conjugacy class in {\pi_1(S)}), and we’ll represent this class by the loop {\gamma : S^1 \rightarrow S}. Given a point {(f,\Sigma) \in \mathrm{Teich}(S)}, there is a unique geodesic representative in the free homotopy class of the loop {f\circ \gamma}. Define {l_\gamma(f,\Sigma) = \mathrm{length}(f\circ \gamma)} to be the length of this representative. Let {C} be the set of conjugacy classes in {\pi_1(S)}. Then we have defined a map

\displaystyle l : \mathrm{Teich}(S) \rightarrow \mathbb{R}^C

by

\displaystyle (f,\Sigma) \mapsto (l_\gamma(f,\Sigma))_\gamma

This is nice in the sense that it’s a real vector space, but not nice in that it’s infinite dimensional. We will see that we need a finite number of dimensions.

4.2. Dimension Counting

Method 1

Let’s try to count the dimension of {\mathrm{Teich}(S)}. Suppose that {S} has genus {g}. We can obtain {S} by gluing the edges of a {4g}-gon in pairs (going counterclockwise, the labels read {a_1}, {b_1}, {a_1^{-1}}, {b_1^{-1}}, {a_2} …, {a_g}, {b_g}, {a_g^{-1}}, {b_g^{-1}}). Since we will be given {S} a hyperbolic metric, let us look at what this tells us about this polygon. We have a hyperbolic polygon; in order to glue it up, we must have

  1. The paired sides must have equal length.
  2. The corner angles must add to {2\pi}.

For a triangle in hyperbolic space, the edges lengths are enough to specify the triangle up to isometry. Similarly, for a hyperbolic 4-gon (square), we need all the exterior edge lengths, plus 1 angle (the angle gives the length of a diagonal). By induction, a {n}-gon needs {n} side lengths and {n-3} angles. For our {4g}-gon, then, we need to specify {4g} side lengths and {4g-3} angles. This is {8g-3} dimensions. However, we have {2g} pairs, each of which gives a constraint, plus our single constraint about the angle sum. This reduces our dimension to {6g-4}. Finally, we made an arbitrary choice about where the vertex of this polygon was in our surface. This is an extra two dimensions that we don’t care about (we disregard those coordinates), so we have {6g-6} dimensions.

Method 2

A marked hyperbolic structure on {S} gives a {\pi_1(S)}-equivariant isometry {\widetilde{\Sigma} \rightarrow \mathbb{H}^2}. That is, an element of {\mathrm{Teich}(S)} is {(f,\Sigma)}, which tells us how to map {\pi_1(S)} isomorphically onto {\pi_1(\Sigma)}, which is the same as the deck group of the universal cover {\widetilde{\Sigma}}, which is {\mathbb{H}^2}. Therefore, to an element of {\mathrm{Teich}(S)} is associated a discrete faithful representation of {\pi_1(S)} to {\mathrm{PSL}(2,\mathbb{R})}, the group of isometries of {\mathbb{H}^2}, and this representation is unique up to conjugacy (if we conjugate the image of the representation, then the quotient manifold is the same). The dimension of {\mathrm{Teich}(S)} is therefore the dimension of the space of representations of {\pi_1(S)} in {\mathrm{PSL}(2,\mathbb{R})} up to conjugacy.

The fundamental group of {S} has a nice presentation in terms of the polygon we can glue up to make it; the interior of the polygon gives us a single relation:

\displaystyle \pi_1(S) = \langle a_1, b_1, \cdots, a_g, b_g \,| \, \prod_i [a_i,b_i]\rangle

So {\mathrm{Hom}(\pi_1(S), \mathrm{PSL}(2,\mathbb{R}))} is the subset of {\mathrm{Hom}(F_{2g}, \mathrm{PSL}(2,\mathbb{R}))} such that {\prod_i [a_i,b_i] = 1} (here {F_{2g}} is the free group on 2 generators, which is what we get if we forget the single relation). Now a representation in {\mathrm{Hom}(F_{2g}, \mathrm{PSL}(2,\mathbb{R}))} is completely free: we can send the generators anywhere we want, so

\displaystyle \mathrm{Hom}(F_{2g}, \mathrm{PSL}(2,\mathbb{R})) \cong \left( \mathrm{PSL}(2,\mathbb{R}) \right)^{2g}

Since {\mathrm{PSL}(2,\mathbb{R})} is 3-dimensional, the right hand side is a real manifold of dimension {6g}. Insisting that {\prod_i [a_i,b_i]} map to {1} is a 3-dimensional constraint (it gives 4 equations, when you think of it as a matrix equation, but there is an implied equation already taken into account). Therefore we expect that {\mathrm{Hom}(\pi_1(S), \mathrm{PSL}(2,\mathbb{R}))} will be {6g-3} dimensional. However, we are interested in representations up to conjugacy, so this removes another 3 dimensions, giving us the same dimension estimate for {\mathrm{Teich}(S)} as {6g-6} dimensional.

In this post, I will cover triangles and area in spaces of constant (nonzero) curvature. We are focused on hyperbolic space, but we will talk about spheres and the Gauss-Bonnet theorem.

1. Triangles in Hyperbolic Space

Suppose we are given 3 points in hyperbolic space {\mathbb{H}^n}. A triangle with these points as vertices is a set of three geodesic segments with these three points as endpoints. The fact that there is a unique triangle requires a (brief) proof. Consider the hyperboloid model: three points on the hyperboloid determine a unique 3-dimensional real subspace of {\mathbb{R}^{n+1}} which contains these three points plus the origin. Intersecting this subspace with the hyperboloid gives a copy of {\mathbb{H}^2}, so we only have to check there is a unique triangle in {\mathbb{H}^2}. For this, consider the Klein model: triangles are euclidean triangles, so there is only one with a given three vertices.

In hyperbolic space, it is still true that knowing enough side lengths and/or angles of a triangles determines it. For example, knowing two side lengths and the angle between them determines the triangle. Similarly, knowing all the angles determines it. However, not every set of angles can be realized (in euclidean space, for example, the angles must add to {\pi}), and the inequalities which must be satisfied are more complicated for hyperbolic space.

2. Ideal Triangles and Area Theorems

We can think about moving one (or more) of the points of a hyperbolic triangle off to infinity (the boundary of the disk). An ideal triangle is one with all three “vertices” (the vertices do not exist in hyperbolic space) on the boundary. Using a conformal map of the disk (which is an isometry of hyperbolic space), we can move any three points on the boundary to any other three points, so up to isometry, there is only one ideal triangle. We have fixed our metric, so we can find the area of this triangle. The logically consistent way to find this is with an integral since we will use this fact in our proof sketch of Gauss-Bonnet, but as a remark, suppose we know Gauss-Bonnet. Imagine a triangle very close to ideal. The curvature is {-1}, and the euler characteristic is {1}. The sum of the exterior angles is just slightly under {3\pi}, so using Gauss-Bonnet, the area is very close to {\pi}, and goes to {\pi} as we push the vertices off to infinity.

One note is that suppose we know what the geodesics are, and we know what the area of an ideal triangle is (suppose we just defined it to be {\pi} without knowing the curvature). Then by pasting together ideal triangles, as we will see, we could find the area of any triangle. That is, really the key to understanding area is knowing the area of an ideal triangle.

As mentioned above, there is a single triangle, up to isometry, with given angles, so denote the triangle with angles {\alpha, \beta, \gamma} by {\Delta(\alpha, \beta, \gamma)}.

2.1. Area

Knowing the area of an ideal triangle allows us to calculate the area of any triangle. In fact:

Theorem 1 (Gauss) {\mathrm{area}(\Delta(\alpha, \beta, \gamma)) = \pi - (\alpha + \beta + \gamma)}

This geometric proof relies on the fact that the angles in the Poincare model are the euclidean angles in the model. Consider the generic picture:

We have extended the sides of {\Delta(\alpha, \beta, \gamma)} and drawn the ideal triangle containing these geodesics. Since the angles are what they look like, we know that the area of {\Delta(\alpha,\beta,\gamma)} is the area of the ideal triangle ({\pi}), minus the sum of the areas of the smaller triangles with two points at infinity:

\displaystyle \mathrm{area}(\Delta(\alpha, \beta, \gamma)) = \pi - \mathrm{area}(\Delta(\pi-\alpha, 0,0)) - \mathrm{area}(\Delta(\pi-\beta, 0, 0)) - \mathrm{area}(\Delta(\pi-\gamma, 0, 0))

Thus it suffices to show that {\mathrm{area}(\Delta(\pi - \alpha, 0, 0)) = \alpha}.

For this fact, we need another picture:

Define {f(\alpha) = \mathrm{area}(\Delta(\pi-\alpha, 0, 0))}. The picture shows that the area of the left triangle (with two vertices at infinity and one near the origin) plus the area of the right triangle is the area of the top triangle plus the area of the (ideal) bottom triangle:

\displaystyle f(\alpha) + f(\beta) = f(\alpha+\beta-\pi) + \pi

We also know some boundary conditions on {f}: we know {f(0) = 0} (this is a degenerate triangle) and {f(\pi) = \pi} (this is an ideal triangle). We therefore conclude that

\displaystyle f(\frac{\pi}{2}) + f(\frac{\pi}{2}) = f(0) + \pi \qquad \Rightarrow \qquad f(\frac{\pi}{2}) = \frac{\pi}{2}

Similarly,

\displaystyle 2f(\frac{3\pi}{4}) = f(\frac{\pi}{2}) + \pi \qquad \Rightarrow \qquad f(\frac{3\pi}{4}) = \frac{3\pi}{4}

And we can find {f(\pi/4) = \pi/4} by observing that

\displaystyle f(\frac{3\pi}{4}) + f(\frac{\pi}{2}) = f(\frac{\pi}{4}) + \pi

Similarly, if we know {f(\frac{k\pi}{2^n}) = \frac{k\pi}{2^n}}, then

\displaystyle f(\frac{(2^{n+1}-1)\pi}{2^{n+1}}) = \frac{(2^{n+1}-1)\pi}{2^{n+1}}

And by subtracting {\pi/2^n}, we find that {f(\frac{k\pi}{2^{n+1}}) = \frac{k\pi}{2^{n+1}}}. By induction, then, {f(\alpha) =\alpha} if {\alpha} is a dyadic rational times {\pi}. This is a dense set, so we know {f(\alpha) = \alpha} for all {\alpha \in [0,\pi]} by continuity. This proves the theorem.

3. Triangles On Spheres

We can find a similar formula for triangles on spheres. A lune is a wedge of a sphere:

A lune.

Since the area of a lune is proportional to the angle at the peak, and the lune with angle {2\pi} has area {4\pi}, the lune {L(\alpha)} with angle {\alpha} has area {2\alpha}. Now consider the following picture:

Notice that each corner of the triangle gives us two lunes (the lunes for {\alpha} are shown) and that there is an identical triangle on the rear of the sphere. If we add up the area of all 6 lunes associated with the corners, we get the total area of the sphere, plus twice the area of both triangles since we have triple-counted them. In other words:

\displaystyle 4\pi + 4\mathrm{area}(\Delta(\alpha, \beta,\gamma)) = 2L(\alpha) + 2L(\beta) + 2L(\gamma) = 4(\alpha + \beta + \gamma)

Solving,

\displaystyle \mathrm{area}(\Delta(\alpha, \beta,\gamma)) = \alpha + \beta + \gamma - \pi

4. Gauss-Bonnet

If we encouter a triangle {\Delta} of constant curvature {K(\Delta)}, then we can scale the problem to one of the two formulas we just computed, so

\displaystyle \mathrm{area}(\Delta) = \frac{\sum \mathrm{angles} - \pi}{K(\Delta)}

This formula allows us to give a slightly handwavy, but accurate, proof of the Gauss-Bonnet theorem, which relates topological information (Euler characteristic) to geometric information (area and curvature). The proof will precede the statement, since this is really a discussion.

Suppose we have any closed Riemannian manifold (surface) {S}. The surface need not have constant curvature. Suppose for the time being it has no boundary. Triangulate it with very small triangles {\Delta_i} such that {\mathrm{area}(\Delta_i) \sim \epsilon^2} and {\mathrm{diameter}(\Delta_i) \sim \epsilon}. Then since the deviation between the curvature and the curvature at the midpoint {K_\mathrm{midpoint}} is {o(\epsilon^2)} times the distance from the midpoint,

\displaystyle \int_{\Delta_i} K d\mathrm{area} = K_\mathrm{midpoint}\cdot \mathrm{area}(\Delta_i) + o(\epsilon^3)

For each triangle {\Delta_i}, we can form a comparison triangle {\Delta^c_i} with the same edge lengths and constant curvature {K_\mathrm{midpoint}}. Using the formula from the beginning of this section, we can rewrite the right hand side of the formula above, so

\displaystyle \int_{\Delta_i} K d\mathrm{area} = \sum_{\Delta_i^c} \mathrm{angles} - \pi + o(\epsilon^3)

Now since the curvature deviates by {o(\epsilon^2)} times the distance from the midpoint, the angles in {\Delta_i} deviate from those in {\Delta_i^c} just slightly:

\displaystyle \sum_{\Delta_i} \mathrm{angles} = \sum_{\Delta_i^c} \mathrm{angles} + o(\epsilon^3)

So we have

\displaystyle \int_{\Delta_i} K d\mathrm{area} = \sum_{\Delta_i} \mathrm{angles} - \pi + o(\epsilon^3)

Therefore, summing over all triangles,

\displaystyle \int_{S} K d\mathrm{area} = \sum_i \left[ \sum_{\Delta_i} \mathrm{angles} - \pi \right] + o(\epsilon)

The right hand side is just the total angle sum. Since the angle sum around each vertex in the triangulation is {2\pi},

\displaystyle \sum_i \left[ \sum_{\Delta_i} \mathrm{angles} - \pi \right] = 2\pi V - \pi T

Where {V} is the number of vertices, and {T} is the number of triangles. The number of edges, {E}, can be calculated from the number of triangles, since there are {3} edges for each triangle, and they are each double counted, so {E = \frac{3}{2} T}. Rewriting the equation,

\displaystyle \int_{S} K d\mathrm{area} = 2\pi (V - \frac{1}{2}T) = 2\pi (V - E + T) = 2\pi\chi(S) + o(\epsilon)

Taking the mesh size {\epsilon} to zero, we get the Gauss-Bonnet theorem {\int_S K d\mathrm{area} = 2\pi\chi(S)}.

4.1. Variants of Gauss-Bonnet

  • If {S} is compact with totally geodesic boundary, then the formula still holds, which can be shown by doubling the surface, applying the theorem to the doubled surface, and finding that euler characteristic also doubles.
  • If {S} has geodesic boundary with corners, then\displaystyle \int_S K d\mathrm{area} + \sum_\mathrm{corners} \mathrm{turning angle} = 2\pi\chi(S) Where the turning angle is the angle you would turn tracing the shape from the outside. That is, it is {\pi - \alpha}, where {\alpha} is the interior angle.

     

  • Most generally, if {S} has smooth boundary with corners, then we can approximate the boundary with totally geodesic segments; taking the length of these segments to zero gives us geodesic curvature ({k_g}):\displaystyle \int_S K d\mathrm{area} + \sum_\mathrm{corners} \mathrm{turning angle} + \int_{\partial S} k_g d\mathrm{length} = 2\pi\chi(S)

4.2. Examples

  • The Euler characteristic of the round disk in the plane is {1}, and the disk has zero curvature, so {\int_{\partial S} k_g d\mathrm{length} = 2\pi}. The geodesic curvature is constant, and the circumference is {2\pi r}, so {2\pi r k_g = 2\pi}, so {k_g = 1/r}.
  • A polygon in the plane has no curvature nor geodesic curvature, so {\sum_\mathrm{corners} \pi - \mathrm{angle} = 2\pi}.

The Gauss-Bonnet theorem constrains the geometry in any space with nonzero curvature. This the “reason” similarities which don’t preserve length and/or area exist in euclidean space; it has curvature zero.

I am Alden, one of Danny’s students. Error/naivete that may (will) be found here is mine. In these posts, I will attempt to give notes from Danny’s class on hyperbolic geometry (157b). This first post covers some models for hyperbolic space.

1. Models

We have a very good natural geometric understanding of {\mathbb{E}^3}, i.e. 3-space with the euclidean metric. Pretty much all of our geometric and topological intuition about manifolds (Riemannian or not) comes from finding some reasonable way to embed or immerse them (perhaps locally) in {\mathbb{E}^3}. Let us look at some examples of 2-manifolds.

  • Example (curvature = 1) {S^2} with its standard metric embeds in {\mathbb{E}^2}; moreover, any isometry of {S^2} is the restriction of (exactly one) isometry of the ambient space (this group of isometries being {SO(3)}). We could not ask for anything more from an embedding.
  • Example (curvature = 0) Planes embed similarly.
  • Example (curvature = -1) The pseudosphere gives an example of an isometric embedding of a manifold with constant curvature -1. Consider a person standing in the plane at the origin. The person holds a string attached to a rock at {(0,1)}, and they proceed to walk due east dragging the rock behind them. The movement of the rock is always straight towards the person, and its distance is always 1 (the string does not stretch). The line traced out by the rock is a tractrix. Draw a right triangle with hypotenuse the tangent line to the curve and vertical side a vertical line to the {x}-axis. The bottom has length {\sqrt{1-y^2}}, which shows that the tractrix is the solution to the differential equation\displaystyle \frac{-y}{\sqrt{1-y^2}} = \frac{dy}{dx}

    The Tractrix

    The surface of revolution about the {x}-axis is the pseudosphere, an isometric embedding of a surface of constant curvature -1. Like the sphere, there are some isometries of the pseudosphere that we can understand as isometries of {\mathbb{E}^3}, namely rotations about the {x}-axis. However, there are lots of isometries which do not extend, so this embeddeding does not serve us all that well.

     

  • Example (hyperbolic space) By the Nash embedding theorem, there is a {\mathcal{C}^1} immersion of {\mathbb{H}^2} in {\mathbb{E}^3}, but by Hilbert, there is no {\mathcal{C}^2} immersion of any complete hyperbolic surface.That last example is the important one to consider when thinking about hypobolic spaces. Intuitively, manifolds with negative curvature have a hard time fitting in euclidean space because volume grows too fast — there is not enough room for them. The solution is to find (local, or global in the case of {\mathbb{H}^2}) models for hyperbolic manfolds such that the geometry is distorted from the usual euclidean geometry, but the isometries of the space are clear.

    2. 1-Dimensional Models for Hyperbolic Space

    While studying 1-dimensional hyperbolic space might seem simplistic, there are nice models such that higher dimensions are simple generalizations of the 1-dimensional case, and we have such a dimensional advantage that our understanding is relatively easy.

    2.1. Hyperboloid Model

    Parameterizing {H}

    Consider the quadratic form {\langle \cdot, \cdot \rangle_H} on {\mathbb{R}^2} defined by {\langle v, w \rangle_A = \langle v, w \rangle_H = v^TAw}, where {A = \left[ \begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array} \right]}. This doesn’t give a norm, since {A} is not positive definite, but we can still ask for the set of points {v} with {\langle v, v \rangle_H = -1}. This is (both sheets of) the hyperbola {x^2-y^2 = -1}. Let {H} be the upper sheet of the hyperbola. This will be 1-dimensional hyperbolic space.

    For any {n\times n} matrix {B}, let {O(B) = \{ M \in \mathrm{Mat}(n,\mathbb{R}) \, | \, \langle v, w \rangle_B = \langle Mv, Mw \rangle_B \}}. That is, matrices which preserve the form given by {A}. The condition is equivalent to requiring that {M^TBM = B}. Notice that if we let {B} be the identity matrix, we would get the regular orthogonal group. We define {O(p,q) = O(B)}, where {B} has {p} positive eigenvalues and {q} negative eigenvalues. Thus {O(1,1) = O(A)}. We similarly define {SO(1,1)} to be matricies of determinant 1 preserving {A}, and {SO_0(1,1)} to be the connected component of the identity. {SO_0(1,1)} is then the group of matrices preserving both orientation and the sheets of the hyperbolas.

    We can find an explicit form for the elements of {SO_0(1,1)}. Consider the matrix {M = \left[ \begin{array}{cc} a & b \\ c& d \end{array} \right]}. Writing down the equations {M^TAM = A} and {\det(M) = 1} gives us four equations, which we can solve to get the solutions

    \displaystyle \left[ \begin{array}{cc} \sqrt{b^2+1} & b \\ b & \sqrt{b^2+1} \end{array} \right] \textrm{ and } \left[ \begin{array}{cc} -\sqrt{b^2+1} & b \\ b & -\sqrt{b^2+1} \end{array} \right].

    Since we are interested in the connected component of the identity, we discard the solution on the right. It is useful to do a change of variables {b = \sinh(t)}, so we have (recall that {\cosh^2(t) - \sinh^2(t) = 1}).

    \displaystyle SO_0(1,1) = \left\{ \left[ \begin{array}{cc} \cosh(t) & \sinh(t) \\ \sinh(t) & \cosh(t) \end{array} \right] \, | \, t \in \mathbb{R} \right\}

    These matrices take {\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]} to {\left[ \begin{array}{c} \sinh(t) \\ \cosh(t) \end{array} \right]}. In other words, {SO_0(1,1)} acts transitively on {H} with trivial stabilizers, and in particular we have parmeterizing maps

    \displaystyle \mathbb{R} \rightarrow SO_0(1,1) \rightarrow H \textrm{ defined by } t \mapsto \left[ \begin{array}{cc} \cosh(t) & \sinh(t) \\ \sinh(t) & \cosh(t) \end{array} \right] \mapsto \left[ \begin{array}{c} \sinh(t) \\ \cosh(t) \end{array} \right]

    The first map is actually a Lie group isomorphism (with the group action on {\mathbb{R}} being {+}) in addition to a diffeomorphism, since

    \displaystyle \left[ \begin{array}{cc} \cosh(t) & \sinh(t) \\ \sinh(t) & \cosh(t) \end{array} \right] \left[ \begin{array}{cc} \cosh(s) & \sinh(s) \\ \sinh(s) & \cosh(s) \end{array} \right] = \left[ \begin{array}{cc} \cosh(t+s) & \sinh(t+s) \\ \sinh(t+s) & \cosh(t+s) \end{array} \right]

    Metric

    As mentioned above, {\langle \cdot, \cdot \rangle_H} is not positive definite, but its restriction to the tangent space of {H} is. We can see this in the following way: tangent vectors at a point {p \in H} are characterized by the form {\langle \cdot, \cdot \rangle_H}. Specifically, {v\in T_pH \Leftrightarrow \langle v, p \rangle_H}, since (by a calculation) {\frac{d}{dt} \langle p+tv, p+tv \rangle_H = 0 \Leftrightarrow \langle v, p \rangle_H}. Therefore, {SO_0(1,1)} takes tangent vectors to tangent vectors and preserves the form (and is transitive), so we only need to check that the form is positive definite on one tangent space. This is obvious on the tangent space to the point {\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}. Thus, {H} is a Riemannian manifold, and {SO_0(1,1)} acts by isometries.

    Let’s use the parameterization {\phi: t \mapsto \left[ \begin{array}{c} \sinh(t) \\ \cosh(t) \end{array} \right]}. The unit (in the {H} metric) tangent at {\phi(t) = \left[ \begin{array}{c} \sinh(t) \\ \cosh(t) \end{array} \right]} is {\left[ \begin{array}{c} \cosh(t) \\ \sinh(t) \end{array} \right]}. The distance between the points {\phi(s)} and {\phi(t)} is

    \displaystyle d_H(\phi(s), \phi(t)) = \left| \int_s^t\sqrt{\langle \left[ \begin{array}{c} \cosh(t) \\ \sinh(t) \end{array} \right], \left[ \begin{array}{c} \cosh(t) \\ \sinh(t) \end{array} \right] \rangle_H dv } \right| = \left|\int_s^tdv \right| = |t-s|

    In other words, {\phi} is an isometry from {\mathbb{E}^1} to {H}.

    1-dimensional hyperbollic space. The hyperboloid model is shown in blue, and the projective model is shown in red. An example of the projection map identifying {H} with {(-1,1) \subseteq \mathbb{R}\mathrm{P}^1} is shown.

    2.2. Projective Model

    Parameterizing

    Real projective space {\mathbb{R}\mathrm{P}^1} is the set of lines through the origin in {\mathbb{R}^2}. We can think about {\mathbb{R}\mathrm{P}^1} as {\mathbb{R} \cup \{\infty\}}, where {x\in \mathbb{R}} is associated with the line (point in {\mathbb{R}\mathrm{P}^1}) intersecting {\{y=1\}} in {x}, and {\infty} is the horizontal line. There is a natural projection {\mathbb{R}^2 \setminus \{0\} \rightarrow \mathbb{R}\mathrm{P}^1} by projecting a point to the line it is on. Under this projection, {H} maps to {(-1,1)\subseteq \mathbb{R} \subseteq \mathbb{R}\mathrm{P}^1}.

    Since {SO_0(1,1)} acts on {\mathbb{R}^2} preserving the lines {y = \pm x}, it gives a projective action on {\mathbb{R}\mathrm{P}^1} fixing the points {\pm 1}. Now suppose we have any projective linear isomorphism of {\mathbb{R}\mathrm{P}^1} fixing {\pm 1}. The isomorphism is represented by a matrix {A \in \mathrm{PGL}(2,\mathbb{R})} with eigenvectors {\left[ \begin{array}{c} 1 \\ \pm 1 \end{array} \right]}. Since scaling {A} preserves its projective class, we may assume it has determinant 1. Its eigenvalues are thus {\lambda} and {\lambda^{-1}}. The determinant equation, plus the fact that

    \displaystyle A \left[ \begin{array}{c} 1 \\ \pm 1 \end{array} \right] = \left[ \begin{array}{c} \lambda^{\pm 1} \\ \pm \lambda^{\pm 1} \end{array} \right]

    Implies that {A} is of the form of a matrix in {SO_0(1,1)}. Therefore, the projective linear structure on {(-1,1) \subseteq \mathbb{R}\mathrm{P}^1} is the “same” (has the same isometry (isomorphism) group) as the hyperbolic (Riemannian) structure on {H}.

    Metric

    Clearly, we’re going to use the pushforward metric under the projection of {H} to {(-1,1)}, but it turns out that this metric is a natural choice for other reasons, and it has a nice expression.

    The map taking {H} to {(-1,1) \subseteq \mathbb{R}\mathrm{P}^1} is {\psi: \left[ \begin{array}{c} \sinh(t) \\ \cosh(t) \end{array} \right] \rightarrow \frac{\sinh(t)}{\cosh(T)} = \tanh(t)}. The hyperbolic distance between {x} and {y} in {(-1,1)} is then {d_H(x,y) = |\tanh^{-1}(x) - \tanh^{-1}(y)|} (by the fact from the previous sections that {\phi} is an isometry).

    Recall the fact that {\tanh(a\pm b) = \frac{\tanh(a) \pm \tanh(b)}{1 \pm \tanh(a)\tanh(b)}}. Applying this, we get the nice form

    \displaystyle d_H(x,y) = \frac{y-x}{1 - xy}

    We also recall the cross ratio, for which we fix notation as { (z_1, z_2; z_3, z_4) := \frac{(z_3 -z_1)(z_4-z_2)}{(z_2-z_1)(z_4-z_3)}}. Then

    \displaystyle (-1, x;y,1 ) = \frac{(y+1)(1-x)}{(x+1)(1-y)} = \frac{1-xy + (y-x)}{1-xy + (x-y)}

    Call the numerator of that fraction by {N} and the denominator by {D}. Then, recalling that {\tanh(u) = \frac{e^{2u}-1}{e^{2u}+1}}, we have

    \displaystyle \tanh(\frac{1}{2} \log(-1,x;y,1)) = \frac{\frac{N}{D} -1}{\frac{N}{D} +1} = \frac{N-D}{N+D} = \frac{2(y-x)}{2(1-xy)} = \tanh(d_H(x,y))

    Therefore, {d_H(x,y) = \frac{1}{2}\log(-1,x;y,-1)}.

    3. Hilbert Metric

    Notice that the expression on the right above has nothing, a priori, to do with the hyperbolic projection. In fact, for any open convex body in {\mathbb{R}\mathrm{P}^n}, we can define the Hilbert metric on {C} by setting {d_H(p,q) = \frac{1}{2}\log(a,p,q,b)}, where {a} and {b} are the intersections of the line through {a} and {b} with the boundary of {C}. How is it possible to take the cross ratio, since {a,p,q,b} are not numbers? The line containing all of them is projectively isomorphic to {\mathbb{R}\mathrm{P}^1}, which we can parameterize as {\mathbb{R} \cup \{\infty\}}. The cross ratio does not depend on the choice of parameterization, so it is well defined. Note that the Hilbert metric is not necessarily a Riemannian metric, but it does make any open convex set into a metric space.

    Therefore, we see that any open convex body in {\mathbb{R}\mathrm{P}^n} has a natural metric, and the hyperbolic metric in {H = (-1,1)} agrees with this metric when {(-1,1)} is thought of as a open convex set in {\mathbb{R}\mathrm{P}^1}.

    4. Higher-Dimensional Hyperbolic Space

    4.1. Hyperboloid

    The higher dimensional hyperbolic spaces are completely analogous to the 1-dimensional case. Consider {\mathbb{R}^{n+1}} with the basis {\{e_i\}_{i=1}^n \cup \{e\}} and the 2-form {\langle v, w \rangle_H = \sum_{i=1}^n v_iw_i - v_{n+1}w_{n+1}}. This is the form defined by the matrix {J = I \oplus (-1)}. Define {\mathbb{H}^n} to be the positive (positive in the {e} direction) sheet of the hyperbola {\langle v,v\rangle_H = -1}.

    Let {O(n,1)} be the linear transformations preserving the form, so {O(n,1) = \{ A \, | \, A^TJA = J\}}. This group is generated by {O(1,1) \subseteq O(n,1)} as symmetries of the {e_1, e} plane, together with {O(n) \subseteq O(n,1)} as symmetries of the span of the {e_i} (this subspace is euclidean). The group {SO_0(n,1)} is the set of orientation preserving elements of {O(n,1)} which preserve the positive sheet of the hyperboloid ({\mathbb{H}^n}). This group acts transitively on {\mathbb{H}^n} with point stabilizers {SO(n)}: this is easiest to see by considering the point {(0,\cdots, 0, 1) \in \mathbb{H}^n}. Here the stabilizer is clearly {SO(n)}, and because {SO_0(n,1)} acts transitively, any stabilizer is a conjugate of this.

    As in the 1-dimensional case, the metric on {\mathbb{H}^n} is {\langle \cdot , \cdot \rangle_H|_{T_p\mathbb{H}^n}}, which is invariant under {SO_0(n,1)}.

    Geodesics in {\mathbb{H}^n} can be understood by consdering the fixed point sets of isometries, which are always totally geodesic. Here, reflection in a vertical (containing {e}) plane restricts to an (orientation-reversing, but that’s ok) isometry of {\mathbb{H}^n}, and the fixed point set is obviously the intersection of this plane with {\mathbb{H}^n}. Now {SO_0(n,1)} is transitive on {\mathbb{H}^n}, and it sends planes to planes in {\mathbb{R}^{n+1}}, so we have a bijection

    {Totally geodesic subspaces through {p}} {\leftrightarrow} {\mathbb{H}^n \cap} {linear subspaces of {\mathbb{R}^{n+1}} through {p} }

    By considering planes through {e}, we can see that these totally geodesic subspaces are isometric to lower dimensional hyperbolic spaces.

    4.2. Projective

    Analogously, we define the projective model as follows: consider the disk {\{v_{n+1} \,| v_{n+1} = 1, \langle v,v \rangle_H < 0\}}. I.e. the points in the {v_{n+1}} plane inside the cone {\langle v,v \rangle_H = 0}. We can think of {\mathbb{R}\mathrm{P}^n} as {\mathbb{R}^n \cup \mathbb{R}\mathrm{P}^{n-1}}, so this disk is {D^\circ \subseteq \mathbb{R}^n \subseteq \mathbb{R}\mathrm{P}^n}. There is, as before, the natural projection of {\mathbb{H}^n} to {D^\circ}, and the pushforward of the hyperbolic metric agrees with the Hilbert metric on {D^\circ} as an open convex body in {\mathbb{R}\mathrm{P}^n}.

    Geodesics in the projective model are the intersections of planes in {\mathbb{R}^{n+1}} with {D^\circ}; that is, they are geodesics in the euclidean space spanned by the {e_i}. One interesting consequence of this is that any theorem which is true in euclidean geometry which does not reply on facts about angles is still true for hyperbolic space. For example, Pappus’ hexagon theorem, the proof of which does not use angles, is true.

    4.3. Projective Model in Dimension 2

    In the case that {n=2}, we can understand the projective isomorphisms of {\mathbb{H}^2 = D \subseteq \mathbb{R}\mathrm{P}^2} by looking at their actions on the boundary {\partial D}. The set {\partial D} is projectively isomorphic to {\mathbb{R}\mathrm{P}^1} as an abstract manifold, but it should be noted that {\partial D} is not a straight line in {\mathbb{R}\mathrm{P}^2}, which would be the most natural way to find {\mathbb{R}\mathrm{P}^1}‘s embedded in {\mathbb{R}\mathrm{P}^2}.

    In addition, any projective isomorphism of {\mathbb{R}\mathrm{P}^1 \cong \partial D} can be extended to a real projective isomorphism of {\mathbb{R}\mathrm{P}^2}. In other words, we can understand isometries of 2-dimensional hyperbolic space by looking at the action on the boundary. Since {\partial D} is not a straight line, the extension is not trivial. We now show how to do this.

    The automorphisms of {\partial D \cong \mathbb{R}\mathrm{P}^1} are {\mathrm{PSL}(2,\mathbb{R}}. We will consider {\mathrm{SL}(2,\mathbb{R})}. For any Lie group {G}, there is an Adjoint action {G \rightarrow \mathrm{Aut}(T_eG)} defined by (the derivative of) conjugation. We can similarly define an adjoint action {\mathrm{ad}} by the Lie algebra on itself, as {\mathrm{ad}(\gamma '(0)) := \left. \frac{d}{dt} \right|_{t=0} \mathrm{Ad}(\gamma(t))} for any path {\gamma} with {\gamma(0) = e}. If the tangent vectors {v} and {w} are matrices, then {\mathrm{ad}(v)(w) = [v,w] = vw-wv}.

    We can define the Killing form {B} on the Lie algebra by {B(v,w) = \mathrm{Tr}(\mathrm{ad}(v)\mathrm{ad}(w))}. Note that {\mathrm{ad}(v)} is a matrix, so this makes sense, and the Lie group acts on the tangent space (Lie algebra) preserving this form.

    Now let’s look at {\mathrm{SL}(2,\mathbb{R})} specifically. A basis for the tangent space (Lie algebra) is {e_1 = \left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right]}, {e_2 = \left[ \begin{array}{cc} 0 & 0 \\ 1 & 0 \end{array} \right]}, and {e_3 = \left[ \begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array} \right]}. We can check that {[e_1,e_2] = e_3}, {[e_1,e_3] = -2e_1}, and {[e_2, e_3]=2e_2}. Using these relations plus the antisymmetry of the Lie bracket, we know

    \displaystyle \mathrm{ad}(e_1) = \left[ \begin{array}{ccc} 0 & 0 & -2 \\ 0 & 0 & 0 \\ 0 & 1 & 0 \end{array}\right] \qquad \mathrm{ad}(e_2) = \left[ \begin{array}{ccc} 0 & 0 & 0 \\ 0 & 0 & 2 \\ -1 & 0 & 0 \end{array}\right] \qquad \mathrm{ad}(e_3) = \left[ \begin{array}{ccc} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 0 \end{array}\right]

    Therefore, the matrix for the Killing form in this basis is

    \displaystyle B_{ij} = B(e_i,e_j) = \mathrm{Tr}(\mathrm{ad}(e_i)\mathrm{ad}(e_j)) = \left[ \begin{array}{ccc} 0 & 4 & 0 \\ 4 & 0 & 0 \\ 0 & 0 & 8 \end{array}\right]

    This matrix has 2 positive eigenvalues and one negative eigenvalue, so its signature is {(2,1)}. Since {\mathrm{SL}(2,\mathbb{R})} acts on {T_e(\mathrm{SL}(2,\mathbb{R}))} preserving this form, we have {\mathrm{SL}(2,\mathbb{R}) \cong O(2,1)}, otherwise known at the group of isometries of the disk in projective space {\mathbb{R}\mathrm{P}^2}, otherwise known as {\mathbb{H}^2}.

    Any element of {\mathrm{PSL}(2,\mathbb{R})} (which, recall, was acting on the boundary of projective hyperbolic space {\partial D}) therefore extends to an element of {O(2,1)}, the isometries of hyperbolic space, i.e. we can extend the action over the disk.

    This means that we can classify isometries of 2-dimensional hyperbolic space by what they do to the boundary, which is determined generally by their eigevectors ({\mathrm{PSL}(2,\mathbb{R})} acts on {\mathbb{R}\mathrm{P}^1} by projecting the action on {\mathbb{R}^2}, so an eigenvector of a matrix corresponds to a fixed line in {\mathbb{R}^2}, so a fixed point in {\mathbb{R}\mathrm{P}^1 \cong \partial D}. For a matrix {A}, we have the following:

     

  • {|\mathrm{Tr}(A)| < 2} (elliptic) In this case, there are no real eigenvalues, so no real eigenvectors. The action here is rotation, which extends to a rotation of the entire disk.
  • {|\mathrm{Tr}(A)| = 2} (parabolic) There is a single real eigenvector. There is a single fixed point, to which all other points are attracted (in one direction) and repelled from (in the other). For example, the action in projective coordinates sending {[x:y]} to {[x+1:y]}: infinity is such a fixed point.
  • {|\mathrm{Tr}(A)| > 2} (hyperbolic) There are two fixed point, one attracting and one repelling.
  •  

    5. Complex Hyperbolic Space

    We can do a construction analogous to real hyperbolic space over the complexes. Define a Hermitian form {q} on {\mathbb{C}^{n+1}} with coordinates {\{z_1,\cdots, z_n\} \cup \{w\}} by {q(x_1,\cdots x_n, w) = |z_1|^2 + \cdots + |z_n|^2 - |w|^2}. We will also refer to {q} as {\langle \cdot, \cdot \rangle_q}. The (complex) matrix for this form is {J = I \oplus (-1)}, where {q(v,w) = v^*Jw}. Complex linear isomorphisms preserving this form are matrices {A} such that {A^*JA = J}. This is our definition for {\mathrm{U}(q) := \mathrm{U}(n,1)}, and we define {\mathrm{SU}(n,1)} to be those elements of {\mathrm{U}(n,1)} with determinant of norm 1.

    The set of points {z} such that {q(z) = -1} is not quite what we are looking for: first it is a {2n+1} real dimensional manifold (not {2n} as we would like for whatever our definition of “complex hyperbolic {n} space” is), but more importantly, {q} does not restrict to a positive definite form on the tangent spaces. Call the set of points {z} where {q(z) = -1} by {\bar{H}}. Consider a point {p} in {\bar{H}} and {v} in {T_p\bar{H}}. As with the real case, by the fact that {v} is in the tangent space,

    \displaystyle \left. \frac{d}{dt} \right|_{t=0} \langle p + tv, p+tv\rangle_q = 0 \quad \Rightarrow \quad \langle v, p \rangle_q + \langle p,v \rangle_q = 0

    Because {q} is hermitian, the expression on the right does not mean that {\langle v,p\rangle_q = 0}, but it does mean that {\langle v,p \rangle_q} is purely imaginary. If {\langle v,p \rangle_q = ik}, then {\langle v,v\rangle_q < 0}, i.e. {q} is not positive definite on the tangent spaces.

    However, we can get rid of this negative definite subspace. {S^1} as the complex numbers of unit length (or {\mathrm{U}(1)}, say) acts on {\mathbb{C}^{n+1}} by multiplying coordinates, and this action preserves {q}: any phase goes away when we apply the absolute value. The quotient of {\bar{H}} by this action is {\mathbb{C}\mathbb{H}^n}. The isometry group of this space is still {\mathrm{U}(n,1)}, but now there are point stabilizers because of the action of {\mathrm{U}(1)}. We can think of {\mathrm{U}(1)} inside {\mathrm{U}(n,1)} as the diagonal matrices, so we can write

    \displaystyle \mathrm{SU}(n,1) \times \mathrm{U}(1) \cong U(n,1)

    And the projectivized matrices {\mathrm{PSU}(n,1)} is the group of isometries of {\mathbb{C}\mathbb{H}^n \subseteq \mathbb{C}^n \subseteq \mathbb{C}\mathrm{P}^n}, where the middle {\mathbb{C}^n} is all vectors in {\mathbb{C}^{n+1}} with {w=1} (which we think of as part of complex projective space). We can also approach this group by projectivizing, since that will get rid of the unwanted point stabilizers too: we have {\mathrm{PU}(n,1) \cong \mathrm{PSU}(n,1)}.

    5.1. Case {n=1}

    In the case {n=1}, we can actually picture {\mathbb{C}\mathrm{P}^1}. We can’t picture the original {\mathbb{C}^4}, but we are looking at the set of {(z,w)} such that {|z|^2 - |w|^2 = -1}. Notice that {|w| \ge 1}. After projectivizing, we may divide by {w}, so {|z/w| - 1 = -1/|w|}. The set of points {z/w} which satisfy this is the interior of the unit circle, so this is what we think of for {\mathbb{C}\mathbb{H}^1}. The group of complex projective isometries of the disk is {\mathrm{PU}(1,1)}. The straight horizontal line is a geodesic, and the complex isometries send circles to circles, so the geodesics in {\mathbb{C}\mathbb{H}^1} are circles perpendicular to the boundary of {S^1} in {\mathbb{C}}.

    Imagine the real projective model as a disk sitting at height one, and the geodesics are the intersections of planes with the disk. Complex hyperbolic space is the upper hemisphere of a sphere of radius one with equator the boundary of real hyperbolic space. To get the geodesics in complex hyperbolic space, intersect a plane with this upper hemisphere and stereographically project it flat. This gives the familiar Poincare disk model.

    5.2. Real {\mathbb{H}^2}‘s contained in {\mathbb{C}\mathbb{H}^n}

    {\mathbb{C}\mathbb{H}^2} contains 2 kinds of real hyperbolic spaces. The subset of real points in {\mathbb{C}\mathbb{H}^n} is (real) {\mathbb{H}^n}, so we have a many {\mathbb{H}^2 \subseteq \mathbb{H}^n \subseteq \mathbb{C}\mathbb{H}^n}. In addition, we have copies of {\mathbb{C}\mathbb{H}^1}, which, as discussed above, has the same geometry (i.e. has the same isometry group) as real {\mathbb{H}^2}. However, these two real hyperbolic spaces are not isometric. the complex hyperbolic space {\mathbb{C}\mathbb{H}^1} has a more negative curvature than the real hyperbolic spaces. If we scale the metric on {\mathbb{C}\mathbb{H}^n} so that the real hyperbolic spaces have curvature {-1}, then the copies of {\mathbb{C}\mathbb{H}^1} will have curvature {-4}.

    In a similar vein, there is a symplectic structure on {\mathbb{C}\mathbb{H}^n} such that the real {\mathbb{H}^2} are lagrangian subspaces (the flattest), and the {\mathbb{C}\mathbb{H}^1} are symplectic, the most negatively curved.

    An important thing to mention is that complex hyperbolic space does not have constant curvature(!).

    6. Poincare Disk Model and Upper Half Space Model

    The projective models that we have been dealing with have many nice properties, especially the fact that geodesics in hyperbolic space are straight lines in projective space. However, the angles are wrong. There are models in which the straight lines are “curved” i.e. curved in the euclidean metric, but the angles between them are accurate. Here we are interested in a group of isometries which preserves angles, so we are looking at a conformal model. Dimension 2 is special, because complex geometry is real conformal geometry, but nevertheless, there is a model of {\mathbb{R}\mathbb{H}^n} in which the isometries of the space are conformal.

    Consider the unit disk {D^n} in {n} dimensions. The conformal automorphisms are the maps taking (straight) diameters and arcs of circles perpendicular to the boundary to this same set. This model is abstractly isomorphic to the Klein model in projective space. Imagine the unit disk in a flat plane of height one with an upper hemisphere over it. The geodesics in the Klein model are the intersections of this flat plane with subspaces (so they are straight lines, for example, in dimension 2). Intersecting vertical planes with the upper hemisphere and stereographically projecting it flat give geodesics in the Poincare disk model. The fact that this model is the “same” (up to scaling the metric) as the example above of {\mathbb{C}\mathbb{H}^1} is a (nice) coincidence.

    The Klein model is the flat disk inside the sphere, and the Poincare disk model is the sphere. Geodesics in the Klein model are intersections of subspaces (the angled plane) with the flat plane at height 1. Geodesics in the Poincare model are intersections of vertical planes with the upper hemisphere. The two darkened geodesics, one in the Klein model and one in the Poincare, correspond under orthogonal projection. We get the usual Poincare disk model by stereographically projecting the upper hemisphere to the disk. The projection of the geodesic is shown as the curved line inside the disk

    The Poincare disk model. A few geodesics are shown.

    Now we have the Poincare disk model, where the geodesics are straight diameters and arcs of circles perpendicular to the boundary and the isometries are the conformal automorphisms of the unit disk. There is a conformal map from the disk to an open half space (we typically choose to conformally identify it with the upper half space). Conveniently, the hyperbolic metric on the upper half space {d_H} can be expressed at a point {(x,t)} (euclidean coordinates) as {d_H = d_E/t}. I.e. the hyperbolic metric is just a rescaling (at each point) of the euclidean metric.

    One of the important things that we wanted in our models was the ability to realize isometries of the model with isometries of the ambient space. In the case of a one-parameter family of isometries of hyperbolic space, this is possible. Suppose that we have a set of elliptic isometries. Then in the disk model, we can move that point to the origin and realize the isometries by rotations. In the upper half space model, we can move the point to infinity, and realize them by translations.

 

I recently uploaded a paper to the arXiv entitled Knots with small rational genus, joint with Cameron Gordon. The genesis of this paper was a couple of nice (and related) talks at Caltech by Matthew Hedden and Jake Rasmussen in 2007. They both talked about potential applications of the theory of knot Floer homology to the Berge conjecture. A Berge knot is a (tame) knot K in the 3-sphere which lies on a genus two Heegaard surface, and with the property that on each side of the Heegaard surface there is a meridian disk that the knot intersects exactly once. Equivalently, the inclusion of the knot into each (closed) handlebody sends the generator of \pi_1(K) to a generator of \pi_1(\text{handlebody}). Note that since the 3-sphere admits a unique (up to isotopy) Heegaard splitting of any genus, one may think of such a knot as lying on a specific genus 2 surface in S^3. Such knots were classified by Berge; they admit (Dehn) surgeries which result in (nontrivial) Lens spaces. The Berge conjecture is the converse; i.e.:

Berge Conjecture: Let K be a knot in S^3 which admits a nontrivial Lens space surgery; i.e. there is a Lens space L and a knot K' in L for which S^3 - K is homeomorphic to L - K'. Then K is a Berge knot.

An equivalent formulation (of course) is to try to classify knots in Lens spaces which admit an S^3 surgery, i.e. to identify the knots K' as in the formulation of the conjecture above. The equivalent formulation says that these knots should be 1-bridge. The strategy of Hedden-Rasmussen (building on work of Ken Baker and Eli Grigsby) to approach the Berge conjecture depends on characterizing such knots by properties which can be detected by topological invariants that behave well under surgery. An example of such a topological invariant is the Casson invariant \lambda(\cdot), a \mathbb{Z}-valued invariant of integer homology spheres which satisfies the surgery formula \lambda(M_{n+1}) - \lambda(M_n) = \text{Arf}(K) where M_i denotes the result of 1/i surgery on some integral homology sphere M along a fixed knot K, and \text{Arf}(K) is the Arf invariant. For more sophisticated invariants like knot Floer homology, the surgery formula is replaced by an exact triangle. One important piece of topological information that is detected by knot Floer homology is the genus of a knot. The approach to the Berge conjecture thus rests on Ken Baker’s impressive paper showing that small genus knots (in a sense to be made precise) in Lens spaces have small bridge number.

Hedden remarked in his talk that his work, and that of his collaborators “gave the first examples of an infinite family of knots that were characterized by their knot Floer homology”. Though technically true, I think this overstates the role of knot Floer homology in this case, since the knots (1-bridge knots in Lens spaces) are entirely characterized (up to isotopy) by their genus (and therefore by any topological invariant which detects genus). My immediate instinct was to think that knots with small genus in any 3-manifold should always be quite special, and that a complete classification might even be feasible. My paper with Cameron confirms this suspicion, and gives such a classification. Let me admit at this point that I am not especially interested in the Berge conjecture per se, although I find it interesting that new ideas in 3-manifold topology are starting to have something meaningful to say about it. In any case, I shall not have anything else to say about it (meaningful or otherwise) in this post.

First I should say that I have been using the word “genus” in a somewhat sloppy manner. For an oriented knot K in S^3, a Seifert surface is a compact oriented embedded surface \Sigma \subset S^3 whose boundary is K. The genus of such a surface is a non-negative integer, and the least such genus over all Seifert surfaces is (said to be) the genus of K, denoted g(K). Such a surface represents the generator in the relative homology group H_2(S^3, K) which equals H_1(K) = \mathbb{Z} since S^3 has vanishing homology in dimensions 1 and 2. This relative homology group is dual to H^1(S^3 - K), which is parameterized by homotopy classes of maps from S^3 - K to a circle (which is a K(\mathbb{Z},1)). The preimage of a regular value under a smooth map dual to the homology class is a smooth proper surface in S^3 - K whose closure is a Seifert surface. It is immediate that g(K)=0 if and only if K is an unknot; in other words, the unknot is “characterized” by its genus. There are infinitely many knots of any positive genus in S^3; on the other hand, there are only two fibered genus 1 knots — the trefoil and the figure 8 knot (three if you distinguish the left-handed from the right-handed trefoil), and it is worth remarking (from the point of view of the motivation of characterizing knots by topological invariants) that a theorem of Yi Ni says that fiberedness of knots can be detected by knot Floer homology.

For knots in integral homology 3-spheres, the situation is very similar: every knot admits a Seifert surface, and the least genus of such a surface is the genus of a knot. The unknot is (always) characterized by the fact that it has genus 0, but there are infinitely many knots of every positive genus. For a knot K in a general 3-manifold M it is not so easy to define genus. A necessary and sufficient condition for K to bound an embedded surface in its complement is that [K]=1 in H_1(M). However, if [K] has finite order, one can find an open properly embedded surface \Sigma in the complement of K whose “boundary” wraps some number of times around K. Technically, let \Sigma be a compact oriented surface, and f:\Sigma \to M a map which restricts to an embedding from the interior of \Sigma into M-K, and which restricts to an oriented covering map from \partial \Sigma to K (note that we allow \Sigma to have multiple boundary components). If p is the degree of the covering map \partial \Sigma \to K, we call \Sigma a p-Seifert surface, and define the rational genus of \Sigma to be -\chi^-(\Sigma)/2p, where \chi denotes Euler characteristic, and \chi^-(\Sigma) = \min(0,\chi(\Sigma)) (for a connected surface \Sigma). The reason to use Euler characteristic instead of genus is that Euler characteristic is multiplicative under coverings (unlike genus), and behaves well with respect to “local” operations on surfaces like cut-and-paste. Moreover, (negative) Euler characteristic, unlike genus, is a good measure of complexity for surfaces with possibly many boundary components. The coefficient of 2 in the denominator reflects the fact that genus is “almost” -2 times Euler characteristic. With this definition, we say that the rational genus of K, for any knot K \subset M with [K] of finite order in H_1(M), is the infimum of -\chi^-(\Sigma)/2p over all p-Seifert surfaces for K and all p. The purpose of our paper is to give a complete classification of knots with sufficiently small rational genus, and to show that such knots are always “geometric” — i.e. they can be isotoped into a normal form which is sensitive to the geometric decomposition of the ambient 3-manifold M. Thus the concept of rational genus makes contact between the homological world of the Thurston norm, knot Floer homology and such invariants, and the geometric world of hyperbolic structures, JSJ decompositions and so on.

It is worth pointing out at this point that knots with small rational genus are not special by virtue of being rare: if K is any knot in S^3 (for instance) of genus g(K), and K' in M is obtained by p/q Dehn surgery on K, then the knot K' has order p in H_1(M), and \|K'\| \le (g-1/2)/2p. Since for “most” coprime p/q the integer p is arbitrarily large, it follows that “most” knots obtained in this way have arbitrarily small rational genus.

There is a precise connection between rational genus and the Thurston norm. There is an exact sequence in homology, which contains the fragment H_2(M,K) \to H_1(K) \to H_1(M). Since H_1(K) = \mathbb{Z}, the kernel of H_1(K) \to H_1(M) is generated by some class n[K], and one can define the affine subspace \partial^{-1}(n[K]) \subset H_2(M,K). By excision, we identify H_2(M,K) with H_2(M-\text{int}(N(K)), \partial N(K)) where N(K) is a tubular neighborhood of K. Under this identification, the rational genus of K is equal to \inf \|[\Sigma]\|_T/2 where \|\cdot\|_T denotes the (relative) Thurston norm, and the infimum is taken over classes in H_2(M-\text{int}(N(K)), \partial N(K)) in the affine subspace corresponding to \partial^{-1}(n[K]). Since the Thurston norm is a convex piecewise rational function, this infimum is realized at some rational point. In other words, rational genus of any knot is rational, and is realized by some p-Seifert surface, where n as above divides p (note: if M is a rational homology sphere, then necessarily p=n, but if the rank of H_1(M) is positive, this is not necessarily true, and p/n might be arbitrarily large). This relationship to the Thurston norm also gives a straightforward algorithm to compute rational genus, since one can compute Thurston norm e.g. by linear programming in normal surface space relative to any triangulation.

The precise statement of results depends on the geometric decomposition of the ambient manifold M. By the geometrization theorem (of Perelman), a closed, orientable 3-manifold is either reducible (i.e. contains an embedded sphere that does not bound a ball), or is a Lens space, or is hyperbolic, or is a small Seifert fiber space, or is toroidal (i.e. contains an essential (\pi_1-injective) embedded torus). For the record, the complete “classification” is as follows:

Reducible Theorem: Let {K} be a knot in a reducible manifold {M}. Then either

  1. {\|K\| \ge 1/12}; or
  2. there is a decomposition {M = M' \# M''}, {K \subset M'} and either
    1. {M'} is irreducible, or
    2. {(M',K) = (\mathbb{RP}^3,\mathbb{RP}^1)\#(\mathbb{RP}^3,\mathbb{RP}^1)}

Lens Theorem: Let {K} be a knot in a lens space {M}. Then either

  1. {\|K\| \ge 1/24}; or
  2. {K} lies on a Heegaard torus in {M}; or
  3. {M} is of the form {L(4k,2k-1)} and {K} lies on a Klein bottle in {M} as a non-separating orientation-preserving curve.

Hyperbolic Theorem: Let {K} be a knot in a closed hyperbolic {3}-manifold {M}. Then either

  1. {\|K\| \ge 1/402}; or
  2. {K} is trivial; or
  3. {K} is isotopic to a cable of the core of a Margulis tube.

Small SFS Theorem: Let {M} be an atoroidal Seifert fiber space over {S^2} with three exceptional fibers and let {K} be a knot in {M}. Then either

  1. {\|K\| \ge 1/402}; or
  2. {K} is trivial; or
  3. {K} is a cable of an exceptional Seifert fiber of {M}; or
  4. {M} is a prism manifold and {K} is a fiber in the Seifert fiber structure of {M} over {\mathbb{RP}^2} with at most one exceptional fiber.

Toroidal Theorem: Let {M} be a closed, irreducible, toroidal 3-manifold, and let {K} be a knot in {M}. Then either

  1. {\|K\| \ge 1/402}; or
  2. {K} is trivial; or
  3. {K} is contained in a hyperbolic piece {N} of the JSJ decomposition of {M} and is isotopic either to a cable of a core of a Margulis tube or into a component of {\partial N}; or
  4. {K} is contained in a Seifert fiber piece {N} of the JSJ decomposition of {M} and either
    1. {K} is isotopic to an ordinary fiber or a cable of an exceptional fiber or into {\partial N}, or
    2. {N} contains a copy {Q} of the twisted {S^1} bundle over the Möbius band and {K} is contained in {Q} as a fiber in this bundle structure;
  5. or

  6. {M} is a {T^2}-bundle over {S^1} with Anosov monodromy and {K} is contained in a fiber.

The constant 1/402 is presumably not optimal, but reflects the coarseness of certain geometric estimates at a particular step in the argument. Broadly speaking, there are two cases to consider: when the knot complement M-K is hyperbolic, and when it is not. The complement M-K is hyperbolic unless it contains an essential subsurface of non-negative Euler characteristic.

The case that M-K is hyperbolic is conceptually easiest to analyze. Let \Sigma be a surface, embedded in M and with boundary wrapping some number of times around K, realizing the rational genus of K. The complete hyperbolic structure on M-K may be deformed, adding back K as a cone geodesic. Just as a cone can be obtained from a wedge of paper by gluing the two edges together, the geometry of a cone geodesic is locally modeled on the quotient space obtained from a (3-dimensional hyperbolic) wedge by gluing the two flat faces together. The thinner the wedge, the smaller the cone angle along the geodesic. For all sufficiently small angles \theta > 0, Thurston proved that there exists a unique hyperbolic metric on M which is singular along a cone geodesic, isotopic to K, with cone angle \theta. Call this metric space M_\theta. The cone angle can be increased, deforming the geometry in a family of spaces, until one of the following three things happens:

  1. The cone angle is increased all the way to 2\pi, resulting in the complete hyperbolic structure on M, in which K is isotopic to an embedded geodesic; or
  2. The volume of the family of manifolds M_\theta goes to zero (and either converges after rescaling to a Euclidean cone manifold, or converges after rescaling to have fixed diameter and injectivity radius going to zero everywhere); or
  3. The cone locus bumps into itself (this can only happen for \theta > \pi).

As the cone angle along K increases, so does the length of the cone geodesic. Simultaneously, the diameter of an embedded tube about this diameter decreases. While the diameter of the tube is big, the deformation can continue. Hodgson-Kerckhoff analyzed the kinds of degenerations that can occur, and obtained universal geometric control on how fast the tube diameter can shrink, or the length of the cone geodesic grow. They showed that the cone angle can be increased (giving rise to a family of singular hyperbolic structures M_\theta) either until \theta = 2\pi, or until the product \theta \cdot \ell, where \ell is the length of the cone geodesic, is at least 1.019675, at which point the diameter of an embedded tube about this cone geodesic is at least 0.531. Since \theta < 2\pi in the latter case, one obtains a lower bound on both the length of the cone geodesic and the diameter of an embedded tube, independent of K or M.

Now, one would like to use this big tube to conclude that \|K\| is large. This is accomplished as follows. Geometrically, one constructs a 1-form \alpha which agrees with the length form on the cone geodesic, which is supported in the tube, and which satisfies \|d\alpha\|\le C pointwise for some (universal) constant C. Then one uses this 1-form to control the topology of \Sigma. By Stokes theorem, for any surface S homotopic to \Sigma in M-K one has an estimate

1.019675/2\pi \le \ell = \int_K \alpha = \frac {1}{p} \int_S d\alpha \le \frac {C}{p} \text{area}(S)

In particular, the area of S divided by p can’t be too small. However, it turns out that one can find a surface S as above with \text{area}(S) \le -2\pi\chi(S); such an estimate is enough to obtain a universal lower bound on \|K\|. Such a surface S can be constructed either by the shrinkwrapping method of Calegari-Gabai, or the (related) PL-wrapping method of Soma. Roughly speaking, one uses the cone geodesic as an “obstacle”, and finds a surface S of least area homotopic to \Sigma (rel. boundary) subject to the constraint that it cannot cross the geodesic. Away from the cone geodesic, S looks like an ordinary minimal surface. In particular, its intrinsic curvature is no more than the extrinsic curvature of hyperbolic space, which is -1 everywhere. Along the geodesic, S looks like a bedsheet hanging on a clothesline; in particular, it does not accumulate any corners or atoms of positive curvature along this singularity, so the Gauss-Bonnet theorem gives the desired bound on \text{area}(S).

This leaves the case that M-K is not hyperbolic to analyze. As remarked above, this only occurs when M-K contains an essential surface (which might be closed or proper) of non-negative Euler characteristic, i.e. a sphere, a disk, an annulus or a torus. In this case, one tries to make the intersection of \Sigma with this essential surface as simple as possible; if one arranges this just right, every intersection contributes a definite amount to the topology of \Sigma, and one can conclude either that \Sigma is complicated (in which case \|K\| is large), or that the intersection is simple, and therefore draw some topological conclusion.

To actually do this in practice is quite complicated, but fortunately it relies on (largely combinatorial) methods developed at length by Gabai, Scharlemann, Gordon and others over the last 30 years to analyze (so-called) “exceptional surgeries”. Of course, the argument is still complicated, and this analysis takes up most of the length of the paper. It is also worth pointing out that every case provided for by the classification above actually occurs, with examples of arbitrarily small rational genus.

This paper raises several natural questions, the most obvious of which is whether the explicit (but quite small) constants can be improved in any way. The constant 1/402 in the statement of the Toroidal Theorem is really only there to take care of a knot sitting inside a hyperbolic piece in the decomposition; a knot that interacts in a meaningful way with an essential torus necessarily has rational genus at least 1/24 (for a precise statement, see the paper). As remarked above, knots of (ordinary) genus 1 are very plentiful, even in S^3, and do not “see” any of the ambient geometry, so the wildest and most optimistic guess might be that there is a chance of classifying knots of rational genus at most 1/4. There are some (very weak) reasons to think that this fraction is critical, at least in some cases, not least of which is the papers of Hedden and Ni mentioned above. But in the hyperbolic case, it is probably not easy to get a better estimate using purely geometric arguments.

Another approach might be to try to substitute another conclusion (again in the hyperbolic case) than that K be isotopic to the cable of a core of a Margulis tube. For instance, one might ask for K to admit an insulator family (of the kind Gabai used here), or one might merely ask that K be unknotted in the universal cover, or satisfy some other condition. This goes to the heart of a very, very difficult and important question, namely how to identify geometric features of codimension 2 objects in (especially hyperbolic) geometric 3-manifolds from purely topological properties. If I am optimistic, then I can imagine that this paper makes a contribution, however small, to this ongoing project.

I am (update: was) currently (update: but am no longer) in Brisbane for the “New directions in geometric group theory” conference, which has been an entirely enjoyable and educational experience. I got to eat fish and chips, to watch Australia make 520 for 7 (declared) against the West Indies at the WACA, and to hear Masato Mimura give a very nice talk about his recent results on rigidity of the “universal lattice”.

His talk included a quick and beautiful survey of some geometric aspects of the theory of rigidity for infinite groups, which I will attempt to partially reproduce (despite the limitations of the wordpress format). In this context, rigidity is expressed in terms of isometric affine actions of groups on Banach spaces. This means the following. Suppose B is a Banach space (i.e. a complete, normed vector space) and G is a group. A linear isometric action is a representation \rho from G to the group of linear isometries of B — i.e. linear norm-preserving automorphisms. An affine action is a representation from G to the group of affine isometries of B — i.e. isometries as a metric space that do not necessarily fix the zero element. The group of isometries of a Banach space B is a semi-direct product B \rtimes U(B) where U(B) is the group of linear isometries, and B is the Banach space, thought of as an Abelian group, acting on itself by (isometric) translations. Such an action is usually encoded by a pair \rho:G \to U(B) which records the “linear” part of the action, and a 1-cocycle with coefficients in \rho, i.e. a function c:G \to B satisfying c(gh) = c(g) + \rho(g)c(h) for every g,h \in G. This formula might look strange if you don’t know where it comes from: it is just the way that factors transform in semi-direct products. The affine action is given by sending g \in G to the transformation that sends each b \in B to \rho(g)b + c(g). Consequently, gh is sent to the transformation that sends b to \rho(gh)b + c(gh) and the fact that this is a group action becomes the formula

\rho(gh)b + c(gh) = \rho(g)(\rho(h)b + c(h)) + c(g) = \rho(gh)b + \rho(g)c(h) + c(g)

Equating the left and right hand sides gives the cocycle condition. Given one affine isometric action, one can obtain another in a silly way by conjugating by an isometry b \to b + b' for some b' \in B. Under conjugation by such an isometry, a cocycle c transforms by c(g) \to c(g) + \rho(g)b' - b'. A function of the form c(g) = \rho(g)b' - b' is called a 1-coboundary, and the quotient of the space of 1-cocycles by the space of 1-coboundaries is the 1 dimensional cohomology of G with coefficients in \rho:G \to U(B). This is usually denoted H^1(G,\rho), where B is suppressed in the notation. In particular, an affine isometric action of G on B with linear part \rho has a global fixed point iff it represents 0 in H^1(G,\rho). Contrapositively, G admits an affine isometric action on B without a global fixed point iff H^1(G,\rho) \ne 0 for some \rho.

A group G is said to satisfy Serre’s Property (FH) if every affine isometric action of G on a Hilbert space has a global fixed point. In 2007, Bader-Furman-Gelander-Monod introduced a property (FB) for a group G to mean that every affine isometric action of G on some (out of a class of) Banach space(s) B has a global fixed point. Mimura used the notation property (FL_p) for the case that B is allowed to range over the class of L_p spaces (for some fixed 1 < p < \infty).

Intimately related is Kazhdan’s Property (T), introduced by Kazhdan in this paper. Let G be a locally compact topological group (for example, a discrete group). The set of irreducible unitary representations of G is called its dual, and denoted \hat{G}. This dual is topologized in the following way. Associated to a representation \rho:G \to U(L), a unit vector X \in L, a positive number \epsilon > 0 and a compact subset K \subset G there is an open neighborhood of \rho consisting of representations \rho':G \to U(L') for which there is a unit vector Y \in L such that |\langle \rho(g)X,X\rangle - \langle \rho(g')Y, Y\rangle| < \epsilon whenever g \in K. With this topology (called the Fell topology), one says that a group G has property (T) if the trivial representation is isolated in \hat{G}. Note that this topology is very far from being Hausdorff: the trivial representation fails to be isolated exactly when there are a sequence of representations \rho_i:G \to U(L_i), unit vectors X_i \in L_i, numbers \epsilon_i \to 0 and compact sets K_i exhausting G so that |\langle\rho_i(g)X_i,X_i\rangle| < \epsilon_i for any g \in K_i. The vectors X_i are said to be (a sequence of) almost invariant vectors. Hence (informally) a group has property (T) if some compact subset must move some unit vector a definite amount in every irreducible nontrivial unitary representation. If a group fails to have property (T), one can rescale a sequence of irreducible actions near a sequence of almost invariant vectors in such a way that one obtains in the geometric limit a nontrivial isometric action on L^2 without a global fixed point. A famous theorem of Delorme-Guichardet says that property (T) and property (FH) are equivalent for (locally compact second countable) groups. Property (T) passes to quotients, and to lattices (i.e. finite covolume discrete subgroups of a topological group). Kazhdan already showed in his paper that \text{SL}(n,\mathbb{R}) has property (T) for n at least 3, and therefore the same is true for lattices in this groups, such as \text{SL}(n,\mathbb{Z}), a fact which is not easy to see directly from the definition. One beautiful application, already pointed out by Kazhdan, is that this means that all lattices in \text{SL}(n,\mathbb{R}), for instance the groups \text{SL}(n,\mathbb{Z}) (and in fact, all discrete groups with property (T)) are finitely generated. Kazhdan’s proof of this is incredibly short: let G be a discrete group and g_i and sequence of elements. For each i, let G_i be the subgroup of G generated by \lbrace g_1,g_2,\cdots,g_i\rbrace. Notice that G is finitely generated iff G_i=G for all sufficiently large i. On the other hand, consider the unitary representations of G induced by the trivial representations on the G_i. Every compact subset of G is finite, and therefore eventually fixes a vector in every one of these representations; thus there is a sequence of almost fixed vectors. If G has property (T), this sequence eventually contains a fixed vector, which can only happen if G/G_i is finite, in which case G is finitely generated, as claimed.

Property (FL_p) generalizes (FH) (equivalently (T)) in many significant ways, with interesting applications to dynamics. For example, Navas showed that if G is a group with property (T) then every action of G on a circle which is at least C^{1+1/2 + \epsilon} factors through a finite group. Navas’s argument can be generalized straightforwardly to show that if G has (FL_p) for some p>2 then every action of G on a circle which is at least C^{1+1/p+\epsilon} factors through a finite group. The proof rests on a beautiful construction due to Reznikov (although a similar construction can be found in Pressley-Segal) of certain functions on a configuration space of the circle which are not in L^p but have coboundaries which are; this gives rise to nontrivial cohomology with L^p coefficients for groups acting on the circle in a sufficiently interesting way.

(Update: Nicolas Monod points out in an email that the “function on a configuration space” is morally just the derivative. In fact, he made the nice remark that if D is any elliptic operator on an n-manifold, then the commutator [D,g] is of Schatten class (n+1) whenever g is a sufficiently smooth function; morally this should give rise to nontrivial cohomology with suitable coefficients for groups acting with enough regularity on any given n-manifold, and one would like to use this e.g. to approach Zimmer’s conjecture, but nobody seems to know how to make this work as yet; in fact the work of Monod et. al. on (FL_p) is at least partly motivated by this general picture.)

Mimura discussed a spectrum of rigid behaviour for infinite groups, ranging from most rigid (property (FL_p) for every p) to least rigid (amenable) (note: every finite group is both amenable and has property (T), so this only really makes sense for infinite groups; moreover, every reasonable measure of rigidity for infinite groups is usually invariant under passing to subgroups of finite index). Free groups, \text{SL}(2,\mathbb{Z}) and so on are very non-rigid. However, it is well-known that certain infinite families of (word) hyperbolic groups, including lattices in groups of isometries of quaternion-hyperbolic symmetric spaces, and “random” groups with relations having density parameter 1/3 < d < 1/2 (see Zuk or Ollivier) are both hyperbolic and have property (T). Nevertheless, these groups are not as rigid as higher rank lattices like \text{SL}(n,\mathbb{Z}) for n>2. The latter have property (FL_p) for every 1< p < \infty, whereas Yu showed that every hyperbolic group admits a proper affine isometric action on \ell^p for some p (the existence of a proper affine isometric action on a Hilbert space is called “a-T-menability” by Gromov, and the “Haagerup property” by some. Groups satisfying this property, or even Yu’s weaker property, are known to satisfy some version of the Baum-Connes conjecture, the subject of a very nice minicourse by Graham Niblo at the same conference).

It is in this context that one can appreciate Mimura’s results. His first main result is that the group \text{SL}_n(\mathbb{Z}[x_1,x_2,\cdots,x_n]) (i.e. the “universal lattice”) has property (FL_p) for every 1<p<\infty provided n is at least 4. Since property (FL_p) (like (T)) passes to quotients, this implies that \text{SL}_n(R) has (FL_p) for every unital, commutative, finitely generated ring R.

His second main result concerns a “quasification” of FL_p, to a property called (FFL_p). Without getting too technical, this property concerns “quasi-actions” of a group on a Banach space by affine isometries; algebraically these are encoded by 1-cochains c:G \to B for which there is a universal constant D so that |c(gh) - c(g) -\rho(g)c(h)| < D as measured in the Banach norm on B. Any bounded map c:G \to B defines a 1-cochain; such (bounded) 1-cochains corresponds to  quasi-action with a bounded orbit. Associated to \rho: G \to U(B) one defines in a similar way a complex of bounded cochains; quasi-actions modulo bounded quasi-actions are parameterized by the kernel of the comparison map H^2_b(G,\rho) \to H^2(G,\rho) from bounded to ordinary cohomology. Mimura’s second main result is that when G is the universal lattice as above, and \rho has no invariant vectors, the comparison map from bounded to ordinary cohomology in dimension 2 is injective.

The fact that \rho as above is required to have no invariant vectors is a technical necessity of Mimura’s proof. When \rho is trivial, one is studying “ordinary” bounded cohomology, and there is an exact sequence

0 \to H^1(G) \to Q(G) \to H^2_b(G) \to H^2(G)

with real coefficients for any G (here Q(G) denotes the vector space of homogeneous quasimorphisms on G). In this context, one knows by Bavard duality that H^2_b \to H^2 is injective if and only if the stable commutator length is identically zero on [G,G]. By quite a different method, Mimura shows that for n at least 6, and for any Euclidean ring R (i.e. a ring for which one has a Euclidean algorithm; for example, R = \mathbb{C}[x]) the group SL_n(R) has vanishing stable commutator length, and therefore one has injectivity of bounded to ordinary cohomology in dimension 2.

(Update 1/9/2010): Nicholas Monod sent me a nice email commenting on a couple of points in this blog entry, and I have consequently modified the language a bit in a few places. Ta much!

On page 10 of Besse’s famous book on Einstein manifolds one finds the following quote:

It would seem that Riemannian and Lorentzian geometry have much in common: canonical connections, geodesics, curvature tensor, etc. . . . But in fact this common part is only a common disposition at the onset: one soon enters different realms.

I will not dispute this. But it is not clear to me whether this divergence is a necessary consequence of the nature of the objects of study (in either case), or an artefact of the schism between mathematics and physics during much of the 20th century. In any case, in this blog post I have the narrow aim of describing some points of contact between Lorentzian (and more generally, causal) geometry and other geometries (hyperbolic, symplectic), which plays a significant role in some of my research.

The first point of contact is the well-known duality between geodesics in the hyperbolic plane and points in the (projectivized) “anti de-Sitter plane”. Let \mathbb{R}^{2,1} denote a 3-dimensional vector space equipped with a quadratic form

q(x,y,z) = x^2 + y^2 - z^2

If we think of the set of rays through the origin as a copy of the real projective plane \mathbb{RP}^2, the hyperbolic plane is the set of projective classes of vectors v with q(v)<0, the (projectivized) anti de-Sitter plane is the set of projective classes of vectors v with q(v)>0, and their common boundary is the set of projective classes of (nonzero) vectors v with q(v)=0. Topologically, the hyperbolic plane is an open disk, the anti de-Sitter plane is an open Möbius band, and their boundary is the “ideal circle” (note: what people usually call the anti de-Sitter plane is actually the annulus double-covering this Möbius band; this is like the distinction between spherical geometry and elliptic geometry). Geometrically, the hyperbolic plane is a complete Riemannian surface of constant curvature -1, whereas the anti de-Sitter plane is a complete Lorentzian surface of constant curvature -1.

In this projective model, a hyperbolic geodesic \gamma is an open straight line segment which is compactified by adding an unordered pair of points in the ideal circle. The straight lines in the anti de-Sitter plane tangent to the ideal circle at these two points intersect at a point p_\gamma. Moreover, the set of geodesics \gamma in the hyperbolic plane passing through a point q are dual to the set of points p_\gamma in the anti de-Sitter plane that lie on a line which does not intersect the ideal circle. In the figure, three concurrent hyperbolic geodesics are dual to three colinear anti de-Sitter points.

The anti de-Sitter geometry has a natural causal structure. There is a cone field whose extremal vectors at every point p are tangent to the straight lines through p that are also tangent to the ideal circle. A smooth curve is timelike if its tangent at every point is supported by this cone field, and spacelike if its tangent is everywhere not supported by the cone field. A timelike curve corresponds to a family of hyperbolic geodesics which locally intersect each other; a spacelike curve corresponds to a family of disjoint hyperbolic geodesics that foliate some region.

One can distinguish (locally) between future and past along a timelike trajectory, by (arbitrarily) identifying the “future” direction with a curve which winds positively around the ideal circle. The fact that one can distinguish in a consistent way between the positive and negative direction is equivalent to the existence of a nonzero section of timelike vectors. On the other hand, there does not exist a nonzero section of spacelike vectors, so one cannot distinguish in a consistent way between left and right (this is a manifestation of the non-orientability of the Möbius band).

The duality between the hyperbolic plane and the anti de-Sitter plane is a manifestation of the fact that (at least at the level of Lie algebras) they have the same (infinitesimal) symmetries. Let O(2,1) denote the group of real 3\times 3 matrices which preserve q; i.e. matrices A for which q(A(v)) = q(v) for all vectors v. This contains a subgroup SO^+(2,1) of index 4 which preserves the “positive sheet” of the hyperboloid q=-1, and acts on it in an orientation-preserving way. The hyperbolic plane is the homogeneous space for this group whose point stabilizers are a copy of SO(2) (which acts as an elliptic “rotation” of the tangent space to their common fixed point). The anti de-Sitter plane is the homogeneous space for this group whose point stabilizers are a copy of SO^+(1,1) (which acts as a hyperbolic “translation” of the geodesic in hyperbolic space dual to the given point in anti de-Sitter space). The ideal circle is the homogeneous space whose point stabilizers are a copy of the affine group of the line. The hyperbolic plane admits a natural Riemannian metric, and the anti de-Sitter plane a Lorentz metric, which are invariant under these group actions. The causal structure on the anti de-Sitter plane limits to a causal structure on the ideal circle.

Now consider the 4-dimensional vector space \mathbb{R}^{2,2} and the quadratic form q(v) = x^2 + y^2 - z^2 - w^2. The (3-dimensional) sheets q=1 and q=-1 both admit homogeneous Lorentz metrics whose point stabilizers are copies of SO^+(1,2) and SO^+(2,1) (which are isomorphic but sit in SO(2,2) in different ways). These 3-manifolds are compactified by adding the projectivization of the cone q=0. Topologically, this is a Clifford torus in \mathbb{RP}^3 dividing this space into two open solid tori which can be thought of as two Lorentz 3-manifolds. The causal structure on the pair of Lorentz manifolds limits to a pair of complementary causal structures on the Clifford torus. (edited 12/10)

Let’s go one dimension higher, to the 5-dimensional vector space \mathbb{R}^{2,3} and the quadratic form q(v) = x^2 + y^2 - u^2 - z^2 - w^2. Now only the sheet q=1 is a Lorentz manifold, whose point stabilizers are copies of SO^+(1,3), with an associated causal structure. The projectivized cone q=0 is a non-orientable twisted S^2 bundle over the circle, and it inherits a causal structure in which the sphere factors are spacelike, and the circle direction is timelike. This ideal boundary can be thought of in quite a different way, because of the exceptional isomorphism at the level of (real) Lie algebras so(2,3)= sp(4), where sp(4) denotes the Lie algebra of the symplectic group in dimension 4. In this manifestation, the ideal boundary is usually denoted \mathcal{L}_2, and can be thought of as the space of Lagrangian planes in \mathbb{R}^4 with its usual symplectic form. One way to see this is as follows. The wedge product is a symmetric bilinear form on \Lambda^2 \mathbb{R}^4 with values in \Lambda^4 \mathbb{R}^4 = \mathbb{R}. The associated quadratic form vanishes precisely on the “pure” 2-forms — i.e. those associated to planes. The condition that the wedge of a given 2-form with the symplectic form vanishes imposes a further linear condition. So the space of Lagrangian 2-planes is a quadric in \mathbb{RP}^4, and one may verify that the signature of the underlying quadratic form is (2,3). The causal structure manifests in symplectic geometry in the following way. A choice of a Lagrangian plane \pi lets us identify symplectic \mathbb{R}^4 with the cotangent bundle T^*\pi. To each symmetric homogeneous quadratic form q on \pi (thought of as a smooth function) is associated a linear Lagrangian subspace of T^*\pi, namely the (linear) section dq. Every Lagrangian subspace transverse to the fiber over 0 is of this form, so this gives a parameterization of an open, dense subset of \mathcal{L}_2 containing the point \pi. The set of positive definite quadratic forms is tangent to an open cone in T_\pi \mathcal{L}_2; the field of such cones as \pi varies defines a causal structure on \mathcal{L}_2 which agrees with the causal structure defined above.

These examples can be generalized to higher dimension, via the orthogonal groups SO(n,2) or the symplectic groups Sp(2n,\mathbb{R}). As well as two other infinite families (which I will not discuss) there is a beautiful “sporadic” example, connected to what Freudenthal called octonion symplectic geometry associated to the noncompact real form E_7(-25) of the exceptional Lie group, where the ideal boundary S^1\times E_6/F_4 has an invariant causal structure whose timelike curves wind around the S^1 factor; see e.g. Clerc-Neeb for a more thorough discussion of the theory of Shilov boundaries from the causal geometry point of view, or see here or here for a discussion of the relationship between the octonions and the exceptional Lie groups.

The causal structure on these ideal boundaries gives rise to certain natural 2-cocycles on their groups of automorphisms. Note in each case that the ideal boundary has the topological structure of a bundle over S^1 with spacelike fibers. Thus each closed timelike curve has a well-defined winding number, which is just the number of times it intersects any one of these spacelike slices. Let C be an ideal boundary as above, and let \tilde{C} denote the cyclic cover dual to a spacelike slice. If p is a point in \tilde{C}, we let p+n denote the image of p under the nth power of the generator of the deck group of the covering. If g is a homeomorphism of C preserving the causal structure, we can lift g to a homeomorphism \tilde{g} of \tilde{C}. For any such lift, define the rotation number of \tilde{g} as follows: for any point p \in \tilde{C} and any integer n, let r_n be the the smallest integer for which there is a causal curve from p to \tilde{g}(p) to p+r_n, and then define rot(\tilde{g}) = \lim_{n \to \infty} r_n/n. This function is a quasimorphism on the group of causal automorphisms of \tilde{C}, with defect equal to the least integer n such that any two points p,q in C are contained in a closed causal loop with winding number n. In the case of the symplectic group Sp(2n,\mathbb{R}) with causal boundary \mathcal{L}_n, the defect is n, and the rotation number is (sometimes) called the symplectic rotation number; it is a quasimorphism on the universal central extension of Sp(2n,\mathbb{R}), whose coboundary descends to the Maslov class (an element of 2-dimensional bounded cohomology) on the symplectic group.

Causal structures in groups of symplectomorphisms or contactomorphisms are intensely studied; see for instance this paper by Eliashberg-Polterovich.

The other day at lunch, one of my colleagues — let’s call her “Wendy Hilton” to preserve her anonymity (OK, this is pretty bad, but perhaps not quite as bad as Clive James’s use of “Romaine Rand” as a pseudonym for “Germaine Greer” in Unreliable Memoirs . . .) — expressed some skepticism about a somewhat unusual assertion that I make at the start of my scl monograph. Since it is my monograph, I feel free to quote the offending paragraphs:

It is unfortunate in some ways that the standard way to refer to the plane emphasizes its product structure. This product structure is topologically unnatural, since it is defined in a way which breaks the natural topological symmetries of the object in question. This fact is thrown more sharply into focus when one discusses more rigid topologies.

At this point I give an example, namely that of the Zariski topology, pointing out that the product topology of two copies of the affine line with the Zariski topology is not the same as the Zariski topology on the affine plane. All well and good. I then go on to claim that part of the bias is biological in origin, citing the following example as evidence:

Example 1.2 (Primary visual cortex). The primary visual cortex of mammals (including humans), located at the posterior pole of the occipital cortex, contains neurons hardwired to fire when exposed to certain spatial and temporal patterns. Certain specific neurons are sensitive to stimulus along specific orientations, but in primates, more cortical machinery is devoted to representing vertical and horizontal than oblique orientations (see for example [58] for a discussion of this effect).

(Note: [58] is a reference to the paper “The distribution of oriented contours in the real world” by David Coppola, Harriett Purves, Allison McCoy, and Dale Purves, Proc. Natl. Acad. Sci. USA 95 (1998), no. 7, 4002–4006)

I think Wendy took this to be some kind of poetic license or conceit, and perhaps even felt that it was a bit out of place in a serious research monograph. On balance, I think I agree that it comes across as somewhat jarring and unexpected to the reader, and the tone and focus is somewhat inconsistent with that of the rest of the book. But I also think that in certain subjects in mathematics — and I would put low-dimensional geometry/topology in this category — we are often not aware of the extent to which our patterns of reasoning and imagination are shaped, limited, or (mis)directed by our psychological — and especially psychophysical — natures.

The particular question of how the mind conceives of, imagines, or perceives any mathematical object is complicated and multidimensional, and colored by historical, social, and psychological (not to mention mathematical) forces. It is generally a vain endeavor to find precise physical correlates of complicated mental objects, but in the case of the plane (or at least one cognitive surrogate, the subjective visual field) there is a natural candidate for such a correlate. Cells on the rear of the occipital lobe are arranged in a “map” in the region of the occipital lobe known as the “primary visual cortex”, or V1. There is a precise geometric relationship between the location of neurons in V1 and the points in the subjective visual field they correspond to. Further visual processing is done by other areas V2, V3, V4, V5 of the visual cortex. Information is fed forward from Vi to Vj with j>i, but also backward from Vj to Vi regions, so that visual information is processed at several levels of abstraction simultaneously, and the results of this processing compared and refined in a complicated synthesis (this tends to make me think of the parallel terraced scan model of analogical reasoning put forward by Douglas Hofstadter and Melanie Mitchell; see Fluid concepts and creative analogies, Chapter 5).

The initial processing done by the V1 area is quite low-level; individual neurons are sensitive to certain kind of stimuli, e.g. color, spatial periodicity (on various scales),  motion, orientation, etc. As remarked earlier, more neurons are devoted to detecting horizontally or vertically aligned stimuli; in other words, our brains literally devote more hardware to perceiving or imagining vertical and horizontal lines than to lines with an oblique orientation. This is not to say that at some higher, more integrated level, our perception is not sensitive to other symmetries that our hardware does not respect, just as a random walk on a square lattice in the plane converges (after a parabolic rescaling) to Brownian motion (which is not just rotationally but conformally invariant). However the fact is that humans perform statistically better on cognitive tasks that involve the perception of figures that are aligned along the horizontal and vertical axes, than on similar tasks that differ only by a rotation of the figures.

It is perhaps interesting therefore that the earliest (?) mathematical conception of the plane, due to the Greeks, did not give a privileged place to the horizontal or vertical directions, but treats all orientations on an equal footing. In other words, in Greek (Euclidean) geometry, the definitions respect the underlying symmetries of the objects. Of course, from our modern perspective we would not say that the Greeks gave a definition of the plane at all, or at best, that the definition is woefully inadequate. According to one well-known translation, the plane is introduced as a special kind of surface as follows:

A surface is that which has length and breadth.

When a surface is such that the right line joining any two arbitrary points in it lies wholly in the surface, it is called a plane.

This definition of a surface looks as though it is introducing coordinates, but in fact one might just as well interpret it as defining a surface in terms of its dimension; having defined a surface (presumably thought of as being contained in some ambient undefined three-dimensional space) one defines a plane to be a certain kind of surface, namely one that is convex. Horizontal and vertical axes are never introduced. Perpendicularity is singled out as important, but the perpendicularity of two lines is a relative notion, whereas horizontality and verticality are absolute. In the end, Euclidean geometry is defined implicitly by its properties, most importantly isotropy (i.e. all right angles are equal to one another) and the parallel postulate, which singles it out from among several alternatives (elliptic geometry, hyperbolic geometry). In my opinion, Euclidean geometry is imprecise but natural (in the sense of category theory), because objects are defined in terms of the natural transformations they admit, and in a way that respects their underlying symmetries.

In the 15th century, the Italian artists of the Renaissance developed the precise geometric method of perspective painting (although the technique of representing more distant objects by smaller figures is extremely ancient). Its invention is typically credited to the architect and engineer Filippo Brunelleschi; one may speculate that the demands of architecture (i.e. the representation of precise 3 dimensional geometric objects in 2 dimensional diagrams) was one of the stimuli that led to this invention (perhaps this suggestion is anachronistic?). Mathematically, this gives rise to the geometry of the projective plane, i.e. the space of lines through the origin (the “eye” of the viewer of a scene). In principle, one could develop projective geometry without introducing “special” directions or families of lines. However, in one, two, or three point perspective, families of lines parallel to one or several “special” coordinate axes (along which significant objects in the painting are aligned) appear to converge to one of the vanishing points of the painting. In his treatise “De pictura” (on painting), Leon Battista Alberti (a friend of Brunelleschi) explicitly described the geometry of vision in terms of projections on to a (visual) plane. Amusingly (in the context of this blog post), he explicitly distinguishes between the mathematical and the visual plane:

In all this discussion, I beg you to consider me not as a mathematician but as a painter writing of these things.

Mathematicians measure with their minds alone the forms of things separated from all matter. Since we wish the object to be seen, we will use a more sensate wisdom.

I beg to differ: similar parts of the brain are used for imagining a triangle and for looking at a painting. Alberti’s claim sounds a bit too much like Gould’s “non-overlapping magisteria”, and in a way it is disheartening that it was made at a place and point in history at which mathematics and the visual arts were perhaps at their closest.

In the 17th century René Descartes introduced his coordinate system and thereby invented “analytic geometry”. To us it might not seem like such a big leap to go from a checkerboard floor in a perspective painting (or a grid of squares to break up the visual field) to the introduction of numerical coordinates to specify a geometrical figure, but Descartes’s ideas for the first time allowed mathematicians to prove theorems in geometry by algebraic methods. Analytic geometry is contrasted with “synthetic geometry”, in which theorems are deduced logically from primitive axioms and rules of inference. In some abstract sense, this is not a clear distinction, since algebra and analysis also rests on primitive axioms, and rules of deduction. In my opinion, this terminology reflects a psychological distinction between “analytic methods” in which one computes blindly and then thinks about what the results of the calculation mean afterwards, and “synthetic methods” in which one has a mental model of the objects one is manipulating, and directly intuits the “meaning” of the operations one performs. Philosophically speaking, the first is formal, the second is platonic. Biologically speaking, the first does not make use of the primary visual cortex, the second does.

As significant as Descartes ideas were, mathematicians were slow to take real advantage of them. Complex numbers were invented by Cardano in the mid 16th century, but the idea of representing complex numbers geometrically, by taking the real and imaginary parts as Cartesian coordinates, had to wait until Argand in the early 19th.

Incidentally, I have heard it said that the Greeks did not introduce coordinates because they drew their figures on the ground and looked at them from all sides, whereas Descartes and his contemporaries drew figures in books. Whether this has any truth to it or not, I do sometimes find it useful to rotate a mathematical figure I am looking at, in order to stimulate my imagination.

After Poincaré’s invention of topology in the late 19th century, there was a new kind of model of the plane to be (re)imagined, namely the plane as a topological space. One of the most interesting characterizations was obtained by the brilliantly original and idiosyncratic R. L. Moore in his paper, “On the foundations of plane analysis situs”. Let me first remark that the line can be characterized topologically in terms of its natural order structure; one might argue that this characterization more properly determines the oriented line, and this is a fair comment, but at least the object has been determined up to a finite ambiguity. Let me second of all remark that the characterization of the line in terms of order structures is useful; a (countable) group G is abstractly isomorphic to a group of (orientation-preserving) homeomorphisms of the line if and only if G admits an (abstract) left-invariant order.

Given points and the line, Moore proceeds to list a collection of axioms which serve to characterize the plane amongst topological spaces. The axioms are expressed in terms of separation properties of primitive undefined terms called points and regions (which correspond more or less to ordinary points and open sets homeomorphic to the interiors of closed disks respectively) and non-primitive objects called “simple closed curves” which are (eventually) defined in terms of simpler objects. Moore’s axioms are “natural” in the sense that they do not introduce new, unnecessary, unnatural structure (such as coordinates, a metric, special families of “straight” lines, etc.). The basic principle on which Moore’s axioms rest is that of separation — which continua separate which points from which others? If there is a psychophysical correlate of this mathematical intuition, perhaps it might be the proliferation of certain neurons in the primary visual cortex which are edge detectors — they are sensitive, not to absolute intensity, but to a spatial discontinuity in the intensity (associated with the “edge” of an object). The visual world is full of objects, and our eyes evolved to detect them, and to distinguish them from their surroundings (to distinguish figure from ground as it were). If I have an objection to Cartesian coordinates on biological grounds (I don’t, but for the sake of argument let’s suppose I do) then perhaps Moore should also be disqualified for similar reasons. Or rather, perhaps it is worth being explicitly aware, when we make use of a particular mathematical model or intellectual apparatus, of which aspects of it are necessary or useful because of their (abstract) applications to mathematics, and which are necessary or useful because we are built in such a way as to need or to be able to use them.

Follow

Get every new post delivered to your Inbox.

Join 109 other followers