Last week I was at Oberwolfach for a meeting on geometric group theory. My friend and collaborator Koji Fujiwara gave a very nice talk about constructing actions of groups on quasi-trees (i.e. spaces quasi-isometric to trees). The construction is inspired by the famous subsurface projection construction, due to Masur-Minsky, which was a key step in their proof that the complex of curves (a natural simplicial complex on which the mapping class group acts cocompactly) is hyperbolic. Koji’s talk was very stimulating, and shook up my thinking about a few related matters; the purpose of this blog post is therefore for me to put some of my thoughts in order: to describe the Masur-Minsky construction, to point out a connection to certain geometric phenomena like winding numbers of curves on surfaces, and to note that a variation on their construction gives rise directly to certain natural chiral invariants of surface automorphisms (and their generalizations) which should be relevant to 4-manifold topologists.

The basic object of study in Masur-Minsky is the complex of curves. This object is a kind of nonlinear analogue (in the context of surface topology) of a Bruhat-Tits building, and was introduced by Harvey in the late 70′s; but it was not until the mid-90s that the first interesting theorems about it began to be proved. We fix a surface S and define a simplicial complex C(S), called the complex of curves, as follows: an n-simplex in C(S) is a collection of (n+1) isotopy classes of essential embedded curves in S which are pairwise disjoint (this means that any two classes of curves in the collection admit disjoint representatives; it is an elementary but important fact about 2-dimensional topology that this implies that there are a collection of representatives of all the classes which are simultaneously pairwise disjoint). The mapping class group of S, denoted Mod(S), is the group of isotopy classes of (orientation-preserving) self-homeomorphisms of S. Evidently Mod(S) acts on C(S) by automorphisms. Furthermore, the action is cocompact (i.e. the quotient of C(S) by the action of Mod(S) is compact); however, the action is not proper. One way to see this is to think about the stabilizer of a simplex in C(S). A simplex denotes a collection of disjoint curves (up to isotopy); at the very least, the Dehn twists in each of these curves is a commuting family of mapping classes which preserves the collection elementwise. So the stabilizer of an n-simplex contains at least a free abelian group of rank (n+1).

Actually, this lack of properness turns out to be a virtue. If G is a (finitely generated) group, and X is a path metric space on which it acts properly and cocompactly by isometries, then G (in its word metric with respect to any finite generating set) is quasi-isometric to X. The mapping class group is not (for most surfaces S) a hyperbolic group, because it contains lots of free abelian subgroups of high rank (depending on the topology of S), so it can’t be quasi-isometric to a hyperbolic metric space. But because the action of Mod(S) on C(S) is not proper, it leaves open the possibility that C(S) is hyperbolic. This is exactly what Masur-Minsky prove.

The key tool is an operation called subsurface projection. Let X be an essential subsurface of S. Define the subsurface projection \pi_X: C(S) \to C(X) as follows. For each vertex \alpha \in C(S) (i.e. for each isotopy class of essential simple closed curve \alpha \subset S) first isotope \alpha so that it intersects X in as few components as possible. The definition of the projection falls into several cases:

  • if \alpha \subset X then set \pi_X(\alpha) = \alpha;
  • if \alpha intersects \partial X, then take the boundary of a tubular neighborhood of (\alpha \cap X) \cup \partial X and throw away the inessential or peripheral components, and set \pi_X(\alpha) equal to the rest; and
  • if \alpha \cap X = \emptyset then define \pi_X(\alpha) = \emptyset.

In order for this definition to not be vacuous, we should insist that X has some curves which are essential and non boundary parallel (so that X should be neither an annulus nor a pair of pants). But actually, it is very interesting and important to extend the definition to the case that X is an annulus. If X is an annulus, Mod(X) is just the integers, generated by a Dehn twist in the core circle. One can give a modified definition of C(X) by picking “base points” on the two boundary components, letting the vertices be isotopy classes of proper arcs from one base point to the other, and joining two vertices by an edge if the corresponding isotopy classes are disjoint (except at their endpoints, of course). With this definition, C(X) is just a copy of the real line with vertices at the integers, and Mod(X) acts by (integer) translation. If X is a pair of pants, any mapping class in Mod(X) is a product of Dehn twists in the boundary curves; so these are superfluous from the point of view of understanding subsurface projection. Now, if X is an annulus in S, and \alpha is an essential simple closed curve in S, we can choose a hyperbolic structure on S and let S_X be the cover corresponding to the annulus X, and let \tilde{\alpha} be the collection of preimages of \alpha in S_X. These preimages are disjoint, so if one of them crosses the lift of X to S_X it determines (up to bounded ambiguity) an essential arc in X and therefore an element of C(X).

Now we can describe how subsurface projection is used to control the geometry of C(S). Lets suppose we choose a family of subsurfaces X_i which are permuted by the action of Mod(S). For each index i there is a subsurface projection \pi_i:C(S) \to C(X_i); putting these together we get a map from C(S) to the product of the C(X_i) (actually, this is not quite true, since for any curve \alpha on S, the image of the projection of \alpha is empty in C(X) whenever \alpha is disjoint from X; what is better behaved is the diameter of the projection of a subset of C(S) to each C(X_i)). Now, for any two simplices \Delta,\Delta' in C(S) we can take their projection to each C(X_i). Suppose (e.g. by induction) that we understand distance in each X_i better than we do in C(S). Then we can define a sort of “distance” in C(S) between \Delta and \Delta' by adding up the distances between the projections to each X_i. The problem is that each projection involves some choice, and each choice involves a bounded but nonzero amount of ambiguity; a priori we might imagine that these infinite bounded choices might add up to a globally uncontrollable ambiguity. Masur-Minsky’s ingenious solution is simply to impose a cutoff: they pick a sufficiently big number K, and then ignore all subsurfaces in which the projections have distance less than K, and then add up the distances in the projections bigger than this. If the family of surfaces X_i is rich enough, the resulting metric is quasi-isometric to the path metric in C(S). If the C(X_i) are hyperbolic, then so is C(S). qed.

Even if the family X_i is not “rich enough”, it is still the case that this construction defines a sort of (approximate) quotient metric on C(S), implicitly defining a hyperbolic space on which Mod(S) acts in an interesting way. If we tailor our choice of surfaces X_i suitably, we can make this action sensitive to certain kinds of information in Mod(S) and not others. What is maybe not obvious is that there are nontrivial variations on this construction that can be obtained by choosing certain kinds of metrics on the C(X_i) — for instance, by choosing asymmetric metrics.

Such asymmetric metrics in turn can be chosen to be sensitive to chiral information about mapping class groups. The simplest kind of chiral distinction one can make is the distinction between a left-handed and a right-handed Dehn twist. The handedness of a Dehn twist is defined by standing on one boundary component of the annulus supporting the twist, looking inwards, and seeing whether the image of the co-cores (under the twist) twist to the left, or to the right before getting to the other component. Importantly, this does not depend on choosing a side of the annulus to stand on (or, equivalently, an orientation of the curve); although it does depend on a choice of orientation on the surface. More generally, if X is a surface with boundary, an element of Mod(X) is left-veering (resp. right-veering) if, for any point x on the boundary, and any proper arc \alpha emanating from x, the image of \alpha under the mapping class twists to the left at x relative to \alpha (since we are only concerned with isotopy classes of arcs, we mean “the representative which intersects \alpha minimally”). The left-veering elements of Mod(X) form a cone (i.e. the product of two left-veering elements is left-veering) and similarly for the right-veering elements; moreover, every conjugate of a left-veering element is left-veering. But it is not the case that every nontrivial element is veering one way or the other. The existence of these invariant cones allows one to construct interesting asymmetric metrics on Mod(X) as follows: first choose some (symmetric) generating set for Mod(X). Next, pick some big integer K, and add to the generating set all right-veering elements of word length at most K. The resulting (asymmetric!) generating set defines a new notion of word length and asymmetric “distance” on Mod(X) in which it is harder in some sense to turn left than right. In the case that X is an annulus, so that Mod(X) is just the integers, this is the asymmetric metric in which the distance from N to M is equal to M-N if M is bigger than N, and (N-M)/K if N is bigger than M. Since C(X) is also just a copy of the integers when X is an annulus, we can similarly define canonical asymmetric metrics on such C(X). Now define an (asymmetric) metric on C(S) by choosing some Mod(S)-invariant collection of annuli X_i, and for each one define (signed) distance to be the sum of the signed distance of the projections, truncated so that we ignore all projections whose signed distance is sufficiently small (in absolute value).

What is the point of building an asymmetric metric? The key point is that asymmetric metrics on hyperbolic groups (or groups acting on hyperbolic spaces) give rise to quasimorphisms, by antisymmetrizing. If G is a group, a function \phi:G \to \mathbb{R} is said to be a quasimorphism if there is some least non-negative real number D(\phi) (called the defect) so that for all g,h \in G there is an inequality |\phi(gh) - \phi(g) - \phi(h)| \le D(\phi). A quasimorphism is further said to be homogeneous if \phi(g^n) = n\phi(g) for all g \in G and all integers n. If \phi is any quasimorphism, it may be homogenized by defining \overline{\phi}(g): = \lim_{n \to \infty} \phi(g^n)/n. Homogenization might increase the defect by (at most) a factor of 2, but it takes quasimorphisms to homogeneous quasimorphisms and furthermore has the property that \phi - \overline{\phi} is bounded. The (real vector) space of homogeneous quasimorphisms on a group G is denoted Q(G); it contains as a subspace the homomorphisms to \mathbb{R} (i.e. H^1(G,\mathbb{R})), which is precisely the subspace on which the defect vanishes. Thus, the defect becomes a norm on Q(G)/H^1(G), and in fact this quotient space is a Banach space. Now, if G is a hyperbolic group, and d is an asymmetric (path) metric, then we can define a quasimorphism \phi_d(g) = d(1,g) - d(1,g^{-1}). It turns out that many classical constructions of quasimorphisms on hyperbolic groups, and groups acting on hyperbolic spaces, are exactly of this kind; for instance the so-called counting quasimorphisms invented by Rhemtulla and rediscovered by Brooks, and the quasimorphisms of Epstein-Fujiwara. Antisymmetrizing asymmetric subsurface projection metrics on mapping class groups gives rise to chiral quasimorphisms which are sensitive to the difference between left and right twisting.

(As an aside, it is interesting to remark that in another talk at the same conference by Zlil Sela, he (Zlil) talked about the problem of solving systems of equations in free semigroups, which he proposed to solve by analogy with his solution for systems of equations in free groups, using the fact that a free semigroup embeds into a free group on the same generators. Many steps of the argument are very similar to the group case. One interesting intermediate step involves analyzing the JSJ decompositions of certain intermediate objects; the subgroups in this decomposition that cause the most trouble are the surface subgroups. In the free group case, one needs to factor out by the action of the mapping class group to get suitable finiteness. In the semigroup case, the surfaces are “decorated” by certain directed structures, coming from the irreversibility of multiplications in semigroups (versus groups); I wondered to Zlil whether the mapping classes preserving such structure would (at least in certain cases?) have something in common with the cones of right-veering (or left-veering) mapping classes described above).

Quasimorphisms on mapping class groups sensitive to chirality should be important for understanding the symplectic geometry of 4-manifolds, especially of surface bundles over surfaces and their cousins. One such chiral invariant (probably the most important) is the signature (for a closed, oriented 4-manifold, the intersection form on middle dimensional homology is symmetric and definite, and the signature of the 4-manifold is the signature of this form). Actually, certain connections between signatures and quasimorphisms are reasonably well-known; Wall non-additivity for signatures (i.e. the phenomenon that if 4-manifolds Y^+ and Y^- are glued along subsets of their boundaries to produce Y also with boundary, then sign(Y) = sign(Y^+) + sign(Y^-) - correction term) can be understood as measuring the failure of certain quasimorphisms to be honest homomorphisms; here the correction term is a Maslov triple index, associated to the symplectic rotation quasimorphism on the universal central extension of the symplectic group. Some of this story is well summarized by Gambaudo-Ghys. But as far as I know, the direct construction of quasimorphisms from chirally asymmetric metrics is new, and unexplored; it would be very interesting to see how much 4-manifold topology it can see.

Actually, if I can speculate, it seems to me that the machinery of subsurface projections seems well suited to analyze general symplectic 4-manifolds. Donaldson famously proved that every symplectic manifold admits the structure of a Lefschetz pencil — roughly, a surface bundle outside some (controlled) singular set of codimension 4. In 4 dimensions this gives the symplectic manifold the structure of a surface bundle over a surface outside finitely many “singular fibers” which look like surfaces pinched to a point along some embedded cycle, and the monodromy around this singular fiber is a positive Dehn twist in the pinched cycle. The fibers are Poincare dual to multiples of the cohomology class of the symplectic form (let’s assume we have perturbed and scaled it to be integral), and such fibrations exist for all sufficiently big multiples. Moreover, these fibrations are unique (up to isotopy) when the multiple is big enough. So symplectic 4-manifolds are not quite surface bundles. But away from the singular fibers they are, and the monodromy — or at least its projections to subsurfaces avoiding the vanishing cycles — are coarsely well-defined.

I wanted to end this post by pointing out that the “subsurface projection plus cutoff” trick can actually already be seen in a few other geometric contexts in which the connection to quasimorphisms is already explicitly known. One example concerns the random turtles in the hyperbolic plane, discussed in a previous post. One fixes a distance D and an angle A and considers a “turtle” in the hyperbolic plane, who proceeds by repeatedly taking steps of length D and then turning either left or right through angle A (randomly, independently, and with equal probability). When A is sufficiently small (for fixed D), the trajectory of the turtle is a quasigeodesic. However, when A passes some threshold A_0 (again, keeping D fixed), the trajectories stop being quasigeodesic, and in fact the winding number of the turtle (as a function of time) is distributed like a Gaussian variable. The variance of the winding number is very sensitive to the difference A-A_0; when this difference is small, the variance is very small, and may be estimated as follows. Think of the turtle’s choices of left or right turns as an infinite sequence of Rs and Ls. It might be the case that each step-plus-turn induces an elliptic isometry of the hyperbolic plane through some angle 2\pi/N (with N big when A-A_0 is small), so that any L^N string will produce a full left turn, and any R^N string will produce a full right turn. However, it is a fundamental feature of hyperbolic geometry that “correlations decay exponentially” — that is, if we have any sequence of Rs and Ls in which there is no substring of at least N consecutive Ls or Rs, the resulting curve is quasigeodesic, and contributed nothing to the winding number at all. The different R^N or L^N substrings are projections to elliptic subgroups centered at different points in the plane; if we sum up all the contributions that exceed the threshold, the result gives the winding number of the turtle.

Here’s another example (very closely related to the turtle, and to the subject of quasimorphisms). Let S be a compact hyperbolic surface with a nonempty geodesic boundary. Let \gamma be a “random” bi-infinite geodesic on S (one must be a bit careful how one defines this — for example, one can use Patterson-Sullivan measure on the limit set of the fundamental group). If p is a point in S, we can ask how many times \gamma “winds around” p. This means: join p to the boundary of S by a shortest arc, and count the algebraic intersection number of this arc with the geodesic \gamma. For any p the winding number (as a function of time) is a Gaussian; but once again, the variance is very sensitive to how close p is to the boundary. If p is very close to the boundary, the only way a geodesic \gamma can go “around” it is to have a very long subsegment which is very parallel to the boundary; i.e. it must wind around the boundary many times. If we lift to the universal cover, and consider the projections of the random geodesic to the many different lifts of the boundary components, only those projections whose length is very long will contribute to winding number around p, and all those below a threshold will contribute 0.

One final example is more suggestive than explicit, and I wonder if one can do more with it. Let G denote the group of area preserving diffeomorphisms of the unit disk, supported strictly in the interior (so that each element is fixed on some neighborhood of the boundary). There is a beautiful homomorphism from G to \mathbb{R}, discovered by Calabi. One definition is as follows. Let \omega be the area form in the disk. This is exact, so we can write \omega = d\alpha. If f is a diffeomorphism with f^*\omega = \omega then f^*\alpha -\alpha is closed, hence exact, hence f^*\alpha - \alpha = d\beta for some function \beta. Then the integral of \beta over the disk (with respect to the area form) is a number, and this is Cal(f) (actually, it is possible that there should be a factor of -1/2 somewhere, depending on the normalization one uses). Another beautiful definition of this homomorphism, discovered by Fathi, is as follows. The group of diffeomorphisms of the disk is contractible, so any f can be joined to the identity by a path, unique up to homotopy. Under the track of this path, points in the disk move about; for any pair of points, we can compute the number of times one winds around the other (this is not quite well-defined, so apply the track n times, compute the winding number divided by n, and then take the limit as n goes to infinity). This gives a number which depends on which two points one starts with. So compute the average of this number over all pairs of points, where we use the invariant measure on the disk to define a measure on the space of pairs of points. This example was vastly generalized by Gambaudo-Ghys: choose any positive integer M and any quasimorphism on the braid group on M points, and compute the expected value of this quasimorphism on a random M-tuple of points. One can already think of this as a kind of subsurface projection — looking at the diffeomorphism as a homotopy class rel. any finite (approximate) orbit, and recovering a dynamical invariant as a kind of average of projections which are sufficiently “long” to persist under taking powers (actually, I wonder what the result is if one simply truncates the result on pairs of points whose relative winding is less than some fixed cutoff). But one can go further. Entov-Polterovich show the existence of a kind of Calabi quasimorphism on the group of area-preserving diffeomorphisms of the sphere, with the property that for every disk D of area at most 1/2, the restriction to the subgroup supported in D agrees with the (usual) Calabi homomorphism defined above. I wonder if there is a direct way to build such invariants by “projecting” the dynamics of some arbitrary diffeomorphism f of the sphere to some subdisk D (eg by looking at how the tracks of the points cross through D and surgering them with segments of the boundary of D), computing a Calabi homomorphism for each (or “enough”) such subdisk(s) (e.g. enough to cover the sphere), then counting contributions from (maximal) subdisks where this contribution is big enough.