You are currently browsing the tag archive for the ‘free groups’ tag.

I am in Melbourne at the moment, in the middle of giving a lecture series, as part of the 2009 Clay-Mahler lectures (also see here). Yesterday I gave a lecture with the title “faces of the scl norm ball”, and I thought I would try to give a sense of what it was all about. This also gives me an excuse to fiddle around with images in wordpress.

One starts with a basic question: given an immersion of a circle in the plane, when is there an immersion of the disk in the plane that bounds the given immersion of a circle? I.e., given a immersion $\gamma:S^1 \to \bf{R}^2$, when is there an immersion $f:D^2 \to \bf{R}^2$ for which $\partial f$ factors through $\gamma$? Obviously this depends on $\gamma$. Consider the following examples:

The first immersed circle obviously bounds an immersed disk; in fact, an embedded disk.

The second circle does not bound such a disk. One way to see this is to use the Gauss map, i.e. the map $\gamma'/|\gamma'|:S^1 \to S^1$ that takes each point on the circle to the unit tangent to its image under the immersion. The degree of the Gauss map for an embedded circle is $\pm 1$ (depending on a choice of orientation). If an immersed circle bounds an immersed disk, one can use this immersed disk to define a 1-parameter family of immersions, connecting the initial immersed circle to an embedded immersed circle; hence the degree of the Gauss map is aso $\pm 1$ for an immersed circle bounding an immersed disk; this rules out the second example.

The third example maps under the Gauss map with degree 1, and yet it does not bound an immersed disk. One must use a slightly more sophisticated invariant to see this. The immersed circle divides the plane up into regions. For each bounded region $R$, let $\alpha:[0,1] \to \bf{R}^2$ be an embedded arc, transverse to $\gamma$, that starts in the region $R$ and ends up “far away” (ideally “at infinity”). The arc $\alpha$ determines a homological intersection number that we denote $\alpha \cap \gamma$, where each point of intersection contributes $\pm 1$ depending on orientations. In this example, there are three bounded regions, which get the numbers $1$, $-1$, $1$ respectively:

If $f:S \to \bf{R}^2$ is any map of any oriented surface with one boundary component whose boundary factors through $\gamma$, then the (homological) degree with which $S$ maps over each region complementary to the image of $\gamma$ is the number we have just defined. Hence if $\gamma$ bounds an immersed disk, these numbers must all be positive (or all negative, if we reverse orientation). This rules out the third example.

The complete answer of which immersed circles in the plane bound immersed disks was given by S. Blank, in his Ph.D. thesis at Brandeis in 1967 (unfortunately, this does not appear to be available online). The answer is in the form of an algorithm to decide the question. One such algorithm (not Blank’s, but related to it) is as follows. The image of $\gamma$ cuts up the plane into regions $R_i$, and each region $R_i$ gets an integer $n_i$. Take $n_i$ “copies” of each region $R_i$, and think of these as pieces of a jigsaw puzzle. Try to glue them together along their edges so that they fit together nicely along $\gamma$ and make a disk with smooth boundary. If you are successful, you have constructed an immersion. If you are not successful (after trying all possible ways of gluing the puzzle pieces together), no such immersion exists. This answer is a bit unsatisfying, since in the first place it does not give any insight into which loops bound and which don’t, and in the second place the algorithm is quite slow and impractial.

As usual, more insight can be gained by generalizing the question. Fix a compact oriented surface $\Sigma$ and consider an immersed $1$-manifold $\Gamma: \coprod_i S^1 \to \Sigma$. One would like to know which such $1$-manifolds bound an immersion of a surface. One piece of subtlety is the fact that there are examples where $\Gamma$ itself does not bound, but a finite cover of $\Gamma$ (e.g. two copies of $\Gamma$) does bound. It is also useful to restrict the class of $1$-manifolds that one considers. For the sake of concreteness then, let $\Sigma$ be a hyperbolic surface with geodesic boundary, and let $\Gamma$ be an oriented immersed geodesic $1$-manifold in $\Sigma$. An immersion $f:S \to \Sigma$ is said to virtually bound $\Gamma$ if the map $\partial f$ factors as a composition $\partial S \to \coprod_i S^1 \to \Sigma$ where the second map is $\Gamma$, and where the first map is a covering map with some degree $n(S)$. The fundamental question, then is:

Question: Which immersed geodesic $1$-manifolds $\Gamma$ in $\Sigma$ are virtually bounded by an immersed surface?

It turns out that this question is unexpectedly connected to stable commutator length, symplectic rigidity, and several other geometric issues; I hope to explain how in the remainder of this post.

First, recall that if $G$ is any group and $g \in [G,G]$, the commutator length of $g$, denoted $\text{cl}(g)$, is the smallest number of commutators in $G$ whose product is equal to $g$, and the stable commutator length $\text{scl}(g)$ is the limit $\text{scl}(g) = \lim_{n \to \infty} \text{cl}(g^n)/n$. One can geometrize this definition as follows. Let $X$ be a space with $\pi_1(X) = G$, and let $\gamma:S^1 \to X$ be a homotopy class of loop representing the conjugacy class of $g$. Then $\text{scl}(g) = \inf_S -\chi^-(S)/2n(S)$ over all surfaces $S$ (possibly with multiple boundary components) mapping to $X$ whose boundary wraps a total of $n(S)$ times around $\gamma$. One can extend this definition to $1$-manifolds $\Gamma:\coprod_i S^1 \to X$ in the obvious way, and one gets a definition of stable commutator length for formal sums of elements in $G$ which represent $0$ in homology. Let $B_1(G)$ denote the vector space of real finite linear combinations of elements in $G$ whose sum represents zero in (real group) homology (i.e. in the abelianization of $G$, tensored with $\bf{R}$). Let $H$ be the subspace spanned by chains of the form $g^n - ng$ and $g - hgh^{-1}$. Then $\text{scl}$ descends to a (pseudo)-norm on the quotient $B_1(G)/H$ which we denote hereafter by $B_1^H(G)$ ($H$ for homogeneous).

There is a dual definition of this norm, in terms of quasimorphisms.

Definition: Let $G$ be a group. A function $\phi:G \to \bf{R}$ is a homogeneous quasimorphism if there is a least non-negative real number $D(\phi)$ (called the defect) so that for all $g,h \in G$ and $n \in \bf{Z}$ one has

1. $\phi(g^n) = n\phi(g)$ (homogeneity)
2. $|\phi(gh) - \phi(g) - \phi(h)| \le D(\phi)$ (quasimorphism)

A function satisfying the second condition but not the first is an (ordinary) quasimorphism. The vector space of quasimorphisms on $G$ is denoted $\widehat{Q}(G)$, and the vector subspace of homogeneous quasimorphisms is denoted $Q(G)$. Given $\phi \in \widehat{Q}(G)$, one can homogenize it, by defining $\overline{\phi}(g) = \lim_{n \to \infty} \phi(g^n)/n$. Then $\overline{\phi} \in Q(G)$ and $D(\overline{\phi}) \le 2D(\phi)$. A quasimorphism has defect zero if and only if it is a homomorphism (i.e. an element of $H^1(G)$) and $D(\cdot)$ makes the quotient $Q/H^1$ into a Banach space.

Examples of quasimorphisms include the following:

1. Let $F$ be a free group on a generating set $S$. Let $\sigma$ be a reduced word in $S^*$ and for each reduced word $w \in S^*$, define $C_\sigma(w)$ to be the number of copies of $\sigma$ in $w$. If $\overline{w}$ denotes the corresponding element of $F$, define $C_\sigma(\overline{w}) = C_\sigma(w)$ (note this is well-defined, since each element of a free group has a unique reduced representative). Then define $H_\sigma = C_\sigma - C_{\sigma^{-1}}$. This quasimorphism is not yet homogeneous, but can be homogenized as above (this example is due to Brooks).
2. Let $M$ be a closed hyperbolic manifold, and let $\alpha$ be a $1$-form. For each $g \in \pi_1(M)$ let $\gamma_g$ be the geodesic representative in the free homotopy class of $g$. Then define $\phi_\alpha(g) = \int_{\gamma_g} \alpha$. By Stokes’ theorem, and some basic hyperbolic geometry, $\phi_\alpha$ is a homogeneous quasimorphism with defect at most $2\pi \|d\alpha\|$.
3. Let $\rho: G \to \text{Homeo}^+(S^1)$ be an orientation-preserving action of $G$ on a circle. The group of homeomorphisms of the circle has a natural central extension $\text{Homeo}^+(\bf{R})^{\bf{Z}}$, the group of homeomorphisms of $\bf{R}$ that commute with integer translation. The preimage of $G$ in this extension is an extension $\widehat{G}$. Given $g \in \text{Homeo}^+(\bf{R})^{\bf{Z}}$, define $\text{rot}(g) = \lim_{n \to \infty} (g^n(0) - 0)/n$; this descends to a $\bf{R}/\bf{Z}$-valued function on $\text{Homeo}^+(S^1)$, Poincare’s so-called rotation number. But on $\widehat{G}$, this function is a homogeneous quasimorphism, typically with defect $1$.
4. Similarly, the group $\text{Sp}(2n,\bf{R})$ has a universal cover $\widetilde{\text{Sp}}(2n,\bf{R})$ with deck group $\bf{Z}$. The symplectic group acts on the space $\Lambda_n$ of Lagrangian subspaces in $\bf{R}^{2n}$. This is equal to the coset space $\Lambda_n = U(n)/O(n)$, and we can therefore define a function $\text{det}^2:\Lambda_n \to S^1$. After picking a basepoint, one obtains an $S^1$-valued function on the symplectic group, which lifts to a real-valued function on its universal cover. This function is a quasimorphism on the covering group, whose homogenization is sometimes called the symplectic rotation number; see e.g. Barge-Ghys.

Quasimorphisms and stable commutator length are related by Bavard Duality:

Theorem (Bavard duality): Let $G$ be a group, and let $\sum t_i g_i \in B_1^H(G)$. Then there is an equality $\text{scl}(\sum t_i g_i) = \sup_\phi \sum t_i \phi(g_i)/2D(\phi)$ where the supremum is taken over all homogeneous quasimorphisms.

This duality theorem shows that $Q/H^1$ with the defect norm is the dual of $B_1^H$ with the $\text{scl}$ norm. (this theorem is proved for elements $g \in [G,G]$ by Bavard, and in generality in my monograph, which is a reference for the content of this post.)

What does this have to do with rigidity (or, for that matter, immersions)? Well, one sees from the examples (and many others) that homogeneous quasimorphisms arise from geometry — specifically, from hyperbolic geometry (negative curvature) and symplectic geometry (causal structures). One expects to find rigidity in extremal circumstances, and therefore one wants to understand, for a given chain $C \in B_1^H(G)$, the set of extremal quasimorphisms for $C$, i.e. those homogeneous quasimorphisms $\phi$ satisfying $\text{scl}(C) = \phi(C)/2D(\phi)$. By the duality theorem, the space of such extremal quasimorphisms are a nonempty closed convex cone, dual to the set of hyperplanes in $B_1^H$ that contain $C/|C|$ and support the unit ball of the $\text{scl}$ norm. The fewer supporting hyperplanes, the smaller the set of extremal quasimorphisms for $C$, and the more rigid such extremal quasimorphisms will be.

When $F$ is a free group, the unit ball in the $\text{scl}$ norm in $B_1^H(F)$ is a rational polyhedron. Every nonzero chain $C \in B_1^H(F)$ has a nonzero multiple $C/|C|$ contained in the boundary of this polyhedron; let $\pi_C$ denote the face of the polyhedron containing this multiple in its interior. The smaller the codimension of $\pi_C$, the smaller the dimension of the cone of extremal quasimorphisms for $C$, and the more rigidity we will see. The best circumstance is when $\pi_C$ has codimension one, and an extremal quasimorphism for $C$ is unique, up to scale, and elements of $H^1$.

An infinite dimensional polyhedron need not necessarily have any top dimensional faces; thus it is natural to ask: does the unit ball in $B_1^H(F)$ have any top dimensional faces? and can one say anything about their geometric meaning? We have now done enough to motivate the following, which is the main theorem from my paper “Faces of the scl norm ball”:

Theorem: Let $F$ be a free group. For every isomorphism $F \to \pi_1(\Sigma)$ (up to conjugacy) where $\Sigma$ is a compact oriented surface, there is a well-defined chain $\partial \Sigma \in B_1^H(F)$. This satisfies the following properties:

1. The projective class of $\partial \Sigma$ intersects the interior of a codimension one face $\pi_\Sigma$ of the $\text{scl}$ norm ball
2. The unique extremal quasimorphism dual to $\pi_\Sigma$ (up to scale and elements of $H^1$) is the rotation quasimorphism $\text{rot}_\Sigma$ (to be defined below) associated to any complete hyperbolic structure on $\Sigma$
3. A homologically trivial geodesic $1$-manifold $\Gamma$ in $\Sigma$ is virtually bounded by an immersed surface $S$ in $\Sigma$ if and only if the projective class of $\Gamma$ (thought of as an element of $B_1^H(F)$) intersects $\pi_\Sigma$. Equivalently, if and only if $\text{rot}_\Sigma$ is extremal for $\Gamma$. Equivalently, if and only if $\text{scl}(\Gamma) = \text{rot}_\Sigma(\Gamma)/2$.

It remains to give a definition of $\text{rot}_\Sigma$. In fact, we give two definitions.

First, a hyperbolic structure on $\Sigma$ and the isomorphism $F\to \pi_1(\Sigma)$ determines a representation $F \to \text{PSL}(2,\bf{R})$. This lifts to $\widetilde{\text{SL}}(2,\bf{R})$, since $F$ is free. The composition with rotation number is a homogeneous quasimorphism on $F$, well-defined up to $H^1$. Note that because the image in $\text{PSL}(2,\bf{R})$ is discrete and torsion-free, this quasimorphism is integer valued (and has defect $1$). This quasimorphism is $\text{rot}_\Sigma$.

Second, a geodesic $1$-manifold $\Gamma$ in $\Sigma$ cuts the surface up into regions $R_i$. For each such region, let $\alpha_i$ be an arc transverse to $\Gamma$, joining $R_i$ to $\partial \Sigma$. Let $(\alpha_i \cap \Gamma)$ denote the homological (signed) intersection number. Then define $\text{rot}_\Sigma(\Gamma) = 1/2\pi \sum_i (\alpha_i \cap \Gamma) \text{area}(R_i)$.

We now show how 3 follows. Given $\Gamma$, we compute $\text{scl}(\Gamma) = \inf_S -\chi^-(S)/2n(S)$ as above. Let $S$ be such a surface, mapping to $\Sigma$. We adjust the map by a homotopy so that it is pleated; i.e. so that $S$ is itself a hyperbolic surface, decomposed into ideal triangles, in such a way that the map is a (possibly orientation-reversing) isometry on each ideal triangle. By Gauss-Bonnet, we can calculate $\text{area}(S) = -2\pi \chi^-(S) = \pi \sum_\Delta 1$. On the other hand, $\partial S$ wraps $n(S)$ times around $\Gamma$ (homologically) so $\text{rot}_\Sigma(\Gamma) = \pi/2\pi n(S) \sum_\Delta \pm 1$ where the sign in each case depends on whether the ideal triangle $\Delta$ is mapped in with positive or negative orientation. Consequently $\text{rot}_\Sigma(\Gamma)/2 \le -\chi^-(S)/2n(S)$ with equality if and only if the sign of every triangle is $1$. This holds if and only if the map $S \to \Sigma$ is an immersion; on the other hand, equality holds if and only if $\text{rot}_\Sigma$ is extremal for $\Gamma$. This proves part 3 of the theorem above.

Incidentally, this fact gives a fast algorithm to determine whether $\Gamma$ is the virtual boundary of an immersed surface. Stable commutator length in free groups can be computed in polynomial time in word length; likewise, the value of $\text{rot}_\Sigma$ can be computed in polynomial time (see section 4.2 of my monograph for details). So one can determine whether $\Gamma$ projectively intersects $\pi_\Sigma$, and therefore whether it is the virtual boundary of an immersed surface. In fact, these algorithms are quite practical, and run quickly (in a matter of seconds) on words of length 60 and longer in $F_2$.

One application to rigidity is a new proof of the following theorem:

Corollary (Goldman, Burger-Iozzi-Wienhard): Let $\Sigma$ be a closed oriented surface of positive genus, and $\rho:\pi_1(\Sigma) \to \text{Sp}(2n,\bf{R})$ a Zariski dense representation. Let $e_\rho \in H^2(\Sigma;\mathbb{Z})$ be the Euler class associated to the action. Suppose that $|e_\rho([\Sigma])| = -n\chi(\Sigma)$ (note: by a theorem of Domic and Toledo, one always has $|e_\rho([\Sigma])| \le -n\chi(\Sigma)$). Then $\rho$ is discrete.

Here $e_\rho$ is the first Chern class of the bundle associated to $\rho$. The proof is as follows: cut $\Sigma$ along an essential loop $\gamma$ into two subsurfaces $\Sigma_i$. One obtains homogeneous quasimorphisms on each group $\pi_1(\Sigma_i)$ (i.e. the symplectic rotation number associated to $\rho$), and the hypothesis of the theorem easily implies that they are extremal for $\partial \Sigma_i$. Consequently the symplectic rotation number is equal to $\text{rot}_{\Sigma_i}$, at least on the commutator subgroup. But this latter quasimorphism takes only integral values; it follows that each element in $\pi_1(\Sigma_i)$ fixes a Lagrangian subspace under $\rho$. But this implies that $\rho$ is not dense, and since it is Zariski dense, it is discrete. (Notes: there are a couple of details under the rug here, but not many; furthermore, the hypothesis that $\rho$ is Zariski dense is not necessary (but can be derived as a conclusion with more work), and one can just as easily treat representations of compact surface groups as closed ones; finally, Burger-Iozzi-Wienhard prove more than just this statement; for instance, they show that the space of maximal representations is always real semialgebraic, and describe it in some detail).

More abstractly, we have shown that extremal quasimorphisms on $\partial \Sigma$ are unique. In other words, by prescribing the value of a quasimorphism on a single group element, one determines its values on the entire commutator subgroup. If such a quasimorphism arises from some geometric or dynamical context, this can be interpreted as a kind of rigidity theorem, of which the Corollary above is an example.

I have just uploaded a paper to the arXiv, entitled “Scl, sails and surgery”. The paper discusses a connection between stable commutator length in free groups and the geometry of sails. This is an interesting example of what sometimes happens in geometry, where a complicated topological problem in low dimensions can be translated into a “simple” geometric problem in high dimensions. Other examples include the Veronese embedding in Algebraic geometry (i.e. the embedding of one projective space into another taking a point with homogeneous co-ordinates $x_i$ to the point whose homogeneous co-ordinates are the monomials of some fixed degree in the $x_i$), which lets one exhibit any projective variety as an intersection of a Veronese variety (whose geometry is understood very well) with a linear subspace.

In my paper, the fundamental problem is to compute stable commutator length in free groups, and more generally in free products of Abelian groups. Let’s focus on the case of a group $G = A*B$ where $A,B$ are free abelian of finite rank. A $K(G,1)$ is just a wedge $X:=K_A \vee K_B$ of tori of dimension equal to the ranks of $A,B$. Let $\Gamma: \coprod_i S^1 \to X$ be a free homotopy class of $1$-manifold in $X$, which is homologically trivial. Formally, we can think of $\Gamma$ as a chain $\sum_i g_i$ in $B_1^H(G)$, the vector space of group $1$-boundaries, modulo homogenization; i.e. quotiented by the subspace spanned by chains of the form $g^n - ng$ and $g-hgh^{-1}$. One wants to find the simplest surface $S$ mapping to $X$ that rationally bounds $\Gamma$. I.e. we want to find a map $f:S \to X$ such that $\partial f:\partial S \to X$ factors through $\Gamma$, and so that the boundary $\partial S$ wraps homologically $n(S)$ times around each loop of $\Gamma$, in such a way as to infimize $-\chi(S)/2n(S)$. This infimum, over all maps of all surfaces $S$ of all possible genus, is the stable commutator length of the chain $\sum_i g_i$. Computing this quantity for all such finite chains is tantamount to understanding the bounded cohomology of a free group in dimension $2$.

Given such a surface $S$, one can cut it up into simpler pieces, along the preimage of the basepoint $K_A \cap K_B$. Since $S$ is a surface with boundary, these simpler pieces are surfaces with corners. In general, understanding how a surface can be assembled from an abstract collection of surfaces with corners is a hopeless task. When one tries to glue the pieces back together, one runs into trouble at the corners — how does one decide when a collection of surfaces “closes up” around a corner? The wrong decision leads to branch points; moreover, a decision made at one corner will propogate along an edge and lead to constraints on the choices one can make at other corners. This problem arises again and again in low-dimensional topology, and has several different (and not always equivalent) formulations and guises, including -

• Given an abstract branched surface and a weight on that surface, when is there an unbranched surface carried by the abstract branched surface and realizing the weight?
• Given a triangulation of a $3$-manifold and a collection of normal surface types in each simplex satisfying the gluing constraints but *not*  necessarily satisfying the quadrilateral condition (i.e. there might be more than one quadrilateral type per simplex), when is there an immersed unbranched normal surface in the manifold realizing the weight?
• Given an immersed curve in the plane, when is there an immersion from the disk to the plane whose boundary is the given curve?
• Given a polyhedral surface (arising e.g. in computer graphics), how can one choose smooth approximations of the polygonal faces that mesh smoothly at the vertices?

I think of all these problems as examples of what I like to call the holonomy problem, since all of them can be reduced, in one way or another, to studying representations of fundamental groups of punctured surfaces into finite groups. The fortunate “accident” in this case is that every corner arises by intersecting a cut with a boundary edge of $S$. Consequently, one never wants to glue more than two pieces up at any corner, and the holonomy problem does not arise. Hence in principle, to understand the surface $S$ one just needs to understand the pieces of $S$ that can arise by cutting, and the ways in which they can be reassembled.

This is still not a complete solution of the problem, since infinitely many kinds of pieces can arise by cutting complicated surfaces $S$. The $1$-manifold $\Gamma$ decomposes into a collection of arcs in the tori $K_A$ and $K_B$ which we denote $\tau_A,\tau_B$ respectively, and the surface $S \cap K_A$ (hereafter abbreviated to $S_A$) has edges that alternate between elements of $\tau_A$, and edges mapping to $K_A \cap K_B$. Since $K_A$ is a torus, handles of $S_A$ mapping to $K_A$ can be compressed, reducing the complexity of $S_A$, and thereby $S$, so one need only consider planar surfaces $S_A$.

Let $C_2(A)$ denote the real vector space with basis the set of ordered pairs $(t,t')$ of elements of $\tau_A$ (not necessarily distinct), and $C_1(A)$ the real vector space with basis the elements of $\tau_A$. A surface $S_A$ determines a non-negative integral vector $v(S_A) \in C_2(A)$, by counting the number of times a given pair of edges $(t,t')$ appear in succession on one of the (oriented) boundary components of $S_A$. The vector $v(S_A)$ satisfies two linear constraints. First, there is a map $\partial: C_2(A) \to C_1(A)$ defined on a basis vector by $\partial(t,t') = t - t'$. The vector $v(S_A)$ satisfies $\partial v(S_A) = 0$. Second, each element $t \in \tau_A$ is a based loop in $K_A$, and therefore corresponds to an element in the free abelian group $A$. Define $h:C_2(A) \to A \otimes \mathbb{R}$ on a basis vector by $h(t,t') = t+t'$ (warning: the notation obscures the fact that $\partial$ and $h$ map to quite different vector spaces). Then $h v(S_A)=0$; moreover, a non-negative rational vector $v \in C_2(A)$ satisfying $\partial v = h v = 0$ has a multiple of the form $v(S_A)$ for some $S_A$ as above. Denote the subspace of $C_2(A)$ consisting of non-negative vectors in the kernel of $\partial$ and $h$ by $V_A$. This is a rational polyhedral cone — i.e. a cone with finitely many extremal rays, each spanned by a rational vector.

Although every integral $v \in V_A$ is equal to $v(S_A)$ for some $S_A$, many different $S_A$ correspond to a given $v$. Moreover, if we are allowed to consider formal weighted sums of surfaces, then even more possibilities. In order to compute stable commutator length, we must determine, for a given vector $v \in V_A$, an expression $v = \sum t_i v(S_i)$ where the $t_i$ are positive real numbers, which minimizes $\sum -t_i \chi_o(S_i)$. Here $\chi_o(\cdot)$ denotes orbifold Euler characteristic of a surface with corners; each corner contributes $-1/4$ to $\chi_o$. The reason one counts complexity using this modified definition is that the result is additive: $\chi(S) = \chi_o(S_A) + \chi_o(S_B)$. The contribution to $\chi_o$ from corners is a linear function on $V_A$. Moreover, a component $S_i$ with $\chi(S_i) \le 0$ can be covered by a surface of high genus and compressed (increasing $\chi$); so such a term can always be replaced by a formal sum $1/n S_i'$ for which $\chi(S_i') = \chi(S_i)$. Thus the only nonlinear contribution to $\chi_o$ comes from the surfaces $S_i$ whose underlying topological surface is a disk.

Call a vector $v \in V_A$ a disk vector if $v = v(S_A)$ where $S_A$ is topologically a disk (with corners). It turns out that the set of disk vectors $\mathcal{D}_A$ has the following simple form: it is equal to the union of the integer lattice points contained in certain of the open faces of $V_A$ (those satisfying a combinatorial criterion). Define the sail of $V_A$ to be equal to the boundary of the convex hull of the polyhedron $\mathcal{D}_A + V_A$ (where $+$ here denotes Minkowski sum). The Klein function $\kappa$ is the unique continuous function on $V_A$, linear on rays, that is equal to $1$ exactly on the sail. Then $\chi_o(v):= \max \sum t_i\chi_o(S_i)$ over expressions $v = \sum t_i v(S_i)$ satisfies $\chi_o(v) = \kappa(v) - |v|/2$ where $|\cdot|$ denotes $L^1$ norm. To calculate stable commutator length, one minimizes $-\chi_o(v) - \chi_o(v')$ over $(v,v')$ contained in a certain rational polyhedron in $V_A \times V_B$.

Sails are considered elsewhere by several authors; usually, people take $\mathcal{D}_A$ to be the set of all integer vectors except the vertex of the cone, and the sail is therefore the boundary of the convex hull of this (simpler) set. Klein introduced sails as a higher-dimensional generalization of continued fractions: if $V$ is a polyhedral cone in two dimensions (i.e. a sector in the plane, normalized so that one edge is the horizontal axis, say), the vertices of the sail are the continued fraction approximations of the boundary slope. Arnold has revived the study of such objects in recent years. They arise in many different interesting contexts, such as numerical analysis (especially diophantine approximation) and algebraic number theory. For example, let $A \in \text{SL}(n,\mathbb{Z})$ be a matrix with irreducible characteristic equation, and all eigenvalues real and positive. There is a basis for $\mathbb{R}^n$ consisting of eigenvalues, spanning a convex cone $V$. The cone — and therefore its sail — is invariant under $A$; moreover, there is a $\mathbb{Z}^{n-1}$ subgroup of $\text{SL}(n,\mathbb{Z})$ consisting of matrices with the same set of eigenvectors; this observation follows from Dirichlet’s theorem on the units in a number field, and is due to Tsuchihashi. This abelian group acts freely on the sail with quotient a (topological) torus of dimension $n-1$, together with a “canonical” cell decomposition. This connection between number theory and combinatorics is quite mysterious; for example, Arnold asks: which cell decompositions can arise? This is unknown even in the case $n=3$.

The most interesting aspect of this correspondence, between stable commutator length and sails, is that it allows one to introduce parameters. An element in a free group $F_2$ can be expressed as a word in letters $a,b,a^{-1},b^{-1}$, e.g. $aab^{-1}b^{-1}a^{-1}a^{-1}a^{-1}bbbbab^{-1}b^{-1}$, which is usually abbreviated with exponential notation, e.g. $a^2b^{-2}a^{-3}b^4ab^{-2}$. Having introduced this notation, one can think of the exponents as parameters, and study stable commutator length in families of words, e.g. $a^{2+p}b^{-2+q}a^{-3-p}b^{4-q}ab^{-2}$. Under the correspondence above, the parameters only affect the coefficients of the linear map $h$, and therefore one obtains families of polyhedral cones $V_A(p,q,\cdots)$ whose extremal rays depend linearly on the exponent parameters. This lets one prove many facts about the stable commutator length spectrum in a free group, including:

Theorem: The image of a nonabelian free group of rank at least $4$ under scl in $\mathbb{R}/\mathbb{Z}$ is precisely $\mathbb{Q}/\mathbb{Z}$.

and

Theorem: For each $n$, the image of the free group $F_n$ under scl contains a well-ordered sequence of values with ordinal type $\omega^{\lfloor n/4 \rfloor}$. The image of $F_\infty$ contains a well-ordered sequence of values with ordinal type $\omega^\omega$.

One can also say things about the precise dependence of scl on parameters in particular families. More conjecturally, one would like to use this correspondence to say something about the statistical distribution of scl in free groups. Experimentally, this distribution appears to obey power laws, in the sense that a given (reduced) fraction $p/q$ appears in certain infinite families of elements with frequency proportional to $q^{-\delta}$ for some power $\delta$ (which unfortunately depends in a rather opaque way on the family). Such power laws are reminiscent of Arnold tongues in dynamics, one of the best-known examples of phase locking of coupled nonlinear oscillators. Heuristically one expects such power laws to appear in the geometry of “random” sails — this is explained by the fact that the (affine) geometry of a sail depends only on its $\text{SL}(n,\mathbb{Z})$ orbit, and the existence of invariant measures on a natural moduli space; see e.g. Kontsevich and Suhov. The simplest example concerns the ($1$-dimensional) cone spanned by a random integral vector in $\mathbb{Z}^2$. The $\text{SL}(2,\mathbb{Z})$ orbit of such a vector depends only on the gcd of the two co-ordinates. As is easy to see, the probability distribution of the gcd of a random pair of integers $p,q$ obeys a power law: $\text{gcd}(p,q) = n$ with probability $\zeta(2)^{-1}/n^2$. The rigorous justification of the power laws observed in the scl spectrum of free groups remains the focus of current research by myself and my students.

The development and scope of modern biology is often held out as a fantastic opportunity for mathematicians. The accumulation of vast amounts of biological data, and the development of new tools for the manipulation of biological organisms at microscopic levels and with unprecedented accuracy, invites the development of new mathematical tools for their analysis and exploitation. I know of several examples of mathematicians who have dipped a toe, or sometimes some more substantial organ, into the water. But it has struck me that I know (personally) few mathematicians who believe they have something substantial to learn from the biologists, despite the existence of several famous historical examples.  This strikes me as odd; my instinctive feeling has always been that intellectual ruts develop so easily, so deeply, and so invisibly, that continual cross-fertilization of ideas is essential to escape ossification (if I may mix biological metaphors . . .)

It is not necessarily easy to come up with profound examples of biological ideas or principles that can be easily translated into mathematical ones, but it is sometimes possible to come up with suggestive ones. Let me try to give a tentative example.

Deoxiribonucleic acid (DNA) is a nucleic acid that contains the genetic blueprint for all known living things. This blueprint takes the form of a code — a molecule of DNA is a long polymer strand composed of simple units called nucleotides; such a molecule is typically imagined as a string in a four character alphabet $\lbrace A,T,G,C \rbrace$, which stand for the nucleotides Adenine, Thymine, Guanine, and Cytosine. These molecular strands like to arrange themselves in tightly bound oppositely aligned pairs, matching up nucleotides in one string with complementary nucleotides in the other, so that $A$ matches with $T$, and $C$ with $G$.

The geometry of a strand of DNA is very complicated — strands can be tangled, knotted, linked in complicated ways, and the fundamental interactions between strands (e.g. transcription, recombination) are facilitated or obstructed by mechanical processes depending on this geometry. Topology, especially knot theory, has been used in the study of some of these processes; the value of topological methods in this context include their robustness (fault-tolerance) and the discreteness of their invariants (similar virtues motivate some efforts to build topological quantum computers). A complete mathematical description of the salient biochemistry, mechanics, and semantic content of a configuration of DNA in a single cell is an unrealistic goal for the foreseeable future, and therefore attempts to model such systems depends on ignoring, or treating statistically, certain features of the system. One such framework ignores the ambient geometry entirely, and treats the system using symbolic, or combinatorial methods which have some of the flavor of geometric group theory.

One interesting approach is to consider a mapping from the alphabet of nucleotides to a standard generating set for $F_2$, the free group on two generators; for example, one can take the mapping $T \to a, A \to A, C \to b, G \to B$ where $a,b$ are free generators for $F_2$, and ${}A,B$ denote their inverses. Then a pair of oppositely aligned strands of DNA translates into an edge of a van Kampen diagram — the “words” obtained by reading the letters along an edge on either side are inverse in $F_2$.

Strands of DNA in a configuration are not always paired along their lengths; sometimes junctions of three or more strands can form; certain mobile four-strand junctions, so-called “Holliday junctions”, perform important functions in the process of genetic recombination, and are found in a wide variety of organisms. A configuration of several strands with junctions of varying valences corresponds in the language of van Kampen diagrams to a fatgraph — i.e. a graph together with a choice of cyclic ordering of edges at each vertex — with edges labeled by inverse pairs of words in $F_2$ (note that this is quite different from the fatgraph model of proteins developed by Penner-Knudsen-Wiuf-Andersen). The energy landscape for branch migration (i.e. the process by which DNA strands separate or join along some segment) is very complicated, and it is challenging to model it thermodynamically. It is therefore not easy to predict in advance what kinds of fatgraphs are more or less likely to arise spontaneously in a prepared “soup” of free DNA strands.

As a thought experiment, consider the following “toy” model, which I do not suggest is physically realistic. We make the assumption that the energy cost of forming a junction of valence $v$ is $c(v-2)$ for some fixed constant $c$. Consequently, the energy of a configuration is proportional to $-\chi$, i.e. the negative of Euler characteristic of the underlying graph. Let $w$ be a reduced word, representing an element of $F_2$, and imagine a soup containing some large number of copies of the strand of DNA corresponding to the string $\dot{w}:=\cdots www \cdots$. In thermodynamic equilibrium, the partition function has the form $Z = \sum_i e^{-E_i/k_BT}$ where $k_B$ is Boltzmann’s constant, $T$ is temperature, and $E_i$ is the energy of a configuration (which by hypothesis is proportional to $-\chi$). At low temperature, minimal energy configurations tend to dominate; these are those that minimize $-\chi$ per unit “volume”. Topologically, a fatgraph corresponding to such a configuration can be thickened to a surface with boundary. The words along the edges determine a homotopy class of map from such a surface to a $K(F_2,1)$ (e.g. a once-punctured torus) whose boundary components wrap multiply around the free homotopy class corresponding to the conjugacy class of $w$. The infimum of $-\chi/2d$ where $d$ is the winding degree on the boundary, taken over all configurations, is precisely the stable commutator length of $w$; see e.g. here for a definition.

Anyway, this example is perhaps a bit strained (and maybe it owes more to thermodynamics than to biology), but already it suggests a new mathematical object of study, namely the partition function $Z$ as above, and one is already inclined to look for examples for which the partition function obeys a symmetry like that enjoyed by the Riemann zeta function, or to specialize temperature to other values, as in random matrix theory. The introduction of new methods into the study of a classical object — for example, the decision to use thermodynamic methods to organize the study of van Kampen diagrams — bends the focus of the investigation towards those examples and contexts where the methods and tools are most informative. Phenomena familiar in one context (power laws, frequency locking, phase transitions etc.) suggest new questions and modes of enquiry in another. Uninspired or predictable research programs can benefit tremendously from such infusions, whether the new methods are borrowed from other intellectual disciplines (biology, physics), or depend on new technology (computers), or new methods of indexing (google) or collaboration (polymath).

One of my intellectual heroes — Wolfgang Haken — worked for eight years in R+D for Siemens in Munich after completing his PhD. I have a conceit (unsubstantiated as far as I know by biographical facts) that his experience working for a big engineering firm colored his approach to mathematics, and made it possible for him to imagine using industrial-scale “engineering” tools (e.g. integer programming, exhaustive computer search of combinatorial possibilities) to solve two of the most significant “pure” mathematical open problems in topology at the time — the knot recognition problem, and the four-color theorem. It is an interesting exercise to try to imagine (fantastic) variations. If I sit down and decide to try to prove (for example) Cannon’s conjecture, I am liable to try minor variations on things I have tried before, appeal for my intuition to examples that I understand well, read papers by others working in similar ways on the problem, etc. If I imagine that I have been given a billion dollars to prove the conjecture, I am almost certain to prioritize the task in different ways, and to entertain (and perhaps create) much more ambitious or innovative research programs to tackle the task. This is the way in which I understand the following quote by John Dewey, which I used as the colophon of my first book:

Every great advance in science has issued from a new audacity of the imagination.

A basic reference for the background to this post is my monograph.

Let $G$ be a group, and let $[G,G]$ denote the commutator subgroup. Every element of $[G,G]$ can be expressed as a product of commutators; the commutator length of an element $g$ is the minimum number of commutators necessary, and is denoted $\text{cl}(g)$. The stable commutator length is the growth rate of the commutator lengths of powers of an element; i.e. $\text{scl}(g) = \lim_{n \to \infty} \text{cl}(g^n)/n$. Recall that a group $G$ is said to satisfy a law if there is a nontrivial word $w$ in a free group $F$ for which every homomorphism from $F$ to $G$ sends $w$ to $\text{id}$.

The purpose of this post is to give a very short proof of the following proposition (modulo some background that I wanted to talk about anyway):

Proposition: Suppose $G$ obeys a law. Then the stable commutator length vanishes identically on $[G,G]$.

The proof depends on a duality between stable commutator length and a certain class of functions, called homogeneous quasimorphisms

Definition: A function $\phi:G \to \mathbb{R}$ is a quasimorphism if there is some least number $D(\phi)\ge 0$ (called the defect) so that for any pair of elements $g,h \in G$ there is an inequality $|\phi(x) + \phi(y) - \phi(xy)| \le D(\phi)$. A quasimorphism is homogeneous if it satisfies $\phi(g^n) = n\phi(g)$ for all integers $n$.

Note that a homogeneous quasimorphism with defect zero is a homomorphism (to $\mathbb{R}$). The defect satisfies the following formula:

Lemma: Let $f$ be a homogeneous quasimorphism. Then $D(\phi) = \sup_{g,h} \phi([g,h])$.

A fundamental theorem, due to Bavard, is the following:

Theorem: (Bavard duality) There is an equality $\text{scl}(g) = \sup_\phi \frac {\phi(g)} {2D(\phi)}$ where the supremum is taken over all homogeneous quasimorphisms with nonzero defect.

In particular, $\text{scl}$ vanishes identically on $[G,G]$ if and only if every homogeneous quasimorphism on $G$ is a homomorphism.

One final ingredient is another geometric definition of $\text{scl}$ in terms of Euler characteristic. Let $X$ be a space with $\pi_1(X) = G$, and let $\gamma:S^1 \to X$ be a free homotopy class representing a given conjugacy class $g$. If $S$ is a compact, oriented surface without sphere or disk components, a map $f:S \to X$ is admissible if the map on $\partial S$ factors through $\partial f:\partial S \to S^1 \to X$, where the second map is $\gamma$. For an admissible map, define $n(S)$ by the equality $[\partial S] \to n(S) [S^1]$ in $H_1(S^1;\mathbb{Z})$ (i.e. $n(S)$ is the degree with which $\partial S$ wraps around $\gamma$). With this notation, one has the following:

Lemma: There is an equality $\text{scl}(g) = \inf_S \frac {-\chi^-(S)} {2n(S)}$.

Note: the function $-\chi^-$ is the sum of $-\chi$ over non-disk and non-sphere components of $S$. By hypothesis, there are none, so we could just write $-\chi$. However, it is worth writing $-\chi^-$ and observing that for more general (orientable) surfaces, this function is equal to the function $\rho$ defined in a previous post.

We now give the proof of the Proposition.

Proof. Suppose to the contrary that stable commutator length does not vanish on $[G,G]$. By Bavard duality, there is a homogeneous quasimorphism $\phi$ with nonzero defect. Rescale $\phi$ to have defect $1$. Then for any $\epsilon$ there are elements $g,h$ with $\phi([g,h]) \ge 1-\epsilon$, and consequently $\text{scl}([g,h]) \ge 1/2 - \epsilon/2$ by Bavard duality. On the other hand, if $X$ is a space with $\pi_1(X)=G$, and $\gamma:S^1 \to X$ is a loop representing the conjugacy class of $[g,h]$, there is a map $f:S \to X$ from a once-punctured torus $S$ to $X$ whose boundary represents $\gamma$. The fundamental group of $S$ is free on two generators $x,y$ which map to the class of $g,h$ respectively. If $w$ is a word in $x,y$ mapping to the identity in $G$, there is an essential loop $\alpha$ in $S$ that maps inessentially to $X$. There is a finite cover $\widetilde{S}$ of $S$, of degree $d$ depending on the word length of $w$, for which $\alpha$ lifts to an embedded loop. This can be compressed to give a surface $S'$ with $-\chi^-(S') \le -\chi^-(\widetilde{S})-2$. However, Euler characteristic is multiplicative under coverings, so $-\chi^-(\widetilde{S}) = -\chi^-(S)\cdot d$. On the other hand, $n(S') = n(\widetilde{S})=d$ so $\text{scl}([g,h]) \le 1/2 - 1/d$. If $G$ obeys a law, then $d$ is fixed, but $\epsilon$ can be made arbitrarily small. So $G$ does not obey a law. qed.

In a previous post, I discussed some methods for showing that a given group contains a (nonabelian) free subgroup. The methods were analytic and/or dynamical, and phrased in terms of the existence (or nonexistence) of certain functions on $G$ or on spaces derived from $G$, or in terms of actions of $G$ on certain spaces. Dually, one can try to find a free group in $G$ by finding a homomorphism $\rho: F \to G$ and looking for circumstances under which $\rho$ is injective.

For concreteness, let $G = \pi_1(X)$ for some (given) space $X$. If $F$ is a free group, a representation $\rho:F \to G$ up to conjugation determines a homotopy class of map $f: S \to X$ where $S$ is a $K(F,1)$. The most natural $K(F,1)$‘s to consider are graphs and surfaces (with boundary). It is generally not easy to tell whether a map of a graph or a surface to a topological space is $\pi_1$-injective at the topological level, but might be easier if one can use some geometry.

Example: Let $X$ be a complete Riemannian manifold with sectional curvature bounded above by some negative constant $K < 0$. Convexity of the distance function in a negatively curved space means that given any map of a graph $f:\Gamma \to X$ one can flow $f$ by the negative gradient of total length until it undergoes some topology change (e.g. some edge shrinks to zero length) or it (asymptotically) achieves a local minimum (the adjective “asymptotically” here just means that the flow takes infinite time to reach the minimum, because the size of the gradient is small when the map is almost minimum; there are no analytic difficulties to overcome when taking the limit). A typical topological change might be some loop shrinking to a point, thereby certifying that a free summand of $\pi_1(\Gamma)$ mapped trivially to $G$ and should have been discarded. Technically, one probably wants to choose $\Gamma$ to be a trivalent graph, and when some interior edge collapses (so that four points come together) to let the $4$-valent vertex resolve itself into a pair of $3$-valent vertices in whichever of the three combinatorial possibilities is locally most efficient. The limiting graph, if nonempty, will be trivalent, with geodesic edges, and vertices at which the three edges are all (tangentially) coplanar and meet at angles of $2\pi/3$. Such a graph can be certified as $\pi_1$-injective provided the edges are sufficiently long (depending on the curvature $K$). After rescaling the metric on $X$ so that the supremum of the curvatures is $-1$, a trivalent geodesic graph with angles $2\pi/3$ at the vertices and edges at least $2\tanh^{-1}(1/2) = 1.0986\cdots$ is $\pi_1$-injective. To see this, lift to maps between universal covers, i.e. consider an equivariant map from a tree $\widetilde{\Gamma}$ to $\widetilde{X}$. Let $\ell$ be an embedded arc in $\widetilde{\Gamma}$, and consider the image in $\widetilde{X}$. Using Toponogov’s theorem, one can compare with a piecewise isometric map from $\ell$ to $\mathbb{H}^n$. The worst case is when all the edges are contained in a single $\mathbb{H}^2$, and all corners “bend” the same way. Providing the image does not bend as much as a horocircle, the endpoints of the image of $\ell$ stay far away in $\mathbb{H}^2$. An infinite sided convex polygon in $\mathbb{H}^2$ with all edges of length $2\tanh^{-1}(1/2)$ and all angles $2\pi/3$ osculates a horocycle, so we are done.

Remark: The fundamental group of a negatively curved manifold is word-hyperbolic, and therefore contains many nonabelian free groups, which may be certified by pingpong applied to the action of the group on its Gromov boundary. The point of the previous example is therefore to certify that a certain subgroup is free in terms of local geometric data, rather than global dynamical data (so to speak). Incidentally, I would not swear to the correctness of the constants above.

Example: A given free group is the fundamental group of a surface with boundary in many different ways (this difference is one of the reasons that a group like $\text{Out}(F_n)$ is so much more complicated than the mapping class group of a surface). Pick a realization $F = \pi_1(S)$. Then a homomorphism $\rho:F \to G$ up to conjugacy determines a homotopy class of map from $S$ to $X$ as above. If $X$ is negatively curved as before, each boundary loop is homotopic to a unique geodesic, and we may try to find a “good” map $f:S \to X$ with boundary on these geodesics. There are many possible classes of good maps to consider:

1. Fix a conformal structure on $S$ and pick a harmonic map in the homotopy class of $f$. Such a map exists since the target is nonpositively curved, by the famous theorem of Eells-Sampson. The image is real analytic if $X$ is, and is at least as negatively curved as the target, and therefore there is an a priori upper bound on the intrinsic curvature of the image; if the supremum of the curvature on $X$ is normalized to be $-1$, then the image surface is $\text{CAT}(-1)$, which just means that pointwise it is at least as negatively curved as hyperbolic space. By Gauss-Bonnet, one obtains an a priori bound on the area of the image of $S$ in terms of the Euler characteristic (which just depends on the rank of $F$). On the other hand, this map depends on a choice of marked conformal structure on $S$, and the space of such structures is noncompact.
2. Vary over all conformal structures on $S$ and choose a harmonic map of least energy (if one exists) or find a sequence of maps that undergo a “neck pinch” as a sequence of conformal structures on $S$ degenerates. Such a neck pinch exhibits a simple curve in $S$ that is essential in $S$ but whose image is inessential in $X$; such a curve can be compressed, and the topology of $S$ simplified. Since each compression increases $\chi$, after finitely many steps the process terminates, and one obtains the desired map. This is Schoen-Yau‘s method to construct a stable minimal surface representative of $S$. When the target is $3$-dimensional, the surface may be assumed to be unbranched, by a trick due to Osserman.
3. Following Thurston, pick an ideal triangulation of $S$ (i.e. a geodesic lamination of $S$ whose complementary regions are all ideal triangles); since $S$ has boundary, we may choose such a lamination by first picking a triangulation (in the ordinary sense) with all vertices on $\partial S$ and then “spinning” the vertices to infinity. Unless $\rho$ factors through a cyclic group, there is some choice of lamination so that the image of $f$ can be straightened along the lamination, and then the image spanned with $CAT(-1)$ ideal triangles to produce a pleated surface in $X$ representing $f$ (note: if $X$ has constant negative curvature, these ideal triangles can be taken to be totally geodesic). The space of pleated surfaces in fixed (closed) $X$ of given genus is compact, so this is a reasonable class of maps to work with.
4. If $G$ is merely a hyperbolic group, one can still construct pleated surfaces, not quite in $X$, but equivariantly in Mineyev’s flow space associated to $\widetilde{X}$. Here we are not really thinking of the triangles themselves, but the geodesic laminations they bound (which carry the same information).
5. If $X$ is complete and $3$-dimensional but noncompact, the space of pleated surfaces of given genus is generally not compact, and it is not always easy to find a pleated surface where you want it. This can sometimes be remedied by shrinkwrapping; one looks for a minimal/pleated/harmonic surface subject to the constraint that it cannot pass through some prescribed set of geodesics in $X$ (which act as “barriers” or “obstacles”, and force the resulting surface to end up roughly where one wants it to).

Anyway, one way or another, one can usually find a map of a surface, or a space of maps of surfaces, representing a given homomorphism, with some kind of a priori control of the geometry. Usually, this control is not enough to certify that a given map is $\pi_1$-injective, but sometimes it might be. For instance, a totally geodesic (immersed) surface in a complete manifold of constant negative curvature is always $\pi_1$-injective, and any surface whose extrinsic curvature is small enough will also be $\pi_1$-injective.

Geometric methods to certify injectivity of free or surface groups are very useful and flexible, as far as they go. Unfortunately, I know of very few topological methods to certify injectivity. By far the most important exception is the following:

Example: In $3$-dimensions, one should look for properly embedded surfaces. If $M$ is a $3$-manifold (possibly with boundary), and $S$ is a two-sided properly embedded surface, the famous Dehn’s Lemma (proved by Papakyriakopoulos) implies that either $S$ is $\pi_1$-injective, or there is an embedded essential loop in $S$ that bounds an embedded disk in $M$ on one side of $S$. Such a loop may be compressed (i.e. $S$ may be cut open along the loop, and two copies of the compressing disk sewn in) preserving the property of embeddedness, but increasing $\chi$. After finitely many steps, either $S$ compresses away entirely, or one obtains a $\pi_1$-injective surface. One way to ensure that $S$ does not compress away entirely is to start with a surface that is essential in (relative) homology; another way is to look for a surface dual to an action (of $\pi_1(M)$) on a tree. In the latter case, one can often construct quite different free subgroups in $\pi_1(M)$ by pingpong on the ends of the tree. Note by the way that this method produces closed surface subgroups as well as free subgroups. Note too that two-sidedness is essential to apply Dehn’s Lemma.

Remark: Modern $3$-manifold topologists are sometimes unreasonably indifferent to the power of Dehn’s Lemma (probably because this tool has been incorporated so fully into their subconscious?); it is worth reading Ralph Fox’s review of Papakyriakopoulos’s paper (linked above). Of this paper, he writes:

. . . it has already led to renewed attack on the problem of classifying the 3-dimensional manifolds; significant results have been and are being obtained. A complete solution has suddenly become a definite possibility.

Remember this was written more than 50 years ago — before the geometrization conjecture, before the JSJ decomposition, before the Scott core theorem, before Haken manifolds. The only reasonable reaction to this is: !!!

Example: The construction of injective surfaces by Dehn’s Lemma may be abstracted in the following way. Given a target space $X$, and a class of maps $\mathcal{F}$ of surfaces into $X$ (in some category; e.g. homotopy classes of maps, pleated surfaces, $\text{CAT}(-1)$ surfaces, etc.) suppose one can find a complexity $c:\mathcal{F} \to \mathcal{O}$ with values in some ordered set, such that if $f \in \mathcal{F}$ is not injective, one can find $f' \in \mathcal{F}$ of smaller complexity. Then if $\mathcal{O}$ is well-ordered, an injective surface may be found. If $\mathcal{O}$ is not well-ordered, one may ask at least that $c$ is upper semi-continuous on $\mathcal{F}$, and hope to extend it upper semi-continuously to some suitable compactification of $\mathcal{F}$. Even if $\mathcal{O}$ is not well-ordered, one can at least certify that a map is injective, by showing that it minimizes $c$. Here are some potential examples (none of them entirely satisfactory).

1. Given a (homologically trivial) homotopy class of loop $\gamma$ in $X$, one can look at all maps of orientable surfaces $S$ to $X$ with boundary factoring through $\gamma$. For such a surface, let $n(S)$ denote the degree with which the (possibly multiple) boundary (components) of $S$ wrap homologically around $\gamma$, and let $-\chi^-(S)$ denote the sum of Euler characteristics of non-disk and non-sphere components of $S$. For each surface $S$, one considers the quantity $-\chi^-(S)/2n(S)$ (the factor of $2$ can be ignored if desired). The important feature of this quantity is that it does not change if $S$ is replaced by a finite cover. If $\pi_1(S)$ is not injective, let $\alpha$ be an essential loop on $S$ whose image in $X$ is inessential. Peter Scott showed that any essential loop on a surface lifts to an embedded loop in some finite cover. Hence, after passing to such a cover, $\alpha$ may be compressed, and the resulting surface $S'$ satisfies $-\chi^-(S')/2n(S') < -\chi^-(S)/2n(S)$. In other words, a global minimizer of this quantity is injective. Such a surface is called extremal. The problem is that extremal surfaces do not always exist; but this construction motivates one to look for them.
2. Given a $\text{CAT}(-1)$ surface $S$ with geodesic boundary in $X$, one can retract $S$ to a geodesic spine, and encode the surface by the resulting fatgraph, with edges labelled by homotopy classes in $X$. Since Euler characteristic is local, one does not really care precisely how the pieces of the fatgraph are assembled, but only how many pieces of what kinds are needed for a given boundary. So if only finitely many such pieces appear in some infinite family of surfaces, one can in fact construct an extremal surface as above, which is necessarily injective (more technically, one reduces the computation of Euler characteristic to a linear programming problem, finds a rational extremal solution (which corresponds to a weighted sum of pieces of fatgraph), and glues together the pieces to construct the extremal surface; one situation in which this scheme can be made to work is explained in this paper of mine). Edges can be subdivided into a finite number of possibilities, so one just needs to ensure finiteness of the number of vertex types. One condition that ensures finiteness of vertex types is the existence of a uniform constant $C>0$ so that for each surface $S$ in the given family, and for each point $p \in S$, there is an estimate $\text{dist}(p,\partial S) \le C$. If this condition is violated, one finds pairs $p_i,S_i$ which converge in the geometric topology to a point in a complete (i.e. without boundary, but probably noncompact) surface.
3. Given $S \to X$, either compress an embedded essential loop, or realize $S$ by a least area surface. If $S$ is not injective, pass to a cover, compress a loop, and realize the result by a least area surface. Repeat this process. One obtains in this way a sequence of least area surfaces in $X$ (typically of bigger and bigger genus) and there is no reason to expect the process to terminate. If $X$ is a $3$-manifold, the curvature of a least area surface admits two-sided curvature bounds away from the boundary, by a theorem of Schoen (near the boundary, the negative curvature might blow up, but only in controlled ways — e.g. after rescaling about a sequence of points with the most negative curvature, one may obtain in the limit a helicoid). Away from the boundary, the family of surfaces one obtains vary precompactly in the $C^\infty$ topology, and one may obtain a complete locally least area lamination $\Lambda$ in the limit. If $\pi_1(\Lambda)$ is not injective, one can continue to pass to covers (applying a version of Scott’s theorem for infinite surfaces) and compress, and by transfinite induction, eventually arrive at a locally least area lamination with injective $\pi_1$. Of course, such a limit might well be a lamination by planes. However, the lamination one obtains is not completely arbitrary: since it is a limit of limits of . . . compact surfaces, one can choose a limit that admits a nontrivial invariant transverse measure (one must be careful here, since the lamination will typically have boundary). Or, as in bullet 2. above, one may insist that this limit lamination is complete (i.e. without boundary).

It is more tricky to find a limit lamination as in 3. without boundary and admitting an invariant transverse measure; in any case, this motivates the following:

Question: Is there a closed hyperbolic $3$-manifold $M$ which admits a locally least area transversely measured complete immersed lamination $\Lambda$, all of whose leaves are disks? (note that the answer is negative if one asks for the lamination to be embedded (there are several easy proofs of this fact)).

Secretly, the function that assigns $\inf_S -\chi^-(S)/2n(S)$ to a homologically trivial loop $\gamma$ is the stable commutator length of the conjugacy class in $\pi_1(X)$ represented by $\gamma$. Extremal surfaces can sometimes be certified by constructing certain functions on $\pi_1(X)$ called homogeneous quasimorphisms, but a discussion of such functions will have to wait for another post.

A few days ago, Joel Friedman posted a paper on the arXiv purporting to give a proof of the (strengthened) Hanna Neumann conjecture, a well-known problem in geometric group theory.

Simply stated, the problem is as follows.

Conjecture (Hanna Neumann): Let $F$ be a free group, and let $G$ and $H$ be finitely generated subgroups. For a subgroup $E$ of $F$, let $\rho(E) = \max(\text{rank}(E)-1,0)$. Then there is an inequality $\rho(G \cap H) \le \rho(G)\rho(H)$.

This conjecture was further strengthened by Walter Neumann (her son):

Conjecture (strengthened Hanna Neumann): With notation above, there is an inequality $\sum_x \rho(G \cap xHx^{-1}) \le \rho(G)\rho(H)$ where the sum is taken over $x \in H\backslash F / G$, i.e. the double coset representatives.

Notice by the way that since any free group embeds into $F_2$, the free group of rank $2$, one can assume that $F$ has rank $2$ above. This fact is implicit in the discussion below.

Friedman’s paper seems to be very carefully written, and contains some new ideas (which I do not yet really understand), namely an approach using sheaf theory. But in this post I want to restrict myself to some simple (and probably well-known) geometric observations.

The first step is to reduce the problem to a completely graph-theoretic one, following Stallings; in fact, Benson Farb tells me that he thinks this reduction was known to Stallings, or at least to Dicks/Formanek (and in any case is very close to some ideas Stallings and Gersten introduced to study the problem; more on that in a later post). Friedman makes the following definition:

Definition: Let $\mathcal{G}$ be a finite group and $g_1,g_2 \in \mathcal{G}$ be two elements (that do not necessarily generate $\mathcal{G}$). The directed Cayley graph $C$ is the graph with vertex set $\mathcal{G}$ and with a directed edge from $v$ to $vg_i$ labeled $i$ for each $v \in \mathcal{G}$ and $i=1,2$.

In other words, $C$ is a graph whose edges are oriented and labeled with either $1$ or $2$ in such a way that each vertex has at most one outgoing and one incoming edge with each label, and such that there is a transitive (on the vertices) free action of a group $\mathcal{G}$ on $C$. (Note: for some reason, Friedman wants his group to act on the right, and therefore has directed edges from $v$ to $g_iv$, but this is just a matter of convention).

For any finite graph $K$, not necessarily connected, let $\rho(K) = \sum_j \max(0,-\chi(K_j))$; i.e. $\rho(K) = \sum_j \rho(\pi_1(K_j))$ where the sum is taken over the connected components $K_j$ of $K$. Friedman shows (but this reduction is well-known) that the SHNC is equivalent to the following graph-theoretic inequality:

Theorem: The SHNC is equivalent to the following statement. For any graph $C$ as above, and any two subgraphs $K,K'$ we have $\sum_{g \in \mathcal{G}} \rho(K \cap gK') \le \rho(K)\rho(K')$.

The purpose of this blog entry is to show that there is a very simple proof of this inequality when $\rho$ is replaced with $-\chi$. This is not such a strange thing to do, since $\rho$ and $-\chi$ are equal for graphs without acyclic components (i.e. without components that are trees), and for “random” graphs $K,K'$ one does not expect the difference between $\rho$ and $-\chi$ to be very big. The argument proceeds as follows. Suppose $K$ has $v$ vertices and $e_1,e_2$ edges of kind $1,2$ respectively, and define $v',e_1',e_2'$ similarly for $K'$. Then

• $(-\chi(K))(-\chi(K')) = (v-e_1-e_2)(v'-e_1'-e_2')$

On the other hand, since Euler characteristic is local, we just need to count how many vertices and edges of each kind turn up in each $K \cap gK'$. But this is easy: every vertex of $K$ is equal to exactly one translate of every vertex of $K'$, and similarly for edges of each kind. Hence

• $\sum_g -\chi(K \cap gK') = e_1e_1' + e_2e_2' - vv'$

So the inequality one wants to show is $e_1e_1' + e_2e_2' - vv' \le (v-e_1-e_2)(v'-e_1'-e_2')$ which simplifies to

• $v(e_1' + e_2') + v'(e_1 + e_2) \le 2vv' + e_1e_2' + e_2 e_1'$

On the other hand, each graph $K,K'$ has at most two edges at any vertex with either label, and therefore we have inequalities $0 \le e_1,e_2 \le v, 0 \le e_1',e_2' \le v'$. Subject to these constraints, the inequality above is straightforward to prove. To see this, first fix some non-negative values of $v,v'$ and let $X$ be the four-dimensional cube of possible values of $e_1,e_2,e_1',e_2'$. Since both sides of the inequality are linear as a function of each $e_i$ or $e_i'$, if the inequality is violated at any point in $X$ one may draw a straight line in $X$ corresponding to varying one of the co-ordinates (e.g. $e_1$) while keeping the others fixed, and deduce that the inequality must be violated on one of the faces of $X$. Inductively, if the inequality is violated at all, it is violated at a vertex of $X$, which may be ruled out by inspection; qed.

This argument shows that the whole game is to understand the acyclic components of $K \cap gK'$; i.e. those which are topologically trees, and therefore contribute $0$ to $\rho$, but $-1$ to $-\chi$.

Incidentally, for all I know, this simple argument is explicitly contained in either Stallings’ or Gersten’s paper (it is surely not original in any case). If a reader can verify this, please let me know!

Update: Walter Neumann informs me that this observation (that the inequality is true with $-\chi$ in place of $\rho$) is in his paper in which he introduces the SHNC! He further shows in that paper that for “most” $G$, the SHNC is true for all $H$.

Update (6/29): Warren Dicks informs me that he was not aware of the reduction of SHNC to the graph-theoretic formulation described above. Friedman’s webpage acknowledges the existence of an error in the paper, and says that he is working to correct it. One problem that I know of (discovered mostly by my student Steven Frankel) concerns the commutativity of the diagram on page 10.

Update (10/22): It has been a few months since I last edited this page, and Joel Friedman has not updated either the arXiv paper, or the statement on his webpage that he is “trying to fix the error”. Since wikipedia mentions Friedman’s announcement, I thought it would be worth going on record at this point to say that Friedman’s arXiv paper (version 1 — the only version at the point I write this) is definitely in error, and that I believe the error is fundamental, and cannot be repaired (this is not to say that the paper does not contain some things of interest (it does), or that Friedman does not acknowledge the error (he does), just that it is worth clearing up any possible ambiguity about the situation for readers who are wondering about the status of the SHNC). The problem is the “not entirely standard” (quote from Friedman’s paper) diagrams, like the one on page 10. In particular, the claimed proof of Theorem 5.6, that the projections constructed in Lemma 5.5 (by a very general dimension counting argument) fit into a diagram with the desired properties is false. Any construction of projections satisfying the desired properties must be quite special. Nevertheless, one can certainly still define Friedman’s sheaf $\mathcal{K}$, and ask whether it has $\rho(\mathcal{K})=0$ (in Friedman’s sense); this would, as far as I can tell, prove SHNC; however, I do not know of any reason why it should hold (or whether there are any counterexamples, which might exist even if SHNC is true).

More ambitious than simply showing that a group is infinite is to show that it contains an infinite subgroup of a certain kind. One of the most important kinds of subgroup to study are free groups. Hence, one is interested in the question:

Question: When does a group contain a (nonabelian) free subgroup?

Again, one can (and does) ask this question both about a specific group, and about certain classes of groups, or for a typical (in some sense) group from some given family.

Example: If $\mathcal{P}$ is a property of groups that is inherited by subgroups, then if no free group satisfies $\mathcal{P}$, no group that satisfies $\mathcal{P}$ can contain a free subgroup. An important property of this kind is amenability. A (discrete) group $G$ is amenable if it admits an invariant mean; that is, if there is a linear map $m: L^\infty(G) \to \mathbb{R}$ (i.e. a way to define the average of a bounded function over $G$) satisfying three basic properties:

1. $m(f) \ge 0$ if $f\ge 0$ (i.e. the average of a non-negative function is non-negative)
2. $m(\chi_G)=1$ where $\chi_G$ is the constant function taking the value $1$ everywhere on $G$ (i.e. the average of the constant function $1$ is normalized to be $1$)
3. $m(g\cdot f) = m(f)$ for every ${}g \in G$ and $f \in L^\infty(G)$, where $(g\cdot f)(x) = f(g^{-1}x)$ (i.e. the mean is invariant under the obvious action of $G$ on $L^\infty(G)$)

If $H$ is a subgroup of $G$, there are (many) $H$-invariant homomorphisms $j: L^\infty(H) \to L^\infty(G)$ taking non-negative functions to non-negative functions, and $\chi_H$ to $\chi_G$; for example, the (left) action of $H$ on $G$ breaks up into a collection of copies of $H$ acting on itself, right-multiplied by a collection of right coset representatives. After choosing such a choice of representatives $\lbrace g_\alpha \rbrace$, one for each coset $Hg_\alpha$, we can define $j(f)(hg_\alpha) = f(h)$. Composing with $m$ shows that every subgroup of an amenable group is amenable (this is harder to see in the “geometric” definition of amenable groups in terms of Folner sets). On the other hand, as is well-known, a nonabelian free group is not amenable. Hence, amenable groups do not contain nonabelian free subgroups.

The usual way to see that a nonabelian free group is not amenable is to observe that it contains enough disjoint “copies” of big subsets. For concreteness, let $F$ denote the free group on two generators $a,b$, and write their inverses as $A,B$. Let $W_a, W_A$ denote the set of reduced words that start with either $a$ or $A$, and let $\chi_a,\chi_A$ denote the indicator functions of $W_a,W_A$ respectively. We suppose that $F$ is amenable, and derive a contradiction. Note that $F = W_a \cup aW_A$, so $m(\chi_a) + m(\chi_A) \ge 1$. Let $V$ denote the set of reduced words that start with one of the strings $a,A,ba,bA$, and let $\chi_V$ denote the indicator function of $V$. Notice that $V$ is made of two disjoint copies of each of $W_a,W_A$. So on the one hand, $m(\chi_V) \le m(\chi_F) = 1$, but on the other hand, $m(\chi_V) = 2 (m(\chi_a)+m(\chi_A)) \ge 2$.

Conversely, the usual way to show that a group $G$ is amenable is to use the Folner condition. Suppose that $G$ is finitely generated by some subset $S$, and let $C$ denote the Cayley graph of $G$ (so that $C$ is a homogeneous locally finite graph). Suppose one can find finite subsets $U_i$ of vertices so that $|\partial U_i|/|U_i| \to 0$ (here $|U_i|$ means the number of vertices in $U_i$, and  $|\partial U_i|$ means the number of vertices in $U_i$ that share an edge with $C - U_i$). Since the “boundary” of $U_i$ is small compared to $U_i$, averaging a bounded function over $U_i$ is an “almost invariant” mean; a weak limit (in the dual space to $L^\infty(G)$) is an invariant mean. Examples of amenable groups include

1. Finite groups
2. Abelian groups
3. Unions and extensions of amenable groups
4. Groups of subexponential growth

and many others. For instance, virtually solvable groups (i.e. groups containing a solvable subgroup with finite index) are amenable.

Example: No amenable group can contain a nonabelian free subgroup. The von Neumann conjecture asked whether the converse was true. This conjecture was disproved by Olshanskii. Subsequently, Adyan showed that the infinite free Burnside groups are not amenable. These are groups $B(m,n)$ with $m\ge 2$ generators, and subject only to the relations that the $n$th power of every element is trivial. When $n$ is odd and at least $665$, these groups are infinite and nonamenable. Since they are torsion groups, they do not even contain a copy of $\mathbb{Z}$, let alone a nonabelian free group!

Example: The Burnside groups are examples of groups that obey a law; i.e. there is a word $w(x_1,x_2,\cdots,x_n)$ in finitely many free variables, such that $w(g_1,g_2,\cdots,g_n)=\text{id}$ for every choice of $g_1,\cdots,g_n \in G$. For example, an abelian group satisfies the law $x_1x_2x_1^{-1}x_2^{-1}=\text{id}$. Evidently, a group that obeys a law does not contain a nonabelian free subgroup. However, there are examples of groups which do not obey a law, but which also do not contain any nonabelian free subgroup. An example is the classical Thompson’s group $F$, which is the group of orientation-preserving piecewise-linear homeomorphisms of $[0,1]$ with finitely many breakpoints at dyadic rationals (i.e. points of the form $p/2^q$ for integers $p,q$) and with slopes integral powers of $2$. To see that this group does not obey a law, one can show (quite easily) that in fact $F$ is dense (in the $C^0$ topology) in the group $\text{Homeo}^+([0,1])$ of all orientation-preserving homeomorphisms of the interval. This latter group contains nonabelian free groups; by approximating the generators of such a group arbitrarily closely, one obtains pairs of elements in $F$ that do not satisfy any identity of length shorter than any given constant. On the other hand, a famous theorem of Brin-Squier says that $F$ does not contain any nonabelian free subgroup. In fact, the entire group $\text{PL}^+([0,1])$ does not contain any nonabelian free subgroup. A short proof of this fact can be found in my paper as a corollary of the fact that every subgroup $G$ of $\text{PL}^+([0,1])$ has vanishing stable commutator length; since stable commutator length is nonvanishing in nonabelian free groups, this shows that there are no such subgroups of $\text{PL}^+([0,1])$. (Incidentally, and complementarily, there is a very short proof that stable commutator length vanishes on any group that obeys a law; we will give this proof in a subsequent post).

Example: If $G$ surjects onto $H$, and $H$ contains a free subgroup $F$, then there is a section from $F$ to $G$ (by freeness), and therefore $G$ contains a free subgroup.

Example: The most useful way to show that $G$ contains a nonabelian free subgroup is to find a suitable action of $G$ on some space $X$. The following is known as Klein’s ping-pong lemma. Suppose one can find disjoint subsets $U^\pm$ and $V^\pm$ of $X$, and elements $g,h \in G$ so that $g(U^+ \cup V^\pm) \subset U^+$$g^{-1}(U^- \cup V^\pm) \subset U^-$, and similarly interchanging the roles of $U^\pm, V^\pm$ and $g,h$. If $w$ is a reduced word in $g^{\pm 1},h^{\pm 1}$, one can follow the trajectory of a point under the orbit of subwords of $w$ to verify that $w$ is nontrivial. The most common way to apply this in practice is when $g,h$ act on $X$ with source-sink dynamics; i.e. the element $g$ has two fixed points $u^\pm$ so that every other point converges to $u^+$ under positive powers of $g$, and to $u^-$ under negative powers of $g$. Similarly, $h$ has two fixed points $v^\pm$ with similar dynamics. If the points $u^\pm,v^\pm$ are disjoint, and $X$ is compact, one can take any small open neighborhoods $U^\pm,V^\pm$ of $u^\pm,v^\pm$, and then sufficiently large powers of $g$ and $h$ will satisfy the hypotheses of ping-pong.

Example: Every hyperbolic group $G$ acts on its Gromov boundary $\partial_\infty G$. This boundary is the set of equivalence classes of quasigeodesic rays in (the Cayley graph of) $G$, where two rays are equivalent if they are a finite Hausdorff distance apart. Non-torsion elements act on the boundary with source-sink dynamics. Consequently, every pair of non-torsion elements in a hyperbolic group either generate a virtually cyclic group, or have powers that generate a nonabelian free group.

It is striking to see how easy it is to construct nonabelian free subgroups of a hyperbolic group, and how difficult to construct closed surface subgroups. We will return to the example of hyperbolic groups in a future post.

Example: The Tits alternative says that any linear group $G$ (i.e. any subgroup of $\text{GL}(n,\mathbb{R})$ for some $n$) either contains a nonabelian free subgroup, or is virtually solvable (and therefore amenable). This can be derived from ping-pong, where $G$ is made to act on certain spaces derived from the linear action (e.g. locally symmetric spaces compactified in certain ways, and buildings associated to discrete valuations on the ring of entries of matrix elements of $G$).

Example: There is a Tits alternative for subgroups of other kinds of groups, for example mapping class groups, as shown by Ivanov and McCarthy. The mapping class group (of a surface) acts on the Thurston boundary of Teichmuller space. Every subgroup of the mapping class group either contains a nonabelian free subgroup, or is virtually abelian. Roughly speaking, either elements move points in the boundary with enough dynamics to be able to do ping-pong, or else the action is “localized” in a train-track chart, and one obtains a linear representation of the group (enough to apply the ordinary Tits alternative). Virtually solvable subgroups of mapping class groups are virtually abelian.

Example: A similar Tits alternative holds for $\text{Out}(F_n)$. This was shown by Bestvina-Feighn-Handel in these three papers (the third paper shows that solvable subgroups are virtually abelian, thus emphasizing the parallels with mapping class groups).

Example: If $G$ is a finitely generated group of homeomorphisms of $S^1$, then there is a kind of Tits alternative, first proposed by Ghys, and proved by Margulis: either $G$ preserves a probability measure on $S^1$ (which might be singular), or it contains a nonabelian free subgroup. To see this, first note that either $G$ has a finite orbit (which supports an invariant probability measure) or the action is semi-conjugate to a minimal action (one with all orbits dense). In the second case, the proof depends on understanding the centralizer of the group action: either the centralizer is infinite, in which case the group is conjugate to a group of rotations, or it is finite cyclic, and one obtains an action of $G$ on a “smaller” circle, by quotienting out by the centralizer. So one may assume the action is minimal with trivial centralizer. In this case, one shows that the action has the property that for any nonempty intervals $I,J$ in $S^1$, there is some ${}g \in G$ with $g(I) \subset J$; i.e. any interval may be put inside any other interval by some element of the group. For such an action, it is very easy to do ping-pong. Incidentally, a minor variation on this result, and with essentially this argument, was established by Thurston in the context of uniform foliations of $3$-manifolds before Ghys proposed his question.

Example: If $\rho_t$ is an (algebraic) family of representations of a (countable) free group $F$ into an algebraic group, then either some element $g \in F$ is in the kernel of every $\rho_t$, or the set of faithful representations is “generic”, i.e. the intersection of countably many open dense sets. This is because the set of representations for which a given element is in the kernel is Zariski closed, and therefore its complement is open and either empty or dense (one must add suitable hypotheses or conditions to the above to make it rigorous).