Ellipsoids and KAK

As many readers are no doubt aware, the title of this blog comes from the famous book Geometry and the Imagination by Hilbert and Cohn-Vossen (based on lectures given by Hilbert). One of the first things discussed in that book is the geometry of conics, especially in two and three dimensions. An ellipsoid is a certain kind of (real) quadric surface, i.e. a surface in \mathbb{R}^n defined by a single quadratic equation of the co-ordinates. It may also be defined as the image of the unit (n-1)-dimensional sphere under an affine self-map of \mathbb{R}^n. After composing with a translation, one may imagine an ellipsoid centered at the origin, and think of it as the image of the unit sphere under a linear automorphism of \mathbb{R}^n — i.e. transformation by a nonsingular matrix M.

A (generic) ellipsoid has n axes; in dimension three, these are the “major axis”, the “minor axis” and the “mean axis”. Distance to the origin is a Morse function on a generic ellipsoid; the symmetry of an ellipsoid under the antipodal map means that critical points occur in antipodal pairs. There are a pair of critical points of each index between 0 and n. There is a gradient flow line of this Morse function between each pair of critical points whose index differs by 1, and the union of these flowlines are the (2-dimensional) ellipse obtained by intersecting the ellipsoid with the plane spanned by the pair of axes in question. This shows that these axes are mutually perpendicular.

One may use this geometric picture to “see” the KAK decomposition of \text{GL}(n,\mathbb{R}) as follows, where K denotes the orthogonal subgroup \text{O}(n,\mathbb{R}), and A denotes the subgroup of diagonal matrices with positive entries. Let M be a linear map of \mathbb{R}^n, and let E be the ellipsoid which is the image of the unit sphere under M. Let \xi_i be the axes of E of index i. There is a unique orthogonal matrix O taking the \xi_i to the co-ordinate axes. There is a unique diagonal matrix D taking O(E) to the round sphere. Hence the composition ODM is orthogonal, and we can express M as a product of an orthogonal matrix, a diagonal matrix, and another orthogonal matrix.

One can use ellipsoids to visualize another less standard matrix decomposition as follows. For simplicity we concentrate on the case of dimension 3. The minor and mean axis span a plane \pi which intersects the ellipsoid in the “smallest” possible ellipse. Rotate this plane by keeping the mean axis fixed, and tilting the minor axis towards the major axis. At some unique point one obtains a plane \pi' that intersects the ellipsoid in a round circle. One may shear the ellipsoid, keeping this plane fixed, into an ellipsoid of rotation. This describes a way to factorize M as a product of a shear, a diagonal matrix with two equal eigenvalues, and a rotation.

Question: What is the generalization of the “shear, dilate, rotate” factorization in higher dimensions?

Question: Is there a way to see the Iwasawa (KAN) decomposition geometrically, by using ellipsoids?

Posted in Visualization | Tagged , , , | Leave a comment

Combable functions

The purpose of this post is to discuss my recent paper with Koji Fujiwara, which will shortly appear in Ergodic Theory and Dynamical Systems, both for its own sake, and in order to motivate some open questions that I find very intriguing. The content of the paper is a mixture of ergodic theory, geometric group theory, and computer science, and was partly inspired by a paper of Jean-Claude Picaud. To state the results of the paper, I must first introduce a few definitions and some background.

Let \Gamma be a finite directed graph (hereafter a digraph) with an initial vertex, and edges labeled by elements of a finite set S in such a way that each vertex has at most one outgoing edge with any given label. A finite directed path in \Gamma starting at the initial vertex determines a word in the alphabet S, by reading the labels on the edges traversed (in order). The set L \subset S^* of words obtained in this way is an example of what is called a regular language, and is said to be parameterized by \Gamma. Note that this is not the most general kind of regular language; in particular, any language L of this kind will necessarily be prefix-closed (i.e. if w \in L then every prefix of w is also in L). Note also that different digraphs might parameterize the same (prefix-closed) regular language L.

If S is a set of generators for a group G, there is an obvious map L \to G called the evaluation map that takes a word w to the element of G represented by that word.

Definition: Let G be a group, and S a finite generating set. A combing of G is a (prefix-closed) regular language L for which the evaluation map L \to G is a bijection, and such that every w \in L represents a geodesic in G.

The intuition behind this definition is that the set of words in L determines a directed spanning tree in the Cayley graph C_S(G) starting at \text{id}, and such that every directed path in the tree is a geodesic in C_S(G). Note that there are other definitions of combing in the literature; for example, some authors do not require the evaluation map to be a bijection, but only a coarse bijection.

Fundamental to the theory of combings is the following Theorem, which paraphrases one of the main results of this paper:

Theorem: (Cannon) Let G be a hyperbolic group, and let S be a finite generating set. Choose a total order on the elements of S. Then the language L of lexicographically first geodesics in G is a combing.

The language L described in this theorem is obviously geodesic and prefix-closed, and the evaluation map is bijective; the content of the theorem is that L is regular, and parameterized by some finite digraph \Gamma. In the sequel, we restrict attention exclusively to hyperbolic groups G.

Given a (hyperbolic) group G, a generating set S, a combing L, one makes the following definition:

Definition: A function \phi:G \to \mathbb{Z} is weakly combable (with respect to S,L) if there is a digraph \Gamma parameterizing L and a function d\phi from the vertices of \Gamma to \mathbb{Z} so that for any w \in L, corresponding to a path \gamma in \Gamma, there is an equality \phi(w) = \sum_i d\phi(\gamma(i)).

In other words, a function \phi is weakly combable if it can be obtained by “integrating” a function d\phi along the paths of a combing. One furthermore says that a function is combable if it changes by a bounded amount under right-multiplication by an element of S, and bicombable if it changes by a bounded amount under either left or right multiplication by an element of S. The property of being (bi-)combable does not depend on the choice of a generating set S or a combing L.

Example: Word length (with respect to a given generating set S) is bicombable.

Example: Let \phi:G \to \mathbb{Z} be a homomorphism. Then \phi is bicombable.

Example: The Brooks counting quasimorphisms (on a free group) and the Epstein-Fujiwara counting quasimorphisms are bicombable.

Example: The sum or difference of two (bi-)combable functions is (bi-)combable.

A particularly interesting example is the following:

Example: Let S be a finite set which generates G as a semigroup. Let \phi_S denote word length with respect to S, and \phi_{S^{-1}} denote word length with respect to S^{-1} (which also generates G as a semigroup). Then the difference \psi_S:= \phi_S - \phi_{S^{-1}} is a bicombable quasimorphism.

The main theorem proved in the paper concerns the statistical distribution of values of a bicombable function.

Theorem: Let G be a hyperbolic group, and let \phi be a bicombable function on G. Let \overline{\phi}_n be the value of \phi on a random word in G of length n (with respect to a certain measure \widehat{\nu} depending on a choice of generating set). Then there are algebraic numbers E and \sigma so that as distributions, n^{-1/2}(\overline{\phi}_n - nE) converges to a normal distribution with standard deviation \sigma.

One interesting corollary concerns the length of typical words in one generating set versus another. The first thing that every geometric group theorist learns is that if S_1, S_2 are two finite generating sets for a group G, then there is a constant K so that every word of length n in one generating set has length at most nK and at least n/K in the other generating set. If one considers an example like \mathbb{Z}^2, one sees that this is the best possible estimate, even statistically. However, if one restricts attention to a hyperbolic group G, then one can do much better for typical words:

Corollary: Let G be hyperbolic, and let S_1,S_2 be two finite generating sets. There is an algebraic number \lambda_{1,2} so that almost all words of length n with respect to the S_1 generating set have length almost equal to n\lambda_{1,2} with respect to the S_2 generating set, with error of size O(\sqrt{n}).

Let me indicate very briefly how the proof of the theorem goes.

Sketch of Proof: Let \phi be bicombable, and let d\phi be a function from the vertices of \Gamma to \mathbb{Z}, where \Gamma is a digraph parameterizing L. There is a bijection between the set of elements in G of word length n and the set of directed paths in \Gamma of length n that start at the initial vertex. So to understand the distribution of \phi, we need to understand the behaviour of a typical long path in \Gamma.

Define a component of \Gamma to be a maximal subgraph with the property that there is a directed path (in the component) from any vertex to any other vertex. One can define a new digraph C(\Gamma) without loops, with one vertex for each component of \Gamma, in an obvious way. Each component C determines an adjacency matrix M_C, with ij-entry equal to 1 if there is a directed edge from vertex i to vertex j, and equal to 0 otherwise. A component C is big if the biggest real eigenvalue \lambda of M_C is at least as big as the biggest real eigenvalue of the matrices associated to every other component. A random long walk in \Gamma will spend most of its time entirely in big components, so these are the only components we need to consider to understand the statistical distribution of \phi.

A theorem of Coornaert implies that there are no big components of C(\Gamma) in series; i.e. there are no directed paths in C(\Gamma) from one big component to another (one also says that the big components do not communicate). This means that a typical long walk in \Gamma is entirely contained in a single big component, except for a (relatively short) path at the start and the end of the walk. So the distribution of \phi gets independent contributions, one from each big component.

The contribution from an individual big component is not hard to understand: the central limit theorem for stationary Markov chains says that for elements of G corresponding to paths that spend almost all their time in a given big component C there is a central limit theorem  n^{-1/2}(\overline{\phi}_n - nE_C) \to N(0,\sigma_C) where the mean E_C and standard deviation \sigma_C depend only on C. The problem is to show that the means and standard deviations associated to different big components are the same. Everything up to this point only depends on weak combability of \phi; to finish the proof one must use bicombability.

It is not hard to show that if \gamma is a typical infinite walk in a component C, then the subpaths of \gamma of length n are distributed like random walks of length n in C. What this means is that the mean and standard deviation E_C,\sigma_C associated to a big component C can be recovered from the distribution of \phi on a single infinite “typical” path in C. Such an infinite path corresponds to an infinite geodesic in G, converging to a definite point in the Gromov boundary \partial G. Another theorem of Coornaert (from the same paper) says that the action of G on its boundary \partial G is ergodic with respect to a certain natural measure called a Patterson-Sullivan measure (see Coornaert’s paper for details). This means that there are typical infinite geodesics \gamma,\gamma' associated to components C and C' for which some g \in G takes \gamma to a geodesic g\gamma ending at the same point in \partial G as \gamma'. Bicombability implies that the values of \phi on \gamma and g\gamma differ by a bounded amount. Moreover, since g\gamma and \gamma' are asymptotic to the same point at infinity, combability implies that the values of \phi on g\gamma and \gamma' also differ by a bounded amount. This is enough to deduce that E_C = E_{C'} and \sigma_C = \sigma_{C'}, and one obtains a (global) central limit theorem for \phi on G. qed.

This obviously raises several questions, some of which seem very hard, including:

Question 1: Let \phi be an arbitrary quasimorphism on a hyperbolic group G (even the case G is free is interesting). Does \phi satisfy a central limit theorem?

Question 2: Let \phi be an arbitrary quasimorphism on a hyperbolic group G. Does \phi satisfy a central limit theorem with respect to a random walk on G? (i.e. one considers the distribution of values of \phi not on the set of elements of G of word length n, but on the set of elements obtained by a random walk on G of length n, and lets n go to infinity)

All bicombable quasimorphisms satisfy an important property which is essential to our proof of the central limit theorem: they are local, which is to say, they are defined as a sum of local contributions. In the continuous world, they are the analogue of the so-called de Rham quasimorphisms on \pi_1(M) where M is a closed negatively curved Riemannian manifold; such quasimorphisms are defined by choosing a 1-form \alpha, and defining \phi_\alpha(g) to be equal to the integral \int_{\gamma_g} \alpha, where \gamma_g is the closed oriented based geodesic in M in the homotopy class of g. De Rham quasimorphisms, being local, also satisfy a central limit theorem.

This locality manifests itself in another way, in terms of defects. Let \phi be a quasimorphism on a hyperbolic group G. Recall that the defect D(\phi) is the supremum of |\phi(gh) - \phi(g) -\phi(h)| over all pairs of elements g,h \in G. A quasimorphism is further said to be homogeneous if \phi(g^n) = n\phi(g) for all integers n. If \phi is an arbitrary quasimorphism, one may homogenize it by taking a limit \psi(g) = \lim_{n \to \infty} \phi(g^n)/n; one says that \psi is the homogenization of \phi in this case. Homogenization typically does not preserve defects; however, there is an inequality D(\psi) \le 2D(\phi). If \phi is local, one expects this inequality to be an equality. For, in a hyperbolic group, the contribution to the defect of a local quasimorphism all arises from the interaction of the suffix of (a geodesic word representing the element) g with the prefix of h (with notation as above). When one homogenizes, one picks up another contribution to the defect from the interaction of the prefix of g with the suffix of h; since these two contributions are essentially independent, one expects that homogenizing a local quasimorphism should exactly double the defect. This is the case for bicombable and de Rham quasimorphisms, and can perhaps be used to define locality for a quasimorphism on an arbitrary group.

This discussion provokes the following key question:

Question 3: Let G be a group, and let \psi be a homogeneous quasimorphism. Is there a quasimorphism \phi with homogenization \psi, satisfying D(\psi) = 2D(\phi)?

Example: The answer to question 3 is “yes” if \psi is the rotation quasimorphism associated to an action of G on S^1 by orientation-preserving homeomorphisms (this is nontrivial; see Proposition 4.70 from my monograph).

Example: Let C be any homologically trivial group 1-boundary. Then there is some extremal homogeneous quasimorphism \psi for C (i.e. a quasimorphism achieving equality \text{scl}(C) = \psi(C)/2D(\psi) under generalized Bavard duality; see this post) for which there is \phi with homogenization \psi satisfying D(\psi) = 2D(\phi). Consequently, if every point in the boundary of the unit ball in the \text{scl} norm is contained in a unique supporting hyperplane, the answer to question 3 is “yes” for any quasimorphism on G.

Any quasimorphism on G can be pulled back to a quasimorphism on a free group, but this does not seem to make anything easier. In particular, question 3 is completely open (as far as I know) when G is a free group. An interesting test case might be the homogenization of an infinite sum of Brooks functions \sum_w h_w for some infinite non-nested family of words \lbrace w \rbrace.  

If the answer to this question is false, and one can find a homogeneous quasimorphism \psi which is not the homogenization of any “local” quasimorphism, then perhaps \psi does not satisfy a central limit theorem. One can try to approach this problem from the other direction:

Question 4: Given a function f defined on the ball of radius n in a free group F, one defines the defect D(f) in the usual way, restricted to pairs of elements g,h for which g,h,gh are all of length at most n. Under what conditions can f be extended to a function on the ball of radius n+1 without increasing the defect?

If one had a good procedure for building a quasimorphism “by hand” (so to speak), one could try to build a quasimorphism that failed to satisfy a central limit theorem, or perhaps find reasons why this was impossible.

Posted in Ergodic Theory | Tagged , , , , , , | 2 Comments

Quasimorphisms and laws

A basic reference for the background to this post is my monograph.

Let G be a group, and let [G,G] denote the commutator subgroup. Every element of [G,G] can be expressed as a product of commutators; the commutator length of an element g is the minimum number of commutators necessary, and is denoted \text{cl}(g). The stable commutator length is the growth rate of the commutator lengths of powers of an element; i.e. \text{scl}(g) = \lim_{n \to \infty} \text{cl}(g^n)/n. Recall that a group G is said to satisfy a law if there is a nontrivial word w in a free group F for which every homomorphism from F to G sends w to \text{id}.

The purpose of this post is to give a very short proof of the following proposition (modulo some background that I wanted to talk about anyway):

Proposition: Suppose G obeys a law. Then the stable commutator length vanishes identically on [G,G].

The proof depends on a duality between stable commutator length and a certain class of functions, called homogeneous quasimorphisms

Definition: A function \phi:G \to \mathbb{R} is a quasimorphism if there is some least number D(\phi)\ge 0 (called the defect) so that for any pair of elements g,h \in G there is an inequality |\phi(x) + \phi(y) - \phi(xy)| \le D(\phi). A quasimorphism is homogeneous if it satisfies \phi(g^n) = n\phi(g) for all integers n.

Note that a homogeneous quasimorphism with defect zero is a homomorphism (to \mathbb{R}). The defect satisfies the following formula:

Lemma: Let f be a homogeneous quasimorphism. Then D(\phi) = \sup_{g,h} \phi([g,h]).

A fundamental theorem, due to Bavard, is the following:

Theorem: (Bavard duality) There is an equality \text{scl}(g) = \sup_\phi \frac {\phi(g)} {2D(\phi)} where the supremum is taken over all homogeneous quasimorphisms with nonzero defect.

In particular, \text{scl} vanishes identically on [G,G] if and only if every homogeneous quasimorphism on G is a homomorphism.

One final ingredient is another geometric definition of \text{scl} in terms of Euler characteristic. Let X be a space with \pi_1(X) = G, and let \gamma:S^1 \to X be a free homotopy class representing a given conjugacy class g. If S is a compact, oriented surface without sphere or disk components, a map f:S \to X is admissible if the map on \partial S factors through \partial f:\partial S \to S^1 \to X, where the second map is \gamma. For an admissible map, define n(S) by the equality [\partial S] \to n(S) [S^1] in H_1(S^1;\mathbb{Z}) (i.e. n(S) is the degree with which \partial S wraps around \gamma). With this notation, one has the following:

Lemma: There is an equality \text{scl}(g) = \inf_S \frac {-\chi^-(S)} {2n(S)}.

Note: the function -\chi^- is the sum of -\chi over non-disk and non-sphere components of S. By hypothesis, there are none, so we could just write -\chi. However, it is worth writing -\chi^- and observing that for more general (orientable) surfaces, this function is equal to the function \rho defined in a previous post.

We now give the proof of the Proposition.

Proof. Suppose to the contrary that stable commutator length does not vanish on [G,G]. By Bavard duality, there is a homogeneous quasimorphism \phi with nonzero defect. Rescale \phi to have defect 1. Then for any \epsilon there are elements g,h with \phi([g,h]) \ge 1-\epsilon, and consequently \text{scl}([g,h]) \ge 1/2 - \epsilon/2 by Bavard duality. On the other hand, if X is a space with \pi_1(X)=G, and \gamma:S^1 \to X is a loop representing the conjugacy class of [g,h], there is a map f:S \to X from a once-punctured torus S to X whose boundary represents \gamma. The fundamental group of S is free on two generators x,y which map to the class of g,h respectively. If w is a word in x,y mapping to the identity in G, there is an essential loop \alpha in S that maps inessentially to X. There is a finite cover \widetilde{S} of S, of degree d depending on the word length of w, for which \alpha lifts to an embedded loop. This can be compressed to give a surface S' with -\chi^-(S') \le -\chi^-(\widetilde{S})-2. However, Euler characteristic is multiplicative under coverings, so -\chi^-(\widetilde{S}) = -\chi^-(S)\cdot d. On the other hand, n(S') = n(\widetilde{S})=d so \text{scl}([g,h]) \le 1/2 - 1/d. If G obeys a law, then d is fixed, but \epsilon can be made arbitrarily small. So G does not obey a law. qed.

Posted in Groups | Tagged , , , , | 3 Comments

Groups with free subgroups (part 2)

In a previous post, I discussed some methods for showing that a given group contains a (nonabelian) free subgroup. The methods were analytic and/or dynamical, and phrased in terms of the existence (or nonexistence) of certain functions on G or on spaces derived from G, or in terms of actions of G on certain spaces. Dually, one can try to find a free group in G by finding a homomorphism \rho: F \to G and looking for circumstances under which \rho is injective.

For concreteness, let G = \pi_1(X) for some (given) space X. If F is a free group, a representation \rho:F \to G up to conjugation determines a homotopy class of map f: S \to X where S is a K(F,1). The most natural K(F,1)‘s to consider are graphs and surfaces (with boundary). It is generally not easy to tell whether a map of a graph or a surface to a topological space is \pi_1-injective at the topological level, but might be easier if one can use some geometry.

Example: Let X be a complete Riemannian manifold with sectional curvature bounded above by some negative constant K < 0. Convexity of the distance function in a negatively curved space means that given any map of a graph f:\Gamma \to X one can flow f by the negative gradient of total length until it undergoes some topology change (e.g. some edge shrinks to zero length) or it (asymptotically) achieves a local minimum (the adjective “asymptotically” here just means that the flow takes infinite time to reach the minimum, because the size of the gradient is small when the map is almost minimum; there are no analytic difficulties to overcome when taking the limit). A typical topological change might be some loop shrinking to a point, thereby certifying that a free summand of \pi_1(\Gamma) mapped trivially to G and should have been discarded. Technically, one probably wants to choose \Gamma to be a trivalent graph, and when some interior edge collapses (so that four points come together) to let the 4-valent vertex resolve itself into a pair of 3-valent vertices in whichever of the three combinatorial possibilities is locally most efficient. The limiting graph, if nonempty, will be trivalent, with geodesic edges, and vertices at which the three edges are all (tangentially) coplanar and meet at angles of 2\pi/3. Such a graph can be certified as \pi_1-injective provided the edges are sufficiently long (depending on the curvature K). After rescaling the metric on X so that the supremum of the curvatures is -1, a trivalent geodesic graph with angles 2\pi/3 at the vertices and edges at least 2\tanh^{-1}(1/2) = 1.0986\cdots is \pi_1-injective. To see this, lift to maps between universal covers, i.e. consider an equivariant map from a tree \widetilde{\Gamma} to \widetilde{X}. Let \ell be an embedded arc in \widetilde{\Gamma}, and consider the image in \widetilde{X}. Using Toponogov’s theorem, one can compare with a piecewise isometric map from \ell to \mathbb{H}^n. The worst case is when all the edges are contained in a single \mathbb{H}^2, and all corners “bend” the same way. Providing the image does not bend as much as a horocircle, the endpoints of the image of \ell stay far away in \mathbb{H}^2. An infinite sided convex polygon in \mathbb{H}^2 with all edges of length 2\tanh^{-1}(1/2) and all angles 2\pi/3 osculates a horocycle, so we are done.

Remark: The fundamental group of a negatively curved manifold is word-hyperbolic, and therefore contains many nonabelian free groups, which may be certified by pingpong applied to the action of the group on its Gromov boundary. The point of the previous example is therefore to certify that a certain subgroup is free in terms of local geometric data, rather than global dynamical data (so to speak). Incidentally, I would not swear to the correctness of the constants above.

Example: A given free group is the fundamental group of a surface with boundary in many different ways (this difference is one of the reasons that a group like \text{Out}(F_n) is so much more complicated than the mapping class group of a surface). Pick a realization F = \pi_1(S). Then a homomorphism \rho:F \to G up to conjugacy determines a homotopy class of map from S to X as above. If X is negatively curved as before, each boundary loop is homotopic to a unique geodesic, and we may try to find a “good” map f:S \to X with boundary on these geodesics. There are many possible classes of good maps to consider:

  1. Fix a conformal structure on S and pick a harmonic map in the homotopy class of f. Such a map exists since the target is nonpositively curved, by the famous theorem of Eells-Sampson. The image is real analytic if X is, and is at least as negatively curved as the target, and therefore there is an a priori upper bound on the intrinsic curvature of the image; if the supremum of the curvature on X is normalized to be -1, then the image surface is \text{CAT}(-1), which just means that pointwise it is at least as negatively curved as hyperbolic space. By Gauss-Bonnet, one obtains an a priori bound on the area of the image of S in terms of the Euler characteristic (which just depends on the rank of F). On the other hand, this map depends on a choice of marked conformal structure on S, and the space of such structures is noncompact.
  2. Vary over all conformal structures on S and choose a harmonic map of least energy (if one exists) or find a sequence of maps that undergo a “neck pinch” as a sequence of conformal structures on S degenerates. Such a neck pinch exhibits a simple curve in S that is essential in S but whose image is inessential in X; such a curve can be compressed, and the topology of S simplified. Since each compression increases \chi, after finitely many steps the process terminates, and one obtains the desired map. This is Schoen-Yau‘s method to construct a stable minimal surface representative of S. When the target is 3-dimensional, the surface may be assumed to be unbranched, by a trick due to Osserman. 
  3. Following Thurston, pick an ideal triangulation of S (i.e. a geodesic lamination of S whose complementary regions are all ideal triangles); since S has boundary, we may choose such a lamination by first picking a triangulation (in the ordinary sense) with all vertices on \partial S and then “spinning” the vertices to infinity. Unless \rho factors through a cyclic group, there is some choice of lamination so that the image of f can be straightened along the lamination, and then the image spanned with CAT(-1) ideal triangles to produce a pleated surface in X representing f (note: if X has constant negative curvature, these ideal triangles can be taken to be totally geodesic). The space of pleated surfaces in fixed (closed) X of given genus is compact, so this is a reasonable class of maps to work with.
  4. If G is merely a hyperbolic group, one can still construct pleated surfaces, not quite in X, but equivariantly in Mineyev’s flow space associated to \widetilde{X}. Here we are not really thinking of the triangles themselves, but the geodesic laminations they bound (which carry the same information). 
  5. If X is complete and 3-dimensional but noncompact, the space of pleated surfaces of given genus is generally not compact, and it is not always easy to find a pleated surface where you want it. This can sometimes be remedied by shrinkwrapping; one looks for a minimal/pleated/harmonic surface subject to the constraint that it cannot pass through some prescribed set of geodesics in X (which act as “barriers” or “obstacles”, and force the resulting surface to end up roughly where one wants it to).

Anyway, one way or another, one can usually find a map of a surface, or a space of maps of surfaces, representing a given homomorphism, with some kind of a priori control of the geometry. Usually, this control is not enough to certify that a given map is \pi_1-injective, but sometimes it might be. For instance, a totally geodesic (immersed) surface in a complete manifold of constant negative curvature is always \pi_1-injective, and any surface whose extrinsic curvature is small enough will also be \pi_1-injective.

Geometric methods to certify injectivity of free or surface groups are very useful and flexible, as far as they go. Unfortunately, I know of very few topological methods to certify injectivity. By far the most important exception is the following:

Example: In 3-dimensions, one should look for properly embedded surfaces. If M is a 3-manifold (possibly with boundary), and S is a two-sided properly embedded surface, the famous Dehn’s Lemma (proved by Papakyriakopoulos) implies that either S is \pi_1-injective, or there is an embedded essential loop in S that bounds an embedded disk in M on one side of S. Such a loop may be compressed (i.e. S may be cut open along the loop, and two copies of the compressing disk sewn in) preserving the property of embeddedness, but increasing \chi. After finitely many steps, either S compresses away entirely, or one obtains a \pi_1-injective surface. One way to ensure that S does not compress away entirely is to start with a surface that is essential in (relative) homology; another way is to look for a surface dual to an action (of \pi_1(M)) on a tree. In the latter case, one can often construct quite different free subgroups in \pi_1(M) by pingpong on the ends of the tree. Note by the way that this method produces closed surface subgroups as well as free subgroups. Note too that two-sidedness is essential to apply Dehn’s Lemma.

Remark: Modern 3-manifold topologists are sometimes unreasonably indifferent to the power of Dehn’s Lemma (probably because this tool has been incorporated so fully into their subconscious?); it is worth reading Ralph Fox’s review of Papakyriakopoulos’s paper (linked above). Of this paper, he writes:

. . . it has already led to renewed attack on the problem of classifying the 3-dimensional manifolds; significant results have been and are being obtained. A complete solution has suddenly become a definite possibility. 

Remember this was written more than 50 years ago — before the geometrization conjecture, before the JSJ decomposition, before the Scott core theorem, before Haken manifolds. The only reasonable reaction to this is: !!!

Example: The construction of injective surfaces by Dehn’s Lemma may be abstracted in the following way. Given a target space X, and a class of maps \mathcal{F} of surfaces into X (in some category; e.g. homotopy classes of maps, pleated surfaces, \text{CAT}(-1) surfaces, etc.) suppose one can find a complexity c:\mathcal{F} \to \mathcal{O} with values in some ordered set, such that if f \in \mathcal{F} is not injective, one can find f' \in \mathcal{F} of smaller complexity. Then if \mathcal{O} is well-ordered, an injective surface may be found. If \mathcal{O} is not well-ordered, one may ask at least that c is upper semi-continuous on \mathcal{F}, and hope to extend it upper semi-continuously to some suitable compactification of \mathcal{F}. Even if \mathcal{O} is not well-ordered, one can at least certify that a map is injective, by showing that it minimizes c. Here are some potential examples (none of them entirely satisfactory).

  1. Given a (homologically trivial) homotopy class of loop \gamma in X, one can look at all maps of orientable surfaces S to X with boundary factoring through \gamma. For such a surface, let n(S) denote the degree with which the (possibly multiple) boundary (components) of S wrap homologically around \gamma, and let -\chi^-(S) denote the sum of Euler characteristics of non-disk and non-sphere components of S. For each surface S, one considers the quantity -\chi^-(S)/2n(S) (the factor of 2 can be ignored if desired). The important feature of this quantity is that it does not change if S is replaced by a finite cover. If \pi_1(S) is not injective, let \alpha be an essential loop on S whose image in X is inessential. Peter Scott showed that any essential loop on a surface lifts to an embedded loop in some finite cover. Hence, after passing to such a cover, \alpha may be compressed, and the resulting surface S' satisfies -\chi^-(S')/2n(S') < -\chi^-(S)/2n(S). In other words, a global minimizer of this quantity is injective. Such a surface is called extremal. The problem is that extremal surfaces do not always exist; but this construction motivates one to look for them. 
  2. Given a \text{CAT}(-1) surface S with geodesic boundary in X, one can retract S to a geodesic spine, and encode the surface by the resulting fatgraph, with edges labelled by homotopy classes in X. Since Euler characteristic is local, one does not really care precisely how the pieces of the fatgraph are assembled, but only how many pieces of what kinds are needed for a given boundary. So if only finitely many such pieces appear in some infinite family of surfaces, one can in fact construct an extremal surface as above, which is necessarily injective (more technically, one reduces the computation of Euler characteristic to a linear programming problem, finds a rational extremal solution (which corresponds to a weighted sum of pieces of fatgraph), and glues together the pieces to construct the extremal surface; one situation in which this scheme can be made to work is explained in this paper of mine). Edges can be subdivided into a finite number of possibilities, so one just needs to ensure finiteness of the number of vertex types. One condition that ensures finiteness of vertex types is the existence of a uniform constant C>0 so that for each surface S in the given family, and for each point p \in S, there is an estimate \text{dist}(p,\partial S) \le C. If this condition is violated, one finds pairs p_i,S_i which converge in the geometric topology to a point in a complete (i.e. without boundary, but probably noncompact) surface.
  3. Given S \to X, either compress an embedded essential loop, or realize S by a least area surface. If S is not injective, pass to a cover, compress a loop, and realize the result by a least area surface. Repeat this process. One obtains in this way a sequence of least area surfaces in X (typically of bigger and bigger genus) and there is no reason to expect the process to terminate. If X is a 3-manifold, the curvature of a least area surface admits two-sided curvature bounds away from the boundary, by a theorem of Schoen (near the boundary, the negative curvature might blow up, but only in controlled ways — e.g. after rescaling about a sequence of points with the most negative curvature, one may obtain in the limit a helicoid). Away from the boundary, the family of surfaces one obtains vary precompactly in the C^\infty topology, and one may obtain a complete locally least area lamination \Lambda in the limit. If \pi_1(\Lambda) is not injective, one can continue to pass to covers (applying a version of Scott’s theorem for infinite surfaces) and compress, and by transfinite induction, eventually arrive at a locally least area lamination with injective \pi_1. Of course, such a limit might well be a lamination by planes. However, the lamination one obtains is not completely arbitrary: since it is a limit of limits of . . . compact surfaces, one can choose a limit that admits a nontrivial invariant transverse measure (one must be careful here, since the lamination will typically have boundary). Or, as in bullet 2. above, one may insist that this limit lamination is complete (i.e. without boundary). 

It is more tricky to find a limit lamination as in 3. without boundary and admitting an invariant transverse measure; in any case, this motivates the following:  

Question: Is there a closed hyperbolic 3-manifold M which admits a locally least area transversely measured complete immersed lamination \Lambda, all of whose leaves are disks? (note that the answer is negative if one asks for the lamination to be embedded (there are several easy proofs of this fact)).

Secretly, the function that assigns \inf_S -\chi^-(S)/2n(S) to a homologically trivial loop \gamma is the stable commutator length of the conjugacy class in \pi_1(X) represented by \gamma. Extremal surfaces can sometimes be certified by constructing certain functions on \pi_1(X) called homogeneous quasimorphisms, but a discussion of such functions will have to wait for another post.

Posted in Groups | Tagged , , , , , , , , | Leave a comment

The (strengthened) Hanna Neumann Conjecture

A few days ago, Joel Friedman posted a paper on the arXiv purporting to give a proof of the (strengthened) Hanna Neumann conjecture, a well-known problem in geometric group theory.

Simply stated, the problem is as follows.

Conjecture (Hanna Neumann): Let F be a free group, and let G and H be finitely generated subgroups. For a subgroup E of F, let \rho(E) = \max(\text{rank}(E)-1,0). Then there is an inequality \rho(G \cap H) \le \rho(G)\rho(H).

This conjecture was further strengthened by Walter Neumann (her son):

Conjecture (strengthened Hanna Neumann): With notation above, there is an inequality \sum_x \rho(G \cap xHx^{-1}) \le \rho(G)\rho(H) where the sum is taken over x \in H\backslash F / G, i.e. the double coset representatives.

Notice by the way that since any free group embeds into F_2, the free group of rank 2, one can assume that F has rank 2 above. This fact is implicit in the discussion below.

Friedman’s paper seems to be very carefully written, and contains some new ideas (which I do not yet really understand), namely an approach using sheaf theory. But in this post I want to restrict myself to some simple (and probably well-known) geometric observations.

The first step is to reduce the problem to a completely graph-theoretic one, following Stallings; in fact, Benson Farb tells me that he thinks this reduction was known to Stallings, or at least to Dicks/Formanek (and in any case is very close to some ideas Stallings and Gersten introduced to study the problem; more on that in a later post). Friedman makes the following definition:

Definition: Let \mathcal{G} be a finite group and g_1,g_2 \in \mathcal{G} be two elements (that do not necessarily generate \mathcal{G}). The directed Cayley graph C is the graph with vertex set \mathcal{G} and with a directed edge from v to vg_i labeled i for each v \in \mathcal{G} and i=1,2.

In other words, C is a graph whose edges are oriented and labeled with either 1 or 2 in such a way that each vertex has at most one outgoing and one incoming edge with each label, and such that there is a transitive (on the vertices) free action of a group \mathcal{G} on C. (Note: for some reason, Friedman wants his group to act on the right, and therefore has directed edges from v to g_iv, but this is just a matter of convention).

For any finite graph K, not necessarily connected, let \rho(K) = \sum_j \max(0,-\chi(K_j)); i.e. \rho(K) = \sum_j \rho(\pi_1(K_j)) where the sum is taken over the connected components K_j of K. Friedman shows (but this reduction is well-known) that the SHNC is equivalent to the following graph-theoretic inequality:

Theorem: The SHNC is equivalent to the following statement. For any graph C as above, and any two subgraphs K,K' we have \sum_{g \in \mathcal{G}} \rho(K \cap gK') \le \rho(K)\rho(K').

The purpose of this blog entry is to show that there is a very simple proof of this inequality when \rho is replaced with -\chi. This is not such a strange thing to do, since \rho and -\chi are equal for graphs without acyclic components (i.e. without components that are trees), and for “random” graphs K,K' one does not expect the difference between \rho and -\chi to be very big. The argument proceeds as follows. Suppose K has v vertices and e_1,e_2 edges of kind 1,2 respectively, and define v',e_1',e_2' similarly for K'. Then

  • (-\chi(K))(-\chi(K')) = (v-e_1-e_2)(v'-e_1'-e_2')

On the other hand, since Euler characteristic is local, we just need to count how many vertices and edges of each kind turn up in each K \cap gK'. But this is easy: every vertex of K is equal to exactly one translate of every vertex of K', and similarly for edges of each kind. Hence

  • \sum_g -\chi(K \cap gK') = e_1e_1' + e_2e_2' - vv'

So the inequality one wants to show is e_1e_1' + e_2e_2' - vv' \le (v-e_1-e_2)(v'-e_1'-e_2') which simplifies to

  • v(e_1' + e_2') + v'(e_1 + e_2) \le 2vv' + e_1e_2' + e_2 e_1'

On the other hand, each graph K,K' has at most two edges at any vertex with either label, and therefore we have inequalities 0 \le e_1,e_2 \le v, 0 \le e_1',e_2' \le v'. Subject to these constraints, the inequality above is straightforward to prove. To see this, first fix some non-negative values of v,v' and let X be the four-dimensional cube of possible values of e_1,e_2,e_1',e_2'. Since both sides of the inequality are linear as a function of each e_i or e_i', if the inequality is violated at any point in X one may draw a straight line in X corresponding to varying one of the co-ordinates (e.g. e_1) while keeping the others fixed, and deduce that the inequality must be violated on one of the faces of X. Inductively, if the inequality is violated at all, it is violated at a vertex of X, which may be ruled out by inspection; qed.

This argument shows that the whole game is to understand the acyclic components of K \cap gK'; i.e. those which are topologically trees, and therefore contribute 0 to \rho, but -1 to -\chi.

Incidentally, for all I know, this simple argument is explicitly contained in either Stallings’ or Gersten’s paper (it is surely not original in any case). If a reader can verify this, please let me know!

Update: Walter Neumann informs me that this observation (that the inequality is true with -\chi in place of \rho) is in his paper in which he introduces the SHNC! He further shows in that paper that for “most” G, the SHNC is true for all H.

Update (6/29): Warren Dicks informs me that he was not aware of the reduction of SHNC to the graph-theoretic formulation described above. Friedman’s webpage acknowledges the existence of an error in the paper, and says that he is working to correct it. One problem that I know of (discovered mostly by my student Steven Frankel) concerns the commutativity of the diagram on page 10.

Update (10/22): It has been a few months since I last edited this page, and Joel Friedman has not updated either the arXiv paper, or the statement on his webpage that he is “trying to fix the error”. Since wikipedia mentions Friedman’s announcement, I thought it would be worth going on record at this point to say that Friedman’s arXiv paper (version 1 — the only version at the point I write this) is definitely in error, and that I believe the error is fundamental, and cannot be repaired (this is not to say that the paper does not contain some things of interest (it does), or that Friedman does not acknowledge the error (he does), just that it is worth clearing up any possible ambiguity about the situation for readers who are wondering about the status of the SHNC). The problem is the “not entirely standard” (quote from Friedman’s paper) diagrams, like the one on page 10. In particular, the claimed proof of Theorem 5.6, that the projections constructed in Lemma 5.5 (by a very general dimension counting argument) fit into a diagram with the desired properties is false. Any construction of projections satisfying the desired properties must be quite special. Nevertheless, one can certainly still define Friedman’s sheaf \mathcal{K}, and ask whether it has \rho(\mathcal{K})=0 (in Friedman’s sense); this would, as far as I can tell, prove SHNC; however, I do not know of any reason why it should hold (or whether there are any counterexamples, which might exist even if SHNC is true).

Posted in Commentary, Groups | Tagged , , | 1 Comment

Groups with free subgroups

More ambitious than simply showing that a group is infinite is to show that it contains an infinite subgroup of a certain kind. One of the most important kinds of subgroup to study are free groups. Hence, one is interested in the question:

Question: When does a group contain a (nonabelian) free subgroup?

Again, one can (and does) ask this question both about a specific group, and about certain classes of groups, or for a typical (in some sense) group from some given family.

Example: If \mathcal{P} is a property of groups that is inherited by subgroups, then if no free group satisfies \mathcal{P}, no group that satisfies \mathcal{P} can contain a free subgroup. An important property of this kind is amenability. A (discrete) group G is amenable if it admits an invariant mean; that is, if there is a linear map m: L^\infty(G) \to \mathbb{R} (i.e. a way to define the average of a bounded function over G) satisfying three basic properties:

  1. m(f) \ge 0 if f\ge 0 (i.e. the average of a non-negative function is non-negative)
  2. m(\chi_G)=1 where \chi_G is the constant function taking the value 1 everywhere on G (i.e. the average of the constant function 1 is normalized to be 1)
  3. m(g\cdot f) = m(f) for every {}g \in G and f \in L^\infty(G), where (g\cdot f)(x) = f(g^{-1}x) (i.e. the mean is invariant under the obvious action of G on L^\infty(G))

If H is a subgroup of G, there are (many) H-invariant homomorphisms j: L^\infty(H) \to L^\infty(G) taking non-negative functions to non-negative functions, and \chi_H to \chi_G; for example, the (left) action of H on G breaks up into a collection of copies of H acting on itself, right-multiplied by a collection of right coset representatives. After choosing such a choice of representatives \lbrace g_\alpha \rbrace, one for each coset Hg_\alpha, we can define j(f)(hg_\alpha) = f(h). Composing with m shows that every subgroup of an amenable group is amenable (this is harder to see in the “geometric” definition of amenable groups in terms of Folner sets). On the other hand, as is well-known, a nonabelian free group is not amenable. Hence, amenable groups do not contain nonabelian free subgroups.

The usual way to see that a nonabelian free group is not amenable is to observe that it contains enough disjoint “copies” of big subsets. For concreteness, let F denote the free group on two generators a,b, and write their inverses as A,B. Let W_a, W_A denote the set of reduced words that start with either a or A, and let \chi_a,\chi_A denote the indicator functions of W_a,W_A respectively. We suppose that F is amenable, and derive a contradiction. Note that F = W_a \cup aW_A, so m(\chi_a) + m(\chi_A) \ge 1. Let V denote the set of reduced words that start with one of the strings a,A,ba,bA, and let \chi_V denote the indicator function of V. Notice that V is made of two disjoint copies of each of W_a,W_A. So on the one hand, m(\chi_V) \le m(\chi_F) = 1, but on the other hand, m(\chi_V) = 2 (m(\chi_a)+m(\chi_A)) \ge 2.

Conversely, the usual way to show that a group G is amenable is to use the Folner condition. Suppose that G is finitely generated by some subset S, and let C denote the Cayley graph of G (so that C is a homogeneous locally finite graph). Suppose one can find finite subsets U_i of vertices so that |\partial U_i|/|U_i| \to 0 (here |U_i| means the number of vertices in U_i, and  |\partial U_i| means the number of vertices in U_i that share an edge with C - U_i). Since the “boundary” of U_i is small compared to U_i, averaging a bounded function over U_i is an “almost invariant” mean; a weak limit (in the dual space to L^\infty(G)) is an invariant mean. Examples of amenable groups include

  1. Finite groups
  2. Abelian groups
  3. Unions and extensions of amenable groups
  4. Groups of subexponential growth

and many others. For instance, virtually solvable groups (i.e. groups containing a solvable subgroup with finite index) are amenable.

Example: No amenable group can contain a nonabelian free subgroup. The von Neumann conjecture asked whether the converse was true. This conjecture was disproved by Olshanskii. Subsequently, Adyan showed that the infinite free Burnside groups are not amenable. These are groups B(m,n) with m\ge 2 generators, and subject only to the relations that the nth power of every element is trivial. When n is odd and at least 665, these groups are infinite and nonamenable. Since they are torsion groups, they do not even contain a copy of \mathbb{Z}, let alone a nonabelian free group!

Example: The Burnside groups are examples of groups that obey a law; i.e. there is a word w(x_1,x_2,\cdots,x_n) in finitely many free variables, such that w(g_1,g_2,\cdots,g_n)=\text{id} for every choice of g_1,\cdots,g_n \in G. For example, an abelian group satisfies the law x_1x_2x_1^{-1}x_2^{-1}=\text{id}. Evidently, a group that obeys a law does not contain a nonabelian free subgroup. However, there are examples of groups which do not obey a law, but which also do not contain any nonabelian free subgroup. An example is the classical Thompson’s group F, which is the group of orientation-preserving piecewise-linear homeomorphisms of [0,1] with finitely many breakpoints at dyadic rationals (i.e. points of the form p/2^q for integers p,q) and with slopes integral powers of 2. To see that this group does not obey a law, one can show (quite easily) that in fact F is dense (in the C^0 topology) in the group \text{Homeo}^+([0,1]) of all orientation-preserving homeomorphisms of the interval. This latter group contains nonabelian free groups; by approximating the generators of such a group arbitrarily closely, one obtains pairs of elements in F that do not satisfy any identity of length shorter than any given constant. On the other hand, a famous theorem of Brin-Squier says that F does not contain any nonabelian free subgroup. In fact, the entire group \text{PL}^+([0,1]) does not contain any nonabelian free subgroup. A short proof of this fact can be found in my paper as a corollary of the fact that every subgroup G of \text{PL}^+([0,1]) has vanishing stable commutator length; since stable commutator length is nonvanishing in nonabelian free groups, this shows that there are no such subgroups of \text{PL}^+([0,1]). (Incidentally, and complementarily, there is a very short proof that stable commutator length vanishes on any group that obeys a law; we will give this proof in a subsequent post).

Example: If G surjects onto H, and H contains a free subgroup F, then there is a section from F to G (by freeness), and therefore G contains a free subgroup.

Example: The most useful way to show that G contains a nonabelian free subgroup is to find a suitable action of G on some space X. The following is known as Klein’s ping-pong lemma. Suppose one can find disjoint subsets U^\pm and V^\pm of X, and elements g,h \in G so that g(U^+ \cup V^\pm) \subset U^+g^{-1}(U^- \cup V^\pm) \subset U^-, and similarly interchanging the roles of U^\pm, V^\pm and g,h. If w is a reduced word in g^{\pm 1},h^{\pm 1}, one can follow the trajectory of a point under the orbit of subwords of w to verify that w is nontrivial. The most common way to apply this in practice is when g,h act on X with source-sink dynamics; i.e. the element g has two fixed points u^\pm so that every other point converges to u^+ under positive powers of g, and to u^- under negative powers of g. Similarly, h has two fixed points v^\pm with similar dynamics. If the points u^\pm,v^\pm are disjoint, and X is compact, one can take any small open neighborhoods U^\pm,V^\pm of u^\pm,v^\pm, and then sufficiently large powers of g and h will satisfy the hypotheses of ping-pong.

Example: Every hyperbolic group G acts on its Gromov boundary \partial_\infty G. This boundary is the set of equivalence classes of quasigeodesic rays in (the Cayley graph of) G, where two rays are equivalent if they are a finite Hausdorff distance apart. Non-torsion elements act on the boundary with source-sink dynamics. Consequently, every pair of non-torsion elements in a hyperbolic group either generate a virtually cyclic group, or have powers that generate a nonabelian free group.

It is striking to see how easy it is to construct nonabelian free subgroups of a hyperbolic group, and how difficult to construct closed surface subgroups. We will return to the example of hyperbolic groups in a future post.

Example: The Tits alternative says that any linear group G (i.e. any subgroup of \text{GL}(n,\mathbb{R}) for some n) either contains a nonabelian free subgroup, or is virtually solvable (and therefore amenable). This can be derived from ping-pong, where G is made to act on certain spaces derived from the linear action (e.g. locally symmetric spaces compactified in certain ways, and buildings associated to discrete valuations on the ring of entries of matrix elements of G). 

Example: There is a Tits alternative for subgroups of other kinds of groups, for example mapping class groups, as shown by Ivanov and McCarthy. The mapping class group (of a surface) acts on the Thurston boundary of Teichmuller space. Every subgroup of the mapping class group either contains a nonabelian free subgroup, or is virtually abelian. Roughly speaking, either elements move points in the boundary with enough dynamics to be able to do ping-pong, or else the action is “localized” in a train-track chart, and one obtains a linear representation of the group (enough to apply the ordinary Tits alternative). Virtually solvable subgroups of mapping class groups are virtually abelian.

Example: A similar Tits alternative holds for \text{Out}(F_n). This was shown by Bestvina-Feighn-Handel in these three papers (the third paper shows that solvable subgroups are virtually abelian, thus emphasizing the parallels with mapping class groups).

Example: If G is a finitely generated group of homeomorphisms of S^1, then there is a kind of Tits alternative, first proposed by Ghys, and proved by Margulis: either G preserves a probability measure on S^1 (which might be singular), or it contains a nonabelian free subgroup. To see this, first note that either G has a finite orbit (which supports an invariant probability measure) or the action is semi-conjugate to a minimal action (one with all orbits dense). In the second case, the proof depends on understanding the centralizer of the group action: either the centralizer is infinite, in which case the group is conjugate to a group of rotations, or it is finite cyclic, and one obtains an action of G on a “smaller” circle, by quotienting out by the centralizer. So one may assume the action is minimal with trivial centralizer. In this case, one shows that the action has the property that for any nonempty intervals I,J in S^1, there is some {}g \in G with g(I) \subset J; i.e. any interval may be put inside any other interval by some element of the group. For such an action, it is very easy to do ping-pong. Incidentally, a minor variation on this result, and with essentially this argument, was established by Thurston in the context of uniform foliations of 3-manifolds before Ghys proposed his question.

Example: If \rho_t is an (algebraic) family of representations of a (countable) free group F into an algebraic group, then either some element g \in F is in the kernel of every \rho_t, or the set of faithful representations is “generic”, i.e. the intersection of countably many open dense sets. This is because the set of representations for which a given element is in the kernel is Zariski closed, and therefore its complement is open and either empty or dense (one must add suitable hypotheses or conditions to the above to make it rigorous).

Posted in Groups | Tagged , , , , , , , | 3 Comments

Infinite groups

Before looking for surface subgroups, it is worth thinking about how to find (or rule out the existence of) simpler classes of subgroups. This is a very general question, and I do not intend to give a complete survey; however, it is instructive to build up to the question of surface subgroups incrementally and to catalog some of the interesting examples and counterexamples along the way.

Question: When is a group infinite?

Already this question is more than hard enough. But first we must examine some unstated assumptions behind the question. We have some group G in mind, and want to know whether it is infinite or not. But in what sense do we “have” the group G? There are several things we might mean by this, including:

  1. An explicit group G given by generators and relations; i.e. G = \langle S \; | \; R \rangle.
  2. A group G given together with an action on a set \Sigma.
  3. A group G not uniquely defined, but described implicitly in terms of its properties (e.g. G is amenable, or left-orderable, or has property (T), or is linear, or is residually p, or is a 3-manifold group, or is finitely presented, or satisfies a law, etc.).

In general, it is hard to learn much about a group from a presentation. However, sometimes one can have some success:

Example: If G is given by a finite presentation G = \langle S \; | \; R \rangle, the deficiency of the presentation is the difference between the number of generators and the number of relators; i.e. |S| - |R|. The deficiency of G is the maximum of the deficiency of all finite presentations. In practice, it is very difficult to determine the deficiency of a group, but trivial to determine the deficiency of a given presentation. The rank of the abelianization of G (i.e. the rank of H_1(G;\mathbb{Z})) is at least as big as the deficiency; hence if the deficiency is positive, G is infinite, and in fact contains a copy of \mathbb{Z}.

Example: Daniel Allcock observed that one can do better when some of the relators R as above are proper powers. Geometrically, a relator of order p counts as only “1/p of a relator” for the purposes of computing the rank of H_1. Explicitly, Allcock shows that if G is a group with a presentation of the form G = \langle a_1, \cdots, a_n \; | \; w_1^{r_1} = \cdots = w_m^{r_m} = 1 \rangle then if H is a normal subgroup of G of index N < \infty and for each index j, one has w_j^k \notin H for 1 \le k \le r_j-1 then the rank of the abelianization of H is at least 1+ N(n-1 - \sum_i \frac {1} {r_i}). If this rank is positive, then H is infinite, and therefore so is G.

Example: A much more subtle example is the famous Golod-Shafarevich inequality. Let G be a finite p-group (i.e. a group in which every element is torsion, with order a power of p). Let n(G) be the minimum number of generators of G, and r(G) the number of relations between these generators in the corresponding free pro-p-group (if R(G) denotes the minimum number of relations defining G as a discrete group then R(G) \ge r(G)). The G-S inequality is the inequality r(G) > n(G)^2/4. In particular, if G is a nontrivial pro-p-group for which n(G)^2/4 \ge r(G) (or n(G)^2/4 \ge R(G) which implies it) then G is infinite. This inequality enabled Golod to give a negative answer to the generalized Burnside’s problem, by showing that for each prime p there is an infinite group G generated by three elements, in which every element has order a power of p.

Example: Marc Lackenby has made very nice use of the Golod-Shafarevich inequality in his work on Kleinian groups with finite non-cyclic subgroups. A Kleinian group is a finitely generated discrete subgroup of the group of isometries of hyperbolic 3-space; such a group is the fundamental group of a hyperbolic 3-orbifold. Marc shows that if a Kleinian group G contains a finite non-cyclic subgroup, then G is finite, or virtually free, or contains a closed surface subgroup. The argument is very interesting and delicate, and I hope to return to it in a later post. But for the moment I just want to remark that the form of the G-S inequality Marc uses is as follows. Let G be a group with a finite presentation G = \langle S \; | \; R \rangle. Let d_p denote the dimension of H_1(G;\mathbb{F}_p) where p is a prime. If d_p^2/4 > d_p - |S| + |R| then G is infinite.

Example: Another way to show a group G is infinite is if the relators are very long. This is the method of small cancellation theory, and can be implemented in many different ways. From the modern perspective, a group presentation satisfies a small cancellation condition if one can build a 2-complex from the presentation which is manifestly non-positively curved in some explicit sense. For example, if G = \langle S \; | \; R \rangle is a symmetrized presentation (i.e. one in which elements of R are cyclically reduced, and R is closed under taking cyclic permutations and inverses), a piece is a word b in the generators if there are distinct relations ba_1, ba_2 in R. If no relation is a product of fewer than p pieces, one says that G satisfies the small cancellation condition C(p). So, for example, if G is C(6), one can build a 2-complex presenting G built from polygons, each of which has at least 6 sides, and is non-positively curved (and therefore G is infinite). 

Example: Instead of showing that a particular group is infinite, one can show that certain groups whose presentations are obtained by a statistical process, are infinite with overwhelming probability. Yann Ollivier wrote an introduction to Gromov’s theory of Random Groups, in which it is made precise what one means by a “random group”, and many important properties of such groups are delineated. There is a parameter in the theory which governs the density of relations added to a generating set to determine the random group. The most striking aspect of the theory (in my opinion) is the existence of a phase transition. Gromov showed that if G is a random group at density d then if d < 1/2, with overwhelming probability, G is infinite, hyperbolic, torsion-free and of geometric dimension 2 (i.e. it is not free, but admits a 2-dimensional K(G,1)). However, if d \ge 1/2, with overwhelming probability, G is either trivial or \mathbb{Z}/2\mathbb{Z}.

Example: A group G which admits a finite dimensional K(G,1) is torsion-free, and therefore either trivial or infinite. This follows from the fact that the K(\mathbb{Z}/p\mathbb{Z},1)‘s are the infinite dimensional Lens spaces, which have nontrivial homology in infinitely many dimensions, together with elementary covering space theory. This example begs the question: how do you tell if a group has a finite dimensional K(G,1)? Well, one way is to exhibit a free, properly discontinuous action of G on a finite dimensional contractible space; of course, given such an action, it is probably easier to directly find elements in G of infinite order. 

Example: A function f:G \to \mathbb{R}^+ is said to be a length function if it satisfies f(\text{id})=0, if it is symmetric (i.e. f(g) = f(g^{-1}) for all g) and if it is subadditive: f(gh) \le f(g) + f(h). A group G is said to be strongly bounded if every length function on G is bounded. The strongly bounded property was introduced by George Bergman in this paper. A countable group is strongly bounded if and only if it is finite (the fact that finite groups are strongly bounded is obvious). Moreover, a group which admits an unbounded length function is evidently infinite. However, it turns out that there are many interesting uncountable but strongly bounded groups! Bergman showed that the group of permutations of any set is strongly bounded. Yves de Cornulier, in an appendix to a paper of mine with Mike Freedman, showed that the same is true for Homeo(S^n), the group of homeomorphisms of an n-sphere.

Example: One of the most spectacular proofs of the finiteness of a (certain class of) group(s) is Margulis’ proof of the normal subgroup theorem, which says that if G is a lattice in a higher rank Lie group, then every normal subgroup H in G is either finite, or of finite index. The proof has three steps: first, one shows that if H is infinite, then G/H is amenable. Second, since G has property (T), the same is true for G/H. Third, an amenable group with property (T) is finite. The second and third steps are not very complicated: a group has property (T) if the trivial representation is isolated in the space of all irreducible unitary representations, in a certain topology. A quotient of a group by a closed normal subgroup certainly has no more unitary representations than the original group itself, so the second step is not hard to show. An amenable group G/H has almost invariant vectors in L^2(G/H); since it has property (T), it has an invariant vector in L^2(G/H); but this implies that G/H is finite. So the hard part is to show that G/H is amenable. This is done using what is now known as boundary theory, and is described in Chapter VI of Margulis’ book.

I would be curious to hear other people’s favorite tricks/techniques to show that a group is or is not infinite.

Posted in Groups | Tagged , , , , | Leave a comment