Schwarz-Christoffel transformations, Schwarzian derivatives, and Schwarz’s minimal surface

Hermann Amandus Schwarz (1843-1921) was a student of Kummer and Weierstrass, and made many significant contributions to geometry, especially to the fields of minimal surfaces and complex analysis. His mathematical creations are both highly abstract and flexible, and at the same time intimately tied to explicit and practical calculation.

I learned about Schwarz-Christoffel transformations, Schwarzian derivatives, and Schwarz’s minimal surface as three quite separate mathematical objects, and I was very surprised to discover firstly that they had all been discovered by the same person, and secondly that they form parts of a consistent mathematical narrative, which I will try to explain in this post to the best of my ability. There is an instructive lesson in this example (for me), that we tend to mine the past for nuggets, examples, tricks, formulae etc. while forgetting the points of view and organizing principles that made their discovery possible. Another teachable example is that of Dehn’s “invention” of combinatorial (infinite) group theory, as a natural branch of geometry; several generations of followers went about the task of reformulating Dehn’s insights and ideas in the language of algebra, “generalizing” them and stripping them of their context, before geometric and topological methods were reintroduced by Milnor, Schwarz (a different one this time), Stallings, Thurston, Gromov and others to spectacular effect (note: I have the second-hand impression that the geometric point of view in group theory (and every other subject) was never abandoned in the Soviet Union).

Schwarz’s minimal surface (also called “Schwarz’s D surface”, and sometimes “Schwarz’s H surface”) is an extraordinarily beautiful triply-periodic minimal surface of infinite genus that is properly embedded in \mathbb{R}^3. According to Nitsche’s excellent book (p.240), this minimal surface closely resembles the separating wall between inorganic and organic materials in the skeleton of a starfish. The basic building block of the surface can be described as follows. If the vertices of a cube are 2-colored, the black vertices are the vertices of a regular tetrahedron. Let Q denote the quadrilateral formed by four edges of this tetrahedron; then a fundamental piece S of Schwarz’s surface is a minimal disk spanning Q:

schwarz_piece

The surface may be “analytically continued” by rotating Q through an angle \pi around each boundary edge. Six copies of Q fit smoothly around each vertex, and the resulting surface extends (triply) periodically throughout space.

The symmetries of Q enable us to give it several descriptions as a Riemann surface. Firstly, we could think of Q as a polygon in the hyperbolic plane with four edges of equal length, and angles \pi/3. Twelve copies of Q can be assembled to make a hyperbolic surface \Sigma of genus 3. Thinking of a surface of genus 3 as the boundary of a genus 3 handlebody defines a homomorphism from \pi_1(\Sigma) to \mathbb{Z}^3, thought of as H_1(\text{handlebody}); the cover \widetilde{\Sigma} associated to the kernel is (conformally) the triply periodic Schwarz surface, and the deck group acts on \mathbb{R}^3 as a lattice (of index 2 in the face-centered cubic lattice).

Another description is as follows. Since the deck group acts by translation, the Gauss map from \widetilde{\Sigma} to S^2 factors through a map \Sigma \to S^2. The map is injective at each point in the interior or on an edge of a copy of Q, but has an order 2 branch point at each vertex. Thus, the map \Sigma \to S^2 is a double-branched cover, with one branch point of order 2 at each vertex of a regular inscribed cube. This leads one to think (like a late 19th century mathematician) of \Sigma as the Riemann surface on which a certain multi-valued function on S^2 = \mathbb{C} \cup \infty is single-valued. Under stereographic projection, the vertices of the cube map to the eight points \lbrace \alpha,i\alpha,-\alpha,-i\alpha,1/\alpha,i/\alpha,-1/\alpha,-i/\alpha \rbrace where \alpha = (\sqrt{3}-1)/\sqrt{2}. These eight points are the roots of the polynomial w^8 - 14w^4 + 1, so we may think of \Sigma as the hyperelliptic Riemann surface defined by the equation v^2 = w^8 - 14w^4 + 1; equivalently, as the surface on which the multi-valued (on \mathbb{C} \cup \infty) function R(w):= 1/v=1/\sqrt{w^8 - 14w^4 + 1} is single-valued.

The function R(w) is known as the Weierstrass function associated to \Sigma, and an explicit formula for the co-ordinates of the embedding \widetilde{\Sigma} \to \mathbb{R}^3 were found by Enneper and Weierstrass. After picking a basepoint (say 0) on the sphere, the coordinates are given by integration:

x = \text{Re} \int_0^{w_0} \frac{1}{2}(1-w^2)R(w)dw

y = \text{Re} \int_0^{w_0} \frac{i}{2}(1+w^2)R(w)dw

z = \text{Re} \int_0^{w_0} wR(w)dw

The integral in each case depends on the path, and lifts to a single-valued function precisely on \widetilde{\Sigma}.

Geometrically, the three coordinate functions x,y,z are harmonic functions on \widetilde{\Sigma}. This corresponds to the fact that minimal surfaces are precisely those with vanishing mean curvature, and the fact that the Laplacian of the coordinate functions (in terms of isothermal parameters on the underlying Riemann surface) can be expressed as a nonzero multiple of the mean curvature vector. A harmonic function on a Riemann surface is the real part of a holomorphic function, unique up to a constant; the holomorphic derivative of the (complexified) coordinate functions are therefore well-defined, and give holomorphic 1-forms \phi_1,\phi_2,\phi_3 which descend to \Sigma (since the deck group acts by translations). These 1-forms satisfy the identity \sum_i \phi_i^2 = 0 (this identity expresses the fact that the embedding of \widetilde{\Sigma} into \mathbb{R}^3 via these functions is conformal). The (composition of the) Gauss map (with stereographic projection) can be read off from the \phi_i, and as a meromorphic function on \Sigma, it is given by the formula w = \phi_3/(\phi_1 - i\phi_2). Define a function f on \Sigma by the formula fdw = \phi_1 - i\phi_2. Then 1/f,w are the coordinates of a rational map from \Sigma into \mathbb{C}^2 which extends to a map into \mathbb{CP}^2, by sending each zero of f to wf = \phi_3/dw in the \mathbb{CP}^1 at infinity. Symmetry allows us to identify the image with the hyperelliptic embedding from before, and we deduce that f=R(w). Solving for \phi_1,\phi_2 we obtain the integrands in the formulae above.

In fact, any holomorphic function R(w) on a domain in \mathbb{C} defines a (typically immersed with branch points) minimal surface, by the integral formulae of Enneper-Weierstrass above. Suppose we want to use this fact to produce an explicit description of a minimal surface bounded by some explicit polygonal loop in \mathbb{R}^3. Any minimal surface so obtained can be continued across the boundary edges by rotation; if the angles at the vertices are all of the form \pi/n the resulting surface closes up smoothly around the vertices, and one obtains a compact abstract Riemann surface \Sigma tiled by copies of the fundamental region, together with a holonomy representation of \pi_1(\Sigma) into \text{Isom}^+(\mathbb{R}^3). Sometimes the image of this representation in the rotational part of \text{Isom}^+(\mathbb{R}^3) is finite, and one obtains an infinitely periodic minimal surface as in the case of Schwarz’s surface. A fundamental tile in \Sigma can be uniformized as a hyperbolic polygon; equivalently, as a region in the upper half-plane bounded by arcs of semicircles perpendicular to the real axis. Since the edges of the loop are straight lines, the image of this hyperbolic polygon under the Gauss map is a region in \mathbb{R}^3 also bounded by arcs of round circles; thus Schwarz’s study of minimal surfaces naturally led him to the problem of how to explicitly describe conformal maps between regions in the plane bounded by circular arcs. This problem is solved by the Schwarz-Christoffel transformation, and its generalizations, with help from the Schwarzian derivative.

Note that if P and Q are two such regions, then a conformal map from P to Q can be factored as the product of a map uniformizing P as the upper half-plane, followed by the inverse of a map uniformizing Q as the upper half-plane. So it suffices to find a conformal map when the domain is the upper half plane, decomposed into intervals and rays that are mapped to the edges of a circular polygon Q. Near each vertex, Q can be moved by a fractional linear transformation z \to (az+b)/(cz+d) to (part of) a wedge, consisting of complex numbers with argument between 0 and \alpha, where \alpha is the angle at Q. The function f(z) = z^{\alpha/\pi} uniformizes the upper half-plane as such a wedge; however it is not clear how to combine the contributions from each vertex, because of the complicated interaction with the fractional linear transformation. The fundamental observation is that there are certain natural holomorphic differential operators which are insensitive to the composition of a holomorphic function with groups of fractional linear transformations, and the uniformizing map can be expressed much more simply in terms of such operators.

For example, two functions that differ by addition of a constant have the same derivative: f' = (f+c)'. Functions that differ by multiplication by a constant have the same logarithmic derivative: (\log(f))' = (\log(cf))'. Putting these two observations together suggest defining the nonlinearity of a function as the composition N(f):= (\log(f'))' = f''/f'. This has the property that N(af+b) = N(f) for any constants a,b. Under inversion z \to 1/z the nonlinearity transforms by N(1/f) = N(f) - 2f'/f. From this, and a simple calculation, one deduces that the operator N' - N^2/2 is invariant under inversion, and since it is also invariant under addition and multiplication by constants, it is invariant under the full group of fractional linear transformations. This combination is called the Schwarzian derivative; explicitly, it is given by the formula S(f) = f'''/f' - 3/2(f''/f')^2. Given the Schwarzian derivative S(f), one may recover the nonlinearity N(f) by solving the Ricatti equation N' - N^2/2 - S = 0. As explained in this post, solutions of the Ricatti equation preserve the projective structure on the line; in this case, it is a complex projective structure on the complex line. Equivalently, different solutions differ by an element of \text{PSL}(2,\mathbb{C}), acting by fractional linear transformations, as we have just deduced. Once we know the nonlinearity, we can solve for f by f = \int e^{\int N}, the usual solution to a first order linear inhomogeneous ODE. The Schwarzian of the function z^{\alpha/\pi} is (1-\alpha^2/\pi^2)/2z^2. The advantage of expressing things in these terms is that the Schwarzian of a uniformizing map for a circular polygon Q with angles \alpha_i at the vertices has the form of a rational function, with principal parts a_i/(z-z_i)^2 + b_i/(z-z_i), where the a_i = (1-\alpha_i^2/\pi^2)/2 and the b_i and z_i depend (unfortunately in a very complicated way) on the edges of Q (for the ugly truth, see Nehari, chapter 5). To see this, observe that the map has an order two pole near finitely many points z_i (the preimages of the vertices of Q under the uniformizing map) but is otherwise holomorphic. Moreover, it can be analytically continued into the lower half plane across the interval between successive z_i, by reflecting the image across each circular edge. After reflecting twice, the image of Q is transformed by a fractional linear transformation, so S(f) has an analytic continuation which is single valued on the entire Riemann sphere, with finitely many isolated poles, and is therefore a rational function! When the edges of the polygon are straight, a simpler formula involving the nonlinearity specializes to the “familiar” Schwarz-Christoffel formula.

(Update 10/22): In fact, I went to the library to refresh myself on the contents of Nehari, chapter 5. The first thing I noticed — which I had forgotten — was that if f is the uniformizing map from the upper half-plane to a polygon Q with spherical arcs, then S(f) is real-valued on the real axis. Since it is a rational function, this implies that its nonsingular part is actually a constant; i.e.

S(f) = \sum _i a_i/(z-z_i)^2 + b_i/(z-z_i) + c

where a_i is as above, and z_i,b_i,c are real constants (which satisfy some further conditions — really see Nehari this time for more details).

The other thing that struck me was the first paragraph of the preface, which touches on some of the issues I alluded to above:

In the preface to the first edition of Courant-Hilbert’s “Methoden der mathematischen Physik”, R. Courant warned against a trend discernible in modern mathematics in which he saw a menace to the future development of mathematical analysis. He was referring to the tendency of many workers in this field to lose sight of the roots of mathematical analysis in physical and geometric intuition and to concentrate their efforts on the refinement and the extreme generalization of existing concepts.

Instead of using a word like “menace”, I would rather take this as a lesson about the value of returning to the points of view that led to the creation of the mathematical objects we study every day; which was (to some approximation) the point I was trying to illustrate in this post.

Posted in Complex analysis, Euclidean Geometry, Surfaces | Tagged , , , , , , , | 4 Comments

Quasimorphisms from knot invariants

Last week, Michael Brandenbursky from the Technion gave a talk at Caltech on an interesting connection between knot theory and quasimorphisms. Michael’s paper on this subject may be obtained from the arXiv. Recall that given a group G, a quasimorphism is a function \phi:G \to \mathbb{R} for which there is some least real number D(\phi) \ge 0 (called the defect) such that for all pairs of elements g,h \in G there is an inequality |\phi(gh) - \phi(g) - \phi(h)| \le D(\phi). Bounded functions are quasimorphisms, although in an uninteresting way, so one is usually only interested in quasimorphisms up to the equivalence relation that \phi \sim \psi if the difference |\phi - \psi| is bounded. It turns out that each equivalence class of quasimorphism contains a unique representative which has the extra property that \phi(g^n) = n\phi(g) for all g\in G and n \in \mathbb{Z}. Such quasimorphisms are said to be homogeneous. Any quasimorphism may be homogenized by defining \overline{\phi}(g) = \lim_{n \to \infty} \phi(g^n)/n (see e.g. this post for more about quasimorphisms, and their relation to stable commutator length).

Many groups that do not admit many homomorphisms to \mathbb{R} nevertheless admit rich families of homogeneous quasimorphisms. For example, groups that act weakly properly discontinuously on word-hyperbolic spaces admit infinite dimensional families of homogeneous quasimorphisms; see e.g. Bestvina-Fujiwara. This includes hyperbolic groups, but also mapping class groups and braid groups, which act on the complex of curves.

Michael discussed another source of quasimorphisms on braid groups, those coming from knot theory. Let I be a knot invariant. Then one can extend I to an invariant of pure braids on n strands by I(\alpha) = I(\widehat{\alpha \Delta}) where \Delta = \sigma_1 \cdots \sigma_{n-1}, and the “hat” denotes plat closure. It is an interesting question to ask: under what conditions on I is the resulting function on braid groups a quasimorphism?

In the abstract, such a question is probably very hard to answer, so one should narrow the question by concentrating on knot invariants of a certain kind. Since one wants the resulting invariants to have some relation to the algebraic structure of braid groups, it is natural to look for functions which factor through certain algebraic structures on knots; Michael was interested in certain homomorphisms from the knot concordance group to \mathbb{R}. We briefly describe this group, and a natural class of homomorphisms.

Two oriented knots K_1,K_2 in the 3-sphere are said to be concordant if there is a (locally flat) properly embedded annulus A in S^3 \times [0,1] with A \cap S^3 \times 0 = K_1 and A \cap S^3 \times 1 = K_2. Concordance is an equivalence relation, and the equivalence classes form a group, with connect sum as the group operation, and orientation-reversed mirror image as inverse. The only subtle aspect of this is the existence of inverses, which we briefly explain. Let K be an arbitrary knot, and let K^! denote the mirror image of K with the opposite orientation. Arrange K \cup K^! in space so that they are symmetric with respect to reflection in a dividing plane. There is an immersed annulus A in S^3 which connects each point on K to its mirror image on K^!, and the self-intersections of this annulus are all disjoint embedded arcs, corresponding to the crossings of K in the projection to the mirror. This annulus is an example of what is called a ribbon surface. Connect summing K to K^! by pushing out a finger of each into an arc in the mirror connects the ribbon annulus to a ribbon disk spanning K \# K^!. A ribbon surface (in particular, a ribbon disk) can be pushed into a (smoothly) embedded surface in a 4-ball bounding S^3. Puncturing the 4-ball at some point on this smooth surface, one obtains a concordance from K\#K^! to the unknot, as claimed.

The resulting group is known as the concordance group \mathcal{C} of knots. Since connect sum is commutative, this group is abelian. Notice as above that a slice knot — i.e. a knot bounding a locally flat disk in the 4-ball — is concordant to the unknot. Ribbon knots (those bounding ribbon disks) are smoothly slice, and therefore slice, and therefore concordant to the trivial knot. Concordance makes sense for codimension two knots in any dimension. In higher even dimensions, knots are always slice, and in higher odd dimensions, Levine found an algebraic description of the concordance groups in terms of (Witt) equivalence classes of linking pairings on a Seifert surface; (some of) this information is contained in the signature of a knot.

Let K be a knot (in S^3 for simplicity) with Seifert surface \Sigma of genus g. If \alpha,\beta are loops in \Sigma, define f(\alpha,\beta) to be the linking number of \alpha with \beta^+, which is obtained from \beta by pushing it to the positive side of \Sigma. The function f is a bilinear form on H_1(\Sigma), and after choosing generators, it can be expressed in terms of a matrix V (called the Seifert matrix of K). The signature of K, denoted \sigma(K), is the signature (in the usual sense) of the symmetric matrix V + V^T. Changing the orientation of a knot does not affect the signature, whereas taking mirror image multiplies it by -1. Moreover, if \Sigma_1,\Sigma_2 are Seifert surfaces for K_1,K_2, one can form a Seifert surface \Sigma for K_1 \# K_2 for which there is some sphere S^2 \in S^3 that intersects \Sigma in a separating arc, so that the pieces on either side of the sphere are isotopic to the \Sigma_i, and therefore the Seifert matrix of K_1 \# K_2 can be chosen to be block diagonal, with one block for each of the Seifert matrices of the K_i; it follows that \sigma(K_1 \# K_2) = \sigma(K_1) + \sigma(K_2). In fact it turns out that \sigma is a homomorphism from \mathcal{C} to \mathbb{Z}; equivalently (by the arguments above), it is zero on knots which are topologically slice. To see this, suppose K bounds a locally flat disk \Delta in the 4-ball. The union \Sigma':=\Sigma \cup \Delta is an embedded bicollared surface in the 4-ball, which bounds a 3-dimensional Seifert “surface” W whose interior may be taken to be disjoint from S^3. Now, it is a well-known fact that for any oriented 3-manifold W, the inclusion \partial W \to W induces a map H_1(\partial W) \to H_1(W) whose kernel is Lagrangian (with respect to the usual symplectic pairing on H_1 of an oriented surface). Geometrically, this means we can find a basis for the homology of \Sigma' (which is equal to the homology of \Sigma) for which half of the basis elements bound 2-chains in W. Let W^+ be obtained by pushing off W in the positive direction. Then chains in W and chains in W^+ are disjoint (since W and W^+ are disjoint) and therefore the Seifert matrix V of K has a block form for which the lower right g \times g block is identically zero. It follows that V+V^T also has a zero g\times g lower right block, and therefore its signature is zero.

The Seifert matrix (and therefore the signature), like the Alexander polynomial, is sensitive to the structure of the first homology of the universal abelian cover of S^3 - K; equivalently, to the structure of the maximal metabelian quotient of \pi_1(S^3 - K). More sophisticated “twisted” and L^2 signatures can be obtained by studying further derived subgroups of \pi_1(S^3 - K) as modules over group rings of certain solvable groups with torsion-free abelian factors (the so-called poly-torsion-free-abelian groups). This was accomplished by Cochran-Orr-Teichner, who used these methods to construct infinitely many new concordance invariants.

The end result of this discussion is the existence of many, many interesting homomorphisms from the knot concordance group to the reals, and by plat closure, many interesting invariants of braids. The connection with quasimorphisms is the following:

Theorem(Brandenbursky): A homomorphism I:\mathcal{C} \to \mathbb{R} gives rise to a quasimorphism on braid groups if there is a constant C so that |I([K])| \le C\cdot\|K\|_g, where \|\cdot\|_g denotes 4-ball genus.

The proof is roughly the following: given pure braids \alpha,\beta one forms the knots \widehat{\alpha\Delta}, \widehat{\beta\Delta} and \widehat{\alpha\beta\Delta}. It is shown that the connect sum L:= \widehat{\alpha \Delta} \# \widehat{\beta\Delta} \# \widehat{\alpha\beta\Delta}^! bounds a Seifert surface whose genus may be universally bounded in terms of the number of strands in the braid group. Pushing this Seifert surface into the 4-ball, the hypothesis of the theorem says that I is uniformly bounded on L. Properties of I then give an estimate for the defect; qed.

It would be interesting to connect these observations up to other “natural” chiral, homogeneous invariants on mapping class groups. For example, associated to a braid or mapping class \phi \in \text{MCG}(S) one can (usually) form a hyperbolic 3-manifold M_\phi which fibers over the circle, with fiber S and monodromy \phi. The \eta-invariant of M_\phi is the signature defect \eta(M_\phi) = \int_Y p_1/3 - \text{sign}(Y) where Y is a 4-manifold with \partial Y = M_\phi with a product metric near the boundary, and p_1 is the first Pontriagin form on Y (expressed in terms of the curvature of the metric). Is \eta a quasimorphism on some subgroup of \text{MCG}(S) (eg on a subgroup consisting entirely of pseudo-Anosov elements)?

Posted in 3-manifolds, Groups | Tagged , , , , , , , | 2 Comments

Harmonic measure

An amenable group G acting by homeomorphisms on a compact topological space X preserves a probability measure on X; in fact, one can given a definition of amenability in such terms. For example, if G is finite, it preserves an atomic measure supported on any orbit. If G = \mathbb{Z}, one can take a sequence of almost invariant probability measures, supported on the subset [-n,n] \cdot p (where p \in X is arbitrary), and any weak limit will be invariant. For a general amenable group, in place of the subsets [-n,n] \subset \mathbb{Z}, one works with a sequence of Folner sets; i.e. subsets with the property that the ratio of their size to the size of their boundary goes to zero (so to speak).

But if G is not amenable, it is generally not true that there is any probability measure on X invariant under the action of G. The best one can expect is a probability measure which is invariant on average. Such a measure is called a harmonic measure (or a stationary measure) for the G-action on X. To be concrete, suppose G is finitely generated by a symmetric generating set S (symmetric here means that if s \in S, then s^{-1} \in S). Let M(X) denote the space of probability measures on X. One can form an operator \Delta:M(X) \to M(X) defined by the formula

\Delta(\mu) = \frac {1} {|S|} \sum_{s \in S} s_*\mu

and then look for a probability measure \nu stationary under \Delta, which exists for quite general reasons. This measure \nu is the harmonic measure: the expectation of the \nu-measure of s(A) under a randomly chosen s \in S is equal to the \nu-measure of A. Note for any probability measure \mu that s_*\mu is absolutely continuous with respect to \Delta(\mu); in fact, the Radon-Nikodym derivative satisfies ds_*\mu/d\Delta(\mu) \le |S|. Substituting \nu for \mu in this formula, one sees that the measure class of \nu is preserved by G, and that for every g \in G, we have dg_*\nu/d\nu \le |S|^{|g|}, where |g| denotes word length with respect to the given generating set.

The existence of harmonic measure is especially useful when X is one-dimensional, e.g. in the case that X=S^1. In one dimension, a measure (at least one of full support without atoms) can be “integrated” to a path metric. Consequently, any finitely generated group of homeomorphisms of the circle is conjugate to a group of bilipschitz homeomorphisms (if the harmonic measure associated to the original action does not have full support, or has atoms, one can “throw in” another random generator to the group; the resulting action can be assumed to have a harmonic measure of full support without atoms, which can be integrated to give a structure with respect to which the group action is bilipschitz). In fact, Deroin-Kleptsyn-Navas showed that any countable group of homeomorphisms of the circle (or interval) is conjugate to a group of bilipschitz homeomorphisms (the hypothesis that G be countable is essential; for example, the group \mathbb{Z}^{\mathbb{Z}} acts in a non-bilipschitz way on the interval — see here).

Suppose now that G = \pi_1(M) for some manifold M. The action of G on S^1 determines a foliated circle bundle S^1 \to E \to M; i.e. a circle bundle, together with a codimension one foliation transverse to the circle fibers. To see this, first form the product \widetilde{M} \times S^1 with its product foliation by leaves \widetilde{M} \times \text{point}, where \widetilde{M} denotes the universal cover of M. The group G = \pi_1(M) acts on \widetilde{M} as the deck group of the covering, and on S^1 by the given action; the quotient of this diagonal action on the product is the desired circle bundle E. The foliation makes E into a “flat” circle bundle with structure group \text{Homeo}^+(S^1). The foliation allows us to associate to each path \gamma in M a homeomorphism from the fiber over \gamma(0) to the fiber over \gamma(1); integrability (or flatness) implies that this homeomorphism only depends on the relative homotopy class of \gamma in M. This identification of fibers is called the holonomy of the foliation along the path \gamma. If M is a Riemannian manifold, there is another kind of harmonic measure on the circle bundle; in other words, a probability measure on each circle with the property that the holonomy associated to an infinitesimal random walk on M preserves the expected value of the measure. This is (very closely related to) a special case of a construction due to Lucy Garnett which associates a harmonic transverse measure to any foliation \mathcal{F} of a manifold N, by finding a fixed point of the leafwise heat flow on the space of probability measures on N, and disintegrating this measure into the product of the leafwise area measure, and a “harmonic” transverse measure.

In any case, we normalize our foliated circle bundle so that each circle has length 2\pi in its harmonic measure. Let X be the vector field on the circle bundle that rotates each circle at unit speed, and let \alpha be the 1-form on E whose kernel is tangent to the leaves of the foliation. We scale \alpha so that \alpha(X)=1 everywhere. The integrability condition for a foliation is expressed in terms of the 1-form as the identity \alpha \wedge d\alpha = 0, and we can write d\alpha = -\beta \wedge \alpha where \beta(X)=0. More intrinsically, \beta descends to a 1-form on the leaves of the foliation which measures the logarithm of the rate at which the transverse measure expands under holonomy in a given direction (the leafwise form \beta is sometimes called the Godbillon class, since it is “half” of the Godbillon-Vey class associated to a codimension one foliation; see e.g. Candel-Conlon volume 2, Chapter 7). Identifying the universal cover of each leaf with \widetilde{M} by projection, the fact that our measure is harmonic means that \beta “is” the gradient of the logarithm of a positive harmonic function on \widetilde{M}. As observed by Thurston, the geometry of M then puts constraints on the size of \beta. The following discussion is taken largely from Thurston’s paper “Three-manifolds, foliations and circles II” (unfortunately this mostly unwritten paper is not publicly available; some details can be found in my foliations book, example 4.6).

An orthogonal connection on E can be obtained by averaging \alpha under the flow of X; i.e. if \phi_t is the diffeomorphism of E which rotates each circle through angle t, then

\omega = \frac {1} {2\pi} \int_{0}^{2\pi} \phi_t^* \alpha

is an X-invariant 1-form on E, which therefore descends to a 1-form on M, which can be thought of as a connection form for an \text{SO}(2)-structure on the bundle E. The curvature of the connection (in the usual sense) is the 2-form d\omega, and we have a formula

d\omega = \frac {1} {2\pi} \int_{0}^{2\pi} \phi_t^*(d\alpha) = \frac {1} {2\pi} \int_{0}^{2\pi} \phi_t^*(-\beta \wedge \alpha)

The action of the 1-parameter group \phi_t trivializes the cotangent bundle to E over each fiber. After choosing such a trivialization, we can think of the values of \alpha at each point on a fiber as sweeping out a circle \gamma in a fixed vector space V. The tangent to this circle is found by taking the Lie derivative

\mathcal{L}_X(\alpha) = \iota_X d\alpha + d\iota_X \alpha = \alpha(X)\beta = \beta

In other words, \beta is identified with d\gamma under the identification of \alpha with \gamma, and \int \phi_t^*(-\beta \wedge \alpha) = \int \gamma \wedge d\gamma; i.e. the absolute value of the curvature of the connection is equal to 1/\pi times the area enclosed by \gamma.

Now suppose M is a hyperbolic n-manifold, i.e. a manifold of dimension n with constant curvature -1 everywhere. Equivalently, think of M as a quotient of hyperbolic space \mathbb{H}^n by a discrete group of isometries. A positive harmonic function on \mathbb{H}^n has a logarithmic derivative which is bounded pointwise by (n-1); identifying positive harmonic functions on hyperbolic space with distributions on the sphere at infinity, one sees that the  “worst case” is the harmonic extension of an atomic measure concentrated at a single point at infinity, since every other positive harmonic function is the weighted average of such examples. As one moves towards or away from a blob at infinity concentrated near this point, the radius of the blob expands like e^t; since the sphere at infinity has dimension n-1, the conclusion follows. But this means that the speed of \gamma (i.e. the size of d\gamma) is pointwise bounded by (n-1), and the length of the \gamma circle is at most 2\pi(n-1). A circle of length 2\pi(n-1) can enclose a disk of area at most \pi (n-1)^2, so the curvature of the connection has absolute value pointwise bounded by (n-1)^2.

One corollary is a new proof of the Milnor-Wood inequality, which says that a foliated circle bundle E over a closed oriented surface S of genus at least 2 satisfies |e(E)| \le -\chi(S), where e(E) is the Euler number of the bundle (a topological invariant). For, the surface S can be given a hyperbolic metric, and the bundle a harmonic connection whose average is an orthogonal connection with pointwise curvature of absolute value at most 1. The Euler class of the bundle evaluated on the fundamental class of S is the Euler number e(E); we have

|e(E)| = \frac {1} {2\pi} |\int_S \omega| \le \text{area}(S)/2\pi = -\chi(S)

where the first equality is the Chern-Weil formula for the Euler class of a bundle in terms of the curvature of a connection, and the last equality is the Gauss-Bonnet theorem for a hyperbolic surface. Another corollary gives lower bounds on the area of an incompressible surface in a hyperbolic manifold. Suppose S \to M is an immersion which is injective on \pi_1. There is a cover \widehat{M} of M for which the immersion lifts to a homotopy equivalence, and we get an action of \pi_1(\widehat{M}) on the circle at infinity of S, and hence a foliated circle bundle as above with e(E) = -\chi(S). Integrating as above over the image of S in \widehat{M}, and using the fact that the curvature of \omega is pointwise bounded by (n-1)^2, we deduce that the area of S is at least -2\pi\chi(S)/(n-1)^2. If M is a 3-manifold, we obtain \text{area}(S) \ge -2\pi\chi(S)/4.

(A somewhat more subtle argument allows one to get better bounds, e.g. replacing 4 by (\pi/2)^2 for n=3, and better estimates for higher n.)

Posted in Dynamics, Groups, Hyperbolic geometry, Surfaces | Tagged , , , , , , , | 1 Comment

How to see the genus

Let R be a polynomial in two variables; i.e. R(\lambda,\mu) = \sum_{i,j} a_{ij} \lambda^i\mu^j where each i,j is non-negative, and the coefficients a_{ij} are complex numbers which are nonzero for only finitely many pairs i,j. For a generic choice of coefficients, the equation R=0 determines a smooth complex curve \Sigma in \mathbb{C}^2 (i.e. a Riemann surface). How can one see the geometry of the curve directly in the expression for R? It turns out that there are several ways to do it, some very old, and some more recent.

The most important geometric invariant of the curve is the genus. To a topologist, this is the number of “handles”; to an algebraic geometer, this is the dimension of the space of holomorphic 1-forms. One well-known way to calculate the genus is by means of the Newton polygon. In the (real) plane \mathbb{R}^2, consider the finite set consisting of the points with integer coordinates (i,j) for which the coefficient a_{ij} of R is nonzero. The convex hull of this finite set is a convex integral polygon, called the Newton polygon of R. It turns out that the genus of \Sigma is the number of integer lattice points in the interior of the Newton polygon. In fact, one can find a basis for the space of holomorphic 1-forms directly from this formulation. Let R_\mu denote the partial derivative of R with respect to \mu. Then for each lattice point (i,j) in the interior of the Newton polygon, the 1-form (\lambda^i\mu^j/R_\mu) d\lambda is a holomorphic 1-form on \Sigma, and the set of all such forms is a basis for the space of all holomorphic 1-forms.

This is direct but a bit unsatisfying to a topologist, since the connection between the dimension of the space of 1-forms and the topological idea of handles is somewhat indirect. In some special cases, it is a bit easier to see things. Two important examples are:

  1. Hyperelliptic surfaces, i.e equations of the form \lambda^2 = p(\mu) for some polynomial p(\cdot) of degree n. The Newton polygon in this case is the triangle with vertices (0,0), (2,0), (0,n) and it has \lfloor (n-1)/2 \rfloor interior lattice points. Geometrically one can “see” the surface by projecting to the \mu plane. For each generic value of \mu, the complex number p(\mu) has two distinct square roots, so the map is 2 to 1. However, at the n roots of p(\cdot), there is only 1 preimage. So the map is a double cover, branched over n points, and one can “see” the topology of the surface by cutting open two copies of the complex line along slits joining pairs of points, and gluing.
  2. A generic surface of degree d. The Newton polygon in this case is the triangle with vertices (0,0), (d,0), (0,d) and it has (d-1)(d-2)/2 interior lattice points. One way to “see” the surface in this case is to first imagine d lines in general position (a quite special degree d curve). Each pair of lines intersect in a point, so there are d(d-1)/2 points of intersection. After deforming the curve, these points of intersection are resolved into tubes, so one obtains d complex lines joined by d(d-1)/2 tubes. The first d-1 tubes are needed to tube the lines together into a (multiply)-punctured plane, and the remaining (d-1)(d-2)/2 tubes each add one to the genus.

It turns out that there is a nice way to directly see the topology of \Sigma in the Newton polygon, via tropical geometry. I recently learned about this idea from Mohammed Abouzaid in one of his Clay lectures; this point of view was pioneered by Grisha Mikhalkin. The idea is as follows. First consider the restriction of \Sigma to the product \mathbb{C}^* \times \mathbb{C}^*; i.e. remove the intersection with the coordinate axes. For generic R, this amounts to removing a finite number of points from \Sigma, which will not change the genus. Then on this punctured curve \Sigma, consider the real valued function (\lambda,\mu) \to (\log(|\lambda|),\log(|\mu|)). The image is a subset of \mathbb{R}^2, called an amoeba. If one varies the (nonzero) coefficients of R generically, the complex geometry of the curve \Sigma will change, but its topology will not. Hence to see the topology of \Sigma one should deform the coefficients in such a way that the topology of the amoeba can be read off from combinatorial information, encoded in the Newton polygon. The terms in R corresponding to lattice points in a boundary edge of the Newton polygon sum to a polynomial which is homogeneous after a suitable change of coordinates. In the region in which these terms dominate, \Sigma looks more and more like a collection of cylinders, each asymptotic to a cone on some points at infinity. The image in the amoeba is a collection of asymptotically straight rays. If the polynomial were genuinely homogeneous, the preimage of each point in the amoeba would be a circle, parameterized by a choice of argument of (a certain root of) either \lambda or \mu. So the amoeba looks like a compact blob with a collection of spikes coming off. As one deforms the coefficients in a suitable way, the compact blob degenerates into a piecewise linear graph which can be read off from purely combinatorial data, and the topology of \Sigma can be recovered by taking the boundary of a thickened tubular neighborhood of this graph.

More explicitly, one chooses a certain triangulation of the Newton polygon into triangles of area 1/2 and with vertices at integer lattice points (by Pick’s theorem this is equivalent to the condition that each triangle and each edge has no lattice points in the interior). This triangulation must satisfy an additional combinatorial condition, namely that there must exist a convex piecewise linear function on the Newton polygon whose domains of linearity are precisely the triangles. This convex function is used to deform the coefficients of R; roughly, if f is the function, choose the coefficient a_{ij} \sim e^{f(i,j)t} and take the limit as t gets very big. The convexity of f guarantees that in the preimage of each triangle of the Newton polygon, the terms of R that contribute the most are those corresponding to the vertices of the triangle. In particular, as t goes to infinity, the amoeba degenerates to the dual spine of the triangle (i.e. a tripod). The preimage of this tripod is a pair of pants; after a change of coordinates, any given triangle can be taken to have vertices (0,0), (1,0), (0,1) corresponding to a linear equation a\lambda + b\mu = c whose solution set in \mathbb{C}^* \times \mathbb{C}^* (for generic a,b,c) is a line minus two points — i.e. a pair of pants.

One therefore has a concrete combinatorial description of the degenerate amoeba: pick a triangulation of the Newton polygon satisfying the combinatorial conditions above. Let \Gamma be the graph dual to the triangulation, with edges dual to boundary edges of the triangulation extended indefinitely. The surface \Sigma is obtained by taking the boundary of a thickened neighborhood of \Gamma. The genus of \Sigma is equal to the rank of the first homology of the graph \Gamma; this is evidently equal to the number of lattice points in the interior of the polygon.

As a really concrete example, consider a polynomial like

R = 1 + 7z^3 - 23.6w^2 + e^\pi z^3w^2

(the exact coefficients are irrelevant; the only issue is to choose them generically enough that the resulting curve is smooth (actually I did not check in this case – please pretend that I did!)). The Newton polygon is a rectangle with vertices (0,0), (3,0), (0,2), (3,2). This can be subdivided into twelve triangles of area 1/2 as in the following figure:

Newton_polygon_1The dual spine is then the following:

Newton_polygon_2

which evidently has rank of H_1 equal to 2, equal on the one hand to the number of interior points in the Newton polygon, and on the other hand to the genus of \Sigma.

Posted in Visualization | Tagged , , , , | 9 Comments

Geometric structures on 1-manifolds

A geometric structure on a manifold is an atlas of charts with values in some kind of “model space”, and transformation functions taken from some pseudogroup of transformations on the model space. If X is the model space, and G is the pseudo-group, one talks about a (G,X)-structure on a manifold M. One usually (but not always) wants X to be homogeneous with respect to G. So, for instance, one talks about smooth structures, conformal structures, projective structures, bilipschitz structures, piecewise linear structures, symplectic structures, and so on, and so on. Riemannian geometry does not easily fit into this picture, because there are so few (germs of) isometries of a typical Riemannian metric, and so many local invariants; but Riemannian metrics modeled on a locally symmetric space, with G a Lie group of symmetries of X, are a very significant example.

Sometimes the abstract details of a theory are hard to grasp before looking at some fundamental examples. The case of geometric structures on 1-manifolds is a nice example, which is surprisingly rich in some ways.


One of the most important ways in which geometric structures arise is in the theory of ODE’s. Consider a first order ODE in one variable, e.g. an equation like y' = f(y,t). If we fix an “initial” value y(t_0)=y_0, then we are guaranteed short time existence and uniqueness of a solution (providing the function f is nice enough). But if we do not fix an initial value, we can instead think of an ODE as a 1-parameter family of (perhaps partially defined) maps from \mathbb{R} to itself. For each fixed t, the function f(y,t) defines a vector field on \mathbb{R}. We can think of the ODE as specifying a path in the Lie algebra of vector fields on \mathbb{R}; solving the ODE amounts to finding a path in the Lie group of diffeomorphisms of \mathbb{R} (or some partially defined Lie pseudogroup of diffeomorphisms on some restricted subdomain) which is tangent to the given family of vector fields. It makes sense therefore to study special classes of equations, and ask when this family of maps is conjugate into an interesting pseudogroup; equivalently, that the evolution of the solutions preserves an interesting geometric structure on \mathbb{R}. We consider some examples in turn.

  1. Indefinite integral y' = a(t). The group in this case is \mathbb{R}, acting on \mathbb{R} by translation. The equation is solved by integrating: y=\int a(t)dt + C.
  2. Linear homogeneous ODE y' = a(t)y. The group in this case is \mathbb{R}^+, acting on \mathbb{R} by multiplication (notice that this group action is not transitive; the point 0 \in \mathbb{R} is preserved; this corresponds to the fact that y = 0 is always a solution of a homogeneous linear ODE). The Lie algebra is \mathbb{R}, and the ODE is “solved” by exponentiating the vector field, and integrating. Hence y = C e^{\int a(t)dt} is the general solution. In fact, in the previous example, the Lie algebra of the group of translations is also identified with \mathbb{R}, and “exponentiating” is the identity map.
  3. Linear inhomogeneous ODE y' = a(t)y + b(t). The group in this case is the affine group \mathbb{R}^+ \ltimes \mathbb{R} where the first factor acts by dilations and the second by translation. The affine group is not abelian, so one cannot “integrate” a vector field directly, but it is solvable: there is a short exact sequence \mathbb{R} \to \mathbb{R}^+ \ltimes \mathbb{R} \to \mathbb{R}^+. The image in the Lie algebra of the group of dilations is the term a(t)y, which can be integrated as before to give an integrating factor e^{\int a(t)dt}. Setting z = ye^{-\int a(t)dt} gives z' = y'e^{-\int a(t)dt} - a(t)ye^{-\int a(t)dt} = b(t)e^{-\int a(t)dt} which is an indefinite integral, and can be solved by a further integration. In other words, we do one integration to change the structure group from \mathbb{R}^+ \ltimes \mathbb{R} to \mathbb{R} (“integrating out” the group of dilations) and then what is left is an abelian structure group, in which we can do “ordinary” integration. This procedure works whenever the structure group is solvable; i.e. whenever there is a finite sequence G=G_0,\cdots,G_n=0 where each G_i surjects onto an abelian group, with kernel G_{i-1}, and after finitely many steps, the last kernel is trivial.
  4. Ricatti equation y' = a(t)y^2 + b(t)y + c(t). In this case, it is well-known that the equation can blow up in finite time, and one does not obtain a group of transformations of \mathbb{R}, but rather a group of transformations of the projective line \mathbb{RP}^1 = \mathbb{R} \cup \infty; another point of view says that one obtains a pseudogroup of transformations of subsets of \mathbb{R}. The group in this case is the projective group \text{PSL}(2,\mathbb{R}), acting by projective linear transformations. Let A(t) be a 1-parameter family of matrices in \text{PSL}(2,\mathbb{R}), say A(t)=\left( \begin{smallmatrix} u(t) & v(t) \\ w(t) & x(t) \end{smallmatrix} \right), with A(0)=\text{id}. Matrices act on \mathbb{R} by fractional linear maps; that is, Az = (uz + v)/(wz+x) for z \in \mathbb{R}. Differentiating A(t)z at t=0 one obtains (Az)'(0) = (u'z+v')-z(w'z+x') = w'z^2 + (u'-x')z + v' which is the general form of the Ricatti equation. Since the group \text{PSL}(2,\mathbb{R}) is not solvable, the Ricatti equation cannot be solved in terms of elementary functions and integrals. However, if one knows one solution y=z(t), one can find all other solutions as follows. Do a change of co-ordinates, by sending the solution z(t) “to infinity”; i.e. define x = 1/(y-z). Then as a function of x, the Ricatti equation reduces to a linear inhomogeneous ODE. In other words, the structure group reduces to the subgroup of \text{PSL}(2,\mathbb{R}) fixing the point at infinity (i.e. the solution z(t)), which is the affine group \mathbb{R}^+ \ltimes \mathbb{R}. One can therefore solve for x, and by substituting back, for y.

The Ricatti equation is important for the solution of second order linear equations, since any second order linear equation y'' = a(t)y' + b(t)y + c(t) can be transformed into a system of two first order linear equations in the variables y and y'. A system of first order ODEs in n variables can be described in terms of pseudogroups of transformations of (subsets of) \mathbb{R}^n. A system of linear equations corresponds to the structure group \text{GL}(n,\mathbb{R}), hence in the case of a 2\times 2 system, to \text{GL}(2,\mathbb{R}). The determinant map is a homomorphism from \text{GL}(2,\mathbb{R}) to \mathbb{R}^* with kernel \text{SL}(2,\mathbb{R}); hence, after  multiplication by a suitable integrating factor, one can reduce to a system which is (equivalent to) the Ricatti equation.

Having seen these examples, one naturally wonders whether there are any other interesting families of equations and corresponding Lie groups acting on 1-manifolds. In fact, there are (essentially) no other examples: if one insists on (finite dimensional) simple Lie groups, then \text{SL}(2,\mathbb{R}) is more or less the only example. Perhaps this is one of the reasons why the theory of ODEs tends to appear to undergraduates (and others) as an unstructured collection of rules and tricks. Nevertheless, recasting the theory in terms of geometric structures has the effect of clearing the air to some extent.


Geometric structures on 1-manifolds arise also in the theory of foliations, which may be seen as a geometric abstraction of certain kinds of PDE. Suppose M is a manifold, and \mathcal{F} is a codimension one foliation. The foliation determines local charts on the manifold in which the leaves of the foliation intersect the chart in the level sets of a co-ordinate function. In the overlap of two such local charts, the transitions between the local co-ordinate functions take values in some pseudogroup. For certain kinds of foliations, this pseudogroup might be analytically quite rigid. For example, if \mathcal{F} is tangent to the kernel of a nonsingular 1-form \alpha on M, then integrating \alpha determines a metric on the leaf space which is preserved by the co-ordinate transformations, and the pseudogroup is conjugate into the group of translations. There are also some interesting examples where the pseudogroup has no interesting local structure, but where structure emerges on a macroscopic scale, because of some special features of the topology of M and \mathcal{F}. For example, suppose M is a 3-manifold, and \mathcal{F} is a foliation in which every leaf is dense. One knows for topological reasons (i.e. theorems of Novikov and Palmeira) that the universal cover \tilde{M} is homeomorphic to \mathbb{R}^3 in such a way that the pulled-back foliation \tilde{\mathcal{F}} is topologically a foliation by planes. One important special case is when any two leaves of \tilde{\mathcal{F}} are a finite Hausdorff distance apart in \tilde{M}. In this case, the foliation \tilde{\mathcal{F}} is topologically conjugate to a product foliation, and \pi_1(M) acts on the leaf space (which is \mathbb{R}) by a group of homeomorphisms. The condition that pairs of leaves are a finite Hausdorff distance away implies that there are intervals I in the leaf space whose translates do not nest; i.e. with the property that there is no g \in \pi_1(M) for which g(I) is properly contained in I. Let I^\pm denote the two endpoints of the interval I. One defines a function Z:\mathbb{R} \to \mathbb{R} by defining Z(p) to be the supremum of the set of values g(I^+) over all g \in \pi_1(M) for which g(I^-) \le p. The non-nesting property, and the fact that every leaf of \mathcal{F} is dense, together imply that Z is a strictly increasing (i.e. fixed-point free) homeomorphism of \mathbb{R} which commutes with the action of \pi_1(M). In particular, the action of \pi_1(M) is conjugate into the subgroup \text{Homeo}^+(\mathbb{R})^{\mathbb{Z}} of homeomorphisms that commute with integer translation. One says in this case that the manifold M slithers over a circle; it is possible to deduce a lot about the geometry and topology of M and \mathcal{F} from this structure. See for example Thurston’s paper, or my book.


A third significant way in which geometric structures arise on circles is in the theory of conformal welding. Let \gamma:S^1 \to \mathbb{CP}^1 be a Jordan curve in the Riemann sphere. The image of the curve decomposes the sphere into two regions homeomorphic to disks. Each open disk region can be uniformized by a holomorphic map from the open unit disk, which extends continuously to the boundary circle. These uniformizing maps are well-defined up to composition with an element of the Möbius group \text{PSL}(2,\mathbb{R}), and their difference is therefore a coset in \text{Homeo}^+(S^1)/\text{PSL}(2,\mathbb{R}) called the welding homeomorphism. Conversely, given a homeomorphism of the circle, one can ask when it arises from a Jordan curve in the Riemann sphere as above, and if it does, whether the curve is unique (up to conformal self-maps of the Riemann sphere). Neither existence nor uniqueness hold in great generality. For example, if the image \gamma(S^1) has positive (Hausdorff) measure, any quasiconformal deformation of the complex structure on the Riemann sphere supported on the image of the curve will deform the curve but not the welding homeomorphism. One significant special case in which existence and uniqueness is assured is the case that \gamma(S^1) is a quasicircle. This means that there is a constant K with the property that if two points p,q are contained in the quasicircle, and the spherical distance between the two points is d(p,q), then at least one arc of the quasicircle joining p to q has spherical diameter at most Kd(p,q). In other words, there are no bottlenecks where two points on the quasicircle come very close in the sphere without being close in the curve. Welding maps corresponding to quasicircles are precisely the quasisymmetric homeomorphisms. A homeomorphism is quasisymmetric if for every sufficiently small interval in the circle, the image of the midpoint of the interval under the homeomorphism is not too far from being the midpoint of the image of the interval; i.e. it divides the image of the interval into two pieces whose lengths have a ratio which is bounded below and above by some fixed constant. Other classes of geometric structures can be detected by welding: smooth Jordan circles correspond to smooth welding maps, real analytic circles correspond to real analytic welding maps, round circles correspond to welding maps in \text{PSL}(2,\mathbb{R}), and so on. Recent work of  Eero Saksman and his collaborators has sought to find the correct idea of a “random” welding, which corresponds to the kinds of Jordan curves generated by stochastic processes such as SLE. In general, the precise correspondence between the analytic quality of \gamma and of the welding map is given by the Hilbert transform.


This list of examples of geometric structures on 1-manifolds is by no means exhaustive. There are many very special features of 1-dimensional geometry: oriented 1-manifolds have a natural causal structure, which may be seen as a special case of contact/symplectic geometry; (nonatomic) measures on 1-manifolds can be integrated to metrics; connections on 1-manifolds are automatically flat, and correspond to representations. It would be interesting to hear other examples, and how they arise in various mathematical fields.

Posted in Lie groups | Tagged , , , , , , , , , , | 1 Comment

Roth’s theorem

I am in Kyoto right now, attending the twenty-first Nevanlinna colloquium (update: took a while to write this post – now I’m in Sydney for the Clay lectures). Yesterday, Junjiro Noguchi gave a plenary talk on Nevanlinna theory in higher dimensions and related Diophantine problems. The talk was quite technical, and I did not understand it very well; however, he said a few suggestive things early on which struck a chord.

The talk started quite accessibly, being concerned with the fundamental equation

a +b = c

where a,b,c are coprime positive integers. The abc conjecture, formulated by Oesterlé and Masser, says that for any positive real number \epsilon, there is a constant C_\epsilon so that

\max(a,b,c) \le C_\epsilon\text{rad}(abc)^{1+\epsilon}

where \text{rad}(abc) is the product of the distinct primes appearing in the product abc. Informally, this conjecture says that for triples a,b,c satisfying the fundamental equation, the numbers a,b,c are not divisible by “too high” powers of a prime. The abc conjecture is known to imply many interesting number theoretic statements, including (famously) Fermat’s Last Theorem (for sufficiently large exponents), and Roth’s theorem on diophantine approximation (as observed by Bombieri).

Roth’s theorem is the following statement:

Theorem(Roth, 1955): Let \alpha be a real algebraic number. Then for any \epsilon>0, the inequality |\alpha - p/q| < q^{-(2+\epsilon)} has only finitely many solutions in coprime integers p,q.

This inequality is best possible, in the sense that every irrational number can be approximated by infinitely many rationals p/q to within 1/2q^2. In fact, the rationals appearing in the continued fraction approximation to \alpha have this property. There is a very short and illuminating geometric proof of this fact.

In the plane, construct a circle packing with a circle of radius 1/2q^2 with center p/q,1/2q^2 for each coprime pair p,q of integers.

circles_1This circle packing nests down on the x-axis, and any vertical line (with irrational x-co-ordinate) intersects infinitely many circles. If the x co-ordinate of a vertical line is \alpha, every circle the line intersects gives a rational p/q which approximates \alpha to within 1/2q^2. qed.

On the other hand, consider the corresponding collection of circles with radius 1/2q^{2+\epsilon}. Some “space” appears between neighboring circles, and they no longer pack tightly (the following picture shows \epsilon = 0.2).

circles_2The total cross-sectional width of these circles, restricted to pairs p/q in the interval [0,1), can be estimated as follows. Each p/q contributes a width of 1/2q^{2+\epsilon}. Ignoring the coprime condition, there are q fractions of the form p/q in the interval [0,1), so the total width is less than \frac 1 2 \sum_q q^{-1-\epsilon} which converges for positive \epsilon. In other words, the total cross-sectional width of all circles is finite. It follows that almost every vertical line intersects only finitely many circles.

Some vertical lines do, in fact, intersect infinitely many circles; i.e. some real numbers are approximated by infinitely many rationals to better than quadratic accuracy; for example, a Liouville number like \sum_{n=1}^\infty 10^{-n!}.

Some special cases of Roth’s theorem are much easier than others. For instance, it is very easy to give a proof when \alpha is a quadratic irrational; i.e. an element of \mathbb{Q}(\sqrt{d}) for some integer d. Quadratic irrationals are characterized by the fact that their continued fraction expansions are eventually periodic. One can think of this geometrically as follows. The group \text{PSL}(2,\mathbb{Z}) acts on the upper half-plane, which we think of now as the complex numbers with non-negative imaginary part, by fractional linear transformations z \to (az+b)/(cz+d). The quotient is a hyperbolic triangle orbifold, with a cusp. A vertical line in the plane ending at a point \alpha on the x-axis projects to a geodesic ray in the triangle orbifold. A rational number p/q approximating \alpha to within 1/2q^2 is detected by the geodesic entering a horoball centered at the cusp. If \alpha is a quadratic irrational, the corresponding geodesic ray eventually winds around a periodic geodesic (this is the periodicity of the continued fraction expansion), so it never gets too deep into the cusp, and the rational approximations to \alpha never get better than C/2q^2 for some constant C depending on \alpha, as required. A different vertical line intersecting the x-axis at some \beta corresponds to a different geodesic ray; the existence of good rational approximations to \beta corresponds to the condition that the corresponding geodesic goes deeper and deeper into the cusp infinitely often at a definite rate (i.e. at a distance which is at least some fixed (fractional) power of time). A “random” geodesic on a cusped hyperbolic surface takes time n to go distance \log{n} out the cusp (this is a kind of equidistribution fact – the thickness of the cusp goes to zero like e^{-t}, so if one chooses a sequence of points in a hyperbolic surface at random with respect to the uniform (area) measure, it takes about n points to find one that is distance \log{n} out the cusp). If one expects that every geodesic ray corresponding to an algebraic number looks like a “typical” random geodesic, one would conjecture (and in fact, Lang did conjecture) that there are only finitely many p/q for which |p/q - \alpha| < q^{-2}(\log{q})^{-1-\epsilon} for any \epsilon > 0.

A slightly different (though related) geometric way to see the periodicity of the continued fraction expansion of a quadratic irrational is to use diophantine geometry. This is best illustrated with an example. Consider the golden number \alpha = (1+\sqrt{5})/2. The matrix A=\left( \begin{smallmatrix} 2 & 1 \ 1 & 1 \end{smallmatrix} \right) has \left( \begin{smallmatrix} \alpha \ 1 \end{smallmatrix} \right) and \left( \begin{smallmatrix} \bar{\alpha} \ 1 \end{smallmatrix} \right) as eigenvectors (here \bar{\alpha} denotes the “conjugate” 1-\alpha), and thus preserves a “wedge” in \mathbb{R}^2 bounded by lines with slopes \alpha and \bar{\alpha}. The set of integer lattice points in this wedge is permuted by A, and therefore so is the boundary of the convex hull of this set (the sail of the cone). Lattice points on the sail correspond to rational approximations to the boundary slopes; the fact that A permutes this set corresponds to the periodicity of the continued fraction expansion of \alpha (and certifies the fact that \alpha cannot be approximated better than quadratically by rational numbers).

There is an analogue of this construction in higher dimensions: let A be an n\times n integer matrix whose eigenvalues are all real, positive, irrational and distinct. A collection of n suitable eigenvectors spans a polyhedral cone which is invariant under A. The  convex hull of the set of integer lattice points in this cone is a polyhedron, and the vertices of this polyhedron (the vertices on the sail) are  the “best” integral approximations to the eigenvectors. In fact, there is a \mathbb{Z}^{n-1} subgroup of \text{SL}(n,\mathbb{Z}) consisting of matrices with the same set of eigenvectors (this is a consequence of Dirichlet’s theorem on the structure of the group of units in the integers in a number field). Hence there is a group that acts discretely and co-compactly on the vertices of the sail, and one gets a priori estimates on how well the eigenvectors can be approximated by integral vectors. It is interesting to ask whether one can give a proof of Roth’s theorem along these lines, at least for algebraic numbers in totally real fields, but I don’t know the answer.

Posted in Diophantine approximation, Ergodic Theory | Tagged , , , | 10 Comments

The Goldman bracket

I was in Stony Brook last week, visiting Moira Chas and Dennis Sullivan, and have been away from blogging for a while; this week I plan to write a few posts about some of the things I discussed with Moira and Dennis. This is an introductory post about the Goldman bracket, an extraordinary mathematical object made out of the combinatorics of immersed curves on surfaces. I don’t have anything original to say about this object, but for my own benefit I thought I would try to explain what it is, and why Goldman was interested in it.

In his study of symplectic structures on character varieties \text{Hom}(\pi,G)/G, where \pi is the fundamental group of a closed oriented surface and G is a Lie group satisfying certain (quite general) conditions, Bill Goldman discovered a remarkable Lie algebra structure on the free abelian group generated by conjugacy classes in \pi. Let \hat{\pi} denote the set of homotopy classes of closed oriented curves on S, where S is itself a compact oriented surface, and let \mathbb{Z}\hat{\pi} denote the free abelian group with generating set \hat{\pi}. If \alpha,\beta are immersed oriented closed curves which intersect transversely (i.e. in double points), define the formal sum

[\alpha,\beta] = \sum_{p \in \alpha \cap \beta} \epsilon(p; \alpha,\beta) |\alpha_p\beta_p| \in \mathbb{Z}\hat{\pi}

In this formula, \alpha_p,\beta_p are \alpha,\beta thought of as based loops at the point p, \alpha_p\beta_p represents their product in \pi_1(S,p), and |\alpha_p\beta_p| represents the resulting conjugacy class in \pi. Moreover, \epsilon(p;\alpha,\beta) = \pm 1 is the oriented intersection number of \alpha and \beta at p.

This operation turns out to depend only on the free homotopy classes of \alpha and \beta, and extends by linearity to a bilinear map [\cdot,\cdot]:\mathbb{Z}\hat{\pi} \times \mathbb{Z}\hat{\pi} \to \mathbb{Z}\hat{\pi}. Goldman shows that this bracket makes \mathbb{Z}\hat{\pi} into a Lie algebra over \mathbb{Z}, and that there are natural Lie algebra homomorphisms from \mathbb{Z}\hat{\pi} to the Lie algebra of functions on \text{Hom}(\pi,G)/G with its Poisson bracket.

The connection with character varieties can be summarized as follows. Let f:G \to \mathbb{R} be a (smooth) class function (i.e. a function which is constant on conjugacy classes) on a Lie group G. Define the variation function F:G \to \mathfrak{g} by the formula

\langle F(A),X\rangle = \frac {d}{dt}|_{t=0} f(A\text{exp}{tX})

where \langle \cdot,\cdot\rangle is some (fixed) \text{Ad}-invariant orthogonal structure on the Lie algebra \mathfrak{g} (for example, if G is reductive (eg if G is semisimple), one can take \langle X,Y\rangle = \text{tr}(XY)). The tangent space to the character variety \text{Hom}(\pi,G)/G at \phi is the first cohomology group of \pi with coefficients in \mathfrak{g}, thought of as a G module with the \text{Ad} action, and then as a \pi module by the representation \phi. Cup product and the pairing \langle\cdot,\cdot\rangle determine a pairing

H^1(\pi,\mathfrak{g})\times H^1(\pi,\mathfrak{g}) \to H^2(\pi,\mathbb{R}) = \mathbb{R}

where the last equality uses the fact that \pi is a closed surface group; this pairing defines the symplectic structure on \text{Hom}(\pi,G)/G.

Every element \alpha \in \pi determines a function f_\alpha:\text{Hom}(\pi,G)/G \to \mathbb{R} by sending a (conjugacy class of) representation [\phi] to f(\phi(\alpha)). Note that f_\alpha only depends on the conjugacy class of \alpha in \pi. It is natural to ask: what is the Hamiltonian flow on \text{Hom}(\pi,G)/G generated by the function f_\alpha? It turns out that when \alpha is a simple closed curve, it is very easy to describe this Hamiltonian flow. If \alpha is nonseparating, then define a flow \psi_t by \psi_t\phi(\gamma)=\phi(\gamma) when \gamma is represented by a curve disjoint from \alpha, and \psi_t\phi(\gamma)= \text{exp} tF_\alpha(\phi)\phi(\gamma) if \gamma intersects \alpha exactly once with a positive orientation (there is a similar formula when \alpha is separating). In other words, the representation is constant on the fundamental group of the surface “cut open” along the curve \alpha, and only deforms in the way the two conjugacy classes of \alpha in the cut open surface are identified in \pi.

In the important motivating case that G = \text{PSL}(2,\mathbb{R}), so that one component of \text{Hom}(\pi,G)/G is the Teichmüller space of hyperbolic structures on the surface S, one can take f = 2\cosh^{-1}\text{tr/2}, and then f_\alpha is just the length of the geodesic in the free homotopy class of \alpha, in the hyperbolic structure on S associated to a representation. In this case, the symplectic structure on the character variety restricts to the Weil-Petersson symplectic structure on Teichmüller space, and the Hamiltonian flow associated to the length function f_\alpha is a family of Fenchel-Nielsen twists, i.e. the deformations of the hyperbolic structure obtained by cutting along the geodesic \alpha, rotating through some angle, and regluing. This latter observation recovers a famous theorem of Wolpert, connected in an obvious way to his formula for the symplectic form \omega = \sum dl_\alpha \wedge d\theta_\alpha where \theta is angle and l is length, and the sum is taken over a maximal system of disjoint essential simple curves \alpha for the surface S.

The combinatorial nature of the Goldman bracket suggests that it might have applications in combinatorial group theory. Turaev discovered a Lie cobracket on \mathbb{Z}\hat{\pi}, and showed that together with the Goldman bracket, one obtains a Lie bialgebra. Motivated by Stallings’ reformulation of the Poincaré conjecture in terms of group theory, Turaev asked whether a free homotopy class contains a power of a simple curve if and only if the cobracket of the class is zero. The answer to this question is negative, as shown by Chas; on the other hand, Chas and Krongold showed that a class \alpha is simple if and only if [\alpha,\alpha^3] is zero. Nevertheless, the full geometric meaning of the Goldman bracket remains mysterious, and a topic worthy of investigation.

Posted in Lie groups, Surfaces | Tagged , , , , , , | 5 Comments