You are currently browsing the tag archive for the ‘minimal surface’ tag.

Hermann Amandus Schwarz (1843-1921) was a student of Kummer and Weierstrass, and made many significant contributions to geometry, especially to the fields of minimal surfaces and complex analysis. His mathematical creations are both highly abstract and flexible, and at the same time intimately tied to explicit and practical calculation.

I learned about Schwarz-Christoffel transformations, Schwarzian derivatives, and Schwarz’s minimal surface as three quite separate mathematical objects, and I was very surprised to discover firstly that they had all been discovered by the same person, and secondly that they form parts of a consistent mathematical narrative, which I will try to explain in this post to the best of my ability. There is an instructive lesson in this example (for me), that we tend to mine the past for nuggets, examples, tricks, formulae etc. while forgetting the points of view and organizing principles that made their discovery possible. Another teachable example is that of Dehn’s “invention” of combinatorial (infinite) group theory, as a natural branch of geometry; several generations of followers went about the task of reformulating Dehn’s insights and ideas in the language of algebra, “generalizing” them and stripping them of their context, before geometric and topological methods were reintroduced by Milnor, Schwarz (a different one this time), Stallings, Thurston, Gromov and others to spectacular effect (note: I have the second-hand impression that the geometric point of view in group theory (and every other subject) was never abandoned in the Soviet Union).

Schwarz’s minimal surface (also called “Schwarz’s D surface”, and sometimes “Schwarz’s H surface”) is an extraordinarily beautiful triply-periodic minimal surface of infinite genus that is properly embedded in $\mathbb{R}^3$. According to Nitsche’s excellent book (p.240), this minimal surface closely resembles the separating wall between inorganic and organic materials in the skeleton of a starfish. The basic building block of the surface can be described as follows. If the vertices of a cube are $2$-colored, the black vertices are the vertices of a regular tetrahedron. Let $Q$ denote the quadrilateral formed by four edges of this tetrahedron; then a fundamental piece $S$ of Schwarz’s surface is a minimal disk spanning $Q$:

The surface may be “analytically continued” by rotating $Q$ through an angle $\pi$ around each boundary edge. Six copies of $Q$ fit smoothly around each vertex, and the resulting surface extends (triply) periodically throughout space.

The symmetries of $Q$ enable us to give it several descriptions as a Riemann surface. Firstly, we could think of $Q$ as a polygon in the hyperbolic plane with four edges of equal length, and angles $\pi/3$. Twelve copies of $Q$ can be assembled to make a hyperbolic surface $\Sigma$ of genus $3$. Thinking of a surface of genus $3$ as the boundary of a genus $3$ handlebody defines a homomorphism from $\pi_1(\Sigma)$ to $\mathbb{Z}^3$, thought of as $H_1(\text{handlebody})$; the cover $\widetilde{\Sigma}$ associated to the kernel is (conformally) the triply periodic Schwarz surface, and the deck group acts on $\mathbb{R}^3$ as a lattice (of index $2$ in the face-centered cubic lattice).

Another description is as follows. Since the deck group acts by translation, the Gauss map from $\widetilde{\Sigma}$ to $S^2$ factors through a map $\Sigma \to S^2$. The map is injective at each point in the interior or on an edge of a copy of $Q$, but has an order $2$ branch point at each vertex. Thus, the map $\Sigma \to S^2$ is a double-branched cover, with one branch point of order $2$ at each vertex of a regular inscribed cube. This leads one to think (like a late 19th century mathematician) of $\Sigma$ as the Riemann surface on which a certain multi-valued function on $S^2 = \mathbb{C} \cup \infty$ is single-valued. Under stereographic projection, the vertices of the cube map to the eight points $\lbrace \alpha,i\alpha,-\alpha,-i\alpha,1/\alpha,i/\alpha,-1/\alpha,-i/\alpha \rbrace$ where $\alpha = (\sqrt{3}-1)/\sqrt{2}$. These eight points are the roots of the polynomial $w^8 - 14w^4 + 1$, so we may think of $\Sigma$ as the hyperelliptic Riemann surface defined by the equation $v^2 = w^8 - 14w^4 + 1$; equivalently, as the surface on which the multi-valued (on $\mathbb{C} \cup \infty$) function $R(w):= 1/v=1/\sqrt{w^8 - 14w^4 + 1}$ is single-valued.

The function $R(w)$ is known as the Weierstrass function associated to $\Sigma$, and an explicit formula for the co-ordinates of the embedding $\widetilde{\Sigma} \to \mathbb{R}^3$ were found by Enneper and Weierstrass. After picking a basepoint (say $0$) on the sphere, the coordinates are given by integration:

$x = \text{Re} \int_0^{w_0} \frac{1}{2}(1-w^2)R(w)dw$

$y = \text{Re} \int_0^{w_0} \frac{i}{2}(1+w^2)R(w)dw$

$z = \text{Re} \int_0^{w_0} wR(w)dw$

The integral in each case depends on the path, and lifts to a single-valued function precisely on $\widetilde{\Sigma}$.

Geometrically, the three coordinate functions $x,y,z$ are harmonic functions on $\widetilde{\Sigma}$. This corresponds to the fact that minimal surfaces are precisely those with vanishing mean curvature, and the fact that the Laplacian of the coordinate functions (in terms of isothermal parameters on the underlying Riemann surface) can be expressed as a nonzero multiple of the mean curvature vector. A harmonic function on a Riemann surface is the real part of a holomorphic function, unique up to a constant; the holomorphic derivative of the (complexified) coordinate functions are therefore well-defined, and give holomorphic $1$-forms $\phi_1,\phi_2,\phi_3$ which descend to $\Sigma$ (since the deck group acts by translations). These $1$-forms satisfy the identity $\sum_i \phi_i^2 = 0$ (this identity expresses the fact that the embedding of $\widetilde{\Sigma}$ into $\mathbb{R}^3$ via these functions is conformal). The (composition of the) Gauss map (with stereographic projection) can be read off from the $\phi_i$, and as a meromorphic function on $\Sigma$, it is given by the formula $w = \phi_3/(\phi_1 - i\phi_2)$. Define a function $f$ on $\Sigma$ by the formula $fdw = \phi_1 - i\phi_2$. Then $1/f,w$ are the coordinates of a rational map from $\Sigma$ into $\mathbb{C}^2$ which extends to a map into $\mathbb{CP}^2$, by sending each zero of $f$ to $wf = \phi_3/dw$ in the $\mathbb{CP}^1$ at infinity. Symmetry allows us to identify the image with the hyperelliptic embedding from before, and we deduce that $f=R(w)$. Solving for $\phi_1,\phi_2$ we obtain the integrands in the formulae above.

In fact, any holomorphic function $R(w)$ on a domain in $\mathbb{C}$ defines a (typically immersed with branch points) minimal surface, by the integral formulae of Enneper-Weierstrass above. Suppose we want to use this fact to produce an explicit description of a minimal surface bounded by some explicit polygonal loop in $\mathbb{R}^3$. Any minimal surface so obtained can be continued across the boundary edges by rotation; if the angles at the vertices are all of the form $\pi/n$ the resulting surface closes up smoothly around the vertices, and one obtains a compact abstract Riemann surface $\Sigma$ tiled by copies of the fundamental region, together with a holonomy representation of $\pi_1(\Sigma)$ into $\text{Isom}^+(\mathbb{R}^3)$. Sometimes the image of this representation in the rotational part of $\text{Isom}^+(\mathbb{R}^3)$ is finite, and one obtains an infinitely periodic minimal surface as in the case of Schwarz’s surface. A fundamental tile in $\Sigma$ can be uniformized as a hyperbolic polygon; equivalently, as a region in the upper half-plane bounded by arcs of semicircles perpendicular to the real axis. Since the edges of the loop are straight lines, the image of this hyperbolic polygon under the Gauss map is a region in $\mathbb{R}^3$ also bounded by arcs of round circles; thus Schwarz’s study of minimal surfaces naturally led him to the problem of how to explicitly describe conformal maps between regions in the plane bounded by circular arcs. This problem is solved by the Schwarz-Christoffel transformation, and its generalizations, with help from the Schwarzian derivative.

Note that if $P$ and $Q$ are two such regions, then a conformal map from $P$ to $Q$ can be factored as the product of a map uniformizing $P$ as the upper half-plane, followed by the inverse of a map uniformizing $Q$ as the upper half-plane. So it suffices to find a conformal map when the domain is the upper half plane, decomposed into intervals and rays that are mapped to the edges of a circular polygon $Q$. Near each vertex, $Q$ can be moved by a fractional linear transformation $z \to (az+b)/(cz+d)$ to (part of) a wedge, consisting of complex numbers with argument between $0$ and $\alpha$, where $\alpha$ is the angle at $Q$. The function $f(z) = z^{\alpha/\pi}$ uniformizes the upper half-plane as such a wedge; however it is not clear how to combine the contributions from each vertex, because of the complicated interaction with the fractional linear transformation. The fundamental observation is that there are certain natural holomorphic differential operators which are insensitive to the composition of a holomorphic function with groups of fractional linear transformations, and the uniformizing map can be expressed much more simply in terms of such operators.

For example, two functions that differ by addition of a constant have the same derivative: $f' = (f+c)'$. Functions that differ by multiplication by a constant have the same logarithmic derivative: $(\log(f))' = (\log(cf))'$. Putting these two observations together suggest defining the nonlinearity of a function as the composition $N(f):= (\log(f'))' = f''/f'$. This has the property that $N(af+b) = N(f)$ for any constants $a,b$. Under inversion $z \to 1/z$ the nonlinearity transforms by $N(1/f) = N(f) - 2f'/f$. From this, and a simple calculation, one deduces that the operator $N' - N^2/2$ is invariant under inversion, and since it is also invariant under addition and multiplication by constants, it is invariant under the full group of fractional linear transformations. This combination is called the Schwarzian derivative; explicitly, it is given by the formula $S(f) = f'''/f' - 3/2(f''/f')^2$. Given the Schwarzian derivative $S(f)$, one may recover the nonlinearity $N(f)$ by solving the Ricatti equation $N' - N^2/2 - S = 0$. As explained in this post, solutions of the Ricatti equation preserve the projective structure on the line; in this case, it is a complex projective structure on the complex line. Equivalently, different solutions differ by an element of $\text{PSL}(2,\mathbb{C})$, acting by fractional linear transformations, as we have just deduced. Once we know the nonlinearity, we can solve for $f$ by $f = \int e^{\int N}$, the usual solution to a first order linear inhomogeneous ODE. The Schwarzian of the function $z^{\alpha/\pi}$ is $(1-\alpha^2/\pi^2)/2z^2$. The advantage of expressing things in these terms is that the Schwarzian of a uniformizing map for a circular polygon $Q$ with angles $\alpha_i$ at the vertices has the form of a rational function, with principal parts $a_i/(z-z_i)^2 + b_i/(z-z_i)$, where the $a_i = (1-\alpha_i^2/\pi^2)/2$ and the $b_i$ and $z_i$ depend (unfortunately in a very complicated way) on the edges of $Q$ (for the ugly truth, see Nehari, chapter 5). To see this, observe that the map has an order two pole near finitely many points $z_i$ (the preimages of the vertices of $Q$ under the uniformizing map) but is otherwise holomorphic. Moreover, it can be analytically continued into the lower half plane across the interval between successive $z_i$, by reflecting the image across each circular edge. After reflecting twice, the image of $Q$ is transformed by a fractional linear transformation, so $S(f)$ has an analytic continuation which is single valued on the entire Riemann sphere, with finitely many isolated poles, and is therefore a rational function! When the edges of the polygon are straight, a simpler formula involving the nonlinearity specializes to the “familiar” Schwarz-Christoffel formula.

(Update 10/22): In fact, I went to the library to refresh myself on the contents of Nehari, chapter 5. The first thing I noticed — which I had forgotten — was that if $f$ is the uniformizing map from the upper half-plane to a polygon $Q$ with spherical arcs, then $S(f)$ is real-valued on the real axis. Since it is a rational function, this implies that its nonsingular part is actually a constant; i.e.

$S(f) = \sum _i a_i/(z-z_i)^2 + b_i/(z-z_i) + c$

where $a_i$ is as above, and $z_i,b_i,c$ are real constants (which satisfy some further conditions — really see Nehari this time for more details).

The other thing that struck me was the first paragraph of the preface, which touches on some of the issues I alluded to above:

In the preface to the first edition of Courant-Hilbert’s “Methoden der mathematischen Physik”, R. Courant warned against a trend discernible in modern mathematics in which he saw a menace to the future development of mathematical analysis. He was referring to the tendency of many workers in this field to lose sight of the roots of mathematical analysis in physical and geometric intuition and to concentrate their efforts on the refinement and the extreme generalization of existing concepts.

Instead of using a word like “menace”, I would rather take this as a lesson about the value of returning to the points of view that led to the creation of the mathematical objects we study every day; which was (to some approximation) the point I was trying to illustrate in this post.

If $f$ is a smooth function on a manifold $M$, and $p$ is a critical point of $f$, recall that the Hessian $H_pf$ is the quadratic form $\nabla df$ on $T_pM$ (in local co-ordinates, the coefficients of the Hessian are the second partial derivatives of $f$ at $p$). Since $H_pf$ is symmetric, it has a well-defined index, which is the dimension of the subspace of maximal dimension on which $H_pf$ is negative definite. The Hessian does not depend on a choice of metric. One way to see this is to give an alternate definition $H_pf(X(p),Y(p)) = X(Yf)(p)$ where $X$ and $Y$ are any two vector fields with given values $X(p)$ and $Y(p)$ in $T_pM$. To see that this does not depend on the choice of $X,Y$, observe

$X(Yf)(p) - Y(Xf)(p) = [X,Y]f(p) = df([X,Y])_p = 0$

because of the hypothesis that $df$ vanishes at $p$. This calculation shows that the formula is symmetric in $X$ and $Y$. Furthermore, since $X(Yf)(p)$ only depends on the value of $X$ at $p$, the symmetry shows that the result only depends on $X(p)$ and $Y(p)$ as claimed. A critical point is nondegenerate if $H_pf$ is nondegenerate as a quadratic form.

In Morse theory, one uses a nondegenerate smooth function $f$ (i.e. one with isolated nondegenerate critical points), also called a Morse function, to understand the topology of $M$: the manifold $M$ has a (smooth) handle decomposition with one $i$-handle for each critical point of $f$ of index $i$. In particular, nontrivial homology of $M$ forces any such function $f$ to have critical points (and one can estimate their number of each index from the homology of $M$). Morse in fact applied his construction not to finite dimensional manifolds, but to the infinite dimensional manifold of smooth loops in some finite dimensional manifold, with arc length as a “Morse” function. Critical “points” of this function are closed geodesics. Any closed manifold has a nontrivial homotopy group in some dimension; this gives rise to nontrivial homology in the loop space. Consequently one obtains the theorem of Lyusternik and Fet:

Theorem: Let $M$ be a closed Riemannian manifold. Then $M$ admits at least one closed geodesic.

In higher dimensions, one can study the space of smooth maps from a fixed manifold $S$ to a Riemannian manifold $M$ equipped with various functionals (which might depend on extra data, such as a metric or conformal structure on $S$). One context with many known applications is when $M$ is a Riemannian $3$-manifold, $S$ is a surface, and one studies the area function on the space of smooth maps from $S$ to $M$ (usually in a fixed homotopy class). Critical points of the area function are called minimal surfaces; the name is in some ways misleading: they are not necessarily even local minima of the area function. That depends on the index of the Hessian of the area function at such a point.

Let $\rho(t)$ be a (compactly supported) $1$-parameter family of surfaces in a Riemannian $3$-manifold $M$, for which $\rho(0)$ is smoothly immersed. For small $t$ the surfaces $\rho(t)$ are transverse to the exponentiated normal bundle of $\rho(0)$; hence locally we can assume that $\rho$ takes the form $\rho(t,u,v)$ where $u,v$ are local co-ordinates on $\rho(0)$, and $\rho(\cdot,u,v)$ is contained in the normal geodesic to $\rho(0)$ through the point $\rho(0,u,v)$; we call such a family of surfaces a normal variation of surfaces. For such a variation, one has the following:

Theorem (first variation formula): Let $\rho(t)$ be a normal variation of surfaces, so that $\rho'(0) = f\nu$ where $\nu$ is the unit normal vector field to $\rho(0)$. Then there is a formula:

$\frac d {dt} \text{area}(\rho(t))|_{t=0} = \int_{\rho(0)} -\langle f\nu,\mu\rangle d\text{area}$

where $\mu$ is the mean curvature vector field along $\rho(0)$.

Proof: let $T,U,V$ denote the image under $d\rho$ of the vector fields $\partial_t,\partial_u,\partial_v$. Choose co-ordinates so that $u,v$ are conformal parameters on $\rho(0)$; this means that $\langle U,V\rangle = 0$ and $\|U\|=\|V\|$ at $t=0$.

The infinitesimal area form on $\rho(t)$ is $\sqrt{\|U\|^2\|V\|^2 - \langle U,V \rangle^2} dUdV$ which we abbreviate by $E^{1/2}$, and write

$\frac d {dt} \text{area}(\rho(t)) = \int_{\rho(t)} \frac {dUdV} {2E^{1/2}} (\|U\|^2\langle V,V\rangle' + \|V\|\langle U,U\rangle' - 2\langle U,V\rangle\langle U,V\rangle')$

Since $V,T$ are the pushforward of coordinate vector fields, they commute; hence $[V,T]=0$, so $\nabla_T V = \nabla_V T$ and therefore

$\langle V,V\rangle' = 2\langle \nabla_T V,V\rangle = 2\langle \nabla_V T,V\rangle = 2(V\langle T,V\rangle - \langle T,\nabla_V V\rangle)$

and similarly for $\langle U,U\rangle'$. At $t = 0$ we have $\langle T,V\rangle = 0$, $\langle U,V\rangle = 0$ and $\|U\|^2 = \|V\|^2 = E^{1/2}$ so the calculation reduces to

$\frac d {dt} \text{area}(\rho(t))|_{t=0} = \int_{\rho(0)} -\langle T,\nabla_U U + \nabla_V V\rangle dUdV$

Now, $T|_{t=0} = f\nu$, and $\nabla_U U + \nabla_V V = \mu E^{1/2}$ so the conclusion follows. qed.

As a corollary, one deduces that a surface is a critical point for area under all smooth compactly supported variations if and only if the mean curvature $\mu$ vanishes identically; such a surface is called minimal.

The second variation formula follows by a similar (though more involved) calculation. The statement is:

Theorem (second variation formula): Let $\rho(t)$ be a normal variation of surfaces, so that $\rho'(0)=f\nu$. Suppose $\rho(0)$ is minimal. Then there is a formula:

$\frac {d^2} {dt^2} \text{area}(\rho(t))|_{t=0} = \int_{\rho(0)} -\langle f\nu,L(f)\nu\rangle d\text{area}$

where $L$ is the Jacobi operator (also called the stability operator), given by the formula

$L = \text{Ric}(\nu) + |A|^2 + \Delta_\rho$

where $A$ is the second fundamental form, and $\Delta_\rho = -\nabla^*\nabla$ is the metric Laplacian on $\rho(0)$.

This formula is frankly a bit fiddly to derive (one derivation, with only a few typos, can be found in my Foliations book; a better derivation can be found in the book of Colding-Minicozzi) but it is easy to deduce some significant consequences directly from this formula. The metric Laplacian on a compact surface is negative self-adjoint (being of the form $-X^*X$ for some operator $X$), and $L$ is obtained from it by adding a $0$th order perturbation, the scalar field $|A|^2 + \text{Ric}(\nu)$. Consequently the biggest eigenspace for $L$ is $1$-dimensional, and the eigenvector of largest eigenvalue cannot change sign. Moreover, the spectrum of $L$ is discrete (counted with multiplicity), and therefore the index of $-L$ (thought of as the “Hessian” of the area functional at the critical point $\rho(0)$) is finite.

A surface is said to be stable if the index vanishes. Integrating by parts, one obtains the so-called stability inequality for a stable minimal surface $S$:

$\int_S (\text{Ric}(\nu) + |A|^2)f^2d\text{area} \le \int_S |\nabla f|^2 d\text{area}$

for any reasonable compactly supported function $f$. If $S$ is closed, we can take $f=1$. Consequently if the Ricci curvature of $M$ is positive, $M$ admits no stable minimal surfaces at all. In fact, in the case of a surface in a $3$-manifold, the expression $\text{Ric}(\nu) + |A|^2$ is equal to $R - K + |A|^2/2$ where $K$ is the intrinsic curvature of $S$, and $R$ is the scalar curvature on $M$. If $S$ has positive genus, the integral of $-K$ is non-negative, by Gauss-Bonnet. Consequently, one obtains the following theorem of Schoen-Yau:

Corollary (Schoen-Yau): Let $M$ be a Riemannian $3$-manifold with positive scalar curvature. Then $M$ admits no immersed stable minimal surfaces at all.

On the other hand, one knows that every $\pi_1$-injective map $S \to M$ to a $3$-manifold is homotopic to a stable minimal surface. Consequently one deduces that when $M$ is a $3$-manifold with positive scalar curvature, then $\pi_1(M)$ does not contain a surface subgroup. In fact, the hypothesis that $S \to M$ be $\pi_1$-injective is excessive: if $S \to M$ is merely incompressible, meaning that no essential simple loop in $S$ has a null-homotopic image in $M$, then the map is homotopic to a stable minimal surface. The simple loop conjecture says that a map $S \to M$ from a $2$-sided surface to a $3$-manifold is incompressible in this sense if and only if it is $\pi_1$-injective; but this conjecture is not yet known.

Update 8/26: It is probably worth making a few more remarks about the stability operator.

The first remark is that the three terms $\text{Ric}(\nu)$, $|A|^2$ and $\Delta$ in $L$ have natural geometric interpretations, which give a “heuristic” justification for the second variation formula, which if nothing else, gives a handy way to remember the terms. We describe the meaning of these terms, one by one.

1. Suppose $f \equiv 1$, i.e. consider a variation by flowing points at unit speed in the direction of the normals. In directions in which the surface curves “up”, the normal flow is focussing; in directions in which it curves “down”, the normal flow is expanding. The net first order effect is given by $\langle \nu,\mu\rangle$, the mean curvature in the direction of the flow. For a minimal surface, $\mu = 0$, and only the second order effect remains, which is $|A|^2$ (remember that $A$ is the second fundamental form, which measures the infinitesimal deviation of $S$ from flatness in $M$; the mean curvature is the trace of $A$, which is first order. The norm $|A|^2$ is second order).
2. There is also an effect coming from the ambient geometry of $M$. The second order rate at which a parallel family of normals $\nu$ along a geodesic $\gamma$ diverge is $\langle R(\gamma',\nu)\gamma',\nu\rangle$ where $R$ is the curvature operator. Taking the average over all geodesics $\gamma$ tangent to $S$ at a point gives the Ricci curvature in the direction of $\nu$, i.e. $\text{Ric}(\nu)$. This is the infinitesimal expansion of area of a geodesic plane under the normal flow, and has second order. The interactions between these terms have higher order, so the net contribution when $f \equiv 1$ is $\text{Ric}(\nu) + |A|^2$.
3. Finally, there is the contribution coming from $f$ itself. Imagine that $S$ is a flat plane in Euclidean space, and let $S_\epsilon$ be the graph of $\epsilon f$. The infinitesimal area element on $S_\epsilon$ is $\sqrt{1+\epsilon^2 |\nabla f|^2} \sim 1+\epsilon^2/2 |\nabla f|^2$. If $f$ has compact support, then differentiating twice by $\epsilon$, and integrating by parts, one sees that the (leading) second order term is $\Delta f$. When $S$ is not totally geodesic, and the ambient manifold is not Euclidean space, there is an interaction which has higher order; the leading terms add, and one is left with $L = \text{Ric}(\nu) + |A|^2 + \Delta$.

The second remark to make is that if the support of a variation $f$ is sufficiently small, then necessarily $|\nabla f|$ will be large compared to $f$, and therefore $-L$ will be positive definite. In other words all variations of a (fixed) minimal surface with sufficiently small support are area increasing — i.e. a minimal surface is locally area minimizing (this is local in the surface itself, not in the “space of all surfaces”). This is a generalization of the important fact that a geodesic in a Riemannian manifold is locally length minimizing (though typically not globally length minimizing).

One final remark is that when $|A|^2$ is big enough at some point $p \in S$, and when the injectivity radius of $S$ at $p$ is big enough (depending on bounds on $\text{Ric}(\nu)$ in some neighborhood of  $p$), one can find a variation with support concentrated near $p$ that violates the stability inequality. Contrapositively, as observed by Schoen, knowing that a minimal surface in a $3$-manifold $M$ is stable gives one a priori control on the size of $|A|^2$, depending only on the Ricci curvature of $M$, and the injectivity radius of the surface at the point. Since stability is preserved under passing to covers (for $2$-sided surfaces, by the fact that the largest eigenvalue of $L$ can’t change sign!) one only needs a lower bound on the distance from $p$ to $\partial S$. In particular, if $S$ is a closed stable minimal surface, there is an a priori pointwise bound on $|A|^2$. This fact has many important topological applications in $3$-manifold topology. On the other hand, when $S$ has boundary, the curvature can be arbitrarily large. The following example is due to Thurston (also see here for a discussion):

Example (Thurston): Let $\Delta$ be an ideal simplex in $\mathbb{H}^3$ with ideal simplex parameter imaginary and very large. The four vertices of $\Delta$ come in two pairs which are very close together (as seen from the center of gravity of the simplex); let $P$ be an ideal quadrilateral whose edges join a point in one pair to a point in the other. The simplex $\Delta$ is bisected by a “square” of arbitrarily small area; together with four “cusps” (again, of arbitrarily small area) one makes a (topological) disk spanning $P$ with area as small as desired. Isotoping this disk rel. boundary to a least area (and therefore stable) representative can only decrease the area further. By the Gauss-Bonnet formula, the curvature of such a disk must get arbitrarily large (and negative) at some point in the interior.

I recently made the final edits to my paper “Positivity of the universal pairing in 3 dimensions”, written jointly with Mike Freedman and Kevin Walker, to appear in Jour. AMS. This paper is inspired by questions that arise in the theory of unitary TQFT’s. An $n+1$-dimensional TQFT (“topological quantum field theory”) is a functor $Z$ from the category of smooth oriented $n$-manifolds and smooth cobordisms between them, to the category of (usually complex) vector spaces and linear maps, that obeys the (so-called) monoidal axiom $Z(A \coprod B) = Z(A) \otimes Z(B)$. The monoidal axiom implies that $Z(\emptyset)=\mathbb{C}$. Roughly speaking, the functor associates to a “spacelike slice” — i.e. to each $n$-manifold $A$ — the vector space of “quantum states” on $A$ (whatever they are), denoted $Z(A)$. A cobordism stands in for the physical idea of the universe and its quantum state evolving in time. An $n+1$-manifold $W$ bounding $A$ can be thought of as a cobordism from the empty manifold to $A$, so $Z(W)$ is a linear map from $\mathbb{C}$ to $Z(A)$, or equivalently, a vector in $Z(A)$ (the image of $1 \in \mathbb{C}$).

Note that as defined above, a TQFT is sensitive not just to the underlying topology of a manifold, but to its smooth structure. One can define variants of TQFTs by requiring more or less structure on the underlying manifolds and cobordisms. One can also consider “decorated” cobordism categories, such as those whose objects are pairs $(A,K)$ where $A$ is a manifold and $K$ is a submanifold of some fixed codimension (usually $2$) and whose morphisms are pairs of cobordisms $(W,S)$ (e.g.  Wilson loops in a $2+1$-dimensional TQFT).

In realistic physical theories, the space of quantum states is a Hilbert space — i.e. it is equipped with a nondegenerate inner product. In particular, the result of pairing a vector with itself should be positive. One says that a TQFT with this property is unitary. In the TQFT, reversing the orientation of a manifold interchanges a vector space with its dual, and pairing is accomplished by gluing diffeomorphic manifolds with opposite orientations. It is interesting to note that many $3+1$-dimensional TQFTs of interest to mathematicians are not unitary; e.g. Donaldson theory, Heegaard Floer homology, etc. These theories depend on a grading, which prevents attempts to unitarize them. It turns out that there is a good reason why this is true, discussed below.

Definition: For any $n$-manifold $S$, let $\mathcal{M}(S)$ denote the complex vector space spanned by the set of $n+1$-manifolds bounding $S$, up to a diffeomorphism fixed on $S$. There is a pairing on this vector space — the universal pairing — taking values in the complex vector space $\mathcal{M}$ spanned by the set of closed $n+1$-manifolds up to diffeomorphism. If $\sum_i a_iA_i$ and $\sum_j b_jB_j$ are two vectors in $\mathcal{M}(A)$, the pairing of these two vectors is equal to the formal sum $\sum_{ij} a_i\overline{b}_j A_i\overline{B}_j$ where overline is complex conjugation on numbers, and orientation-reversal on manifolds, and $A_i\overline{B}_j$ denotes the closed manifold obtained by gluing ${}A_i$ to $\overline{B}_j$ along $S$.

The point of making this definition is the following. If $v \in \mathcal{M}(S)$ is a vector with the property that $\langle v,v\rangle_S = 0$ (i.e. the result of pairing $v$ with itself is zero), then $Z(v)=0$ for any unitary TQFT $Z$. One says that the universal pairing is positive in $n+1$ dimensions if every nonzero vector $v$ pairs nontrivially with itself.

Example: The Mazur manifold $M$ is a smooth $4$-manifold with boundary $S$. There is an involution $\theta$ of $S$ that does not extend over $M$, so $M,\theta(M)$ denote distinct elements of $\mathcal{M}(S)$. Let $v = M - \theta(M)$, their formal difference. Then the result of pairing $v$ with itself has four terms: $\langle v,v\rangle_S = M\overline{M} - \theta(M)\overline{M} - M\overline{\theta(M)} + \theta(M)\overline{\theta(M)}$. It turns out that all four terms are diffeomorphic to $S^4$, and therefore this formal sum is zero even though $v$ is not zero, and the universal pairing is not positive in dimension $4$.

More generally, it turns out that unitary TQFTs cannot distinguish $s$-cobordant $4$-manifolds, and therefore they are insensitive to essentially all “interesting” smooth $4$-manifold topology! This “explains” why interesting $3+1$-dimensional TQFTs, such as Donaldson theory and Heegaard Floer homology (mentioned above) are necessarily not unitary.

One sees that cancellation arises, and a pairing may fail to be positive, if there are some unusual “coincidences” in the set of terms $A_i\overline{B}_j$ arising in the pairing. One way to ensure that cancellation does not occur is to control the coefficients for the terms appearing in some fixed diffeomorphism type. Observe that the “diagonal” coefficients $a_i\overline{a}_i$ are all positive real numbers, and therefore cancellation can only occur if every manifold appearing as a diagonal term is diffeomorphic to some manifold appearing as an off-diagonal term. The way to ensure that this does not occur is to define some sort of ordering or complexity on terms in such a way that the term of greatest complexity can occur only on the diagonal. This property — diagonal dominance — can be expressed in the following way:

Definition: A pairing $\langle \cdot,\cdot \rangle_S$ as above satisfies the topological Cauchy-Schwarz inequality if there is a complexity function $\mathcal{C}$ defined on all closed $n+1$-manifolds, so that if ${}A,B$ are any two $n+1$-manifolds with boundary $S$, there is an inequality $\mathcal{C}(A\overline{B}) \le \max(\mathcal{C}(A\overline{A}),\mathcal{C}(B\overline{B}))$ with equality if and only if $A=B$.

The existence of such a complexity function ensures diagonal dominance, and therefore the positivity of the pairing $\langle\cdot,\cdot\rangle_S$.

Example: Define a complexity function $\mathcal{C}$ on closed $1$-manifolds, by defining $\mathcal{C}(M)$ to be equal to the number of components of $M$. This complexity function satisfies the topological Cauchy-Schwarz inequality, and proves positivity for the universal pairing in $1$ dimension.

Example: A suitable complexity function can also be found in $2$ dimensions. The first term in the complexity is number of components. The second is a lexicographic list of the Euler characteristics of the resulting pieces (i.e. the complexity favors more components of bigger Euler characteristic). The first term is maximized if and only if the pieces of $A$ and $B$ are all glued up in pairs with the same number of boundary components in $S$; the second term is then maximized if and only if each piece of $A$ is glued to a piece of $B$ with the same Euler characteristic and number of boundary components — i.e. if and only if $A=B$.

Positivity holds in dimensions below $3$, and fails in dimensions above $3$. The main theorem we prove in our paper is that positivity holds in dimension $3$, and we do this by constructing an explicit complexity function which satisfies the topological Cauchy-Schwarz inequality.

Unfortunately, the function itself is extremely complicated. At a first pass, it is a tuple $c=(c_0,c_1,c_2,c_3)$ where $c_0$ treats number of components, $c_1$ treats the kernel of $\pi_1(S) \to \pi_1(A)$ under inclusion, $c_2$ treats the essential $2$-spheres, and $c_3$ treats prime factors arising in the decomposition.

The term $c_1$ is itself very interesting: for each finite group $G$ Witten and Dijkgraaf constructed a real unitary TQFT $Z_G$ (i.e. one for which the resulting vector spaces are real), so that roughly speaking $Z_G(S)$ is the vector space spanned by representations of $\pi_1(S)$ into $G$ up to conjugacy, and $Z_G(A)$ is the vector that counts (in a suitable sense) the number of ways each such representation extends over $\pi_1(A)$. The value of $Z_G$ on a closed manifold is roughly just the number of representations of the fundamental group in $G$, up to conjugacy. The complexity $c_1$ is obtained by first enumerating all isomorphism classes of finite groups $G_1,G_2,G_3 \cdots$ and then listing the values of $Z_{G_i}$ in order. If the kernel of $\pi_1(S) \to \pi_1(A)$ is different from the kernel of $\pi_1(S) \to \pi_1(B)$, this difference can be detected by some finite group (this fact depends on the fact that $3$-manifold groups are residually finite, proved in this context by Hempel); so $c_1$ is diagonal dominant unless these two kernels are equal; equivalently, if the maximal compression bodies of $S$ in $A$ and $B$ are diffeomorphic rel. $S$. It is essential to control these compression bodies before counting essential $2$-spheres, so this term must come before $c_2$ in the complexity.

The term $c_3$ has a contribution $c_p$ from each prime summand. The complexity $c_p$ itself is a tuple $c_p = (c_S,c_h,c_a)$ where $c_S$ treats Seifert-fibered pieces, $c_h$ treats hyperbolic pieces, and $c_a$ treats the way in which these are assembled in the JSJ decomposition. The term $c_h$ is quite interesting; evaluated on a finite volume hyperbolic $3$-manifold $M$ it gives as output the tuple $c_h(M) = (-\text{vol}(M),\sigma(M))$ where $\text{vol}(M)$ denotes hyperbolic volume, and $\sigma(M)$ is the geodesic length spectrum, or at least those terms in the spectrum with zero imaginary part. The choice of the first term depends on the following theorem:

Theorem: Let $S$ be an orientable surface of finite type so that each component has negative Euler characteristic, and let ${}A,B$ be irreducible, atoroidal and acylindrical, with boundary $S$. Then $A\overline{A},A\overline{B},B\overline{B}$ admit unique complete hyperbolic structures, and either $2\text{vol}(A\overline{B}) > \text{vol}(A\overline{A})+\text{vol}(B\overline{B})$ or else $2\text{vol}(A\overline{B}) = \text{vol}(A\overline{A}) + \text{vol}(B\overline{B})$ and $S$ is totally geodesic in $A\overline{B}$.

This theorem is probably the most technically difficult part of the paper. Notice that even though in the end we are only interested in closed manifolds, we must prove this theorem for hyperbolic manifolds with cusps, since these are the pieces that arise in the JSJ decomposition. This theorem was proved for closed manifolds by Agol-Storm-Thurston, and our proof follows their argument in general terms, although there are more technical difficulties in the cusped case. One starts with the hyperbolic manifold $A\overline{B}$, and finds a least area representative of the surface $S$. Cut along this surface, and double (metrically) to get two singular metrics on the topological manifolds $A\overline{A}$ and $B\overline{B}$. The theorem will be proved if we can show the volume of this singular metric is bigger than the volume of the hyperbolic metric. Such comparison theorems for volume are widely studied in geometry; in many circumstances one defines a geometric invariant of a Riemannian metric, and then shows that it is minimized/maximized on a locally symmetric metric (which is usually unique in dimensions $>2$). For example, Besson-Courtois-Gallot famously proved that a negatively curved locally symmetric metric on a manifold uniquely minimizes the volume entropy over all metrics with fixed volume (roughly, the entropy of the geodesic flow, at least when the curvature is negative).

Hamilton proved that if one rescales Ricci flow to have constant volume, then scalar curvature $R$ satisfies $R' = \Delta R + 2|\text{Ric}_0|^2 + \frac 2 3 R(R-r)$ where $\text{Ric}_0$ denotes the traceless Ricci tensor, and $r$ denotes the spatial average of the scalar curvature $R$. If the spatial minimum of $R$ is negative, then at a point achieving the minimum, $\Delta R$ is non-negative, as are the other two terms; in other words, if one does Ricci flow rescaled to have constant volume, the minimum of scalar curvature increases (this fact remains true for noncompact manifolds, if one substitutes infimum for maximum). Conversely, if one rescales to keep the infimum of scalar curvature constant, volume decreases under flow. In $3$ dimensions, Perelman shows that Ricci flow with surgery converges to the hyperbolic metric. Surgery at finite times occurs when scalar curvature blows up to positive infinity, so surgery does not affect the infimum of scalar curvature, and only makes volume smaller (since things are being cut out). Consequently, Perelman’s work implies that of all metrics on a hyperbolic $3$-manifold with the infimum of scalar curvature equal to $-6$, the constant curvature metric is the unique metric minimizing volume.

Now, the metric on $A\overline{A}$ obtained by doubling along a minimal surface is not smooth, so one cannot even define the curvature tensor. However, if one interprets scalar curvature as an “average” of Ricci curvature, and observes that a minimal surface is flat “on average”, then one should expect that the distributional scalar curvature of the metric is equal to what it would be if one doubled along a totally geodesic surface, i.e. identically equal to $-6$. So Perelman’s inequality should apply, and prove the desired volume estimate.

To make this argument rigorous, one must show that the singular metric evolves under Ricci flow, and instantaneously becomes smooth, with $R \ge -6$. A theorem of Miles Simon says that this follows if one can find a smooth background metric with uniform bounds on the curvature and its first derivatives, and which is $1+\epsilon$-bilipschitz to the singular metric. The existence of such a background metric is essentially trivial in the closed case, but becomes much more delicate in the cusped case. Basically, one needs to establish the following comparison lemma, stated somewhat informally:

Lemma: Least area surfaces in cusps of hyperbolic $3$-manifolds become asymptotically flat faster than the thickness of the cusp goes to zero.

In other words, if one lifts a least area surface $S$ to a surface $\tilde{S}$ in the universal cover, there is a (unique) totally geodesic surface $\pi$ (the “osculating plane”) asymptotic to $\tilde{S}$ at the fixed point of the parabolic element corresponding to the cusp, and satisfying the following geometric estimate. If $B_t$ is the horoball centered at the parabolic fixed point at height $t$ (for some horofunction), then the Hausdorff distance between $\tilde{S} \cap B_t$ and $\pi \cap B_t$ is $o(e^{-t})$. One must further prove that if a surface $S$ has multiple ends in a single cusp, these ends osculate distinct geodesic planes. Given this, it is not too hard to construct a suitable background metric. Between ends of $S$, the geometry looks more and more like a slab wedged between two totally geodesic planes. The double of this is a nonsingular hyperbolic manifold, so it certainly enjoys uniform control on the curvature and its first derivatives; this gives the background metric in the thin part. In the thick part, one can convolve the singular metric with a bump function to find a bilipschitz background metric; compactness of the thick part implies trivially that any smooth metric enjoys uniform bounds on the curvature and its first derivatives. Hence one may apply Simon, and then Perelman, and the volume estimate is proved.

The Seifert fibered case is very fiddly, but ultimately does not require many new ideas. The assembly complexity turns out to be surprisingly involved. Essentially, one thinks of the JSJ decomposition as defining a decorated graph, whose vertices correspond to the pieces in the decomposition, and whose edges control the gluing along tori. One must prove an analogue of the topological Cauchy-Schwarz inequality in the context of (decorated) graphs. This ends up looking much more like the familiar TQFT picture of tensor networks, but a more detailed discussion will have to wait for another post.

In a previous post, I discussed some methods for showing that a given group contains a (nonabelian) free subgroup. The methods were analytic and/or dynamical, and phrased in terms of the existence (or nonexistence) of certain functions on $G$ or on spaces derived from $G$, or in terms of actions of $G$ on certain spaces. Dually, one can try to find a free group in $G$ by finding a homomorphism $\rho: F \to G$ and looking for circumstances under which $\rho$ is injective.

For concreteness, let $G = \pi_1(X)$ for some (given) space $X$. If $F$ is a free group, a representation $\rho:F \to G$ up to conjugation determines a homotopy class of map $f: S \to X$ where $S$ is a $K(F,1)$. The most natural $K(F,1)$‘s to consider are graphs and surfaces (with boundary). It is generally not easy to tell whether a map of a graph or a surface to a topological space is $\pi_1$-injective at the topological level, but might be easier if one can use some geometry.

Example: Let $X$ be a complete Riemannian manifold with sectional curvature bounded above by some negative constant $K < 0$. Convexity of the distance function in a negatively curved space means that given any map of a graph $f:\Gamma \to X$ one can flow $f$ by the negative gradient of total length until it undergoes some topology change (e.g. some edge shrinks to zero length) or it (asymptotically) achieves a local minimum (the adjective “asymptotically” here just means that the flow takes infinite time to reach the minimum, because the size of the gradient is small when the map is almost minimum; there are no analytic difficulties to overcome when taking the limit). A typical topological change might be some loop shrinking to a point, thereby certifying that a free summand of $\pi_1(\Gamma)$ mapped trivially to $G$ and should have been discarded. Technically, one probably wants to choose $\Gamma$ to be a trivalent graph, and when some interior edge collapses (so that four points come together) to let the $4$-valent vertex resolve itself into a pair of $3$-valent vertices in whichever of the three combinatorial possibilities is locally most efficient. The limiting graph, if nonempty, will be trivalent, with geodesic edges, and vertices at which the three edges are all (tangentially) coplanar and meet at angles of $2\pi/3$. Such a graph can be certified as $\pi_1$-injective provided the edges are sufficiently long (depending on the curvature $K$). After rescaling the metric on $X$ so that the supremum of the curvatures is $-1$, a trivalent geodesic graph with angles $2\pi/3$ at the vertices and edges at least $2\tanh^{-1}(1/2) = 1.0986\cdots$ is $\pi_1$-injective. To see this, lift to maps between universal covers, i.e. consider an equivariant map from a tree $\widetilde{\Gamma}$ to $\widetilde{X}$. Let $\ell$ be an embedded arc in $\widetilde{\Gamma}$, and consider the image in $\widetilde{X}$. Using Toponogov’s theorem, one can compare with a piecewise isometric map from $\ell$ to $\mathbb{H}^n$. The worst case is when all the edges are contained in a single $\mathbb{H}^2$, and all corners “bend” the same way. Providing the image does not bend as much as a horocircle, the endpoints of the image of $\ell$ stay far away in $\mathbb{H}^2$. An infinite sided convex polygon in $\mathbb{H}^2$ with all edges of length $2\tanh^{-1}(1/2)$ and all angles $2\pi/3$ osculates a horocycle, so we are done.

Remark: The fundamental group of a negatively curved manifold is word-hyperbolic, and therefore contains many nonabelian free groups, which may be certified by pingpong applied to the action of the group on its Gromov boundary. The point of the previous example is therefore to certify that a certain subgroup is free in terms of local geometric data, rather than global dynamical data (so to speak). Incidentally, I would not swear to the correctness of the constants above.

Example: A given free group is the fundamental group of a surface with boundary in many different ways (this difference is one of the reasons that a group like $\text{Out}(F_n)$ is so much more complicated than the mapping class group of a surface). Pick a realization $F = \pi_1(S)$. Then a homomorphism $\rho:F \to G$ up to conjugacy determines a homotopy class of map from $S$ to $X$ as above. If $X$ is negatively curved as before, each boundary loop is homotopic to a unique geodesic, and we may try to find a “good” map $f:S \to X$ with boundary on these geodesics. There are many possible classes of good maps to consider:

1. Fix a conformal structure on $S$ and pick a harmonic map in the homotopy class of $f$. Such a map exists since the target is nonpositively curved, by the famous theorem of Eells-Sampson. The image is real analytic if $X$ is, and is at least as negatively curved as the target, and therefore there is an a priori upper bound on the intrinsic curvature of the image; if the supremum of the curvature on $X$ is normalized to be $-1$, then the image surface is $\text{CAT}(-1)$, which just means that pointwise it is at least as negatively curved as hyperbolic space. By Gauss-Bonnet, one obtains an a priori bound on the area of the image of $S$ in terms of the Euler characteristic (which just depends on the rank of $F$). On the other hand, this map depends on a choice of marked conformal structure on $S$, and the space of such structures is noncompact.
2. Vary over all conformal structures on $S$ and choose a harmonic map of least energy (if one exists) or find a sequence of maps that undergo a “neck pinch” as a sequence of conformal structures on $S$ degenerates. Such a neck pinch exhibits a simple curve in $S$ that is essential in $S$ but whose image is inessential in $X$; such a curve can be compressed, and the topology of $S$ simplified. Since each compression increases $\chi$, after finitely many steps the process terminates, and one obtains the desired map. This is Schoen-Yau‘s method to construct a stable minimal surface representative of $S$. When the target is $3$-dimensional, the surface may be assumed to be unbranched, by a trick due to Osserman.
3. Following Thurston, pick an ideal triangulation of $S$ (i.e. a geodesic lamination of $S$ whose complementary regions are all ideal triangles); since $S$ has boundary, we may choose such a lamination by first picking a triangulation (in the ordinary sense) with all vertices on $\partial S$ and then “spinning” the vertices to infinity. Unless $\rho$ factors through a cyclic group, there is some choice of lamination so that the image of $f$ can be straightened along the lamination, and then the image spanned with $CAT(-1)$ ideal triangles to produce a pleated surface in $X$ representing $f$ (note: if $X$ has constant negative curvature, these ideal triangles can be taken to be totally geodesic). The space of pleated surfaces in fixed (closed) $X$ of given genus is compact, so this is a reasonable class of maps to work with.
4. If $G$ is merely a hyperbolic group, one can still construct pleated surfaces, not quite in $X$, but equivariantly in Mineyev’s flow space associated to $\widetilde{X}$. Here we are not really thinking of the triangles themselves, but the geodesic laminations they bound (which carry the same information).
5. If $X$ is complete and $3$-dimensional but noncompact, the space of pleated surfaces of given genus is generally not compact, and it is not always easy to find a pleated surface where you want it. This can sometimes be remedied by shrinkwrapping; one looks for a minimal/pleated/harmonic surface subject to the constraint that it cannot pass through some prescribed set of geodesics in $X$ (which act as “barriers” or “obstacles”, and force the resulting surface to end up roughly where one wants it to).

Anyway, one way or another, one can usually find a map of a surface, or a space of maps of surfaces, representing a given homomorphism, with some kind of a priori control of the geometry. Usually, this control is not enough to certify that a given map is $\pi_1$-injective, but sometimes it might be. For instance, a totally geodesic (immersed) surface in a complete manifold of constant negative curvature is always $\pi_1$-injective, and any surface whose extrinsic curvature is small enough will also be $\pi_1$-injective.

Geometric methods to certify injectivity of free or surface groups are very useful and flexible, as far as they go. Unfortunately, I know of very few topological methods to certify injectivity. By far the most important exception is the following:

Example: In $3$-dimensions, one should look for properly embedded surfaces. If $M$ is a $3$-manifold (possibly with boundary), and $S$ is a two-sided properly embedded surface, the famous Dehn’s Lemma (proved by Papakyriakopoulos) implies that either $S$ is $\pi_1$-injective, or there is an embedded essential loop in $S$ that bounds an embedded disk in $M$ on one side of $S$. Such a loop may be compressed (i.e. $S$ may be cut open along the loop, and two copies of the compressing disk sewn in) preserving the property of embeddedness, but increasing $\chi$. After finitely many steps, either $S$ compresses away entirely, or one obtains a $\pi_1$-injective surface. One way to ensure that $S$ does not compress away entirely is to start with a surface that is essential in (relative) homology; another way is to look for a surface dual to an action (of $\pi_1(M)$) on a tree. In the latter case, one can often construct quite different free subgroups in $\pi_1(M)$ by pingpong on the ends of the tree. Note by the way that this method produces closed surface subgroups as well as free subgroups. Note too that two-sidedness is essential to apply Dehn’s Lemma.

Remark: Modern $3$-manifold topologists are sometimes unreasonably indifferent to the power of Dehn’s Lemma (probably because this tool has been incorporated so fully into their subconscious?); it is worth reading Ralph Fox’s review of Papakyriakopoulos’s paper (linked above). Of this paper, he writes:

. . . it has already led to renewed attack on the problem of classifying the 3-dimensional manifolds; significant results have been and are being obtained. A complete solution has suddenly become a definite possibility.

Remember this was written more than 50 years ago — before the geometrization conjecture, before the JSJ decomposition, before the Scott core theorem, before Haken manifolds. The only reasonable reaction to this is: !!!

Example: The construction of injective surfaces by Dehn’s Lemma may be abstracted in the following way. Given a target space $X$, and a class of maps $\mathcal{F}$ of surfaces into $X$ (in some category; e.g. homotopy classes of maps, pleated surfaces, $\text{CAT}(-1)$ surfaces, etc.) suppose one can find a complexity $c:\mathcal{F} \to \mathcal{O}$ with values in some ordered set, such that if $f \in \mathcal{F}$ is not injective, one can find $f' \in \mathcal{F}$ of smaller complexity. Then if $\mathcal{O}$ is well-ordered, an injective surface may be found. If $\mathcal{O}$ is not well-ordered, one may ask at least that $c$ is upper semi-continuous on $\mathcal{F}$, and hope to extend it upper semi-continuously to some suitable compactification of $\mathcal{F}$. Even if $\mathcal{O}$ is not well-ordered, one can at least certify that a map is injective, by showing that it minimizes $c$. Here are some potential examples (none of them entirely satisfactory).

1. Given a (homologically trivial) homotopy class of loop $\gamma$ in $X$, one can look at all maps of orientable surfaces $S$ to $X$ with boundary factoring through $\gamma$. For such a surface, let $n(S)$ denote the degree with which the (possibly multiple) boundary (components) of $S$ wrap homologically around $\gamma$, and let $-\chi^-(S)$ denote the sum of Euler characteristics of non-disk and non-sphere components of $S$. For each surface $S$, one considers the quantity $-\chi^-(S)/2n(S)$ (the factor of $2$ can be ignored if desired). The important feature of this quantity is that it does not change if $S$ is replaced by a finite cover. If $\pi_1(S)$ is not injective, let $\alpha$ be an essential loop on $S$ whose image in $X$ is inessential. Peter Scott showed that any essential loop on a surface lifts to an embedded loop in some finite cover. Hence, after passing to such a cover, $\alpha$ may be compressed, and the resulting surface $S'$ satisfies $-\chi^-(S')/2n(S') < -\chi^-(S)/2n(S)$. In other words, a global minimizer of this quantity is injective. Such a surface is called extremal. The problem is that extremal surfaces do not always exist; but this construction motivates one to look for them.
2. Given a $\text{CAT}(-1)$ surface $S$ with geodesic boundary in $X$, one can retract $S$ to a geodesic spine, and encode the surface by the resulting fatgraph, with edges labelled by homotopy classes in $X$. Since Euler characteristic is local, one does not really care precisely how the pieces of the fatgraph are assembled, but only how many pieces of what kinds are needed for a given boundary. So if only finitely many such pieces appear in some infinite family of surfaces, one can in fact construct an extremal surface as above, which is necessarily injective (more technically, one reduces the computation of Euler characteristic to a linear programming problem, finds a rational extremal solution (which corresponds to a weighted sum of pieces of fatgraph), and glues together the pieces to construct the extremal surface; one situation in which this scheme can be made to work is explained in this paper of mine). Edges can be subdivided into a finite number of possibilities, so one just needs to ensure finiteness of the number of vertex types. One condition that ensures finiteness of vertex types is the existence of a uniform constant $C>0$ so that for each surface $S$ in the given family, and for each point $p \in S$, there is an estimate $\text{dist}(p,\partial S) \le C$. If this condition is violated, one finds pairs $p_i,S_i$ which converge in the geometric topology to a point in a complete (i.e. without boundary, but probably noncompact) surface.
3. Given $S \to X$, either compress an embedded essential loop, or realize $S$ by a least area surface. If $S$ is not injective, pass to a cover, compress a loop, and realize the result by a least area surface. Repeat this process. One obtains in this way a sequence of least area surfaces in $X$ (typically of bigger and bigger genus) and there is no reason to expect the process to terminate. If $X$ is a $3$-manifold, the curvature of a least area surface admits two-sided curvature bounds away from the boundary, by a theorem of Schoen (near the boundary, the negative curvature might blow up, but only in controlled ways — e.g. after rescaling about a sequence of points with the most negative curvature, one may obtain in the limit a helicoid). Away from the boundary, the family of surfaces one obtains vary precompactly in the $C^\infty$ topology, and one may obtain a complete locally least area lamination $\Lambda$ in the limit. If $\pi_1(\Lambda)$ is not injective, one can continue to pass to covers (applying a version of Scott’s theorem for infinite surfaces) and compress, and by transfinite induction, eventually arrive at a locally least area lamination with injective $\pi_1$. Of course, such a limit might well be a lamination by planes. However, the lamination one obtains is not completely arbitrary: since it is a limit of limits of . . . compact surfaces, one can choose a limit that admits a nontrivial invariant transverse measure (one must be careful here, since the lamination will typically have boundary). Or, as in bullet 2. above, one may insist that this limit lamination is complete (i.e. without boundary).

It is more tricky to find a limit lamination as in 3. without boundary and admitting an invariant transverse measure; in any case, this motivates the following:

Question: Is there a closed hyperbolic $3$-manifold $M$ which admits a locally least area transversely measured complete immersed lamination $\Lambda$, all of whose leaves are disks? (note that the answer is negative if one asks for the lamination to be embedded (there are several easy proofs of this fact)).

Secretly, the function that assigns $\inf_S -\chi^-(S)/2n(S)$ to a homologically trivial loop $\gamma$ is the stable commutator length of the conjugacy class in $\pi_1(X)$ represented by $\gamma$. Extremal surfaces can sometimes be certified by constructing certain functions on $\pi_1(X)$ called homogeneous quasimorphisms, but a discussion of such functions will have to wait for another post.