You are currently browsing the tag archive for the ‘Schwarzian derivative’ tag.

Hermann Amandus Schwarz (1843-1921) was a student of Kummer and Weierstrass, and made many significant contributions to geometry, especially to the fields of minimal surfaces and complex analysis. His mathematical creations are both highly abstract and flexible, and at the same time intimately tied to explicit and practical calculation.

I learned about Schwarz-Christoffel transformations, Schwarzian derivatives, and Schwarz’s minimal surface as three quite separate mathematical objects, and I was very surprised to discover firstly that they had all been discovered by the same person, and secondly that they form parts of a consistent mathematical narrative, which I will try to explain in this post to the best of my ability. There is an instructive lesson in this example (for me), that we tend to mine the past for nuggets, examples, tricks, formulae etc. while forgetting the points of view and organizing principles that made their discovery possible. Another teachable example is that of Dehn’s “invention” of combinatorial (infinite) group theory, as a natural branch of geometry; several generations of followers went about the task of reformulating Dehn’s insights and ideas in the language of algebra, “generalizing” them and stripping them of their context, before geometric and topological methods were reintroduced by Milnor, Schwarz (a different one this time), Stallings, Thurston, Gromov and others to spectacular effect (note: I have the second-hand impression that the geometric point of view in group theory (and every other subject) was never abandoned in the Soviet Union).

Schwarz’s minimal surface (also called “Schwarz’s D surface”, and sometimes “Schwarz’s H surface”) is an extraordinarily beautiful triply-periodic minimal surface of infinite genus that is properly embedded in \mathbb{R}^3. According to Nitsche’s excellent book (p.240), this minimal surface closely resembles the separating wall between inorganic and organic materials in the skeleton of a starfish. The basic building block of the surface can be described as follows. If the vertices of a cube are 2-colored, the black vertices are the vertices of a regular tetrahedron. Let Q denote the quadrilateral formed by four edges of this tetrahedron; then a fundamental piece S of Schwarz’s surface is a minimal disk spanning Q:


The surface may be “analytically continued” by rotating Q through an angle \pi around each boundary edge. Six copies of Q fit smoothly around each vertex, and the resulting surface extends (triply) periodically throughout space.

The symmetries of Q enable us to give it several descriptions as a Riemann surface. Firstly, we could think of Q as a polygon in the hyperbolic plane with four edges of equal length, and angles \pi/3. Twelve copies of Q can be assembled to make a hyperbolic surface \Sigma of genus 3. Thinking of a surface of genus 3 as the boundary of a genus 3 handlebody defines a homomorphism from \pi_1(\Sigma) to \mathbb{Z}^3, thought of as H_1(\text{handlebody}); the cover \widetilde{\Sigma} associated to the kernel is (conformally) the triply periodic Schwarz surface, and the deck group acts on \mathbb{R}^3 as a lattice (of index 2 in the face-centered cubic lattice).

Another description is as follows. Since the deck group acts by translation, the Gauss map from \widetilde{\Sigma} to S^2 factors through a map \Sigma \to S^2. The map is injective at each point in the interior or on an edge of a copy of Q, but has an order 2 branch point at each vertex. Thus, the map \Sigma \to S^2 is a double-branched cover, with one branch point of order 2 at each vertex of a regular inscribed cube. This leads one to think (like a late 19th century mathematician) of \Sigma as the Riemann surface on which a certain multi-valued function on S^2 = \mathbb{C} \cup \infty is single-valued. Under stereographic projection, the vertices of the cube map to the eight points \lbrace \alpha,i\alpha,-\alpha,-i\alpha,1/\alpha,i/\alpha,-1/\alpha,-i/\alpha \rbrace where \alpha = (\sqrt{3}-1)/\sqrt{2}. These eight points are the roots of the polynomial w^8 - 14w^4 + 1, so we may think of \Sigma as the hyperelliptic Riemann surface defined by the equation v^2 = w^8 - 14w^4 + 1; equivalently, as the surface on which the multi-valued (on \mathbb{C} \cup \infty) function R(w):= 1/v=1/\sqrt{w^8 - 14w^4 + 1} is single-valued.

The function R(w) is known as the Weierstrass function associated to \Sigma, and an explicit formula for the co-ordinates of the embedding \widetilde{\Sigma} \to \mathbb{R}^3 were found by Enneper and Weierstrass. After picking a basepoint (say 0) on the sphere, the coordinates are given by integration:

x = \text{Re} \int_0^{w_0} \frac{1}{2}(1-w^2)R(w)dw

y = \text{Re} \int_0^{w_0} \frac{i}{2}(1+w^2)R(w)dw

z = \text{Re} \int_0^{w_0} wR(w)dw

The integral in each case depends on the path, and lifts to a single-valued function precisely on \widetilde{\Sigma}.

Geometrically, the three coordinate functions x,y,z are harmonic functions on \widetilde{\Sigma}. This corresponds to the fact that minimal surfaces are precisely those with vanishing mean curvature, and the fact that the Laplacian of the coordinate functions (in terms of isothermal parameters on the underlying Riemann surface) can be expressed as a nonzero multiple of the mean curvature vector. A harmonic function on a Riemann surface is the real part of a holomorphic function, unique up to a constant; the holomorphic derivative of the (complexified) coordinate functions are therefore well-defined, and give holomorphic 1-forms \phi_1,\phi_2,\phi_3 which descend to \Sigma (since the deck group acts by translations). These 1-forms satisfy the identity \sum_i \phi_i^2 = 0 (this identity expresses the fact that the embedding of \widetilde{\Sigma} into \mathbb{R}^3 via these functions is conformal). The (composition of the) Gauss map (with stereographic projection) can be read off from the \phi_i, and as a meromorphic function on \Sigma, it is given by the formula w = \phi_3/(\phi_1 - i\phi_2). Define a function f on \Sigma by the formula fdw = \phi_1 - i\phi_2. Then 1/f,w are the coordinates of a rational map from \Sigma into \mathbb{C}^2 which extends to a map into \mathbb{CP}^2, by sending each zero of f to wf = \phi_3/dw in the \mathbb{CP}^1 at infinity. Symmetry allows us to identify the image with the hyperelliptic embedding from before, and we deduce that f=R(w). Solving for \phi_1,\phi_2 we obtain the integrands in the formulae above.

In fact, any holomorphic function R(w) on a domain in \mathbb{C} defines a (typically immersed with branch points) minimal surface, by the integral formulae of Enneper-Weierstrass above. Suppose we want to use this fact to produce an explicit description of a minimal surface bounded by some explicit polygonal loop in \mathbb{R}^3. Any minimal surface so obtained can be continued across the boundary edges by rotation; if the angles at the vertices are all of the form \pi/n the resulting surface closes up smoothly around the vertices, and one obtains a compact abstract Riemann surface \Sigma tiled by copies of the fundamental region, together with a holonomy representation of \pi_1(\Sigma) into \text{Isom}^+(\mathbb{R}^3). Sometimes the image of this representation in the rotational part of \text{Isom}^+(\mathbb{R}^3) is finite, and one obtains an infinitely periodic minimal surface as in the case of Schwarz’s surface. A fundamental tile in \Sigma can be uniformized as a hyperbolic polygon; equivalently, as a region in the upper half-plane bounded by arcs of semicircles perpendicular to the real axis. Since the edges of the loop are straight lines, the image of this hyperbolic polygon under the Gauss map is a region in \mathbb{R}^3 also bounded by arcs of round circles; thus Schwarz’s study of minimal surfaces naturally led him to the problem of how to explicitly describe conformal maps between regions in the plane bounded by circular arcs. This problem is solved by the Schwarz-Christoffel transformation, and its generalizations, with help from the Schwarzian derivative.

Note that if P and Q are two such regions, then a conformal map from P to Q can be factored as the product of a map uniformizing P as the upper half-plane, followed by the inverse of a map uniformizing Q as the upper half-plane. So it suffices to find a conformal map when the domain is the upper half plane, decomposed into intervals and rays that are mapped to the edges of a circular polygon Q. Near each vertex, Q can be moved by a fractional linear transformation z \to (az+b)/(cz+d) to (part of) a wedge, consisting of complex numbers with argument between 0 and \alpha, where \alpha is the angle at Q. The function f(z) = z^{\alpha/\pi} uniformizes the upper half-plane as such a wedge; however it is not clear how to combine the contributions from each vertex, because of the complicated interaction with the fractional linear transformation. The fundamental observation is that there are certain natural holomorphic differential operators which are insensitive to the composition of a holomorphic function with groups of fractional linear transformations, and the uniformizing map can be expressed much more simply in terms of such operators.

For example, two functions that differ by addition of a constant have the same derivative: f' = (f+c)'. Functions that differ by multiplication by a constant have the same logarithmic derivative: (\log(f))' = (\log(cf))'. Putting these two observations together suggest defining the nonlinearity of a function as the composition N(f):= (\log(f'))' = f''/f'. This has the property that N(af+b) = N(f) for any constants a,b. Under inversion z \to 1/z the nonlinearity transforms by N(1/f) = N(f) - 2f'/f. From this, and a simple calculation, one deduces that the operator N' - N^2/2 is invariant under inversion, and since it is also invariant under addition and multiplication by constants, it is invariant under the full group of fractional linear transformations. This combination is called the Schwarzian derivative; explicitly, it is given by the formula S(f) = f'''/f' - 3/2(f''/f')^2. Given the Schwarzian derivative S(f), one may recover the nonlinearity N(f) by solving the Ricatti equation N' - N^2/2 - S = 0. As explained in this post, solutions of the Ricatti equation preserve the projective structure on the line; in this case, it is a complex projective structure on the complex line. Equivalently, different solutions differ by an element of \text{PSL}(2,\mathbb{C}), acting by fractional linear transformations, as we have just deduced. Once we know the nonlinearity, we can solve for f by f = \int e^{\int N}, the usual solution to a first order linear inhomogeneous ODE. The Schwarzian of the function z^{\alpha/\pi} is (1-\alpha^2/\pi^2)/2z^2. The advantage of expressing things in these terms is that the Schwarzian of a uniformizing map for a circular polygon Q with angles \alpha_i at the vertices has the form of a rational function, with principal parts a_i/(z-z_i)^2 + b_i/(z-z_i), where the a_i = (1-\alpha_i^2/\pi^2)/2 and the b_i and z_i depend (unfortunately in a very complicated way) on the edges of Q (for the ugly truth, see Nehari, chapter 5). To see this, observe that the map has an order two pole near finitely many points z_i (the preimages of the vertices of Q under the uniformizing map) but is otherwise holomorphic. Moreover, it can be analytically continued into the lower half plane across the interval between successive z_i, by reflecting the image across each circular edge. After reflecting twice, the image of Q is transformed by a fractional linear transformation, so S(f) has an analytic continuation which is single valued on the entire Riemann sphere, with finitely many isolated poles, and is therefore a rational function! When the edges of the polygon are straight, a simpler formula involving the nonlinearity specializes to the “familiar” Schwarz-Christoffel formula.

(Update 10/22): In fact, I went to the library to refresh myself on the contents of Nehari, chapter 5. The first thing I noticed — which I had forgotten — was that if f is the uniformizing map from the upper half-plane to a polygon Q with spherical arcs, then S(f) is real-valued on the real axis. Since it is a rational function, this implies that its nonsingular part is actually a constant; i.e.

S(f) = \sum _i a_i/(z-z_i)^2 + b_i/(z-z_i) + c

where a_i is as above, and z_i,b_i,c are real constants (which satisfy some further conditions — really see Nehari this time for more details).

The other thing that struck me was the first paragraph of the preface, which touches on some of the issues I alluded to above:

In the preface to the first edition of Courant-Hilbert’s “Methoden der mathematischen Physik”, R. Courant warned against a trend discernible in modern mathematics in which he saw a menace to the future development of mathematical analysis. He was referring to the tendency of many workers in this field to lose sight of the roots of mathematical analysis in physical and geometric intuition and to concentrate their efforts on the refinement and the extreme generalization of existing concepts.

Instead of using a word like “menace”, I would rather take this as a lesson about the value of returning to the points of view that led to the creation of the mathematical objects we study every day; which was (to some approximation) the point I was trying to illustrate in this post.

Quadratic forms (i.e. homogeneous polynomials of degree two) are fundamental mathematical objects. For the ancient Greeks, quadratic forms manifested in the geometry of conic sections, and in Pythagoras’ theorem. Riemann recognized the importance of studying abstract smooth manifolds equipped with a field of infinitesimal quadratic forms (i.e. a Riemannian metric), giving rise to the theory of Riemannian manifolds. In contrast to more general norms, an inner product on a vector space enjoys a big group of symmetries; thus infinitesimal Riemannian geometry inherits all the richness of the representation theory of orthogonal groups, which organizes the various curvature tensors and Weitzenbock formulae. It is natural that quadratic forms should come up in so many distinct ways in differential geometry: one uses calculus to approximate a smooth object near some point by a linear object, and the “difference” is a second-order term, which can often be interpreted as a quadratic form. For example:

  1. If M is a Riemannian manifold, at any point p one can choose an orthonormal frame for T_p M, and exponentiate to obtain geodesic normal co-ordinates. In such local co-ordinates, the metric tensor g_{ij} satisfies g_{ij}(p)=\delta_{ij} and \partial_kg_{ij}(p) = 0. The second order derivatives can be expressed in terms of the Riemann curvature tensor at p.
  2. If S is an immersed submanifold of Euclidean space, at every point p \in S there is a unique linear subspace that is tangent to S at p. The second order difference between these two spaces is measured by the second fundamental form of S, a quadratic form (with coefficients in the normal bundle) whose eigenvectors are the directions of (extrinsic) principal curvature. If S has codimension one, the second fundamental form is easily described in terms of the Gauss map g: S \to S^{n-1} taking each point on S to the unique unit normal to S at that point, and using the flatness of the ambient Euclidean space to identify the normal spheres at different points with “the” standard sphere. The second fundamental form is then defined by the formula II(v,w) = \langle dg(v),w \rangle. For higher codimension, one considers Gauss maps with values in an appropriate Grassmannian.
  3. If f is a smooth function on a manifold M, a critical point p of f is a point at which df=0 (i.e. at which all the partial derivatives of f in some local coordinates vanish). At such a point, one defines the Hessian Hf, which is a quadratic form on T_pM, determined by the second partial derivatives of f at such a point. If \nabla is a Levi-Civita connection on T^*M (determined by an Riemannian metric on M compatible with the smooth structure) then Hf = \nabla df. The condition that the Levi-Civita connection is torsion-free translates into the fact that the antisymmetric part of \nabla \theta is equal to d\theta for any 1-form \theta; in this context, this means that the antisymmetric part of the Hessian vanishes — i.e. that it is symmetric (and therefore a quadratic form). If \nabla' is a different connection, then \nabla' df = \nabla df + \alpha \wedge df for some 1-form \alpha, and therefore their values at p agree, and Hf is well-defined, independent of a choice of metric.

By contrast, cubic forms are less often encountered, either in geometry or in other parts of mathematics; their appearance is often indicative of unusual richness. For example: Lie groups arise as the subgroups of automorphisms of vector spaces preserving certain structure. Orthogonal and symplectic groups are those that preserve certain (symmetric or alternating) quadratic forms. The exceptional Lie group G_2 is the group of automorphisms of \mathbb{R}^7 that preserves a generic (i.e. nondegenerate) alternating 3-form. One expects to encounter cubic forms most often in flavors of geometry in which the local transformation pseudogroups are bigger than the orthogonal group.

One example is that of 1-dimensional complex projective geometry. If U is a domain in the Riemann sphere, one can think of U as a geometric space in at least two natural ways: by considering the local pseudogroup of all holomorphic self-maps between open subsets of the Riemann sphere, restricted to U (i.e. all holomorphic functions), or by considering only those holomorphic maps that extend to the entire Riemann sphere (i.e. the projective transformations: z \to \frac {az+b} {cz+d}). The difference between these two geometric structures is measured by a third-order term, called the Schwarzian derivative. If U is homeomorphic to a disk, then we can think of U as the image of the round unit disk D under a uniformizing map f. At every point p \in D there is a unique projective transformation f_p that osculates to f to second order at p (i.e. has the same value, first derivative, and second derivative as f at the point p); the (scaled) third derivative is the Schwarzian of f at p. In local co-ordinates, Sf = f'''/f' - \frac {3} {2} \left( f''/f'\right)^2. Actually, although the Schwarzian is sensitive to third-order information, it should really be thought of as a quadratic form on the (one-dimensional) complex tangent space to p.

Real projective geometry gives rise to similar invariants. Consider an immersed curve in the (real projective) plane. At every point, there is a unique osculating conic, that agrees with the immersed curve to second order. The projective curvature (really a cubic form) measures the third order deviation between these two immersed submanifolds at this point. See e.g. the book by Ovsienko and Tabachnikov for more details.

Another example is the so-called symplectic curvature. Let X be a flat symplectic space; this could be ordinary Euclidean space \mathbb{R}^{2n} with its standard symplectic form, or a quotient of such a space by a discrete group of translations. A linear subspace \pi of \mathbb{R}^{2n} through the origin is a Lagrangian subspace if it has (maximal) dimension n, and the restriction of the symplectic form to \pi is identically zero. A smooth submanifold L of dimension n is Lagrangian if its tangent space at every point is a Lagrangian submanifold. A Lagrangian submanifold of a flat symplectic space inherits a natural cubic form on the tangent space at every point, which can be defined in any of the following equivalent ways:

  1. If W is a symplectic manifold and L is a Lagrangian submanifold, then near any point p one can find a neighborhood U and choose symplectic coordinates so that U is symplectomorphic to a neighborhood of some point in T^*L. Moreover, every other Lagrangian submanifold L' sufficiently close (in C^1) to L can be taken in some possibly smaller neighborhood to be of the form df, where f is a smooth function on L (well-defined up to a constant), thought of as a section of T^*L. In the context above, choose local symplectic coordinates (by a linear symplectic transformation) for which the flat space looks locally like T^*\pi and L looks locally like df. The condition that \pi and L are tangent at the origin means that the 2-jet of f vanishes. The first nonvanishing term are the third partial derivatives of f, which can be thought of as the coefficients of a (symmetric) cubic form on \pi.
  2. If we choose a Euclidean metric on X compatible with the flat symplectic structure, the second fundamental form of L at some point is a quadratic form on \pi with coefficients in the normal bundle to \pi. The symplectic form identifies the normal \pi^\perp to \pi with the dual \pi^*, so by contracting indices, one obtains a cubic form on \pi. This form does not depend on the choice of Euclidean metric, since a different metric skews the normal bundle \pi^\perp replacing it with \pi^\perp + \alpha\pi. But since \pi is Lagrangian, the identification of this normal bundle with \pi^* is insensitive to the skewed term, and therefore independent of the choices.
  3. The space of all Lagrangian subspaces \Lambda of \mathbb{R}^{2n} is a symmetric space, homeomorphic to U(n)/O(n), sometimes called the Shilov boundary of the Siegel upper half-space. If \pi \in \Lambda and \pi'_0 is a tangent vector to \pi in \Lambda, then one obtains a symmetric quadratic form on \pi in the following way. If \sigma is a transverse Lagrangian to \pi, and \pi_t is a 1-parameter family of Lagrangians starting at \pi, then for small  t the Lagrangians \pi_t and \sigma are transverse, and span \mathbb{R}^{2n}. For any v \in \mathbb{R}^{2n} there is a unique decomposition v = v(\pi_t) + v(\sigma). Define q_t(v,w) = \omega(v(\pi_t),w(\sigma)). Then q'_0 is a symmetric bilinear form that vanishes on \sigma, and therefore descends to a form on \pi that depends only on \pi'_0. A Lagrangian submanifold L maps to \Lambda by the Gauss map g. One obtains a cubic form on \pi associated to L as follows: if u,v,w \in \pi then dg(u) is a tangent vector to \pi in \Lambda, and therefore determines a quadratic form on \pi; this form is then evaluated on the vectors v,w.

One application of symplectic curvature is to homological mirror symmetry, where the symplectic curvature associated to a Lagrangian family of Calabi-Yau 3-folds Y in H^3(Y) determines the so-called “Yukawa 3-differential”, whose expression in a certain local coordinate gives the generating function for the number of rational curves of degree d in a generic quintic hypersurface in \mathbb{CP}^4. This geometric picture is described explicitly in the work of Givental (e.g. here). In another more recent paper, Givental shows how the topological recursion relations, the string equation and the dilaton equation in Gromov-Witten theory can be reformulated in terms of the geometry of a certain Lagrangian cone in a formal loop space (the geometric property of this cone is that it is overruled — i.e. each tangent space L is tangent to the cone exactly along zL, where z is a formal variable). This geometric condition translates into properties of the symplectic curvature of the Lagrangian cone, from which one can read off the “gravitational descendents” in the theory (let me add that this subject is quite far from my area of expertise, and that I come to this material as an interested outsider).

Cubic forms occur naturally in other “special” geometric contexts, e.g. holomorphic symplectic geometry (Rozansky-Witten invariants), affine differential geometry (related to the discussion of the Schwarzian above), etc. Each of these contexts is the start of a long story, which is best kept for another post.


Get every new post delivered to your Inbox.

Join 156 other followers