The title of this post is a nod to the excellent and well-known Div, grad, curl and all that by Harry Schey (and perhaps also to the lesser-known sequel to one of the more consoling histories of Great Britain), and the purpose is to explain how to generalize these differential operators (familiar to electrical engineers and undergraduates taking vector calculus) and a few other ones from Euclidean 3-space to arbitrary Riemannian manifolds. I have a complicated relationship with the subject of Riemannian geometry; when I reviewed Dominic Joyce’s book Riemannian holonomy groups and calibrated geometry for SIAM reviews a few years ago, I began my review with the following sentence:

Riemannian manifolds are not primitive mathematical objects, like numbers, or functions, or graphs. They represent a compromise between local Euclidean geometry and global smooth topology, and another sort of compromise between precognitive geometric intuition and precise mathematical formalism.

Don’t ask me precisely what I meant by that; rather observe the repeated use of the key word compromise. The study of Riemannian geometry is — at least to me — fraught with compromise, a compromise which begins with language and notation. On the one hand, one would like a language and a formalism which treats Riemannian manifolds on their own terms, without introducing superfluous extra structure, and in which the fundamental objects and their properties are highlighted; on the other hand, in order to actually compute or to use the all-important tools of vector calculus and analysis one must introduce coordinates, indices, and cryptic notation which trips up beginners and experts alike.

Actually, my complicated relationship began the first time I was introduced to vectors. It was 1986, I was at a training camp for Australian mathematics olympiad hopefuls, and Ben Robinson gave me a 2 minute introduction to the subject over lunch. I found the notation overwhelming, and there was no connection in my mind between the letters and subscripts on one side of the page and the squiggly arrows and parallelograms on the other side. By the time the subject came up again a few years later in high school, somehow the mystery had faded, and the vocabulary and meaning of vectors, inner products, determinants etc. was crystal clear. I think that the difference this time around was that I concentrated first on learning what vectors were, and only when I had gotten the point did I engage with the question of how to represent them or calculate with them. In a similar way, my introduction to div, grad and curl was equally painless, since we learned the subject in physics class (in the last couple of years of high school) in the context of classical electrodynamics. I might have been challenged to grasp the abstract idea of a “vector field” as it is introduced in some textbooks, but those little pictures of lines of force running from positive to negative charges made immediate and intuitive sense. In fact, the whole idea of describing a vector field as a partial differential operator such as \frac {\partial} {\partial x_i} obscures an enormous complexity; it’s easy enough to compute with an expression like this, but as a mathematical object itself it is quite sophisticated, since even to define it we need not just one coordinate x_i but an entire system of coordinates on some nearby smooth patch. Contrast this with the intuitive idea of a particle moving along a line of force, and being subjected to some influence which varies along the trajectory. I’m grateful to whoever designed the Melbourne high school science curriculum in the late 1980′s for integrating the maths and physics curricula so successfully.

A few years later, as an undergraduate at the University of Melbourne, I was attending Marty Ross’ reading group as we attempted to go through Cheeger and Ebin’s Comparison theorems in Riemannian geometry, and the confusion was back. Noel Hicks’ MathSciNet review calls this book a “tight, elegant, and delightful addition to the literature on global Riemannian geometry”, although he remarks that the “tightness of the exposition and a few misprints leave the reader with some challenging work”. Today I love this book, and recommend it to anyone; but at the time it was a terrible book to learn Riemannian geometry from for the first time (actually, since I was not a maths major, my confusion was amplified by many gaps in my intermediate education). Some aspects of the book I could appreciate — at least we were not drowning in indices, and the formulae were almost readable. But I was simply at a loss to understand the rules of the game — what sort of manipulations of formulae were allowed? how do you contract a vector field with a form? why am I allowed to choose coordinates at this point so that everything magically simplifies? how would anyone ever stumble on the formula for the Ricci curvature and see that it was invariant and had such nice properties? and so on.

And yet again, the duration of a couple of years made a world of difference. As a graduate student at Berkeley taking classes from Shoshichi Kobayashi and Sasha Givental, suddenly everything made sense (well, not everything, but at least the rudiments of Riemannian geometry). The difference again was that the notation and the calculations followed a discussion of what the objects were, and what information they contained and why you might want to use them or talk about them. And, crucially, this initial discussion was carried out first informally in words rather than by beginning with a formal definition or a formula.

So with this backstory in mind, I hope it might be useful to the graduate student out there who is struggling with the elements of the tensor calculus to go through a brief informal discussion of the meaning of some of the basic differential operators, which are the ingredients out of which much of the beauty of the subject can be synthesized.

Let’s get down to brass tacks. We start with a smooth manifold M and a vector field X. What is a vector field? For me I always think of it dynamically as a flow: the manifold is something like a fluid, and an object in M will be swept along by this flow and moved along the flowlines, or integral curves of the vector field. On a smooth manifold without a metric it doesn’t make sense to talk about whether the flow is moving “fast” or “slow”, but it does make sense to look at the places where it is stationary (the zeros of the vector field) and see whether the zeros are isolated or not, stable or unstable, or come in families. If f is a smooth function on M, the value of f varies along the integral curves of the vector field, and we can look at the rate at which the value changes; this is the derivative of f in the direction X, and denoted Xf. It is a smooth function on M; we can iterate this procedure and compute X(Xf), X(X(Xf)) and so on. The level sets of a smooth function f are (generically) smooth manifolds, and the whole idea of calculus is to approximate smooth things locally by linear things; thus generically through most points we can look at the level set of f through that point, and the tangent space to that level set. This is a hyperplane, and is spanned locally by the vector fields for which Xf is zero at the given point. More precisely, we can define a 1-form df just by setting df(X) = Xf; where df is nonzero, the kernel of df is the tangent space to the level set to f as described above.

Grad. Now we introduce a Riemannian metric, which is a smooth choice of inner product on the tangent space at each point. It does two things for us: first, it lets us talk about the speed of a flow generated by a vector field X (or equivalently, the size of the vectors); and second, it lets us measure the angle between two vectors at each point, in particular it lets us say what it means for vectors to be perpendicular. If f is a smooth function on a Riemannian manifold, we can do more than just construct the level sets of f; we can ask in which direction the value of f increases the fastest (and we can further ask how fast it increases in that direction). The answer to this question is the gradient; the gradient of f is a vector field which points always in the direction in which f increases the fastest, and with a magnitude proportional to the rate at which it increases there. In terms of the level sets of the function f, any vector field can be decomposed into a part which is tangent to the level sets (this is the part of the vector field whose flow keeps f unchanged) and a part which is perpendicular to it; the gradient is thus everywhere perpendicular to the level sets of f.

The inner product \langle \cdot,\cdot\rangle lets us give isomorphisms between vector fields and 1-forms called the sharp and flat isomorphisms. If \alpha is a 1-form, and X is a vector field, we define the vector field \alpha^\sharp and the 1-form X^\flat by the formulae

\langle \alpha^\sharp,X\rangle = \alpha(X) and X^\flat(Y) = \langle X,Y\rangle

Sharp and flat are inverse operations. In words, a vector field and a 1-form are related by these operations if at each point they have the same magnitude, and the direction of the vector field is perpendicular to the kernel of the 1-form (i.e. the tangent space on which the 1-form vanishes). Using these isomorphisms, the gradient of a function f is just the vector field obtained by applying the sharp isomorphism to the 1-form df. In other words, it is the unique vector field such that for any other vector field X there is an identity

\langle \text{grad}(f),X\rangle = df(X)

The zeros of the gradient are the critical points of f; for instance, the gradient vanishes at the minimum and the maximum of f.

Div. In Euclidean space of some dimension n, a collection of n linearly independent vectors form the edges of a parallelepiped. The volume of the parallelepiped is the determinant of the matrix whose columns are the given vectors. Actually there is a subtlety here — we need to choose an ordering of the vectors to take the determinant. A permutation might change the determinant by a factor of -1 if the sign of the permutation is odd. On an oriented Riemannian n-manifold if we have n vectors at a point, we can convert them to 1-forms and wedge them together — the result is an n-form. On an n-dimensional vector space, any two n-forms are proportional. Wedging together the 1-forms associated to a basis of perpendicular vectors of length 1 (an orthonormal collection) gives an n-form at each point which we call the volume form, and denote it dvol. For any other n-tuple of vectors the volume of the parallelepiped is equal to the ratio of the n-form they determine (by taking sharp and wedging) and the volume form.

Now, there is an operator called Hodge star which acts on differential forms as follows. A k-form \alpha can be wedged with an (n-k) form \beta to make an n-form, and this n-form can be compared in size to the volume form. We define the (n-k) form *\alpha to be the smallest form such that

\alpha \wedge *\alpha = \|\alpha\|^2 dvol

In other words, *\alpha is perpendicular to the subspace of forms \beta with \alpha \wedge \beta = 0. With this notation *dvol is the constant function equal to 1 everywhere; conversely for any smooth function f we have *f = fdvol.

If X is a vector field, the flow generated by X carries along not just points, but tensor fields of all kinds. Covariant tensor fields are pushed forward by the flow, contravariant ones are pulled back. Thus a stationary observer at a point in M sees a one-parameter family of tensors of some fixed kind flowing through their point, and they may differentiate this family. The result is the Lie derivative of the tensor field, and is denote \mathcal{L}_X. The divergence of a vector field X measures the extent to which the flow generated by X does or does not preserve volume. It is a function which vanishes where the field infinitesimally preserves volume, and is biggest where the flow expands volume the most and smallest where the flow compresses volume the most.

The Lie derivative of the volume form is an n-form; taking Hodge star gives a function, and this function is the divergence. Thus:

\text{div}(X) = *(\mathcal{L}_X dvol)

In terms of the operators we have described above, applying flat to a vector field X gives a 1-form X^\flat. Applying Hodge star to this one form gives rise to an (n-1)-form, then applying d gives an n-form, and this n-form (finally) is precisely \mathcal{L}_X dvol. Thus,

\text{div}(X) = *\, d * (X^\flat)

Gradient and divergence are “almost” dual to each other under Hodge star, in the following sense. Let’s suppose we have some function f and some vector field X. We can take the gradient and form \text{grad}(f), and then we can look at the inner product of the gradient with X to obtain a function, and then integrate this function over the manifold. I.e.

\int_M\langle X,\text{grad}(f)\rangle dvol = \int_M df(X)dvol = \int_M df\wedge *(X^\flat)

But

d(f*(X^\flat)) = df\wedge *(X^\flat) + fd\,*(X^\flat) = df\wedge *(X^\flat) + f\text{div}(X)dvol

If M is closed, the integral of an exact form over M is zero, so we deduce that

\int_M \langle X,\text{grad}(f)\rangle dvol = \int_M -f \text{div}(X) dvol

so that -div is a formal adjoint to grad.

Laplacian. If f is a function, we can first apply the gradient and then the divergence to obtain another function; this composition (or rather its negative) is the Laplacian, and is denoted \Delta. In other words,

\Delta f = -\text{div} \, \text{grad}(f) = -*d*df

Note that there are competing conventions here: it is common to denote the negative of this quantity (i.e. the composition div grad itself) as the Laplacian. But this convention is also common, and has the advantage that the Laplacian is a non-negative self-adjoint operator. The Laplacian governs the flow of heat in the manifold; if we imagine our manifold is filled with some collection of microscopic particles buzzing around randomly at great speed and carrying kinetic energy around, then the temperature is a measure of the amount of energy per unit of volume. If the temperature is constant, then although the particles can move from point to point, on average for each particle that moves out of a small box, there will be another particle that moves in from the outside; thus the ensemble of particles is in “thermal equilibrium”. However, if there is a local hot spot — i.e. a concentration of high energy particles — then these particles will have a tendency to spread out, in the sense that the average number of particles that leave the small hot box will exceed the number of particles that enter from neighboring cooler boxes. Thus, heat will tend to spread out by the vector field which is its negative gradient, and where this vector field diverges, the heat will dissipate and the temperature will cool. In other words, if f is the temperature, then the derivative of temperature over time satisfies the heat equation f' = -\Delta f. Actually, since heat can come in or out from any direction, what is important is how the heat at a point deviates from the average of the heat at nearby points. The stationary heat distributions — i.e. the functions f with \Delta f=0 — are therefore the functions which satisfy an (infinitesimal) mean value property. These functions are called harmonic.

The erratic motion of the infinitesimal particles as they bump into each other and drift around is called Brownian motion, after the botanist Robert Brown, who is known to Australians for being the naturalist on the scientific voyage of the Investigator which sailed to Western Australia in 1801. Later, in 1827, he observed the jittery motion of minute particles ejected from pollen grains, and the phenomenon came to be named after him. Thus, a function on a Riemannian manifold is harmonic if its expected value stays constant under random Brownian motion, and the Laplacian describes the way that the expected value of the function changes under such motion.

Curl. After converting a vector field to a 1-form with the flat operator, one can apply the operator d to obtain a closed 2-form. On an arbitrary Riemannian manifold, this is more or less the end of the story, but on a 3-manifold, applying Hodge star to a 2-form gives back a 1-form, which can then be converted back to a vector field with the sharp operator. This composition is the curl of a vector field; i.e.

\text{curl}(X) = (*d(X^\flat))^\sharp

Notice that this satisfies the identities

\text{div}\, \text{curl}(X) = * d * * d (X^\flat) = 0 and \text{curl}\, \text{grad} (f) = (* d df)^\sharp = 0

Thus one of the functions of the curl operator is to give a necessary condition on a vector field to arise as the gradient of some function; such a function, if it exists, is called a potential for the vector field. Since a gradient flows from places where the function is small to where it is large, it does not recur or circulate; hence in a sense the curl measures the tendency of the vector field to circulate, or to form closed orbits. Actually there is a subtlety here which is that the curl will vanish precisely on vector fields which are locally the gradient of a smooth function. The topology of M — in particular its first homology group with real coefficients — parameterizes curl-free vector fields modulo those which are gradients of smooth functions.

As mentioned above, the curl measures the tendency of the vector field to spiral around an axis (locally); the direction of this axis of spiraling is the direction of the vector field \text{curl}(X), and the magnitude is the rate of twisting. Another way to say this is that the magnitude of the curl measures the tendency of flowlines of the vector field to wind positively around each other. A vector field and its curl can be proportional; such vector fields are called Beltrami fields and they arise (up to rescaling) as the Reeb flows associated to contact structures.

On an arbitrary Riemannian n-manifold it is still possible to interpret the curl in terms of rotation or twisting. Using the sharp and flat isomorphisms, a 2-form \theta determines at each point a skew-symmetric endomorphism of the tangent space. The endomorphism applies to a vector by first contracting it with the 2-form to produce a 1-form, then using the sharp operator to transform it back to a vector. The skew-symmetry of this endomorphism is equivalent to the alternating property of forms. Now, a skew-symmetric endomorphism of a vector space can be thought of as an infinitesimal rotation, since the Lie algebra of the orthogonal group consists precisely of skew-symmetric matrices. Thus a vector field X on a Riemannian manifold determines a field of infinitesimal rotations, and this field is one way of thinking of \text{curl}(X). On a 3-manifold, a rotation has a unique axis, and this axis points in the direction of the vector field \text{curl}(X). On a Kähler manifold, the Kähler form determines a field of infinitesimal rotations which rotate the complex directions at constant speed.

Strain. Actually, the curl, the divergence, and a third operator called the strain can all be put on a uniform footing, as follows. We continue to think of a vector field X as a flow on a smooth manifold M. Tensor fields are pushed or pulled around by X, and an observer at a fixed point sees a 1-parameter family of tensors (of a fixed kind) evolving over time. But we would like to be able to study the effect of X on an object which is carried about and distorted by the flow; for example, we might have a curve or a submanifold in M, and we might want to understand how the geometry of this submanifold is preserved or distorted as it is carried along by the flow. Calculus takes place in a fixed vector space, and the flow is moving our object along the flowlines. We need some way to bring the object back along the flowline to a fixed reference frame so that we can understand how it is being transformed by the flow. On a Riemannian manifold there is a canonical way to move tensor fields along flowlines: we move them by parallel transport. There is a unique connection on the manifold called the Levi-Civita connection \nabla which preserves the metric, and is torsion-free. The first condition just means that parallel transport is an isometry from one tangent space to the other. The second condition is more subtle, and it means (roughly) that there is no “unnecessary twisting” of the tangent space as it is transported around (no yaw, in aviation terms). Think of a car moving down a straight freeway; the geometry of the car is (hopefully!) not distorted by its motion, and the occupants of the car are not unnecessarily rotated or twisted. When the car hits some ice, it begins to skid and twist; the occupants are still moved in roughly the same overall direction, and the geometry is still not distorted (until a collision, anyway), but there is unnecessary twisting — the “torsion” of the connection.

So on a Riemannian manifold, we can flow objects away by a vector field X, and then parallel transport them back along the flowlines with the Levi-Civita connection. Now “the same” tensor experiences the effect of the vector field X while staying in “the same” vector space, so that we can compute the derivative to determine the infinitesimal effect of the flow. This derivative is the operator denoted A_X:=\mathcal{L}_X -\nabla_X by Kobayashi-Nomizu, and it is easy to check that it is itself a tensor field for any fixed X, and therefore determines a section A_X of the bundle of endomorphisms of the tangent bundle.

On a Riemannian manifold, the space of endomorphisms of the tangent space at each point is a module for the Lie algebra of the orthogonal group, and it makes sense to decompose an endomorphism into components which correspond to the irreducible factors. Said more prosaically, an endomorphism is expressed (in terms of an orthonormal basis) as a matrix, and we can decompose this matrix into an antisymmetric and a symmetric part. Further, the symmetric part can be decomposed into its trace (a diagonal matrix, up to scale) and a trace-free part.

In this language,

  1. the divergence of X is the negative of the trace of A_X;
  2. the curl of X is the skew-symmetric part of A_X; and
  3. the strain of X is the trace-free symmetric part of A_X.

The strain measures the infinitesimal failure of flow by X to be conformal. Under a conformal transformation, lengths might change but angles are preserved. The strain measures the extent to which some directions are pushed and pulled by the flow of X more than others; in general relativity, this is expressed by talking about the tidal force of the gravitational field. An extreme example of tidal forces is the spaghettification experienced (briefly) by an observer falling in to a black hole. In the theory of quasiconformal analysis, a Beltrami field prescribes the strain of a smooth mapping between domains.

and so on. This is a far from exhaustive survey of some of the key players in Riemannian geometry, and yet strangely I am temporarily exhausted. It is hard work to unpack the telegraphic beauty of Levi-Civita’s calculus into a collection of stories. And this is the undeniable advantage of the notational formalism — its concision. A geometric formula can (and often does) contain an enormous amount of information — much of it explicit, but some of it implicit, and depending on the reader to be familiar with a host of conventions, simplifications, abbreviations,  and even ad hoc identifications which might depend on context. Maybe the trick is to learn to read more slowly. Or if you have a couple of years to spare, you can always do what I did, and go away and come back later when the material is ready for you. For the curious, I have a few notes on my webpage, including notes from a class on Riemannian geometry I taught in Spring 2013, and notes from a class on minimal surfaces that I’m teaching right now (much of this blog post is adapted from the introduction to the latter). Bear in mind that these notes are not very polished in places, and the minimal surface notes are very rudimentary and only cover a couple of topics as of this writing.