The development and scope of modern biology is often held out as a fantastic opportunity for mathematicians. The accumulation of vast amounts of biological data, and the development of new tools for the manipulation of biological organisms at microscopic levels and with unprecedented accuracy, invites the development of new mathematical tools for their analysis and exploitation. I know of several examples of mathematicians who have dipped a toe, or sometimes some more substantial organ, into the water. But it has struck me that I know (personally) few mathematicians who believe they have something substantial to learn from the biologists, despite the existence of several famous historical examples. This strikes me as odd; my instinctive feeling has always been that intellectual ruts develop so easily, so deeply, and so invisibly, that continual cross-fertilization of ideas is essential to escape ossification (if I may mix biological metaphors . . .)
It is not necessarily easy to come up with profound examples of biological ideas or principles that can be easily translated into mathematical ones, but it is sometimes possible to come up with suggestive ones. Let me try to give a tentative example.
Deoxiribonucleic acid (DNA) is a nucleic acid that contains the genetic blueprint for all known living things. This blueprint takes the form of a code — a molecule of DNA is a long polymer strand composed of simple units called nucleotides; such a molecule is typically imagined as a string in a four character alphabet , which stand for the nucleotides Adenine, Thymine, Guanine, and Cytosine. These molecular strands like to arrange themselves in tightly bound oppositely aligned pairs, matching up nucleotides in one string with complementary nucleotides in the other, so that
matches with
, and
with
.
The geometry of a strand of DNA is very complicated — strands can be tangled, knotted, linked in complicated ways, and the fundamental interactions between strands (e.g. transcription, recombination) are facilitated or obstructed by mechanical processes depending on this geometry. Topology, especially knot theory, has been used in the study of some of these processes; the value of topological methods in this context include their robustness (fault-tolerance) and the discreteness of their invariants (similar virtues motivate some efforts to build topological quantum computers). A complete mathematical description of the salient biochemistry, mechanics, and semantic content of a configuration of DNA in a single cell is an unrealistic goal for the foreseeable future, and therefore attempts to model such systems depends on ignoring, or treating statistically, certain features of the system. One such framework ignores the ambient geometry entirely, and treats the system using symbolic, or combinatorial methods which have some of the flavor of geometric group theory.
One interesting approach is to consider a mapping from the alphabet of nucleotides to a standard generating set for , the free group on two generators; for example, one can take the mapping
where
are free generators for
, and
denote their inverses. Then a pair of oppositely aligned strands of DNA translates into an edge of a van Kampen diagram — the “words” obtained by reading the letters along an edge on either side are inverse in
.
Strands of DNA in a configuration are not always paired along their lengths; sometimes junctions of three or more strands can form; certain mobile four-strand junctions, so-called “Holliday junctions”, perform important functions in the process of genetic recombination, and are found in a wide variety of organisms. A configuration of several strands with junctions of varying valences corresponds in the language of van Kampen diagrams to a fatgraph — i.e. a graph together with a choice of cyclic ordering of edges at each vertex — with edges labeled by inverse pairs of words in (note that this is quite different from the fatgraph model of proteins developed by Penner-Knudsen-Wiuf-Andersen). The energy landscape for branch migration (i.e. the process by which DNA strands separate or join along some segment) is very complicated, and it is challenging to model it thermodynamically. It is therefore not easy to predict in advance what kinds of fatgraphs are more or less likely to arise spontaneously in a prepared “soup” of free DNA strands.
As a thought experiment, consider the following “toy” model, which I do not suggest is physically realistic. We make the assumption that the energy cost of forming a junction of valence is
for some fixed constant
. Consequently, the energy of a configuration is proportional to
, i.e. the negative of Euler characteristic of the underlying graph. Let
be a reduced word, representing an element of
, and imagine a soup containing some large number of copies of the strand of DNA corresponding to the string
. In thermodynamic equilibrium, the partition function has the form
where
is Boltzmann’s constant,
is temperature, and
is the energy of a configuration (which by hypothesis is proportional to
). At low temperature, minimal energy configurations tend to dominate; these are those that minimize
per unit “volume”. Topologically, a fatgraph corresponding to such a configuration can be thickened to a surface with boundary. The words along the edges determine a homotopy class of map from such a surface to a
(e.g. a once-punctured torus) whose boundary components wrap multiply around the free homotopy class corresponding to the conjugacy class of
. The infimum of
where
is the winding degree on the boundary, taken over all configurations, is precisely the stable commutator length of
; see e.g. here for a definition.
Anyway, this example is perhaps a bit strained (and maybe it owes more to thermodynamics than to biology), but already it suggests a new mathematical object of study, namely the partition function as above, and one is already inclined to look for examples for which the partition function obeys a symmetry like that enjoyed by the Riemann zeta function, or to specialize temperature to other values, as in random matrix theory. The introduction of new methods into the study of a classical object — for example, the decision to use thermodynamic methods to organize the study of van Kampen diagrams — bends the focus of the investigation towards those examples and contexts where the methods and tools are most informative. Phenomena familiar in one context (power laws, frequency locking, phase transitions etc.) suggest new questions and modes of enquiry in another. Uninspired or predictable research programs can benefit tremendously from such infusions, whether the new methods are borrowed from other intellectual disciplines (biology, physics), or depend on new technology (computers), or new methods of indexing (google) or collaboration (polymath).
One of my intellectual heroes — Wolfgang Haken — worked for eight years in R+D for Siemens in Munich after completing his PhD. I have a conceit (unsubstantiated as far as I know by biographical facts) that his experience working for a big engineering firm colored his approach to mathematics, and made it possible for him to imagine using industrial-scale “engineering” tools (e.g. integer programming, exhaustive computer search of combinatorial possibilities) to solve two of the most significant “pure” mathematical open problems in topology at the time — the knot recognition problem, and the four-color theorem. It is an interesting exercise to try to imagine (fantastic) variations. If I sit down and decide to try to prove (for example) Cannon’s conjecture, I am liable to try minor variations on things I have tried before, appeal for my intuition to examples that I understand well, read papers by others working in similar ways on the problem, etc. If I imagine that I have been given a billion dollars to prove the conjecture, I am almost certain to prioritize the task in different ways, and to entertain (and perhaps create) much more ambitious or innovative research programs to tackle the task. This is the way in which I understand the following quote by John Dewey, which I used as the colophon of my first book:
Every great advance in science has issued from a new audacity of the imagination.
Amenability of Thompson’s group F?
Geometric group theory is not a coherent and unified field of enquiry so much as a collection of overlapping methods, examples, and contexts. The most important examples of groups are those that arise in nature: free groups and fundamental groups of surfaces, the automorphism groups of the same, lattices, Coxeter and Artin groups, and so on; whereas the most important properties of groups are those that lend themselves to applications or can be used in certain proof templates: linearity, hyperbolicity, orderability, property (T), coherence, amenability, etc. It is natural to confront examples arising in one context with properties that arise in the other, and this is the source of a wealth of (usually very difficult) problems; e.g. do mapping class groups have property (T)? (no, by Andersen) or: is every lattice in
virtually orderable?
As remarked above, it is natural to formulate these questions, but not necessarily productive. Gromov, in his essay Spaces and Questions remarks that
A famous question of the kind Gromov warns against is the following:
Question: Is Thompson’s group
amenable?
Recall that Thompson’s group is the group of (orientation-preserving) PL homeomorphisms of the unit interval with breakpoints at dyadic rationals (i.e. rational numbers of the form
for integers
) and derivatives all powers of
. This group is a rich source of examples/counterexamples in geometric group theory: it is finitely presented (in fact
) but “looks like” a transformation group; it contains no nonabelian free group (by Brin-Squier), but obeys no law. It is not elementary amenable (i.e. it cannot be built up from finite or abelian groups by elementary operations — subgroups, quotients, extensions, directed unions), so it is “natural” to wonder whether it is amenable at all, or whether it is one of the rare examples of nonamenable groups without nonabelian free subgroups (see this post for a discussion of amenability versus the existence of free subgroups, and von Neumann’s conjecture). This question has attracted a great deal of attention, possibly because of its long historical pedigree, rather than because of the potential applications of a positive (or negative) answer.
Recently, two papers were posted on the arXiv, promising competing resolutions of the question. In February, Azer Akhmedov posted a preprint claiming to show that the group
is not amenable. This preprint was revised, withdrawn, then revised again, and as of the end of April, Akhmedov continues to press his claim. Akhmedov’s argument depends on a new geometric criterion for nonamenability, roughly speaking, the existence of a
-generator subgroup and a subadditive non-negative function on the group whose values grow at a definite rate on words in the subgroup whose exponents satisfy suitable parity conditions and inequalities. The non-negative function (Akhmedov calls it a “height function”) certifies the existence of a sufficiently “bushy” subset of the group to violate Folner’s criterion for amenability. Akhmedov’s paper reads like a “conventional” paper in geometric group theory, using methods from coarse geometry, careful combinatorial and counting arguments to establish the existence of a geometric object with certain large-scale properties, and an appeal to a standard geometric criterion to obtain the desired result. Akhmedov’s paper is part of a series, relating (non)amenability to certain other interesting geometric properties, some related to the so-called “traveling salesman” property, introduced earlier by Akhmedov.
On the other hand, in May, E. Shavgulidze posted a preprint claiming to show that the group
is amenable. Interestingly enough, Shavgulidze’s argument does not apply to the slightly more general class of Stein-Thompson groups in which slopes and denominators of break points can be divisible by an arbitrary (but prescribed) finite set of prime numbers. Moreover, his methods are very unlike any that one would expect to find in the typical geometric group theory paper. The argument depends on the construction, going back (in some sense) to a paper of Shavgulidze from 1978, of a measure on the space
of continuous functions on the interval which is quasi-invariant under the natural action of the group of diffeomorphisms of the interval of regularity at least
. In more detail, let
denote the group of diffeomorphisms of the interval of regularity at least
for each
, and let
denote the Banach space of continuous functions on the interval that vanish at the origin. Define
by the formula
. The space
can be equipped with a natural measure — the Wiener measure
of variance
, and this measure can be pulled back to
by
, which is thought of as a topological space with the
topology. Shavgulidze shows that the left action of
on
quasi-preserves this measure. Here the Wiener measure on
is the probability measure associated to Brownian motion (with given variance). A “sample” trajectory
from
is characterized by three properties: that it starts at the origin (i.e.
), that is it continuous almost surely (this is already implicit in the fact that the measure is supported on the space
and not some more general space), and that increments are independent, with the distribution of
equal to a Gaussian with mean zero and variance
. Shavgulidze’s argument depends first on an argument of Ghys-Sergiescu that shows Thompson’s group is conjugate (by a homeomorphism) to a discrete subgroup of the group of
diffeomorphisms of the interval. A bounded function
on
determines a continuous bounded function
on
(for
) by a certain convolution trick, using both the group structure of
, and its discreteness in
. Roughly, given an element
, the set of elements of
whose (group) composition with
is uniformly bounded in the
norm is finite; so the value of
is obtained by taking a suitable average of the value of
on this finite subset of
. This reduces the problem of the amenability of
to the existence of a suitable functional on the space of bounded continuous functions on
, which is constructed via the pulled back Wiener measure as above.
There are several distinctive features of Shavgulidze’s preprint. One of the most striking is that it depends on very delicate analytic features of the Wiener measure, and the way it transforms under the action of
on
— a transformation law involving the Schwartzian derivative — and suggesting that certain parts of the argument could be clarified (at least from the point of view of a topologist?) by using projective geometry and Sturm-Liouville theory. Another is that the crucial analytic quality — namely differentiability of class
— is also crucial for many other natural problems in
-dimensional analysis and geometry, from regularity estimates in the thin obstacle problem, to Navas’ work on actions of property (T) groups on the circle. At least one of the preprints by Akhmedov and Shavgulidze must be in error (in fact, a real skeptic’s skeptic such as Michael Aschbacher is not even willing to concede that much . . .) but even if wrong, it is possible that they contain things more valuable than a resolution of the question that prompted them.
Update (7/6): Azer Akhmedov sent me a construction of a (nonabelian) free subgroup of
that is discrete in the
topology. This is not quite enough regularity to intersect with Shavgulidze’s program, but it is interesting, and worth explaining. This is my (minor) modification of Azer’s construction (any errors are due to me):
Proposition: The group
contains a discrete nonabelian free subgroup.
Sketch of Proof: First, decompose the interval
into countably many disjoint subintervals accumulating only at the endpoints. Choose a free action on two generators by doing something generic on each subinterval, in such a way that the derivative is equal to
at the endpoints. This can certainly be accomplished; for concreteness, choose the action so that for each subinterval
there is a point
in the interior of
whose stabilizer is trivial.
Second, for each pair of distinct words in the generators, choose a subinterval and modify the action there so that the derivatives of those words in that subinterval differ by at least some definite constant
at some point. In more detail: enumerate the pairs of words somehow
where each
is a pair of words
in the generators, and modify the action on the subinterval
so the words in
differ by at least
in the
norm on the interval
. Since we are modifying the generators infinitely many times, but in such a way that the support of the modification exits any compact subset of the interior, we just need to check that the modifications are
. Since there are only finitely many pairs of words, both of which are of bounded length (for any given bound), when
is sufficiently big, one of the words
,
has length at least
where
goes to infinity as
goes to infinity. Without loss of generality, we can order the pairs so that
is the “long” word.
Now this is how we modify the action in
. Recall that the point
has trivial stabilizer, so the translates
of
under the suffixes of
are distinct. Take disjoint intervals about the
and observe that each
is taken to
by one of the generators. Modify this generator inside this disjoint neighborhood so that
is still taken to
, but the derivative at that point is multiplied by
, and the derivative at nearby points is not multiplied by more than
. Since the neighborhoods of the
are disjoint, these modifications are all compatible, and the derivative of the generators does not change by more than
at any point. Since
goes to infinity as
goes to infinity, we can perform such modifications for each
, and the resulting action is still
. But now the derivative of
at
has been multiplied by
, so
and
differ by at least
in the
norm. qed.
It is interesting to observe that this construction, while
, is not
for any
. For big
, we have
whereas
. Introducing a “bump” which modifies the derivative by
in a subinterval of size
will blow up every Holder norm.
(Update 8/10): Mark Sapir has created a webpage to discuss Shavgulidze’s paper here. Also, Matt Brin has posted notes on Shavgulidze’s paper here. The notes are very nice, and go into great detail, as far as they go. Matt promises to update the notes periodically.
(Update 11/18): Matt Brin has let me know by email that a significant gap has emerged in Shavgulidze’s argument. He writes:
In light of this, it would seem to be reasonable to consider the question of whether
is amenable as wide open.
(Update 9/21/2012): Justin Moore has posted a preprint on the arXiv claiming to prove amenability of
. It is too early to suggest that there is expert consensus on the correctness of the proof, but certainly everything I have heard is promising. I have not had time to look carefully at the argument yet, but hope to get a chance to do so before too long.
(Update 10/2/2012): Justin has withdrawn his claim of a proof. A gap was found by Akhmedov.