You are currently browsing the tag archive for the ‘finite state automaton’ tag.
The purpose of this post is to discuss my recent paper with Koji Fujiwara, which will shortly appear in Ergodic Theory and Dynamical Systems, both for its own sake, and in order to motivate some open questions that I find very intriguing. The content of the paper is a mixture of ergodic theory, geometric group theory, and computer science, and was partly inspired by a paper of Jean-Claude Picaud. To state the results of the paper, I must first introduce a few definitions and some background.
Let be a finite directed graph (hereafter a digraph) with an initial vertex, and edges labeled by elements of a finite set in such a way that each vertex has at most one outgoing edge with any given label. A finite directed path in starting at the initial vertex determines a word in the alphabet , by reading the labels on the edges traversed (in order). The set of words obtained in this way is an example of what is called a regular language, and is said to be parameterized by . Note that this is not the most general kind of regular language; in particular, any language of this kind will necessarily be prefix-closed (i.e. if then every prefix of is also in ). Note also that different digraphs might parameterize the same (prefix-closed) regular language .
If is a set of generators for a group , there is an obvious map called the evaluation map that takes a word to the element of represented by that word.
Definition: Let be a group, and a finite generating set. A combing of is a (prefix-closed) regular language for which the evaluation map is a bijection, and such that every represents a geodesic in .
The intuition behind this definition is that the set of words in determines a directed spanning tree in the Cayley graph starting at , and such that every directed path in the tree is a geodesic in . Note that there are other definitions of combing in the literature; for example, some authors do not require the evaluation map to be a bijection, but only a coarse bijection.
Fundamental to the theory of combings is the following Theorem, which paraphrases one of the main results of this paper:
Theorem: (Cannon) Let be a hyperbolic group, and let be a finite generating set. Choose a total order on the elements of . Then the language of lexicographically first geodesics in is a combing.
The language described in this theorem is obviously geodesic and prefix-closed, and the evaluation map is bijective; the content of the theorem is that is regular, and parameterized by some finite digraph . In the sequel, we restrict attention exclusively to hyperbolic groups .
Given a (hyperbolic) group , a generating set , a combing , one makes the following definition:
Definition: A function is weakly combable (with respect to ) if there is a digraph parameterizing and a function from the vertices of to so that for any , corresponding to a path in , there is an equality .
In other words, a function is weakly combable if it can be obtained by “integrating” a function along the paths of a combing. One furthermore says that a function is combable if it changes by a bounded amount under right-multiplication by an element of , and bicombable if it changes by a bounded amount under either left or right multiplication by an element of . The property of being (bi-)combable does not depend on the choice of a generating set or a combing .
Example: Word length (with respect to a given generating set ) is bicombable.
Example: Let be a homomorphism. Then is bicombable.
Example: The Brooks counting quasimorphisms (on a free group) and the Epstein-Fujiwara counting quasimorphisms are bicombable.
Example: The sum or difference of two (bi-)combable functions is (bi-)combable.
A particularly interesting example is the following:
Example: Let be a finite set which generates as a semigroup. Let denote word length with respect to , and denote word length with respect to (which also generates as a semigroup). Then the difference is a bicombable quasimorphism.
The main theorem proved in the paper concerns the statistical distribution of values of a bicombable function.
Theorem: Let be a hyperbolic group, and let be a bicombable function on . Let be the value of on a random word in of length (with respect to a certain measure depending on a choice of generating set). Then there are algebraic numbers and so that as distributions, converges to a normal distribution with standard deviation .
One interesting corollary concerns the length of typical words in one generating set versus another. The first thing that every geometric group theorist learns is that if are two finite generating sets for a group , then there is a constant so that every word of length in one generating set has length at most and at least in the other generating set. If one considers an example like , one sees that this is the best possible estimate, even statistically. However, if one restricts attention to a hyperbolic group , then one can do much better for typical words:
Corollary: Let be hyperbolic, and let be two finite generating sets. There is an algebraic number so that almost all words of length with respect to the generating set have length almost equal to with respect to the generating set, with error of size .
Let me indicate very briefly how the proof of the theorem goes.
Sketch of Proof: Let be bicombable, and let be a function from the vertices of to , where is a digraph parameterizing . There is a bijection between the set of elements in of word length and the set of directed paths in of length that start at the initial vertex. So to understand the distribution of , we need to understand the behaviour of a typical long path in .
Define a component of to be a maximal subgraph with the property that there is a directed path (in the component) from any vertex to any other vertex. One can define a new digraph without loops, with one vertex for each component of , in an obvious way. Each component determines an adjacency matrix , with -entry equal to if there is a directed edge from vertex to vertex , and equal to otherwise. A component is big if the biggest real eigenvalue of is at least as big as the biggest real eigenvalue of the matrices associated to every other component. A random long walk in will spend most of its time entirely in big components, so these are the only components we need to consider to understand the statistical distribution of .
A theorem of Coornaert implies that there are no big components of in series; i.e. there are no directed paths in from one big component to another (one also says that the big components do not communicate). This means that a typical long walk in is entirely contained in a single big component, except for a (relatively short) path at the start and the end of the walk. So the distribution of gets independent contributions, one from each big component.
The contribution from an individual big component is not hard to understand: the central limit theorem for stationary Markov chains says that for elements of corresponding to paths that spend almost all their time in a given big component there is a central limit theorem where the mean and standard deviation depend only on . The problem is to show that the means and standard deviations associated to different big components are the same. Everything up to this point only depends on weak combability of ; to finish the proof one must use bicombability.
It is not hard to show that if is a typical infinite walk in a component , then the subpaths of of length are distributed like random walks of length in . What this means is that the mean and standard deviation associated to a big component can be recovered from the distribution of on a single infinite “typical” path in . Such an infinite path corresponds to an infinite geodesic in , converging to a definite point in the Gromov boundary . Another theorem of Coornaert (from the same paper) says that the action of on its boundary is ergodic with respect to a certain natural measure called a Patterson-Sullivan measure (see Coornaert’s paper for details). This means that there are typical infinite geodesics associated to components and for which some takes to a geodesic ending at the same point in as . Bicombability implies that the values of on and differ by a bounded amount. Moreover, since and are asymptotic to the same point at infinity, combability implies that the values of on and also differ by a bounded amount. This is enough to deduce that and , and one obtains a (global) central limit theorem for on . qed.
This obviously raises several questions, some of which seem very hard, including:
Question 1: Let be an arbitrary quasimorphism on a hyperbolic group (even the case is free is interesting). Does satisfy a central limit theorem?
Question 2: Let be an arbitrary quasimorphism on a hyperbolic group . Does satisfy a central limit theorem with respect to a random walk on ? (i.e. one considers the distribution of values of not on the set of elements of of word length , but on the set of elements obtained by a random walk on of length , and lets go to infinity)
All bicombable quasimorphisms satisfy an important property which is essential to our proof of the central limit theorem: they are local, which is to say, they are defined as a sum of local contributions. In the continuous world, they are the analogue of the so-called de Rham quasimorphisms on where is a closed negatively curved Riemannian manifold; such quasimorphisms are defined by choosing a -form , and defining to be equal to the integral , where is the closed oriented based geodesic in in the homotopy class of . De Rham quasimorphisms, being local, also satisfy a central limit theorem.
This locality manifests itself in another way, in terms of defects. Let be a quasimorphism on a hyperbolic group . Recall that the defect is the supremum of over all pairs of elements . A quasimorphism is further said to be homogeneous if for all integers . If is an arbitrary quasimorphism, one may homogenize it by taking a limit ; one says that is the homogenization of in this case. Homogenization typically does not preserve defects; however, there is an inequality . If is local, one expects this inequality to be an equality. For, in a hyperbolic group, the contribution to the defect of a local quasimorphism all arises from the interaction of the suffix of (a geodesic word representing the element) with the prefix of (with notation as above). When one homogenizes, one picks up another contribution to the defect from the interaction of the prefix of with the suffix of ; since these two contributions are essentially independent, one expects that homogenizing a local quasimorphism should exactly double the defect. This is the case for bicombable and de Rham quasimorphisms, and can perhaps be used to define locality for a quasimorphism on an arbitrary group.
This discussion provokes the following key question:
Question 3: Let be a group, and let be a homogeneous quasimorphism. Is there a quasimorphism with homogenization , satisfying ?
Example: The answer to question 3 is “yes” if is the rotation quasimorphism associated to an action of on by orientation-preserving homeomorphisms (this is nontrivial; see Proposition 4.70 from my monograph).
Example: Let be any homologically trivial group -boundary. Then there is some extremal homogeneous quasimorphism for (i.e. a quasimorphism achieving equality under generalized Bavard duality; see this post) for which there is with homogenization satisfying . Consequently, if every point in the boundary of the unit ball in the norm is contained in a unique supporting hyperplane, the answer to question 3 is “yes” for any quasimorphism on .
Any quasimorphism on can be pulled back to a quasimorphism on a free group, but this does not seem to make anything easier. In particular, question 3 is completely open (as far as I know) when is a free group. An interesting test case might be the homogenization of an infinite sum of Brooks functions for some infinite non-nested family of words .
If the answer to this question is false, and one can find a homogeneous quasimorphism which is not the homogenization of any “local” quasimorphism, then perhaps does not satisfy a central limit theorem. One can try to approach this problem from the other direction:
Question 4: Given a function defined on the ball of radius in a free group , one defines the defect in the usual way, restricted to pairs of elements for which are all of length at most . Under what conditions can be extended to a function on the ball of radius without increasing the defect?
If one had a good procedure for building a quasimorphism “by hand” (so to speak), one could try to build a quasimorphism that failed to satisfy a central limit theorem, or perhaps find reasons why this was impossible.