Roth’s theorem

I am in Kyoto right now, attending the twenty-first Nevanlinna colloquium (update: took a while to write this post – now I’m in Sydney for the Clay lectures). Yesterday, Junjiro Noguchi gave a plenary talk on Nevanlinna theory in higher dimensions and related Diophantine problems. The talk was quite technical, and I did not understand it very well; however, he said a few suggestive things early on which struck a chord.

The talk started quite accessibly, being concerned with the fundamental equation

a +b = c

where a,b,c are coprime positive integers. The abc conjecture, formulated by Oesterlé and Masser, says that for any positive real number \epsilon, there is a constant C_\epsilon so that

\max(a,b,c) \le C_\epsilon\text{rad}(abc)^{1+\epsilon}

where \text{rad}(abc) is the product of the distinct primes appearing in the product abc. Informally, this conjecture says that for triples a,b,c satisfying the fundamental equation, the numbers a,b,c are not divisible by “too high” powers of a prime. The abc conjecture is known to imply many interesting number theoretic statements, including (famously) Fermat’s Last Theorem (for sufficiently large exponents), and Roth’s theorem on diophantine approximation (as observed by Bombieri).

Roth’s theorem is the following statement:

Theorem(Roth, 1955): Let \alpha be a real algebraic number. Then for any \epsilon>0, the inequality |\alpha - p/q| < q^{-(2+\epsilon)} has only finitely many solutions in coprime integers p,q.

This inequality is best possible, in the sense that every irrational number can be approximated by infinitely many rationals p/q to within 1/2q^2. In fact, the rationals appearing in the continued fraction approximation to \alpha have this property. There is a very short and illuminating geometric proof of this fact.

In the plane, construct a circle packing with a circle of radius 1/2q^2 with center p/q,1/2q^2 for each coprime pair p,q of integers.

circles_1This circle packing nests down on the x-axis, and any vertical line (with irrational x-co-ordinate) intersects infinitely many circles. If the x co-ordinate of a vertical line is \alpha, every circle the line intersects gives a rational p/q which approximates \alpha to within 1/2q^2. qed.

On the other hand, consider the corresponding collection of circles with radius 1/2q^{2+\epsilon}. Some “space” appears between neighboring circles, and they no longer pack tightly (the following picture shows \epsilon = 0.2).

circles_2The total cross-sectional width of these circles, restricted to pairs p/q in the interval [0,1), can be estimated as follows. Each p/q contributes a width of 1/2q^{2+\epsilon}. Ignoring the coprime condition, there are q fractions of the form p/q in the interval [0,1), so the total width is less than \frac 1 2 \sum_q q^{-1-\epsilon} which converges for positive \epsilon. In other words, the total cross-sectional width of all circles is finite. It follows that almost every vertical line intersects only finitely many circles.

Some vertical lines do, in fact, intersect infinitely many circles; i.e. some real numbers are approximated by infinitely many rationals to better than quadratic accuracy; for example, a Liouville number like \sum_{n=1}^\infty 10^{-n!}.

Some special cases of Roth’s theorem are much easier than others. For instance, it is very easy to give a proof when \alpha is a quadratic irrational; i.e. an element of \mathbb{Q}(\sqrt{d}) for some integer d. Quadratic irrationals are characterized by the fact that their continued fraction expansions are eventually periodic. One can think of this geometrically as follows. The group \text{PSL}(2,\mathbb{Z}) acts on the upper half-plane, which we think of now as the complex numbers with non-negative imaginary part, by fractional linear transformations z \to (az+b)/(cz+d). The quotient is a hyperbolic triangle orbifold, with a cusp. A vertical line in the plane ending at a point \alpha on the x-axis projects to a geodesic ray in the triangle orbifold. A rational number p/q approximating \alpha to within 1/2q^2 is detected by the geodesic entering a horoball centered at the cusp. If \alpha is a quadratic irrational, the corresponding geodesic ray eventually winds around a periodic geodesic (this is the periodicity of the continued fraction expansion), so it never gets too deep into the cusp, and the rational approximations to \alpha never get better than C/2q^2 for some constant C depending on \alpha, as required. A different vertical line intersecting the x-axis at some \beta corresponds to a different geodesic ray; the existence of good rational approximations to \beta corresponds to the condition that the corresponding geodesic goes deeper and deeper into the cusp infinitely often at a definite rate (i.e. at a distance which is at least some fixed (fractional) power of time). A “random” geodesic on a cusped hyperbolic surface takes time n to go distance \log{n} out the cusp (this is a kind of equidistribution fact – the thickness of the cusp goes to zero like e^{-t}, so if one chooses a sequence of points in a hyperbolic surface at random with respect to the uniform (area) measure, it takes about n points to find one that is distance \log{n} out the cusp). If one expects that every geodesic ray corresponding to an algebraic number looks like a “typical” random geodesic, one would conjecture (and in fact, Lang did conjecture) that there are only finitely many p/q for which |p/q - \alpha| < q^{-2}(\log{q})^{-1-\epsilon} for any \epsilon > 0.

A slightly different (though related) geometric way to see the periodicity of the continued fraction expansion of a quadratic irrational is to use diophantine geometry. This is best illustrated with an example. Consider the golden number \alpha = (1+\sqrt{5})/2. The matrix A=\left( \begin{smallmatrix} 2 & 1 \ 1 & 1 \end{smallmatrix} \right) has \left( \begin{smallmatrix} \alpha \ 1 \end{smallmatrix} \right) and \left( \begin{smallmatrix} \bar{\alpha} \ 1 \end{smallmatrix} \right) as eigenvectors (here \bar{\alpha} denotes the “conjugate” 1-\alpha), and thus preserves a “wedge” in \mathbb{R}^2 bounded by lines with slopes \alpha and \bar{\alpha}. The set of integer lattice points in this wedge is permuted by A, and therefore so is the boundary of the convex hull of this set (the sail of the cone). Lattice points on the sail correspond to rational approximations to the boundary slopes; the fact that A permutes this set corresponds to the periodicity of the continued fraction expansion of \alpha (and certifies the fact that \alpha cannot be approximated better than quadratically by rational numbers).

There is an analogue of this construction in higher dimensions: let A be an n\times n integer matrix whose eigenvalues are all real, positive, irrational and distinct. A collection of n suitable eigenvectors spans a polyhedral cone which is invariant under A. The  convex hull of the set of integer lattice points in this cone is a polyhedron, and the vertices of this polyhedron (the vertices on the sail) are  the “best” integral approximations to the eigenvectors. In fact, there is a \mathbb{Z}^{n-1} subgroup of \text{SL}(n,\mathbb{Z}) consisting of matrices with the same set of eigenvectors (this is a consequence of Dirichlet’s theorem on the structure of the group of units in the integers in a number field). Hence there is a group that acts discretely and co-compactly on the vertices of the sail, and one gets a priori estimates on how well the eigenvectors can be approximated by integral vectors. It is interesting to ask whether one can give a proof of Roth’s theorem along these lines, at least for algebraic numbers in totally real fields, but I don’t know the answer.

This entry was posted in Diophantine approximation, Ergodic Theory and tagged , , , . Bookmark the permalink.

10 Responses to Roth’s theorem

  1. Err, $$\sum_{n=1}^\infty 10^{-n}$$ is equal to $$1/9$$…

  2. Anonymous says:

    finally a post i can understand a little bit.

  3. It’s very nice to see a geometer’s perspective on this type of questions!

    • Danny Calegari says:

      Dear Emmanuel – I warmly invite you (and any other reader) to give other perspectives (a geometer’s perspective is the only one I’m capable of giving . . . :)



      • My first exposure to the elementary parts was from reading Hardy and Wright’s “Introduction to the theory of numbers” (chapter X on continued fractions and chapter XI on approximation of reals by rational numbers); I remember thinking that the proof of periodicity of continued fractions there was rather unenlightening…

  4. I like your polyhedron picture (although I’m having trouble seeing any manifest periodicity).

    Minor clarification: In the original Ford circle picture, a vertical line intersects infinitely many circles if and only if its x-coordinate is irrational.

    • Thanks for the correction.

      Maybe it’s worth saying explicitly that what is periodic in the polyhedron picture is the sail (the boundary of the convex hull of the set of interior lattice points), and it is periodic with respect to the affine action (on the plane) of the 2×2 matrix whose eigenvectors span the extremal rays of the cone; i.e. the sail is a particular piecewise-linear curve in the plane on which the matrix acts by “translation”; the continued fraction expansion can be recovered from the geometry of the sail, so the existence of a Z family of symmetries implies that the continued fraction expansion is periodic.

      Best, Danny

  5. Tushar Das says:

    This was a wonderful post :O)

    Those interested may want to look at –

    L. R. Ford,
    The American Mathematical Monthly, Vol. 45, No. 9 (Nov., 1938), pp. 586-601.

    D. Sullivan,
    Disjoint spheres, approximation by imaginary quadratic numbers, and the logarithmic law for geodesics,
    Acta. Math. 149 (1982), p. 215–237

    and also the book by Conway –
    The sensual (quadratic) form
    By John Horton Conway, Francis Yein Chei Fung

  6. Pingback: Percentages for sceptics: part III | Out of the Norm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s