Large Deviations – Cramer’s theorem

Last summer I attended a conference on “Dynamics and Numbers” at the Max Planck Institute in Bonn. Vincent Delecroix gave a nice talk on large deviations for the Teichmuller flow, and I decided to learn a bit about this concept. I wrote the notes below at that time.

In this post, I will prove the simplest example of large deviations – Cramer’s theorem. Here is a simple example. Consider a random coin toss, and at each step you either win or loose {1} dollar. After {N} steps, consider the probability of having more than {aN} dollars, where {a>0} is fixed. Of course, it goes to zero as {N{\rightarrow} \infty}, but how fast? Playing around with Stirling’s formula reveals that the probability decays exponentially fast.

The theory of large deviations is concerned with these kinds of exponentially unlikely events. We now move on to a more precise setup. Continue reading


Entropy and set projections

Let {A\subset {\mathbb Z}^3} be a finite set, and let {A_{xy}, A_{yz}, A_{xz} } be its projections to the corresponding two-planes. Denoting by {\# S} the cardinality of a set {S}, we have the following inequality:

\displaystyle  \left(\# A\right)^2 \leq \left(\# A_{xy}\right)\left(\# A_{yz}\right) \left(\# A_{xy}\right)

This generalizes to sets in higher dimensions and projections to subspaces of possibly different dimensions. As long as each coordinate appears at least {n} times on the right, the size of {\left(\# A\right)^n} is bounded by the product of the projections.

One proof of the above inequality is via a mixed version of Cauchy-Schwartz or Holder-type inequalities. For {f,g,h} functions of two variables, we have

\displaystyle  \begin{array}{rcl}  \left(\int \int \int f\left(x,y\right) g\left(y,z\right) h\left(x,z\right) dx\, dy\, dz\right)^2 \leq\\ \leq\left(\int \int f\left(x,y\right)dx\, dy\right) \cdot \left(\int \int g\left(y,z\right)dy\, dz\right) \left(\int \int h\left(x,z\right)dx \, dz\right) \end{array}

Taking {f,g,h} to be indicators of projections of {A} to the {2}-planes, the bound follows. However, proving the inequality for {f,g,h} requires some work.

There is a different inequality, involving entropy of random variables, whose proof (and generalizations) are much more conceptual. Namely, if {X,Y,Z} are random variables and {H\left(-\right)} denotes entropy, we have

\displaystyle  2H\left(X,Y,Z\right) \leq H\left(X,Y\right) + H\left(Y,Z\right) + H\left(X,Z\right)

Below the fold are the definitions, a proof, and an application. Continue reading