Entropy and set projections

Let {A\subset {\mathbb Z}^3} be a finite set, and let {A_{xy}, A_{yz}, A_{xz} } be its projections to the corresponding two-planes. Denoting by {\# S} the cardinality of a set {S}, we have the following inequality:

\displaystyle  \left(\# A\right)^2 \leq \left(\# A_{xy}\right)\left(\# A_{yz}\right) \left(\# A_{xy}\right)

This generalizes to sets in higher dimensions and projections to subspaces of possibly different dimensions. As long as each coordinate appears at least {n} times on the right, the size of {\left(\# A\right)^n} is bounded by the product of the projections.

One proof of the above inequality is via a mixed version of Cauchy-Schwartz or Holder-type inequalities. For {f,g,h} functions of two variables, we have

\displaystyle  \begin{array}{rcl}  \left(\int \int \int f\left(x,y\right) g\left(y,z\right) h\left(x,z\right) dx\, dy\, dz\right)^2 \leq\\ \leq\left(\int \int f\left(x,y\right)dx\, dy\right) \cdot \left(\int \int g\left(y,z\right)dy\, dz\right) \left(\int \int h\left(x,z\right)dx \, dz\right) \end{array}

Taking {f,g,h} to be indicators of projections of {A} to the {2}-planes, the bound follows. However, proving the inequality for {f,g,h} requires some work.

There is a different inequality, involving entropy of random variables, whose proof (and generalizations) are much more conceptual. Namely, if {X,Y,Z} are random variables and {H\left(-\right)} denotes entropy, we have

\displaystyle  2H\left(X,Y,Z\right) \leq H\left(X,Y\right) + H\left(Y,Z\right) + H\left(X,Z\right)

Below the fold are the definitions, a proof, and an application.

— Entropy —

Let {X} be a random variable taking values in a finite set {S}. Define its entropy to be

\displaystyle  H\left(X\right):=\sum_{s\in S} -{\mathbb P}\left(X=s\right)\cdot \log \left({\mathbb P}\left(X=s\right)\right)

If {X} and {Y} are variables defined on the same probability space, defined the conditional entropy of {X}, given {Y}, as

\displaystyle  H\left(X|Y\right) := \sum_{t\in T} {\mathbb P}\left(Y=t\right) \sum_{s\in S} -{\mathbb P}\left(X=s|Y=t\right) \log \left({\mathbb P}\left(X=s|Y=t\right)\right)

The joint entropy of {X} and {Y}, denoted {H\left(X,Y\right)}, is simply the entropy of the variable {X\times Y}.

— Properties —

A heuristic for the definition of entropy is that it is “the amount of information learned, when the value of {X} is revealed”. Similarly, the conditional entropy {H\left(X|Y\right)} is the amount of information learned, if we already know the value of {Y} and {X} is revealed.

  • {0\leq H\left(X\right)\leq \log \left(\# S\right)}
  • {H\left(X,Y\right) = H\left(X|Y\right)+H\left(Y\right)}
  • {H\left(X|Y\right) \leq H\left(X\right)}

The interpretation of the first is that entropy is maximized when all values are equally likely. For the second, the information contained in {X} and {Y} is the same as that in {Y}, plus the one in {X} if we already know {Y}. The third property follows from Jensen’s inequality, while its information content is nearly obvious.

Theorem 1 The following inequality between entropies holds:

\displaystyle  2H\left(X,Y,Z\right) \leq H\left(X,Y\right) + H\left(Y,Z\right) + H\left(X,Z\right)

Proof: We have

\displaystyle  \begin{array}{rcl}  H\left(X,Y,Z\right) - H\left(Y,Z\right) = H\left(X|Y,Z\right)\\ H\left(X,Y,Z\right) - H\left(X,Z\right) = H\left(Y|X,Z\right) \end{array}

Summing the above two equations, it suffices to show that

\displaystyle  H\left(X|Y,Z\right) + H\left(Y|X,Z\right) \leq H\left(X,Y\right)

However, we have that

\displaystyle  \begin{array}{rcl}  H\left(X|Y,Z\right) &\leq H\left(X|Y\right)\\ H\left(Y|X,Z\right) & \leq H\left(Y\right) \end{array}

Combined with the equality {H\left(X,Y\right)=H\left(X|Y\right)+H\left(Y\right)}, the claim follows. \Box

— Application —

Let now {A\subset {\mathbb Z}^3} be the finite set, viewed as a probability space with each point having equal probability {1/\left(\# A\right)}. Define the random variables {X,Y,Z} to be the coordinates of a randomly chosen point. Then we have that

\displaystyle  H\left(X,Y,Z\right) = \log \left(\# A\right)\\

Moreover, we have the inequality

\displaystyle  H\left(X,Y\right) \leq \log \left(\# A_{xy}\right)

Indeed, uniform distribution over the projection to the {xy}-plane would correspond to maximal entropy for {X} and {Y}. A similar bound holds for any pair of projections, so we have the initial inequality in logarithmic form:

\displaystyle  2 \log \left(\# A\right) \leq \log \left(\# A_{xy}\right) + \log \left(\# A_{yz}\right) + \log \left(\# A_{xz}\right)

Exponentiating, we get the inequality from the beginning of the post.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s