# Entropy and set projections

Let ${A\subset {\mathbb Z}^3}$ be a finite set, and let ${A_{xy}, A_{yz}, A_{xz} }$ be its projections to the corresponding two-planes. Denoting by ${\# S}$ the cardinality of a set ${S}$, we have the following inequality:

$\displaystyle \left(\# A\right)^2 \leq \left(\# A_{xy}\right)\left(\# A_{yz}\right) \left(\# A_{xy}\right)$

This generalizes to sets in higher dimensions and projections to subspaces of possibly different dimensions. As long as each coordinate appears at least ${n}$ times on the right, the size of ${\left(\# A\right)^n}$ is bounded by the product of the projections.

One proof of the above inequality is via a mixed version of Cauchy-Schwartz or Holder-type inequalities. For ${f,g,h}$ functions of two variables, we have

$\displaystyle \begin{array}{rcl} \left(\int \int \int f\left(x,y\right) g\left(y,z\right) h\left(x,z\right) dx\, dy\, dz\right)^2 \leq\\ \leq\left(\int \int f\left(x,y\right)dx\, dy\right) \cdot \left(\int \int g\left(y,z\right)dy\, dz\right) \left(\int \int h\left(x,z\right)dx \, dz\right) \end{array}$

Taking ${f,g,h}$ to be indicators of projections of ${A}$ to the ${2}$-planes, the bound follows. However, proving the inequality for ${f,g,h}$ requires some work.

There is a different inequality, involving entropy of random variables, whose proof (and generalizations) are much more conceptual. Namely, if ${X,Y,Z}$ are random variables and ${H\left(-\right)}$ denotes entropy, we have

$\displaystyle 2H\left(X,Y,Z\right) \leq H\left(X,Y\right) + H\left(Y,Z\right) + H\left(X,Z\right)$

Below the fold are the definitions, a proof, and an application.

— Entropy —

Let ${X}$ be a random variable taking values in a finite set ${S}$. Define its entropy to be

$\displaystyle H\left(X\right):=\sum_{s\in S} -{\mathbb P}\left(X=s\right)\cdot \log \left({\mathbb P}\left(X=s\right)\right)$

If ${X}$ and ${Y}$ are variables defined on the same probability space, defined the conditional entropy of ${X}$, given ${Y}$, as

$\displaystyle H\left(X|Y\right) := \sum_{t\in T} {\mathbb P}\left(Y=t\right) \sum_{s\in S} -{\mathbb P}\left(X=s|Y=t\right) \log \left({\mathbb P}\left(X=s|Y=t\right)\right)$

The joint entropy of ${X}$ and ${Y}$, denoted ${H\left(X,Y\right)}$, is simply the entropy of the variable ${X\times Y}$.

— Properties —

A heuristic for the definition of entropy is that it is “the amount of information learned, when the value of ${X}$ is revealed”. Similarly, the conditional entropy ${H\left(X|Y\right)}$ is the amount of information learned, if we already know the value of ${Y}$ and ${X}$ is revealed.

• ${0\leq H\left(X\right)\leq \log \left(\# S\right)}$
• ${H\left(X,Y\right) = H\left(X|Y\right)+H\left(Y\right)}$
• ${H\left(X|Y\right) \leq H\left(X\right)}$

The interpretation of the first is that entropy is maximized when all values are equally likely. For the second, the information contained in ${X}$ and ${Y}$ is the same as that in ${Y}$, plus the one in ${X}$ if we already know ${Y}$. The third property follows from Jensen’s inequality, while its information content is nearly obvious.

Theorem 1 The following inequality between entropies holds:

$\displaystyle 2H\left(X,Y,Z\right) \leq H\left(X,Y\right) + H\left(Y,Z\right) + H\left(X,Z\right)$

Proof: We have

$\displaystyle \begin{array}{rcl} H\left(X,Y,Z\right) - H\left(Y,Z\right) = H\left(X|Y,Z\right)\\ H\left(X,Y,Z\right) - H\left(X,Z\right) = H\left(Y|X,Z\right) \end{array}$

Summing the above two equations, it suffices to show that

$\displaystyle H\left(X|Y,Z\right) + H\left(Y|X,Z\right) \leq H\left(X,Y\right)$

However, we have that

$\displaystyle \begin{array}{rcl} H\left(X|Y,Z\right) &\leq H\left(X|Y\right)\\ H\left(Y|X,Z\right) & \leq H\left(Y\right) \end{array}$

Combined with the equality ${H\left(X,Y\right)=H\left(X|Y\right)+H\left(Y\right)}$, the claim follows. $\Box$

— Application —

Let now ${A\subset {\mathbb Z}^3}$ be the finite set, viewed as a probability space with each point having equal probability ${1/\left(\# A\right)}$. Define the random variables ${X,Y,Z}$ to be the coordinates of a randomly chosen point. Then we have that

$\displaystyle H\left(X,Y,Z\right) = \log \left(\# A\right)\\$

Moreover, we have the inequality

$\displaystyle H\left(X,Y\right) \leq \log \left(\# A_{xy}\right)$

Indeed, uniform distribution over the projection to the ${xy}$-plane would correspond to maximal entropy for ${X}$ and ${Y}$. A similar bound holds for any pair of projections, so we have the initial inequality in logarithmic form:

$\displaystyle 2 \log \left(\# A\right) \leq \log \left(\# A_{xy}\right) + \log \left(\# A_{yz}\right) + \log \left(\# A_{xz}\right)$

Exponentiating, we get the inequality from the beginning of the post.