Let be a finite set, and let be its projections to the corresponding two-planes. Denoting by the cardinality of a set , we have the following inequality:
This generalizes to sets in higher dimensions and projections to subspaces of possibly different dimensions. As long as each coordinate appears at least times on the right, the size of is bounded by the product of the projections.
One proof of the above inequality is via a mixed version of Cauchy-Schwartz or Holder-type inequalities. For functions of two variables, we have
Taking to be indicators of projections of to the -planes, the bound follows. However, proving the inequality for requires some work.
There is a different inequality, involving entropy of random variables, whose proof (and generalizations) are much more conceptual. Namely, if are random variables and denotes entropy, we have
Below the fold are the definitions, a proof, and an application.
— Entropy —
Let be a random variable taking values in a finite set . Define its entropy to be
If and are variables defined on the same probability space, defined the conditional entropy of , given , as
The joint entropy of and , denoted , is simply the entropy of the variable .
— Properties —
A heuristic for the definition of entropy is that it is “the amount of information learned, when the value of is revealed”. Similarly, the conditional entropy is the amount of information learned, if we already know the value of and is revealed.
The interpretation of the first is that entropy is maximized when all values are equally likely. For the second, the information contained in and is the same as that in , plus the one in if we already know . The third property follows from Jensen’s inequality, while its information content is nearly obvious.
Theorem 1 The following inequality between entropies holds:
Proof: We have
Summing the above two equations, it suffices to show that
However, we have that
Combined with the equality , the claim follows.
— Application —
Let now be the finite set, viewed as a probability space with each point having equal probability . Define the random variables to be the coordinates of a randomly chosen point. Then we have that
Moreover, we have the inequality
Indeed, uniform distribution over the projection to the -plane would correspond to maximal entropy for and . A similar bound holds for any pair of projections, so we have the initial inequality in logarithmic form:
Exponentiating, we get the inequality from the beginning of the post.