Information projection
In information theory, the information projection or I-projection of a probability distribution q onto a set of distributions P is
where is the Kullback–Leibler divergence from p to q. Viewing the Kullback–Leibler divergence as a measure of distance, the I-projection is the "closest" distribution to q of all the distributions in P.
The I-projection is useful in setting up information geometry, notably because of the following inequality:[1]
This inequality can be interpreted as an information-geometric version of Pythagoras' triangle inequality theorem, where KL divergence is viewed as squared distance in a Euclidean space.
It is worthwile to note that since and continuous in p, if P is closed and non-empty, then there exists at least one minimizer to the optimization problem framed above. Furthermore if P is convex, then the optimum distribution is unique.
The reverse I-projection is
See also
References
- ↑ Cover, Thomas M.; Thomas, Joy A. (2006). Elements of Information Theory (2 ed.). Hoboken, New Jersey: Wiley Interscience. pp. 367(theorem 11.6.1).