[physics/0004057] The information bottleneck method (1999)
> We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y . Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. **Understanding the signal x requires more than just predicting y, it also requires specifying which features of X play a role in the prediction. We formalize this problem as that of finding a short code for X that preserves the maximum information about Y.** That is, we squeeze the information that X provides about Y through a ‘bottleneck’ formed by a limited set of codewords X ̃... This approach yields an exact set of self consistent equations for the coding rules X → X ̃ and X ̃ → Y . (from the intro) : how to define "meaningful / relevant" information? An issue left out of information theory by Shannon (focus on the problem of transmitting information rather than judging its value to the recipient) ->leads to consider statistical and information theoretic principles as almost irrelevant for the question of meaning. > In contrast, **we argue here that information theory, in particular lossy source compression, provides a natural quantitative approach to the question of “relevant information.”** Specifically, we formulate a **variational principle** for the extraction or efficient representation of relevant information.
About This Document
File info