Likelihood Ratio Exponential Families
This work offers a unified theoretical framework for understanding several information-theoretic and MCMC methods, which is significant for researchers in machine learning and statistical physics by connecting previously disparate concepts.
This paper extends likelihood ratio exponential families to include solutions for rate-distortion optimization, the information bottleneck method, and rate-distortion-classification approaches. This provides a unified mathematical framework for these methods through the conjugate duality of exponential families and hypothesis testing.
The exponential family is well known in machine learning and statistical physics as the maximum entropy distribution subject to a set of observed constraints, while the geometric mixture path is common in MCMC methods such as annealed importance sampling. Linking these two ideas, recent work has interpreted the geometric mixture path as an exponential family of distributions to analyze the thermodynamic variational objective (TVO). We extend these likelihood ratio exponential families to include solutions to rate-distortion (RD) optimization, the information bottleneck (IB) method, and recent rate-distortion-classification approaches which combine RD and IB. This provides a common mathematical framework for understanding these methods via the conjugate duality of exponential families and hypothesis testing. Further, we collect existing results to provide a variational representation of intermediate RD or TVO distributions as a minimizing an expectation of KL divergences. This solution also corresponds to a size-power tradeoff using the likelihood ratio test and the Neyman Pearson lemma. In thermodynamic integration bounds such as the TVO, we identify the intermediate distribution whose expected sufficient statistics match the log partition function.