LG OC MLNov 7, 2013

The Maximum Entropy Relaxation Path

Moshe Dubiner, Matan Gavish, Yoram Singer

arXiv:1311.1644v1

Originality Incremental advance

AI Analysis

This work addresses the problem of efficiently solving relaxed maximum entropy problems for researchers in machine learning and statistics, with potential applications in n-gram language model estimation, but it is incremental as it builds on existing maximum entropy frameworks.

The paper studies the entire relaxation path for the relaxed maximum entropy problem, showing it has a planar geometric description as an increasing, piecewise linear function and deriving fast algorithms with O(n log(n)) operations for distributions on n points, enabling handling of large problems and reducing admissible models to a small discrete set.

The relaxed maximum entropy problem is concerned with finding a probability distribution on a finite set that minimizes the relative entropy to a given prior distribution, while satisfying relaxed max-norm constraints with respect to a third observed multinomial distribution. We study the entire relaxation path for this problem in detail. We show existence and a geometric description of the relaxation path. Specifically, we show that the maximum entropy relaxation path admits a planar geometric description as an increasing, piecewise linear function in the inverse relaxation parameter. We derive fast algorithms for tracking the path. In various realistic settings, our algorithms require $O(n\log(n))$ operations for probability distributions on $n$ points, making it possible to handle large problems. Once the path has been recovered, we show that given a validation set, the family of admissible models is reduced from an infinite family to a small, discrete set. We demonstrate the merits of our approach in experiments with synthetic data and discuss its potential for the estimation of compact n-gram language models.

View on arXiv PDF

Similar