Minimal Achievable Sufficient Statistic Learning
This work addresses the challenge of learning minimal representations in deep networks, which is an incremental improvement for machine learning practitioners focused on efficiency and interpretability.
The paper tackles the problem of training machine learning models to produce minimal sufficient statistics, introducing MASS Learning and Conserved Differential Information (CDI) as tools for handling deterministic dependencies in deep networks. The result is that deep networks trained with MASS Learning achieve competitive performance on supervised learning and uncertainty quantification benchmarks.
We introduce Minimal Achievable Sufficient Statistic (MASS) Learning, a training method for machine learning models that attempts to produce minimal sufficient statistics with respect to a class of functions (e.g. deep networks) being optimized over. In deriving MASS Learning, we also introduce Conserved Differential Information (CDI), an information-theoretic quantity that - unlike standard mutual information - can be usefully applied to deterministically-dependent continuous random variables like the input and output of a deep network. In a series of experiments, we show that deep networks trained with MASS Learning achieve competitive performance on supervised learning and uncertainty quantification benchmarks.