MLITLGSTFeb 26, 2020

Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions

arXiv:2002.11665v11 citations
AI Analysis

This provides a unifying theoretical framework for estimation, inference, and compression in discrete distributions, which is foundational for machine learning and statistics.

The paper introduces profile entropy as a fundamental measure for discrete distributions, showing it determines the speed of distribution estimation relative to the best natural estimator, characterizes inference rates for symmetric properties, and serves as the limit of profile compression with optimal algorithms.

The profile of a sample is the multiset of its symbol frequencies. We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression. Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetric properties compared with the best estimator over any label-invariant distribution collection; c) serves as the limit of profile compression, for which we derive optimal near-linear-time block and sequential algorithms. To further our understanding of profile entropy, we investigate its attributes, provide algorithms for approximating its value, and determine its magnitude for numerous structural distribution families.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes