Separable Computation of Information Measures
This work addresses a foundational challenge in machine learning for researchers and practitioners by enabling more efficient computation of information measures, though it appears incremental as it builds on existing concepts with new connections.
The paper tackles the problem of computing information measures by proposing a separable design that uses learned feature representations instead of raw data, demonstrating that this approach applies to various measures like mutual information and information bottleneck under mild assumptions. The result includes theoretical guarantees for practical estimation through representation learning.
We study a separable design for computing information measures, where the information measure is computed from learned feature representations instead of raw data. Under mild assumptions on the feature representations, we demonstrate that a class of information measures admit such separable computation, including mutual information, $f$-information, Wyner's common information, G{á}cs--K{ö}rner common information, and Tishby's information bottleneck. Our development establishes several new connections between information measures and the statistical dependence structure. The characterizations also provide theoretical guarantees of practical designs for estimating information measures through representation learning.