Neural Methods for Point-wise Dependency Estimation
This work addresses the need for point-wise dependency measures in machine learning, offering incremental improvements over existing neural mutual information methods.
The paper tackles the problem of estimating point-wise dependency (PD) between events, rather than aggregate mutual information, by developing two methods: Probabilistic Classifier and Density-Ratio Fitting. The results demonstrate effectiveness in mutual information estimation, self-supervised representation learning, and cross-modal retrieval tasks.
Since its inception, the neural estimation of mutual information (MI) has demonstrated the empirical success of modeling expected dependency between high-dimensional random variables. However, MI is an aggregate statistic and cannot be used to measure point-wise dependency between different events. In this work, instead of estimating the expected dependency, we focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur. We show that we can naturally obtain PD when we are optimizing MI neural variational bounds. However, optimizing these bounds is challenging due to its large variance in practice. To address this issue, we develop two methods (free of optimizing MI variational bounds): Probabilistic Classifier and Density-Ratio Fitting. We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.