Ranking by Dependence - A Fair Criteria
This work addresses a prevalent issue in machine learning for researchers and practitioners, but it appears incremental as it builds on existing methods with specific improvements.
The paper tackled the problem of estimating and ranking dependences between random variables, showing that p-value and mutual information can fail in simple cases, and proposed a new regularized dependence measure that performs well compared to established criteria, with experiments showing good agreement with Bayesian approaches.
Estimating the dependences between random variables, and ranking them accordingly, is a prevalent problem in machine learning. Pursuing frequentist and information-theoretic approaches, we first show that the p-value and the mutual information can fail even in simplistic situations. We then propose two conditions for regularizing an estimator of dependence, which leads to a simple yet effective new measure. We discuss its advantages and compare it to well-established model-selection criteria. Apart from that, we derive a simple constraint for regularizing parameter estimates in a graphical model. This results in an analytical approximation for the optimal value of the equivalent sample size, which agrees very well with the more involved Bayesian approach in our experiments.