MLLGMay 7, 2023

Classification Tree Pruning Under Covariate Shift

arXiv:2305.04335v23 citations
Originality Incremental advance
AI Analysis

This addresses a practical issue for machine learning practitioners dealing with distribution shifts, but it is incremental as it builds on existing transfer-exponent concepts.

The paper tackles the problem of pruning classification trees under covariate shift, where training data comes from a different distribution than the target, by introducing an efficient procedure for optimal pruning based on a relaxed average discrepancy measure, achieving optimality in this setting.

We consider the problem of \emph{pruning} a classification tree, that is, selecting a suitable subtree that balances bias and variance, in common situations with inhomogeneous training data. Namely, assuming access to mostly data from a distribution $P_{X, Y}$, but little data from a desired distribution $Q_{X, Y}$ with different $X$-marginals, we present the first efficient procedure for optimal pruning in such situations, when cross-validation and other penalized variants are grossly inadequate. Optimality is derived with respect to a notion of \emph{average discrepancy} $P_{X} \to Q_{X}$ (averaged over $X$ space) which significantly relaxes a recent notion -- termed \emph{transfer-exponent} -- shown to tightly capture the limits of classification under such a distribution shift. Our relaxed notion can be viewed as a measure of \emph{relative dimension} between distributions, as it relates to existing notions of information such as the Minkowski and Renyi dimensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes