CVNov 25, 2020

Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer

arXiv:2011.12454v411 citations
AI Analysis

This work tackles the problem of severe class imbalance for computer vision tasks, which is a common issue for practitioners dealing with long-tailed datasets, offering an incremental solution.

This paper addresses the challenge of severe class imbalance in real-world applications, particularly for minority class classification. The authors propose a causal data inflation procedure that leverages knowledge transfer from dominant to under-represented classes, even with feature distribution disparities, to enlarge minority class representations. The method is validated on synthetic and real-world computer vision tasks against state-of-the-art solutions.

Dealing with severe class imbalance poses a major challenge for real-world applications, especially when the accurate classification and generalization of minority classes is of primary interest. In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets. While existing solutions mostly appeal to sampling or weighting adjustments to alleviate the pathological imbalance, or imposing inductive bias to prioritize non-spurious associations, we take novel perspectives to promote sample efficiency and model generalization based on the invariance principles of causality. Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions. Such causal assumption enables efficient knowledge transfer from the dominant classes to their under-represented counterparts, even if the respective feature distributions show apparent disparities. This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes. Our development is orthogonal to the existing extreme classification techniques thus can be seamlessly integrated. The utility of our proposal is validated with an extensive set of synthetic and real-world computer vision tasks against SOTA solutions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes