CVOct 14, 2024

RICASSO: Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure

arXiv:2410.10548v1h-index: 1
Originality Highly original
AI Analysis

This addresses challenges in real-world scenarios where models face both data imbalance and OOD detection, offering a holistic solution without needing real OOD data.

The paper tackles the joint problem of imbalanced (long-tailed) and out-of-distribution (OOD) data in deep learning by proposing RICASSO, a framework that uses data mixing to generate pseudo-OOD data from in-distribution data, achieving state-of-the-art performance with a 27% improvement in AUROC and 61% reduction in FPR on iNaturalist2018.

In real-world scenarios, deep learning models often face challenges from both imbalanced (long-tailed) and out-of-distribution (OOD) data. However, existing joint methods rely on real OOD data, which leads to unnecessary trade-offs. In contrast, our research shows that data mixing, a potent augmentation technique for long-tailed recognition, can generate pseudo-OOD data that exhibit the features of both in-distribution (ID) data and OOD data. Therefore, by using mixed data instead of real OOD data, we can address long-tailed recognition and OOD detection holistically. We propose a unified framework called Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure (RICASSO), where "self-supervised" denotes that we only use ID data for outlier exposure. RICASSO includes three main strategies: Norm-Odd-Duality-Based Outlier Exposure: Uses mixed data as pseudo-OOD data, enabling simultaneous ID data rebalancing and outlier exposure through a single loss function. Ambiguity-Aware Logits Adjustment: Utilizes the ambiguity of ID data to adaptively recalibrate logits. Contrastive Boundary-Center Learning: Combines Virtual Boundary Learning and Dual-Entropy Center Learning to use mixed data for better feature separation and clustering, with Representation Consistency Learning for robustness. Extensive experiments demonstrate that RICASSO achieves state-of-the-art performance in long-tailed recognition and significantly improves OOD detection compared to our baseline (27% improvement in AUROC and 61% reduction in FPR on the iNaturalist2018 dataset). On iNaturalist2018, we even outperforms methods using real OOD data. The code will be made public soon.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes