LGAICVMar 17, 2022

On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond

arXiv:2203.09513v30.4356 citationsh-index: 93Has Code
AI Analysis55

It addresses data imbalance across domains, which is crucial for improving generalization to unseen domains, but is incremental as it builds on existing multi-domain and imbalance studies.

The paper tackles the problem of learning from multi-domain imbalanced data, addressing label imbalance, domain shift, and divergent label distributions across domains, and proposes BoDA, which achieves superior performance on five benchmarks and establishes new state-of-the-art on Domain Generalization benchmarks.

Real-world data often exhibit imbalanced label distributions. Existing studies on data imbalance focus on single-domain settings, i.e., samples are from the same data distribution. However, natural data can originate from distinct domains, where a minority class in one domain could have abundant instances from other domains. We formalize the task of Multi-Domain Long-Tailed Recognition (MDLT), which learns from multi-domain imbalanced data, addresses label imbalance, domain shift, and divergent label distributions across domains, and generalizes to all domain-class pairs. We first develop the domain-class transferability graph, and show that such transferability governs the success of learning in MDLT. We then propose BoDA, a theoretically grounded learning strategy that tracks the upper bound of transferability statistics, and ensures balanced alignment and calibration across imbalanced domain-class distributions. We curate five MDLT benchmarks based on widely-used multi-domain datasets, and compare BoDA to twenty algorithms that span different learning strategies. Extensive and rigorous experiments verify the superior performance of BoDA. Further, as a byproduct, BoDA establishes new state-of-the-art on Domain Generalization benchmarks, highlighting the importance of addressing data imbalance across domains, which can be crucial for improving generalization to unseen domains. Code and data are available at: https://github.com/YyzHarry/multi-domain-imbalance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes