CVLGOct 23, 2021

Domain Adaptation for Rare Classes Augmented with Synthetic Samples

arXiv:2110.12216v16 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving classification for rare classes in imbalanced datasets, which is a common problem in domains like wildlife monitoring, but it is incremental as it adapts existing domain adaptation methods to a specific setting.

The paper tackles the problem of low classification performance on rare classes in imbalanced datasets by augmenting a single rare class with synthetic samples and applying domain adaptation to reduce discrepancies between real and synthetic data. Experiments on a camera trap animal dataset with a rare deer class show that the proposed DeerDANN method improves deer classification accuracy by 24.0%, and both methods achieve higher accuracies with fewer synthetic samples than the baseline.

To alleviate lower classification performance on rare classes in imbalanced datasets, a possible solution is to augment the underrepresented classes with synthetic samples. Domain adaptation can be incorporated in a classifier to decrease the domain discrepancy between real and synthetic samples. While domain adaptation is generally applied on completely synthetic source domains and real target domains, we explore how domain adaptation can be applied when only a single rare class is augmented with simulated samples. As a testbed, we use a camera trap animal dataset with a rare deer class, which is augmented with synthetic deer samples. We adapt existing domain adaptation methods to two new methods for the single rare class setting: DeerDANN, based on the Domain-Adversarial Neural Network (DANN), and DeerCORAL, based on deep correlation alignment (Deep CORAL) architectures. Experiments show that DeerDANN has the highest improvement in deer classification accuracy of 24.0% versus 22.4% improvement of DeerCORAL when compared to the baseline. Further, both methods require fewer than 10k synthetic samples, as used by the baseline, to achieve these higher accuracies. DeerCORAL requires the least number of synthetic samples (2k deer), followed by DeerDANN (8k deer).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes