LG AI CVAug 13, 2022

Combating Label Distribution Shift for Active Domain Adaptation

Sehyun Hwang, Sohyun Lee, Sungyeon Kim, Jungseul Ok, Suha Kwak

arXiv:2208.06604v113.626 citationsh-index: 33

Originality Highly original

AI Analysis

This work addresses a critical issue in active domain adaptation for machine learning applications where source and target label distributions differ, offering a method that significantly outperforms existing approaches.

The paper tackles the problem of label distribution mismatch in active domain adaptation by introducing a novel sampling strategy that selects target data to approximate the target distribution while being representative, diverse, and uncertain, leading to substantial performance improvements on four public benchmarks.

We consider the problem of active domain adaptation (ADA) to unlabeled target data, of which subset is actively selected and labeled given a budget constraint. Inspired by recent analysis on a critical issue from label distribution mismatch between source and target in domain adaptation, we devise a method that addresses the issue for the first time in ADA. At its heart lies a novel sampling strategy, which seeks target data that best approximate the entire target distribution as well as being representative, diverse, and uncertain. The sampled target data are then used not only for supervised learning but also for matching label distributions of source and target domains, leading to remarkable performance improvement. On four public benchmarks, our method substantially outperforms existing methods in every adaptation scenario.

View on arXiv PDF

Similar