Mining Label Distribution Drift in Unsupervised Domain Adaptation
This work addresses domain adaptation for machine learning practitioners by focusing on an under-explored aspect, though it appears incremental as it builds on existing adversarial network frameworks.
The paper tackles the problem of label distribution drift in unsupervised domain adaptation, which is often overlooked, and proposes a method that jointly addresses data and label distribution shifts, achieving superior performance in experiments.
Unsupervised domain adaptation targets to transfer task-related knowledge from labeled source domain to unlabeled target domain. Although tremendous efforts have been made to minimize domain divergence, most existing methods only partially manage by aligning feature representations from diverse domains. Beyond the discrepancy in data distribution, the gap between source and target label distribution, recognized as label distribution drift, is another crucial factor raising domain divergence, and has been under insufficient exploration. From this perspective, we first reveal how label distribution drift brings negative influence. Next, we propose Label distribution Matching Domain Adversarial Network (LMDAN) to handle data distribution shift and label distribution drift jointly. In LMDAN, label distribution drift is addressed by a source sample weighting strategy, which selects samples that contribute to positive adaptation and avoid adverse effects brought by the mismatched samples. Experiments show that LMDAN delivers superior performance under considerable label distribution drift.