LG MLAug 4, 2020

Learning from a Complementary-label Source Domain: Theory and Algorithms

Yiyang Zhang, Feng Liu, Zhen Fang, Bo Yuan, Guangquan Zhang, Jie Lu

arXiv:2008.01454v17.973 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of expensive data labeling in domain adaptation for machine learning practitioners, offering a more efficient alternative, though it is incremental as it builds on existing UDA frameworks.

The paper tackles the high cost of collecting fully-labeled source data in unsupervised domain adaptation by proposing a novel setting where the source domain uses complementary labels (indicating classes a pattern does not belong to), with theoretical bounds and a method called CLARINET that outperforms baselines on handwritten-digit and object recognition tasks.

In unsupervised domain adaptation (UDA), a classifier for the target domain is trained with massive true-label data from the source domain and unlabeled data from the target domain. However, collecting fully-true-label data in the source domain is high-cost and sometimes impossible. Compared to the true labels, a complementary label specifies a class that a pattern does not belong to, hence collecting complementary labels would be less laborious than collecting true labels. Thus, in this paper, we propose a novel setting that the source domain is composed of complementary-label data, and a theoretical bound for it is first proved. We consider two cases of this setting, one is that the source domain only contains complementary-label data (completely complementary unsupervised domain adaptation, CC-UDA), and the other is that the source domain has plenty of complementary-label data and a small amount of true-label data (partly complementary unsupervised domain adaptation, PC-UDA). To this end, a complementary label adversarial network} (CLARINET) is proposed to solve CC-UDA and PC-UDA problems. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines on handwritten-digits-recognition and objects-recognition tasks.

View on arXiv PDF Code

Similar