LG MLSep 29, 2020

Ensemble Multi-Source Domain Adaptation with Pseudolabels

arXiv:2009.14248v11.2

Originality Incremental advance

AI Analysis

It addresses the problem of training models with unlabeled target data using multiple sources, which is crucial for applications with privacy constraints, but is incremental in improving existing MSDA frameworks.

The paper tackles multi-source domain adaptation (MSDA) by proposing EnMDAP, which uses label-wise moment matching with pseudolabels and ensemble learning to align conditional distributions, achieving state-of-the-art performance in image and text domains.

Given multiple source datasets with labels, how can we train a target model with no labeled data? Multi-source domain adaptation (MSDA) aims to train a model using multiple source datasets different from a target dataset in the absence of target data labels. MSDA is a crucial problem applicable to many practical cases where labels for the target data are unavailable due to privacy issues. Existing MSDA frameworks are limited since they align data without considering conditional distributions p(x|y) of each domain. They also miss a lot of target label information by not considering the target label at all and relying on only one feature extractor. In this paper, we propose Ensemble Multi-source Domain Adaptation with Pseudolabels (EnMDAP), a novel method for multi-source domain adaptation. EnMDAP exploits label-wise moment matching to align conditional distributions p(x|y), using pseudolabels for the unavailable target labels, and introduces ensemble learning theme by using multiple feature extractors for accurate domain adaptation. Extensive experiments show that EnMDAP provides the state-of-the-art performance for multi-source domain adaptation tasks in both of image domains and text domains.

View on arXiv PDF

Similar