Sample-to-Sample Correspondence for Unsupervised Domain Adaptation
This addresses the problem of domain shift in real-world machine learning applications where labeled target data is unavailable, but it is incremental as it builds on existing local domain-adaptation approaches.
The paper tackles unsupervised domain adaptation by proposing a method that finds sample-to-sample correspondences between source and target domains using graph matching with convex optimization, and it outperforms traditional moment-matching methods and is competitive with current local methods in simulations on synthetic, image, and sentiment classification datasets.
The assumption that training and testing samples are generated from the same distribution does not always hold for real-world machine-learning applications. The procedure of tackling this discrepancy between the training (source) and testing (target) domains is known as domain adaptation. We propose an unsupervised version of domain adaptation that considers the presence of only unlabelled data in the target domain. Our approach centers on finding correspondences between samples of each domain. The correspondences are obtained by treating the source and target samples as graphs and using a convex criterion to match them. The criteria used are first-order and second-order similarities between the graphs as well as a class-based regularization. We have also developed a computationally efficient routine for the convex optimization, thus allowing the proposed method to be used widely. To verify the effectiveness of the proposed method, computer simulations were conducted on synthetic, image classification and sentiment classification datasets. Results validated that the proposed local sample-to-sample matching method out-performs traditional moment-matching methods and is competitive with respect to current local domain-adaptation methods.