CVSep 10, 2018

Improved Techniques for Adversarial Discriminative Domain Adaptation

arXiv:1809.03625v355 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of domain adaptation for image classification, particularly in scenarios with limited labeled data, by improving upon existing adversarial methods, though it appears incremental as it builds on ADDA and semi-supervised GANs.

The paper tackles unsupervised domain adaptation for image classification by proposing a new framework and loss formulations that extend adversarial discriminative domain adaptation (ADDA), leveraging semi-supervised GANs and incorporating maximum mean discrepancy (MMD) and reconstruction-based losses to align target and source domains. The results show that the proposal competes or outperforms state-of-the-art methods on standard datasets like SVHN and MNIST, as well as on a neuromorphic vision sensing sign language recognition dataset.

Adversarial discriminative domain adaptation (ADDA) is an efficient framework for unsupervised domain adaptation in image classification, where the source and target domains are assumed to have the same classes, but no labels are available for the target domain. We investigate whether we can improve performance of ADDA with a new framework and new loss formulations. Following the framework of semi-supervised GANs, we first extend the discriminator output over the source classes, in order to model the joint distribution over domain and task. We thus leverage on the distribution over the source encoder posteriors (which is fixed during adversarial training) and propose maximum mean discrepancy (MMD) and reconstruction-based loss functions for aligning the target encoder distribution to the source domain. We compare and provide a comprehensive analysis of how our framework and loss formulations extend over simple multi-class extensions of ADDA and other discriminative variants of semi-supervised GANs. In addition, we introduce various forms of regularization for stabilizing training, including treating the discriminator as a denoising autoencoder and regularizing the target encoder with source examples to reduce overfitting under a contraction mapping (i.e., when the target per-class distributions are contracting during alignment with the source). Finally, we validate our framework on standard domain adaptation datasets, such as SVHN and MNIST. We also examine how our framework benefits recognition problems based on modalities that lack training data, by introducing and evaluating on a neuromorphic vision sensing (NVS) sign language recognition dataset, where the source and target domains constitute emulated and real neuromorphic spike events respectively. Our results on all datasets show that our proposal competes or outperforms the state-of-the-art in unsupervised domain adaptation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes