CLSep 7, 2018

Adversarial Domain Adaptation for Duplicate Question Detection

arXiv:1809.02255v11130 citations
Originality Incremental advance
AI Analysis

This addresses the problem of automating duplicate detection for forum moderators and users, but it is incremental as it builds on existing domain adaptation methods.

The paper tackled duplicate question detection in forums by using adversarial domain adaptation to transfer knowledge from annotated to unannotated domains, achieving an average improvement of 5.6% over baselines on StackExchange data.

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as duplicates, and thus a promising solution is to use domain adaptation from another forum that has such annotations. Here we focus on adversarial domain adaptation, deriving important findings about when it performs well and what properties of the domains are important in this regard. Our experiments with StackExchange data show an average improvement of 5.6% over the best baseline across multiple pairs of domains.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes