CLSep 20, 2024

Unsupervised Domain Adaptation for Keyphrase Generation using Citation Contexts

arXiv:2409.13266v222 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of expensive expert annotation for domain adaptation in keyphrase generation, though it is incremental as it builds on existing unsupervised methods.

The paper tackled the problem of adapting keyphrase generation models to new domains without labeled data by using citation contexts to create synthetic training data, resulting in significant and consistent performance improvements across three domains.

Adapting keyphrase generation models to new domains typically involves few-shot fine-tuning with in-domain labeled data. However, annotating documents with keyphrases is often prohibitively expensive and impractical, requiring expert annotators. This paper presents silk, an unsupervised method designed to address this issue by extracting silver-standard keyphrases from citation contexts to create synthetic labeled data for domain adaptation. Extensive experiments across three distinct domains demonstrate that our method yields high-quality synthetic samples, resulting in significant and consistent improvements in in-domain performance over strong baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes