CLJun 28, 2024

MetaKP: On-Demand Keyphrase Generation

arXiv:2407.00191v223 citations
AI Analysis

This addresses the limitation of traditional keyphrase prediction by catering to diverse user needs in NLP applications, though it is incremental as it builds on existing methods like prompting and fine-tuning.

The paper tackles the problem of generating keyphrases tailored to specific user goals or intents, introducing a new paradigm called on-demand keyphrase generation, and shows that a self-consistency prompting method with GPT-4o achieves a SemF1 score of 0.548, outperforming a fine-tuned BART-base model.

Traditional keyphrase prediction methods predict a single set of keyphrases per document, failing to cater to the diverse needs of users and downstream applications. To bridge the gap, we introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents. For this task, we present MetaKP, a large-scale benchmark comprising four datasets, 7500 documents, and 3760 goals across news and biomedical domains with human-annotated keyphrases. Leveraging MetaKP, we design both supervised and unsupervised methods, including a multi-task fine-tuning approach and a self-consistency prompting method with large language models. The results highlight the challenges of supervised fine-tuning, whose performance is not robust to distribution shifts. By contrast, the proposed self-consistency prompting approach greatly improves the performance of large language models, enabling GPT-4o to achieve 0.548 SemF1, surpassing the performance of a fully fine-tuned BART-base model. Finally, we demonstrate the potential of our method to serve as a general NLP infrastructure, exemplified by its application in epidemic event detection from social media.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes