Keyphrase Extraction using Sequential Labeling
This addresses keyphrase extraction for document summarization and retrieval, offering an incremental improvement over existing techniques.
The paper tackled keyphrase extraction by framing it as a sequential labeling task, overcoming limitations of phrase-level methods and handling varying keyphrase lengths, and showed that this approach yields significant performance benefits over state-of-the-art methods.
Keyphrases efficiently summarize a document's content and are used in various document processing and retrieval tasks. Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents. Most of these methods operate at a phrase-level and rely on part-of-speech (POS) filters for candidate phrase generation. In addition, they do not directly handle keyphrases of varying lengths. We overcome these modeling shortcomings by addressing keyphrase extraction as a sequential labeling task in this paper. We explore a basic set of features commonly used in NLP tasks as well as predictions from various unsupervised methods to train our taggers. In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing state-of-the-art extraction methods.