Biased TextRank: Unsupervised Graph-Based Content Extraction
This work addresses the need for efficient, unsupervised methods for extracting task-relevant content in NLP, though it is incremental as it builds directly on the existing TextRank algorithm.
The paper tackles the problem of focused content extraction from text by introducing Biased TextRank, which modifies TextRank to prioritize text spans relevant to a given focus, resulting in improved performance on summarization and explanation extraction tasks with significant ROUGE-N score gains.
We introduce Biased TextRank, a graph-based content extraction method inspired by the popular TextRank algorithm that ranks text spans according to their importance for language processing tasks and according to their relevance to an input "focus." Biased TextRank enables focused content extraction for text by modifying the random restarts in the execution of TextRank. The random restart probabilities are assigned based on the relevance of the graph nodes to the focus of the task. We present two applications of Biased TextRank: focused summarization and explanation extraction, and show that our algorithm leads to improved performance on two different datasets by significant ROUGE-N score margins. Much like its predecessor, Biased TextRank is unsupervised, easy to implement and orders of magnitude faster and lighter than current state-of-the-art Natural Language Processing methods for similar tasks.