IRNov 27, 2018

sCAKE: Semantic Connectivity Aware Keyword Extraction

arXiv:1811.10831v161 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses keyword extraction challenges for text analysis, particularly benefiting languages with limited NLP resources, though it appears incremental as it builds on existing graph-based methods.

The authors tackled the problem of keyword extraction by proposing sCAKE, a parameterless graph-based method that captures semantic connectivity between words, and demonstrated its superiority over state-of-the-art algorithms. They also introduced LAKE, a language-agnostic version that eliminates the need for NLP tools, showing it is effective for languages lacking such support.

Keyword Extraction is an important task in several text analysis endeavors. In this paper, we present a critical discussion of the issues and challenges ingraph-based keyword extraction methods, along with comprehensive empirical analysis. We propose a parameterless method for constructing graph of text that captures the contextual relation between words. A novel word scoring method is also proposed based on the connection between concepts. We demonstrate that both proposals are individually superior to those followed by the state-of-the-art graph-based keyword extraction algorithms. Combination of the proposed graph construction and scoring methods leads to a novel, parameterless keyword extraction method (sCAKE) based on semantic connectivity of words in the document. Motivated by limited availability of NLP tools for several languages, we also design and present a language-agnostic keyword extraction (LAKE) method. We eliminate the need of NLP tools by using a statistical filter to identify candidate keywords before constructing the graph. We show that the resulting method is a competent solution for extracting keywords from documents oflanguages lacking sophisticated NLP support.

Code Implementations6 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes