CLAug 9, 2017

Identifying Reference Spans: Topic Modeling and Word Embeddings help IR

arXiv:1708.02989v1
Originality Synthesis-oriented
AI Analysis

This work addresses a specific citation analysis task in information retrieval, with incremental improvements over existing methods.

The paper tackles the problem of identifying text spans in a document referenced by a citation, using the CL-SciSumm 2016 dataset, and shows that topic models and word embeddings improve system performance beyond the previous best.

The CL-SciSumm 2016 shared task introduced an interesting problem: given a document D and a piece of text that cites D, how do we identify the text spans of D being referenced by the piece of text? The shared task provided the first annotated dataset for studying this problem. We present an analysis of our continued work in improving our system's performance on this task. We demonstrate how topic models and word embeddings can be used to surpass the previously best performing system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes