CLMay 6, 2018

Construction of the Literature Graph in Semantic Scholar

arXiv:1805.02262v11207 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for scalable systems to structure scientific literature for researchers and developers, though it is incremental as it builds on existing NLP tasks.

The authors tackled the problem of organizing scientific literature into a heterogeneous graph to enable algorithmic manipulation and discovery, resulting in a graph with over 280M nodes including papers, authors, and entities.

We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graph consists of more than 280M nodes, representing papers, authors, entities and various interactions between them (e.g., authorships, citations, entity mentions). We reduce literature graph construction into familiar NLP tasks (e.g., entity extraction and linking), point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. The methods described in this paper are used to enable semantic features in www.semanticscholar.org

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes