CLAug 29, 2022

Extracting Mathematical Concepts from Text

arXiv:2208.13830v1583 citationsh-index: 33
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of processing noisy domain text for researchers in mathematics and knowledge graph construction, but it is incremental as it focuses on a small experiment and comparison of existing methods.

The paper tackled the problem of extracting mathematical entities from English texts in category theory to build a mathematical knowledge graph, comparing four term extractors and highlighting issues in construction and evaluation, while providing two open corpora of 755 abstracts and 15,000 sentences.

We investigate different systems for extracting mathematical entities from English texts in the mathematical field of category theory as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of terms extracted from noisy domain text. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a larger corpus from the nLab community wiki (15,000 sentences).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes