DLCLIRJun 11, 2014

POS Tagging and its Applications for Mathematics

arXiv:1406.2880v120 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of content analysis for scientific information services in mathematics, but it is incremental as it adjusts existing NLP methods rather than introducing new paradigms.

The paper tackled the challenge of adapting Natural Language Processing (NLP) methods, specifically part-of-speech tagging, for mathematical publications to handle mathematical formulae, and demonstrated its application for key phrase extraction and classification in the zbMATH database.

Content analysis of scientific publications is a nontrivial task, but a useful and important one for scientific information services. In the Gutenberg era it was a domain of human experts; in the digital age many machine-based methods, e.g., graph analysis tools and machine-learning techniques, have been developed for it. Natural Language Processing (NLP) is a powerful machine-learning approach to semiautomatic speech and language processing, which is also applicable to mathematics. The well established methods of NLP have to be adjusted for the special needs of mathematics, in particular for handling mathematical formulae. We demonstrate a mathematics-aware part of speech tagger and give a short overview about our adaptation of NLP methods for mathematical publications. We show the use of the tools developed for key phrase extraction and classification in the database zbMATH.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes