CLNov 23, 2016

ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala

arXiv:1611.07804v152 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a modular and scalable toolkit for researchers and practitioners in natural language processing, but it is incremental as it consolidates existing methods rather than introducing new ones.

The authors tackled the lack of standardized implementations and comparisons for automatic terminology recognition methods by developing ATR4S, an open-source Scala toolkit with over 15 methods, and found that no single method performed best across 7 datasets in terms of average precision.

Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of state-of-the-art methods implementations, which are usually non-trivial to recreate. In order to address these issues, we present ATR4S, an open-source software written in Scala that comprises more than 15 methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidates scoring, and finally, term candidates ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 10 state-of-the-art methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes