IRSep 24, 2020

Automatic Extraction of Agriculture Terms from Domain Text: A Survey of Tools and Techniques

arXiv:2009.11796v12 citations
Originality Synthesis-oriented
AI Analysis

It addresses the challenge of selecting effective term extraction tools for automating knowledge resource population in agriculture, which is incremental as it builds on existing methods.

This paper analyzes and compares the performance of three common term extraction tools (RAKE, TerMine, TermRaider) against a newer tool (RENT) for extracting agriculture terms from text, focusing on precision and recall metrics.

Agriculture is a key component in any country's development. Domain-specific knowledge resources serve to gain insight into the domain. Existing knowledge resources such as AGROVOC and NAL Thesaurus are developed and maintained by the domain experts. Population of terms into these knowledge resources can be automated by using automatic term extraction tools for processing unstructured agricultural text. Automatic term extraction is also a key component in many semantic web applications, such as ontology creation, recommendation systems, sentiment classification, query expansion among others. The primary goal of an automatic term extraction system is to maximize the number of valid terms and minimize the number of invalid terms extracted from the input set of documents. Despite its importance in various applications, the availability of online tools for the said purpose is rather limited. Moreover, the performance of the most popular ones among them varies significantly. As a consequence, selection of the right term extraction tool is perceived as a serious problem for different knowledge-based applications. This paper presents an analysis of three commonly used term extraction tools, viz. RAKE, TerMine, TermRaider and compares their performance in terms of precision and recall, vis-a-vis RENT, a more recent term extractor developed by these authors for agriculture domain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes