CLAug 30, 2017

TANKER: Distributed Architecture for Named Entity Recognition and Disambiguation

arXiv:1708.09230v32 citations
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient and unreliable integration of NERD systems for companies needing large-scale processing, though it is incremental as it builds on existing NERD methods with a new architectural approach.

The paper tackles the lack of standardization and scalability in Named Entity Recognition and Disambiguation (NERD) systems for industrial use by introducing TANKER, a distributed micro-services architecture that provides a standardized API to combine multiple NERD systems, resulting in improved scalability, reliability, and failure tolerance.

Named Entity Recognition and Disambiguation (NERD) systems have recently been widely researched to deal with the significant growth of the Web. NERD systems are crucial for several Natural Language Processing (NLP) tasks such as summarization, understanding, and machine translation. However, there is no standard interface specification, i.e. these systems may vary significantly either for exporting their outputs or for processing the inputs. Thus, when a given company desires to implement more than one NERD system, the process is quite exhaustive and prone to failure. In addition, industrial solutions demand critical requirements, e.g., large-scale processing, completeness, versatility, and licenses. Commonly, these requirements impose a limitation, making good NERD models to be ignored by companies. This paper presents TANKER, a distributed architecture which aims to overcome scalability, reliability and failure tolerance limitations related to industrial needs by combining NERD systems. To this end, TANKER relies on a micro-services oriented architecture, which enables agile development and delivery of complex enterprise applications. In addition, TANKER provides a standardized API which makes possible to combine several NERD systems at once.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes