DC CLFeb 10, 2018

Distributed NLP

arXiv:1802.03606v11 citations

Originality Synthesis-oriented

AI Analysis

This work addresses efficiency in NLP for Turkish language processing, but it is incremental as it applies existing methods to a new dataset.

The paper tackled the problem of processing Turkish scientific papers using distributed NLP with MapReduce on a Hadoop cluster, and the result showed performance comparisons with single-machine setups, though no concrete numbers were provided.

In this paper we present the performance of parallel text processing with Map Reduce on a cloud platform. Scientific papers in Turkish language are processed using Zemberek NLP library. Experiments were run on a Hadoop cluster and compared with the single machines performance.

View on arXiv PDF

Similar