CLLGMay 7, 2019

Development of Deep Learning Based Natural Language Processing Model for Turkish

arXiv:1905.05699v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific need for Turkish natural language processing researchers by providing a platform for POS tagging, but it is incremental as it applies an existing deep learning method to a new language context.

The researchers tackled the problem of part-of-speech tagging for Turkish, a morphologically rich language, by proposing a model using Bidirectional Long-Short Term Memory (BLSTM), which reduced the error rate through expert feedback during development.

Natural language is one of the most fundamental features that distinguish people from other living things and enable people to communicate each other. Language is a tool that enables people to express their feelings and thoughts and to transfers cultures through generations. Texts and audio are examples of natural language in daily life. In the natural language, many words disappear in time, on the other hand new words are derived. Therefore, while the process of natural language processing (NLP) is complex even for human, it is difficult to process in computer system. The area of linguistics examines how people use language. NLP, which requires the collaboration of linguists and computer scientists, plays an important role in human computer interaction. Studies in NLP have increased with the use of artificial intelligence technologies in the field of linguistics. With the deep learning methods which are one of the artificial intelligence study areas, platforms close to natural language are being developed. Developed platforms for language comprehension, machine translation and part of speech (POS) tagging benefit from deep learning methods. Recurrent Neural Network (RNN), one of the deep learning architectures, is preferred for processing sequential data such as text or audio data. In this study, Turkish POS tagging model has been proposed by using Bidirectional Long-Short Term Memory (BLSTM) which is an RNN type. The proposed POS tagging model is provided to natural language researchers with a platform that allows them to perform and use their own analysis. In the development phase of the platform developed by using BLSTM, the error rate of the POS tagger has been reduced by taking feedback with expert opinion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes