CLFeb 14, 2014

Machine Learning of Phonologically Conditioned Noun Declensions For Tamil Morphological Generators

arXiv:1402.3382v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses a practical issue in natural language generation for Tamil, but it is incremental as it applies existing machine learning methods to a specific linguistic domain.

The paper tackled the problem of generating word forms for Tamil, an agglutinative language, by learning morphophonemic rules from training data without explicit rule specification, achieving successful results with decision trees and Bayesian algorithms.

This paper presents machine learning solutions to a practical problem of Natural Language Generation (NLG), particularly the word formation in agglutinative languages like Tamil, in a supervised manner. The morphological generator is an important component of Natural Language Processing in Artificial Intelligence. It generates word forms given a root and affixes. The morphophonemic changes like addition, deletion, alternation etc., occur when two or more morphemes or words joined together. The Sandhi rules should be explicitly specified in the rule based morphological analyzers and generators. In machine learning framework, these rules can be learned automatically by the system from the training samples and subsequently be applied for new inputs. In this paper we proposed the machine learning models which learn the morphophonemic rules for noun declensions from the given training data. These models are trained to learn sandhi rules using various learning algorithms and the performance of those algorithms are presented. From this we conclude that machine learning of morphological processing such as word form generation can be successfully learned in a supervised manner, without explicit description of rules. The performance of Decision trees and Bayesian machine learning algorithms on noun declensions are discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes