CLAug 18, 2017

Syllable-level Neural Language Model for Agglutinative Language

arXiv:1708.05515v11089 citations
Originality Incremental advance
AI Analysis

This addresses language modeling challenges for agglutinative languages, with commercial application, but is incremental as it builds on existing embedding methods.

The paper tackles the problem of out-of-vocabulary words in language models for agglutinative languages by introducing syllable and morpheme embeddings, resulting in a 16.87 perplexity improvement over character-level embeddings with 9.50M parameters and achieving state-of-the-art performance in Key Stroke Saving.

Language models for agglutinative languages have always been hindered in past due to myriad of agglutinations possible to any given word through various affixes. We propose a method to diminish the problem of out-of-vocabulary words by introducing an embedding derived from syllables and morphemes which leverages the agglutinative property. Our model outperforms character-level embedding in perplexity by 16.87 with 9.50M parameters. Proposed method achieves state of the art performance over existing input prediction methods in terms of Key Stroke Saving and has been commercialized.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes