CLFeb 7, 2017

Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology

arXiv:1702.02211v21 citations
Originality Highly original
AI Analysis

This addresses a long-standing challenge in natural language processing for Semitic languages, offering an unsupervised and language-agnostic solution.

The paper tackled the problem of unsupervised discovery of root-and-pattern morphology in Semitic languages, which had not been handled in prior approaches, and showed that their root extractor compares favorably with the widely used ISRI extractor.

We present an unsupervised and language-agnostic method for learning root-and-pattern morphology in Semitic languages. This form of morphology, abundant in Semitic languages, has not been handled in prior unsupervised approaches. We harness the syntactico-semantic information in distributed word representations to solve the long standing problem of root-and-pattern discovery in Semitic languages. Moreover, we construct an unsupervised root extractor based on the learned rules. We prove the validity of learned rules across Arabic, Hebrew, and Amharic, alongside showing that our root extractor compares favorably with a widely used, carefully engineered root extractor: ISRI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes