CLNov 24, 2025

A symbolic Perl algorithm for the unification of Nahuatl word spellings

arXiv:2511.19118v1
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem for researchers and practitioners working with Nahuatl language texts, but it is incremental as it builds on existing algorithms and corpora.

The paper tackles the problem of automatically unifying orthographic variations in Nahuatl text documents using a symbolic algorithm based on linguistic rules and regular expressions, achieving encouraging results in a manual evaluation of the unified sentences' semantic quality.

In this paper, we describe a symbolic model for the automatic orthographic unification of Nawatl text documents. Our model is based on algorithms that we have previously used to analyze sentences in Nawatl, and on the corpus called $π$-yalli, consisting of texts in several Nawatl orthographies. Our automatic unification algorithm implements linguistic rules in symbolic regular expressions. We also present a manual evaluation protocol that we have proposed and implemented to assess the quality of the unified sentences generated by our algorithm, by testing in a sentence semantic task. We have obtained encouraging results from the evaluators for most of the desired features of our artificially unified sentences

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes