CLJul 12, 2025

Swa-bhasha Resource Hub: Romanized Sinhala to Sinhala Transliteration Systems and Data Resources

arXiv:2507.09245v12 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This addresses the need for accessible tools and data in Sinhala NLP, particularly for researchers and developers working with Romanized Sinhala, but it is incremental as it compiles existing resources rather than introducing new methods.

The paper tackles the problem of transliterating Romanized Sinhala to Sinhala by providing a comprehensive resource hub with data sets and algorithms developed from 2020 to 2025, which has advanced Sinhala NLP research and enabled the development of transliteration models and applications.

The Swa-bhasha Resource Hub provides a comprehensive collection of data resources and algorithms developed for Romanized Sinhala to Sinhala transliteration between 2020 and 2025. These resources have played a significant role in advancing research in Sinhala Natural Language Processing (NLP), particularly in training transliteration models and developing applications involving Romanized Sinhala. The current openly accessible data sets and corresponding tools are made publicly available through this hub. This paper presents a detailed overview of the resources contributed by the authors and includes a comparative analysis of existing transliteration applications in the domain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes