CLMar 25

PINGALA: Prosody-Aware Decoding for Sanskrit Poetry Generation

arXiv:2603.2441330.5h-index: 8
AI Analysis

This addresses the problem of generating high-quality Sanskrit poetry with strict prosodic rules, which is incremental as it builds on existing language models with targeted decoding enhancements.

The paper tackled Sanskrit poetry generation by improving semantic coherence and metrical adherence, achieving a 10% increase in semantic coherence and a 46% boost in metrical alignment using prosody-aware decoding and phonetic transliteration.

Poetry generation in Sanskrit typically requires the verse to be semantically coherent and adhere to strict prosodic rules. In Sanskrit prosody, every line of a verse is typically a fixed length sequence of syllables adhering to prescribed binary patterns of syllable weights. We observe that instead of treating a verse as a monolithic sequence, segmenting them as grouped-lines leads to significant improvement in semantic coherence by 10\% with comparable metrical adherence. Specifically, PINGALA, our proposed decoding approach is designed to encourage every line to have well-formed words and our token selection biases the model towards it by preferring longer tokens. Writing in Sanskrit follows phonemic orthography, hence using a phonetically aware transliteration scheme, SLP1, increased the metrical alignment by 46\% with comparable semantic similarity, for a instruction fine-tuned large language models like Phi-4. We also introduce a new approach for reference-free evaluation using cross-encoders which achieved better alignment with true poetry instances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes