AS AIJun 5, 2025

Intelligibility of Text-to-Speech Systems for Mathematical Expressions

Sujoy Roychowdhury, H. G. Ranjani, Sumit Soman, Nishtha Paul, Subhadip Bandyopadhyay, Siddhanth Iyengar

arXiv:2506.11086v14.33 citationsh-index: 11INTERSPEECH

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of making mathematical content accessible via TTS for visually impaired users, but it is incremental as it primarily benchmarks existing models.

The study evaluated the intelligibility of five Text-to-Speech (TTS) models for mathematical expressions, finding that their outputs are often not intelligible, with performance significantly worse than human expert renditions for most categories.

There has been limited evaluation of advanced Text-to-Speech (TTS) models with Mathematical eXpressions (MX) as inputs. In this work, we design experiments to evaluate quality and intelligibility of five TTS models through listening and transcribing tests for various categories of MX. We use two Large Language Models (LLMs) to generate English pronunciation from LaTeX MX as TTS models cannot process LaTeX directly. We use Mean Opinion Score from user ratings and quantify intelligibility through transcription correctness using three metrics. We also compare listener preference of TTS outputs with respect to human expert rendition of same MX. Results establish that output of TTS models for MX is not necessarily intelligible, the gap in intelligibility varies across TTS models and MX category. For most categories, performance of TTS models is significantly worse than that of expert rendition. The effect of choice of LLM is limited. This establishes the need to improve TTS models for MX.

View on arXiv PDF

Similar