Everyday Speech in the Indian Subcontinent
This addresses the problem of everyday multilingual communication for populations in the Indian subcontinent, representing an incremental improvement in speech synthesis.
The paper tackled the challenge of multilingual speech synthesis for India's many languages by developing a Common Label Set based on phonetics, enabling seamless code-switching across 13 Indian languages and English in a single voice.
India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End-to-End (E2E) framework for multilingual synthesis. The Indian language text is first converted to CLS. This approach enables seamless code switching across 13 Indian languages and English in a given native speaker's voice, which corresponds to everyday speech in the Indian subcontinent, where the population is multilingual.