Development and Transcription of Assamese Speech Corpus
This work addresses the problem of enabling speech processing tasks for Assamese speakers and researchers, but it is incremental as it only reports initial development without results.
The authors tackled the lack of a speech corpus for Assamese, a less computationally aware language, by developing an initial balanced speech corpus, reporting on the issues and challenges faced during this first effort.
A balanced speech corpus is the basic need for any speech processing task. In this report we describe our effort on development of Assamese speech corpus. We mainly focused on some issues and challenges faced during development of the corpus. Being a less computationally aware language, this is the first effort to develop speech corpus for Assamese. As corpus development is an ongoing process, in this paper we report only the initial task.