CLJun 29, 2020
Towards the Study of Morphological Processing of the Tangkhul LanguageMirinso Shadang, Navanath Saharia, Thoudam Doren Singh
There is no or little work on natural language processing of Tangkhul language. The current work is a humble beginning of morphological processing of this language using an unsupervised approach. We use a small corpus collected from different sources of text books, short stories and articles of other topics. Based on the experiments carried out, the morpheme identification task using morphessor gives reasonable and interesting output despite using a small corpus.
HCOct 31, 2018
A speech-based driver assisting module for Intelligent Transport SystemHimangshu Sarma, Navanath Saharia
Aim of this research is to transform images of roadside traffic panels to speech to assist the vehicle driver, which is a new approach in the state-of-the-art of the advanced driver assistance systems. The designed system comprises of three modules, where the first module is used to capture and detect the text area in traffic panels, second module is responsible for converting the image of the detected text area to editable text and the last module is responsible for transforming the text to speech. Additionally, during experiments, we developed a corpus of 250 images of traffic panels for two Indian languages.
CLApr 22, 2014
Morphological Analysis of the Bishnupriya Manipuri Language using Finite State TransducersNayan Jyoti Kalita, Navanath Saharia, Smriti Kumar Sinha
In this work we present a morphological analysis of Bishnupriya Manipuri language, an Indo-Aryan language spoken in the north eastern India. As of now, there is no computational work available for the language. Finite state morphology is one of the successful approaches applied in a wide variety of languages over the year. Therefore we adapted the finite state approach to analyse morphology of the Bishnupriya Manipuri language.
CLDec 11, 2013
Towards The Development of a Bishnupriya Manipuri CorpusNayan Jyoti Kalita, Navanath Saharia, Smriti Kumar Sinha
For any deep computational processing of language we need evidences, and one such set of evidences is corpus. This paper describes the development of a text-based corpus for the Bishnupriya Manipuri language. A Corpus is considered as a building block for any language processing tasks. Due to the lack of awareness like other Indian languages, it is also studied less frequently. As a result the language still lacks a good corpus and basic language processing tools. As per our knowledge this is the first effort to develop a corpus for Bishnupriya Manipuri language.
CLSep 27, 2013
Development and Transcription of Assamese Speech CorpusHimangshu Sarma, Navanath Saharia, Utpal Sharma et al.
A balanced speech corpus is the basic need for any speech processing task. In this report we describe our effort on development of Assamese speech corpus. We mainly focused on some issues and challenges faced during development of the corpus. Being a less computationally aware language, this is the first effort to develop speech corpus for Assamese. As corpus development is an ongoing process, in this paper we report only the initial task.