ASLGJul 12, 2020

NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

arXiv:2007.06021v121 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses a gap for researchers and practitioners in speech technology by enabling more robust speaker profiling, though it is incremental as it builds on existing data collection efforts.

The authors tackled the lack of comprehensive datasets for speaker profiling by creating the NISP dataset, which includes speech data from five Indian languages and English along with metadata like linguistic, regional, and physical characteristics, and they provided baseline results for profiling tasks.

Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have only partial information for speaker profiling. In this paper, we attempt to overcome this limitation by developing a new dataset which has speech data from five different Indian languages along with English. The metadata information for speaker profiling applications like linguistic information, regional information, and physical characteristics of a speaker are also collected. We call this dataset as NITK-IISc Multilingual Multi-accent Speaker Profiling (NISP) dataset. The description of the dataset, potential applications, and baseline results for speaker profiling on this dataset are provided in this paper.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes