Hemant Kumar Yadav

h-index17

3papers

1,580citations

Novelty27%

AI Score25

Ranked #165,288 of 194,257 authors (top 85%)#28,084 in CL (top 91%)

3 Papers

31.3CLFeb 25, 2022

A Survey of Multilingual Models for Automatic Speech Recognition

Hemant Yadav, Sunayana Sitaram

Although Automatic Speech Recognition (ASR) systems have achieved human-like performance for a few languages, the majority of the world's languages do not have usable systems due to the lack of large speech datasets to train these models. Cross-lingual transfer is an attractive solution to this problem, because low-resource languages can potentially benefit from higher-resource languages either through transfer learning, or being jointly trained in the same multilingual model. The problem of cross-lingual transfer has been well studied in ASR, however, recent advances in Self Supervised Learning are opening up avenues for unlabeled speech data to be used in multilingual ASR models, which can pave the way for improved performance on low-resource languages. In this paper, we survey the state of the art in multilingual ASR models that are built with cross-lingual transfer in mind. We present best practices for building multilingual models from research across diverse languages and techniques, discuss open questions and provide recommendations for future work.

1.9SDNov 25, 2020Code

mask-Net: Learning Context Aware Invariant Features using Adversarial Forgetting (Student Abstract)

Hemant Yadav, Atul Anshuman Singh, Rachit Mittal et al.

Training a robust system, e.g.,Speech to Text (STT), requires large datasets. Variability present in the dataset such as unwanted nuisances and biases are the reason for the need of large datasets to learn general representations. In this work, we propose a novel approach to induce invariance using adversarial forgetting (AF). Our initial experiments on learning invariant features such as accent on the STT task achieve better generalizations in terms of word error rate (WER) compared to the traditional models. We observe an absolute improvement of 2.2% and 1.3% on out-of-distribution and in-distribution test sets, respectively.

31.0CLSep 6, 2020

MIDAS at SemEval-2020 Task 10: Emphasis Selection using Label Distribution Learning and Contextual Embeddings

Sarthak Anand, Pradyumna Gupta, Hemant Yadav et al.

This paper presents our submission to the SemEval 2020 - Task 10 on emphasis selection in written text. We approach this emphasis selection problem as a sequence labeling task where we represent the underlying text with various contextual embedding models. We also employ label distribution learning to account for annotator disagreements. We experiment with the choice of model architectures, trainability of layers, and different contextual embeddings. Our best performing architecture is an ensemble of different models, which achieved an overall matching score of 0.783, placing us 15th out of 31 participating teams. Lastly, we analyze the results in terms of parts of speech tags, sentence lengths, and word ordering.