LGMar 28, 2023
On Feature Scaling of Recursive Feature MachinesArunav Gupta, Rohit Mishra, William Luu et al.
In this technical report, we explore the behavior of Recursive Feature Machines (RFMs), a type of novel kernel machine that recursively learns features via the average gradient outer product, through a series of experiments on regression datasets. When successively adding random noise features to a dataset, we observe intriguing patterns in the Mean Squared Error (MSE) curves with the test MSE exhibiting a decrease-increase-decrease pattern. This behavior is consistent across different dataset sizes, noise parameters, and target functions. Interestingly, the observed MSE curves show similarities to the "double descent" phenomenon observed in deep neural networks, hinting at new connection between RFMs and neural network behavior. This report lays the groundwork for future research into this peculiar behavior.
CLJul 21, 2025
Enhancing Hindi NER in Low Context: A Comparative study of Transformer-based models with vs. without Retrieval AugmentationSumit Singh, Rohit Mishra, Uma Shanker Tiwary
One major challenge in natural language processing is named entity recognition (NER), which identifies and categorises named entities in textual input. In order to improve NER, this study investigates a Hindi NER technique that makes use of Hindi-specific pretrained encoders (MuRIL and XLM-R) and Generative Models ( Llama-2-7B-chat-hf (Llama2-7B), Llama-2-70B-chat-hf (Llama2-70B), Llama-3-70B-Instruct (Llama3-70B) and GPT3.5-turbo), and augments the data with retrieved data from external relevant contexts, notably from Wikipedia. We have fine-tuned MuRIL, XLM-R and Llama2-7B with and without RA. However, Llama2-70B, lama3-70B and GPT3.5-turbo are utilised for few-shot NER generation. Our investigation shows that the mentioned language models (LMs) with Retrieval Augmentation (RA) outperform baseline methods that don't incorporate RA in most cases. The macro F1 scores for MuRIL and XLM-R are 0.69 and 0.495, respectively, without RA and increase to 0.70 and 0.71, respectively, in the presence of RA. Fine-tuned Llama2-7B outperforms Llama2-7B by a significant margin. On the other hand the generative models which are not fine-tuned also perform better with augmented data. GPT3.5-turbo adopted RA well; however, Llama2-70B and llama3-70B did not adopt RA with our retrieval context. The findings show that RA significantly improves performance, especially for low-context data. This study adds significant knowledge about how best to use data augmentation methods and pretrained models to enhance NER performance, particularly in languages with limited resources.
CLJan 30, 2017
Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text CorpusShrikant Malviya, Rohit Mishra, Uma Shanker Tiwary
Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent area of research in human computer interaction nowadays. A set of phonetically rich sentences is in a matter of importance in order to develop these two interactive modules of HCI. Essentially, the set of phonetically rich sentences has to cover all possible phone units distributed uniformly. Selecting such a set from a big corpus with maintaining phonetic characteristic based similarity is still a challenging problem. The major objective of this paper is to devise a criteria in order to select a set of sentences encompassing all phonetic aspects of a corpus with size as minimum as possible. First, this paper presents a statistical analysis of Hindi phonetics by observing the structural characteristics. Further a two stage algorithm is proposed to extract phonetically rich sentences with a high variety of triphones from the EMILLE Hindi corpus. The algorithm consists of a distance measuring criteria to select a sentence in order to improve the triphone distribution. Moreover, a special preprocessing method is proposed to score each triphone in terms of inverse probability in order to fasten the algorithm. The results show that the approach efficiently build uniformly distributed phonetically-rich corpus with optimum number of sentences.
CRSep 11, 2012
Two Way Concurrent Buffer System without Deadlock in Various Time Models Using Timed AutomataRohit Mishra, Md Zeeshan, Sanjay Singh
Two way buffer system is a system that exhibits transfer of data using two buffers concurrently. It includes processes that synchronize to exchange data with each other along with executing certain delays between these synchronizations. In existing Tiny Two Way Buffer System, both generators produce packets in half duplex manner in no time, deterministic time, and non deterministic time. Analysis of the model for above time options leads the model in deadlock. The model can be out of the deadlock if timings in the model is incorporated in alternative fashion. The generators produce packets after a delay of 10 seconds. However, generator one has an initial shift of 5 seconds after which it begins sending a packet every 10 seconds. Hence, initial delay for generator one is 15 seconds and for generator two it is 10 seconds. Due to this initial shift, both generators produce packets alternatively and is deadlock free as the packets do not meet at the same time instant. Moreover, the existing system model is not concurrent and hence takes more time for packet transfer in every iteration. In this paper we have proposed a model of buffer system using an additional dummy buffer for transfer of data packets, we thus checking the model with various time models as no time, deterministic time and non deterministic time. The results of proposed model under above time models are in deadlock. We achieve deadlock free situation by introducing appropriate delay in various buffers of the proposed system, the delay timing is nondeterministic time. The new approach speeds up the transfer of packets, as a result the transfer of data becomes concurrent, deadlock free and hence the model proposed is time efficient. Simulation results shows that the proposed two way buffer system is fully concurrent and time efficient as compared to the existing buffer system.