Dhruv Rathi

CV
4papers
181citations
Novelty45%
AI Score39

4 Papers

CLMar 1
Towards Orthographically-Informed Evaluation of Speech Recognition Systems for Indian Languages

Kaushal Santosh Bhogale, Tahir Javed, Greeshma Susan John et al.

Evaluating ASR systems for Indian languages is challenging due to spelling variations, suffix splitting flexibility, and non-standard spellings in code-mixed words. Traditional Word Error Rate (WER) often presents a bleaker picture of system performance than what human users perceive. Better aligning evaluation with real-world performance requires capturing permissible orthographic variations, which is extremely challenging for under-resourced Indian languages. Leveraging recent advances in LLMs, we propose a framework for creating benchmarks that capture permissible variations. Through extensive experiments, we demonstrate that OIWER, by accounting for orthographic variations, reduces pessimistic error rates (an average improvement of 6.3 points), narrows inflated model gaps (e.g., Gemini-Canary performance difference drops from 18.1 to 11.5 points), and aligns more closely with human perception than prior methods like WER-SN by 4.9 points.

CVMay 25, 2018
Underwater Fish Species Classification using Convolutional Neural Network and Deep Learning

Dhruv Rathi, Sushant Jain, Dr. S. Indu

The target of this paper is to recommend a way for Automated classification of Fish species. A high accuracy fish classification is required for greater understanding of fish behavior in Ichthyology and by marine biologists. Maintaining a ledger of the number of fishes per species and marking the endangered species in large and small water bodies is required by concerned institutions. Majority of available methods focus on classification of fishes outside of water because underwater classification poses challenges such as background noises, distortion of images, the presence of other water bodies in images, image quality and occlusion. This method uses a novel technique based on Convolutional Neural Networks, Deep Learning and Image Processing to achieve an accuracy of 96.29%. This method ensures considerably discrimination accuracy improvements than the previously proposed methods.

CRMay 17, 2018
DroidMark: A Tool for Android Malware Detection using Taint Analysis and Bayesian Network

Dhruv Rathi, Rajni Jindal

With the increasing user base of Android devices and advent of technologies such as Internet Banking, delicate user data is prone to be misused by malware and spyware applications. As the app developer community increases, the quality reassurance could not be justified for every application and a possibility of data leakage arises. In this research, with the aim to ensure the application authenticity, Deep Learning methods and Taint Analysis are deployed on the applications. The detection system named DroidMark looks for possible sinks and sources of data leakage in the application by modelling Android lifecycle and callbacks, which is done by Reverse Engineering the APK, further monitoring the suspected processes and collecting data in different states of the application. DroidMark is thus designed to extract features from the applications which are fed to a trained Bayesian Network for classification of Malicious and Regular applications. The results indicate a high accuracy of 96.87% and an error rate of 3.13% in the detection of Malware in Android devices.

CVMay 17, 2018
Optimization of Transfer Learning for Sign Language Recognition Targeting Mobile Platform

Dhruv Rathi

The target of this research is to experiment, iterate and recommend a system that is successful in recognition of American Sign Language (ASL). It is a challenging as well as an interesting problem that if solved will bring a leap in social and technological aspects alike. In this paper, we propose a real-time recognizer of ASL based on a mobile platform, so that it will have more accessibility and provides an ease of use. The technique implemented is Transfer Learning of new data of Hand gestures for alphabets in ASL to be modelled on various pre-trained high- end models and optimize the best model to run on a mobile platform considering the various limitations of the same during optimization. The data used consists of 27,455 images of 24 alphabets of ASL. The optimized model when ran over a memory-efficient mobile application, provides an accuracy of 95.03% of accurate recognition with an average recognition time of 2.42 seconds. This method ensures considerable discrimination in accuracy and recognition time than the previous research.