T. V. Prabhakar

AS
h-index4
10papers
43citations
Novelty38%
AI Score33

10 Papers

CLNov 3, 2022
H_eval: A new hybrid evaluation metric for automatic speech recognition tasks

Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar et al.

Many studies have examined the shortcomings of word error rate (WER) as an evaluation metric for automatic speech recognition (ASR) systems. Since WER considers only literal word-level correctness, new evaluation metrics based on semantic similarity such as semantic distance (SD) and BERTScore have been developed. However, we found that these metrics have their own limitations, such as a tendency to overly prioritise keywords. We propose H_eval, a new hybrid evaluation metric for ASR systems that considers both semantic correctness and error rate and performs significantly well in scenarios where WER and SD perform poorly. Due to lighter computation compared to BERTScore, it offers 49 times reduction in metric computation time. Furthermore, we show that H_eval correlates strongly with downstream NLP tasks. Also, to reduce the metric calculation time, we built multiple fast and lightweight models using distillation techniques

DCJul 14, 2023
Ed-Fed: A generic federated learning framework with resource-aware client selection for edge devices

Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar

Federated learning (FL) has evolved as a prominent method for edge devices to cooperatively create a unified prediction model while securing their sensitive training data local to the device. Despite the existence of numerous research frameworks for simulating FL algorithms, they do not facilitate comprehensive deployment for automatic speech recognition tasks on heterogeneous edge devices. This is where Ed-Fed, a comprehensive and generic FL framework, comes in as a foundation for future practical FL system research. We also propose a novel resource-aware client selection algorithm to optimise the waiting time in the FL settings. We show that our approach can handle the straggler devices and dynamically set the training time for the selected devices in a round. Our evaluation has shown that the proposed approach significantly optimises waiting time in FL compared to conventional random client selection methods.

LGNov 21, 2022
PreMa: Predictive Maintenance of Solenoid Valve in Real-Time at Embedded Edge-Level

Prajwal BN, Harsha Yelchuri, Vishwanath Shastry et al.

In industrial process automation, sensors (pressure, temperature, etc.), controllers, and actuators (solenoid valves, electro-mechanical relays, circuit breakers, motors, etc.) make sure that production lines are working under the pre-defined conditions. When these systems malfunction or sometimes completely fail, alerts have to be generated in real-time to make sure not only production quality is not compromised but also safety of humans and equipment is assured. In this work, we describe the construction of a smart and real-time edge-based electronic product called PreMa, which is basically a sensor for monitoring the health of a Solenoid Valve (SV). PreMa is compact, low power, easy to install, and cost effective. It has data fidelity and measurement accuracy comparable to signals captured using high end equipment. The smart solenoid sensor runs TinyML, a compact version of TensorFlow (a.k.a. TFLite) machine learning framework. While fault detection inferencing is in-situ, model training uses mobile phones to accomplish the `on-device' training. Our product evaluation shows that the sensor is able to differentiate between the distinct types of faults. These faults include: (a) Spool stuck (b) Spring failure and (c) Under voltage. Furthermore, the product provides maintenance personnel, the remaining useful life (RUL) of the SV. The RUL provides assistance to decide valve replacement or otherwise. We perform an extensive evaluation on optimizing metrics related to performance of the entire system (i.e. embedded platform and the neural network model). The proposed implementation is such that, given any electro-mechanical actuator with similar transient response to that of the SV, the system is capable of condition monitoring, hence presenting a first of its kind generic infrastructure.

ASJun 15, 2023
MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones

Zitha Sasindran, Harsha Yelchuri, Pooja Rao et al.

We describe a comprehensive methodology for developing user-voice personalized automatic speech recognition (ASR) models by effectively training models on mobile phones, allowing user data and models to be stored and used locally. To achieve this, we propose a resource-aware sub-model-based training approach that considers the RAM, and battery capabilities of mobile phones. By considering the evaluation metric and resource constraints of the mobile phones, we are able to perform efficient training and halt the process accordingly. To simulate real users, we use speakers with various accents. The entire on-device training and evaluation framework was then tested on various mobile phones across brands. We show that fine-tuning the models and selecting the right hyperparameter values is a trade-off between the lowest achievable performance metric, on-device training time, and memory consumption. Overall, our methodology offers a comprehensive solution for developing personalized ASR models while leveraging the capabilities of mobile phones, and balancing the need for accuracy with resource constraints.

CRFeb 5, 2019Code
PUTWorkbench: Analysing Privacy in AI-intensive Systems

Saurabh Srivastava, Vinay P. Namboodiri, T. V. Prabhakar

AI intensive systems that operate upon user data face the challenge of balancing data utility with privacy concerns. We propose the idea and present the prototype of an open-source tool called Privacy Utility Trade-off (PUT) Workbench which seeks to aid software practitioners to take such crucial decisions. We pick a simple privacy model that doesn't require any background knowledge in Data Science and show how even that can achieve significant results over standard and real-life datasets. The tool and the source code is made freely available for extensions and usage.

ETJul 3, 2025
Vertiport Terminal Scheduling and Throughput Analysis for Multiple Surface Directions

Ravi Raj Saxena, T. V. Prabhakar, Joy Kuri et al.

Vertical Take-Off and Landing (VTOL) vehicles are gaining traction in both the delivery drone market and passenger transportation, driving the development of Urban Air Mobility (UAM) systems. UAM seeks to alleviate road congestion in dense urban areas by leveraging urban airspace. To handle UAM traffic, vertiport terminals (vertiminals) play a critical role in supporting VTOL vehicle operations such as take-offs, landings, taxiing, passenger boarding, refueling or charging, and maintenance. Efficient scheduling algorithms are essential to manage these operations and optimize vertiminal throughput while ensuring safety protocols. Unlike fixed-wing aircraft, which rely on runways for take-off and climbing in fixed directions, VTOL vehicles can utilize multiple surface directions for climbing and approach. This flexibility necessitates specialized scheduling methods. We propose a Mixed Integer Linear Program (MILP) formulation to holistically optimize vertiminal operations, including taxiing, climbing (or approach) using multiple directions, and turnaround at gates. The proposed MILP reduces delays by up to 50%. Additionally, we derive equations to compute upper bounds of the throughput capacity of vertiminals, considering its core elements: the TLOF pad system, taxiway system, and gate system. Our results demonstrate that the MILP achieves throughput levels consistent with the theoretical maximum derived from these equations. We also validate our framework through a case study using a well-established vertiminal topology from the literature. Our MILP can be used to find the optimal configuration of vertiminal. This dual approach, MILP and throughput analysis, allows for comprehensive capacity analysis without requiring simulations while enabling efficient scheduling through the MILP formulation.

ASJan 15, 2024
SeMaScore : a new evaluation metric for automatic speech recognition tasks

Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar

In this study, we present SeMaScore, generated using a segment-wise mapping and scoring algorithm that serves as an evaluation metric for automatic speech recognition tasks. SeMaScore leverages both the error rate and a more robust similarity score. We show that our algorithm's score generation improves upon the state-of-the-art BERTScore. Our experimental results show that SeMaScore corresponds well with expert human assessments, signal-to-noise ratio levels, and other natural language metrics. We outperform BERTScore by 41x in metric computation speed. Overall, we demonstrate that SeMaScore serves as a more dependable evaluation metric, particularly in real-world situations involving atypical speech patterns.

ASDec 7, 2021
Training end-to-end speech-to-text models on mobile phones

Zitha S, Raghavendra Rao Suresh, Pooja Rao et al.

Training the state-of-the-art speech-to-text (STT) models in mobile devices is challenging due to its limited resources relative to a server environment. In addition, these models are trained on generic datasets that are not exhaustive in capturing user-specific characteristics. Recently, on-device personalization techniques have been making strides in mitigating the problem. Although many current works have already explored the effectiveness of on-device personalization, the majority of their findings are limited to simulation settings or a specific smartphone. In this paper, we develop and provide a detailed explanation of our framework to train end-to-end models in mobile phones. To make it simple, we considered a model based on connectionist temporal classification (CTC) loss. We evaluated the framework on various mobile phones from different brands and reported the results. We provide enough evidence that fine-tuning the models and choosing the right hyperparameter values is a trade-off between the lowest WER achievable, training time on-device, and memory consumption. Hence, this is vital for a successful deployment of on-device training onto a resource-limited environment like mobile phones. We use training sets from speakers with different accents and record a 7.6% decrease in average word error rate (WER). We also report the associated computational cost measurements with respect to time, memory usage, and cpu utilization in mobile phones in real-time.

LGApr 13, 2016
Animation and Chirplet-Based Development of a PIR Sensor Array for Intruder Classification in an Outdoor Environment

Raviteja Upadrashta, Tarun Choubisa, A. Praneeth et al.

This paper presents the development of a passive infra-red sensor tower platform along with a classification algorithm to distinguish between human intrusion, animal intrusion and clutter arising from wind-blown vegetative movement in an outdoor environment. The research was aimed at exploring the potential use of wireless sensor networks as an early-warning system to help mitigate human-wildlife conflicts occurring at the edge of a forest. There are three important features to the development. Firstly, the sensor platform employs multiple sensors arranged in the form of a two-dimensional array to give it a key spatial-resolution capability that aids in classification. Secondly, given the challenges of collecting data involving animal intrusion, an Animation-based Simulation tool for Passive Infra-Red sEnsor (ASPIRE) was developed that simulates signals corresponding to human and animal intrusion and some limited models of vegetative clutter. This speeded up the process of algorithm development by allowing us to test different hypotheses in a time-efficient manner. Finally, a chirplet-based model for intruder signal was developed that significantly helped boost classification accuracy despite drawing data from a smaller number of sensors. An SVM-based classifier was used which made use of chirplet, energy and signal cross-correlation-based features. The average accuracy obtained for intruder detection and classification on real-world and simulated data sets was in excess of 97%.

SEMay 21, 2012
Examining the Impact of Platform Properties on Quality Attributes

Balwinder Sodhi, T. V. Prabhakar

We examine and bring out the architecturally significant characteristics of various virtualization and cloud oriented platforms. The impact of such characteristics on the ability of guest applications to achieve various quality attributes (QA) has also been determined by examining existing body of architecture knowledge. We observe from our findings that efficiency, resource elasticity and security are among the most impacted QAs, and virtualization platforms exhibit the maximum impact on various QAs.