79.2CLApr 21
Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in IndiaKaushal Bhogale, Manas Dhir, Amritansh Walecha et al.
Existing Indic ASR benchmarks often use scripted, clean speech and leaderboard driven evaluation that encourages dataset specific overfitting. In addition, strict single reference WER penalizes natural spelling variation in Indian languages, including non standardized spellings of code-mixed English origin words. To address these limitations, we introduce Voice of India, a closed source benchmark built from unscripted telephonic conversations covering 15 major Indian languages across 139 regional clusters. The dataset contains 306230 utterances, totaling 536 hours of speech from 36691 speakers with transcripts accounting for spelling variations. We also analyze performance geographically at the district level, revealing disparities. Finally, we provide detailed analysis across factors such as audio quality, speaking rate, gender, and device type, highlighting where current ASR systems struggle and offering insights for improving real world Indic ASR systems.
53.1QUANT-PHApr 3
Recurrent Quantum Feature Maps for Reservoir ComputingUtkarsh Singh, Aaron Z. Goldberg, Christoph Simon et al.
Reservoir computing promises a fast method for handling large amounts of temporal data. This hinges on constructing a good reservoir--a dynamical system capable of transforming inputs into a high-dimensional representation while remembering properties of earlier data. In this work, we introduce a reservoir based on recurrent quantum feature maps where a fixed quantum circuit is reused to encode both current inputs and a classical feedback signal derived from previous outputs. We evaluate the model on the Mackey-Glass time-series prediction task using our recently introduced CP feature map, and find that it achieves lower mean squared error than standard classical baselines, including echo state networks and multilayer perceptrons, while maintaining compact circuit depth and qubit requirements. We further analyze memory capacity and show that the model effectively retains temporal information, consistent with its forecasting accuracy. Finally, we study the impact of realistic noise and find that performance is robust to several noise channels but remains sensitive to two-qubit gate errors, identifying a key limitation for near-term implementations.
QUANT-PHFeb 1, 2024
Coherent Feed Forward Quantum Neural NetworkUtkarsh Singh, Aaron Z. Goldberg, Khabat Heshami
Quantum machine learning, focusing on quantum neural networks (QNNs), remains a vastly uncharted field of study. Current QNN models primarily employ variational circuits on an ansatz or a quantum feature map, often requiring multiple entanglement layers. This methodology not only increases the computational cost of the circuit beyond what is practical on near-term quantum devices but also misleadingly labels these models as neural networks, given their divergence from the structure of a typical feed-forward neural network (FFNN). Moreover, the circuit depth and qubit needs of these models scale poorly with the number of data features, resulting in an efficiency challenge for real-world machine-learning tasks. We introduce a bona fide QNN model, which seamlessly aligns with the versatility of a traditional FFNN in terms of its adaptable intermediate layers and nodes, absent from intermediate measurements such that our entire model is coherent. This model stands out with its reduced circuit depth and number of requisite C-NOT gates to outperform prevailing QNN models. Furthermore, the qubit count in our model remains unaffected by the data's feature quantity. We test our proposed model on various benchmarking datasets such as the diagnostic breast cancer (Wisconsin) and credit card fraud detection datasets. We compare the outcomes of our model with the existing QNN methods to showcase the advantageous efficacy of our approach, even with a reduced requirement on quantum resources. Our model paves the way for application of quantum neural networks to real relevant machine learning problems.
CVJan 5
Car Drag Coefficient Prediction from 3D Point Clouds Using a Slice-Based Surrogate ModelUtkarsh Singh, Absaar Ali, Adarsh Roy
The automotive industry's pursuit of enhanced fuel economy and performance necessitates efficient aerodynamic design. However, traditional evaluation methods such as computational fluid dynamics (CFD) and wind tunnel testing are resource intensive, hindering rapid iteration in the early design stages. Machine learning-based surrogate models offer a promising alternative, yet many existing approaches suffer from high computational complexity, limited interpretability, or insufficient accuracy for detailed geometric inputs. This paper introduces a novel lightweight surrogate model for the prediction of the aerodynamic drag coefficient (Cd) based on a sequential slice-wise processing of the geometry of the 3D vehicle. Inspired by medical imaging, 3D point clouds of vehicles are decomposed into an ordered sequence of 2D cross-sectional slices along the stream-wise axis. Each slice is encoded by a lightweight PointNet2D module, and the sequence of slice embeddings is processed by a bidirectional LSTM to capture longitudinal geometric evolution. The model, trained and evaluated on the DrivAerNet++ dataset, achieves a high coefficient of determination (R^2 > 0.9528) and a low mean absolute error (MAE approx 6.046 x 10^{-3}) in Cd prediction. With an inference time of approximately 0.025 seconds per sample on a consumer-grade GPU, our approach provides fast, accurate, and interpretable aerodynamic feedback, facilitating more agile and informed automotive design exploration.
CLAug 6, 2025
Lightweight Transformers for Zero-Shot and Fine-Tuned Text-to-SQL Generation Using SpiderChirag Seth, Utkarsh Singh
Text-to-SQL translation enables non-expert users to query relational databases using natural language, with applications in education and business intelligence. This study evaluates three lightweight transformer models - T5-Small, BART-Small, and GPT-2 - on the Spider dataset, focusing on low-resource settings. We developed a reusable, model-agnostic pipeline that tailors schema formatting to each model's architecture, training them across 1000 to 5000 iterations and evaluating on 1000 test samples using Logical Form Accuracy (LFAcc), BLEU, and Exact Match (EM) metrics. Fine-tuned T5-Small achieves the highest LFAcc (27.8%), outperforming BART-Small (23.98%) and GPT-2 (20.1%), highlighting encoder-decoder models' superiority in schema-aware SQL generation. Despite resource constraints limiting performance, our pipeline's modularity supports future enhancements, such as advanced schema linking or alternative base models. This work underscores the potential of compact transformers for accessible text-to-SQL solutions in resource-scarce environments.
QUANT-PHJul 4, 2025
A Resource Efficient Quantum KernelUtkarsh Singh, Jean-Frédéric Laprade, Aaron Z. Goldberg et al.
Quantum processors may enhance machine learning by mapping high-dimensional data onto quantum systems for processing. Conventional feature maps, for encoding data onto a quantum circuit are currently impractical, as the number of entangling gates scales quadratically with the dimension of the dataset and the number of qubits. In this work, we introduce a quantum feature map designed to handle high-dimensional data with a significantly reduced number of qubits and entangling operations. Our approach preserves essential data characteristics while promoting computational efficiency, as evidenced by extensive experiments on benchmark datasets that demonstrate a marked improvement in both accuracy and resource utilization when using our feature map as a kernel for characterization, as compared to state-of-the-art quantum feature maps. Our noisy simulation results, combined with lower resource requirements, highlight our map's ability to function within the constraints of noisy intermediate-scale quantum devices. Through numerical simulations and small-scale implementation on a superconducting circuit quantum computing platform, we demonstrate that our scheme performs on par or better than a set of classical algorithms for classification. While quantum kernels are typically stymied by exponential concentration, our approach is affected with a slower rate with respect to both the number of qubits and features, which allows practical applications to remain within reach. Our findings herald a promising avenue for the practical implementation of quantum machine learning algorithms on near future quantum computing platforms.
SYOct 20, 2020
Monitoring Large Crowds With WiFi: A Privacy-Preserving ApproachJean-François Determe, Sophia Azzagnuni, Utkarsh Singh et al.
This paper presents a crowd monitoring system based on the passive detection of probe requests. The system meets strict privacy requirements and is suited to monitoring events or buildings with a least a few hundreds of attendees. We present our counting process and an associated mathematical model. From this model, we derive a concentration inequality that highlights the accuracy of our crowd count estimator. Then, we describe our system. We present and discuss our sensor hardware, our computing system architecture, and an efficient implementation of our counting algorithm -- as well as its space and time complexity. We also show how our system ensures the privacy of people in the monitored area. Finally, we validate our system using nine weeks of data from a public library endowed with a camera-based counting system, which generates counts against which we compare those of our counting system. This comparison empirically quantifies the accuracy of our counting system, thereby showing it to be suitable for monitoring public areas. Similarly, the concentration inequality provides a theoretical validation of the system.
APP-PHOct 28, 2019
Comparing domain wall synapse with other Non Volatile Memory devices for on-chip learning in Analog Hardware Neural NetworkDivya Kaushik, Utkarsh Singh, Upasana Sahu et al.
Resistive Random Access Memory (RRAM) and Phase Change Memory (PCM) devices have been popularly used as synapses in crossbar array based analog Neural Network (NN) circuit to achieve more energy and time efficient data classification compared to conventional computers. Here we demonstrate the advantages of recently proposed spin orbit torque driven Domain Wall (DW) device as synapse compared to the RRAM and PCM devices with respect to on-chip learning (training in hardware) in such NN. Synaptic characteristic of DW synapse, obtained by us from micromagnetic modeling, turns out to be much more linear and symmetric (between positive and negative update) than that of RRAM and PCM synapse. This makes design of peripheral analog circuits for on-chip learning much easier in DW synapse based NN compared to that for RRAM and PCM synapses. We next incorporate the DW synapse as a Verilog-A model in the crossbar array based NN circuit we design on SPICE circuit simulator. Successful on-chip learning is demonstrated through SPICE simulations on the popular Fisher's Iris dataset. Time and energy required for learning turn out to be orders of magnitude lower for DW synapse based NN circuit compared to that for RRAM and PCM synapse based NN circuits.
NEJul 1, 2019
On-chip learning in a conventional silicon MOSFET based Analog Hardware Neural NetworkNilabjo Dey, Janak Sharda, Utkarsh Saxena et al.
On-chip learning in a crossbar array based analog hardware Neural Network (NN) has been shown to have major advantages in terms of speed and energy compared to training NN on a traditional computer. However analog hardware NN proposals and implementations thus far have mostly involved Non Volatile Memory (NVM) devices like Resistive Random Access Memory (RRAM), Phase Change Memory (PCM), spintronic devices or floating gate transistors as synapses. Fabricating systems based on RRAM, PCM or spintronic devices need in-house laboratory facilities and cannot be done through merchant foundries, unlike conventional silicon based CMOS chips. Floating gate transistors need large voltage pulses for weight update, making on-chip learning in such systems energy inefficient. This paper proposes and implements through SPICE simulations on-chip learning in analog hardware NN using only conventional silicon based MOSFETs (without any floating gate) as synapses since they are easy to fabricate. We first model the synaptic characteristic of our single transistor synapse using SPICE circuit simulator and benchmark it against experimentally obtained current-voltage characteristics of a transistor. Next we design a Fully Connected Neural Network (FCNN) crossbar array using such transistor synapses. We also design analog peripheral circuits for neuron and synaptic weight update calculation, needed for on-chip learning, again using conventional transistors. Simulating the entire system on SPICE simulator, we obtain high training and test accuracy on the standard Fisher's Iris dataset, widely used in machine learning. We also compare the speed and energy performance of our transistor based implementation of analog hardware NN with some previous implementations of NN with NVM devices and show comparable performance with respect to on-chip learning.