HCFeb 11, 2023
Synthesizing Human Gaze Feedback for Improved NLP PerformanceVarun Khurana, Yaman Kumar Singla, Nora Hollenstein et al. · eth-zurich
Integrating human feedback in models can improve the performance of natural language processing (NLP) models. Feedback can be either explicit (e.g. ranking used in training language models) or implicit (e.g. using human cognitive signals in the form of eyetracking). Prior eye tracking and NLP research reveal that cognitive processes, such as human scanpaths, gleaned from human gaze patterns aid in the understanding and performance of NLP models. However, the collection of real eyetracking data for NLP tasks is challenging due to the requirement of expensive and precise equipment coupled with privacy invasion issues. To address this challenge, we propose ScanTextGAN, a novel model for generating human scanpaths over text. We show that ScanTextGAN-generated scanpaths can approximate meaningful cognitive signals in human gaze patterns. We include synthetically generated scanpaths in four popular NLP tasks spanning six different datasets as proof of concept and show that the models augmented with generated scanpaths improve the performance of all downstream NLP tasks.
CROct 2, 2022
GANTouch: An Attack-Resilient Framework for Touch-based Continuous Authentication SystemMohit Agrawal, Pragyan Mehrotra, Rajesh Kumar et al.
Previous studies have shown that commonly studied (vanilla) implementations of touch-based continuous authentication systems (V-TCAS) are susceptible to active adversarial attempts. This study presents a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework and compares it to the V-TCAS under three active adversarial environments viz. Zero-effort, Population, and Random-vector. The Zero-effort environment was implemented in two variations viz. Zero-effort (same-dataset) and Zero-effort (cross-dataset). The first involved a Zero-effort attack from the same dataset, while the second used three different datasets. G-TCAS showed more resilience than V-TCAS under the Population and Random-vector, the more damaging adversarial scenarios than the Zero-effort. On average, the increase in the false accept rates (FARs) for V-TCAS was much higher (27.5% and 21.5%) than for G-TCAS (14% and 12.5%) for Population and Random-vector attacks, respectively. Moreover, we performed a fairness analysis of TCAS for different genders and found TCAS to be fair across genders. The findings suggest that we should evaluate TCAS under active adversarial environments and affirm the usefulness of GANs in the TCAS pipeline.
SEApr 13Code
AgentForge: Execution-Grounded Multi-Agent LLM Framework for Autonomous Software EngineeringRajesh Kumar, Waqar Ali, Junaid Ahmed et al.
Large language models generate plausible code but cannot verify correctness. Existing multi-agent systems simulate execution or leave verification optional. We introduce execution-grounded verification as a first-class principle: every code change must survive sandboxed execution before propagation. We instantiate this principle in AGENTFORGE, a multi-agent framework where Planner, Coder, Tester, Debugger, and Critic agents coordinate through shared memory and a mandatory Docker sandbox. We formalize software engineering with LLMs as an iterative decision process over repository states, where execution feedback provides a stronger supervision signal than next-token likelihood. AGENTFORGE achieves 40.0\% resolution on SWE-BENCH Lite, outperforming single-agent baselines by 26--28 points. Ablations confirm that execution feedback and role decomposition each independently drive performance. The framework is open-source at https://github.com/raja21068/AutoCodeAI.
CROct 2, 2022
iCTGAN--An Attack Mitigation Technique for Random-vector Attack on Accelerometer-based Gait Authentication SystemsJun Hyung Mo, Rajesh Kumar
A recent study showed that commonly (vanilla) studied implementations of accelerometer-based gait authentication systems ($v$ABGait) are susceptible to random-vector attack. The same study proposed a beta noise-assisted implementation ($β$ABGait) to mitigate the attack. In this paper, we assess the effectiveness of the random-vector attack on both $v$ABGait and $β$ABGait using three accelerometer-based gait datasets. In addition, we propose $i$ABGait, an alternative implementation of ABGait, which uses a Conditional Tabular Generative Adversarial Network. Then we evaluate $i$ABGait's resilience against the traditional zero-effort and random-vector attacks. The results show that $i$ABGait mitigates the impact of the random-vector attack to a reasonable extent and outperforms $β$ABGait in most experimental settings.
HCApr 14
Detecting LLM-Assisted Academic Dishonesty using Keystroke DynamicsAtharva Mehta, Rajesh Kumar, Aman Singla et al.
The rapid adoption of generative AI tools has heightened concerns regarding academic integrity, as students increasingly engage in dishonest practices by copying or paraphrasing AI-generated content. Existing plagiarism detection systems, which rely primarily on text-intrinsic features, are ineffective at identifying AI-assisted or paraphrased submissions. Our prior conference work introduced a behavioral detection approach that leverages how text is produced, captured through keystroke dynamics, in addition to what is written, enabling discrimination between genuine and assisted writing. That study, conducted on keystroke data from 40 participants, demonstrated promising performance. This paper substantially extends and systemizes the prior work by: (1) expanding the dataset with 90 additional participants and introducing an explicit paraphrasing condition to model realistic plagiarism strategies; (2) formalizing a threat model and evaluating detection under adversarial and deception-oriented scenarios; and (3) performing a comprehensive empirical comparison against state-of-the-art text-only detectors and human evaluators. Experimental results demonstrate that keystroke-based models significantly outperform text-based approaches in practical deployment settings, while revealing limitations under more challenging adversarial conditions.
CRSep 21, 2023
Dictionary Attack on IMU-based Gait AuthenticationRajesh Kumar, Can Isik, Chilukuri K. Mohan
We present a novel adversarial model for authentication systems that use gait patterns recorded by the inertial measurement unit (IMU) built into smartphones. The attack idea is inspired by and named after the concept of a dictionary attack on knowledge (PIN or password) based authentication systems. In particular, this work investigates whether it is possible to build a dictionary of IMUGait patterns and use it to launch an attack or find an imitator who can actively reproduce IMUGait patterns that match the target's IMUGait pattern. Nine physically and demographically diverse individuals walked at various levels of four predefined controllable and adaptable gait factors (speed, step length, step width, and thigh-lift), producing 178 unique IMUGait patterns. Each pattern attacked a wide variety of user authentication models. The deeper analysis of error rates (before and after the attack) challenges the belief that authentication systems based on IMUGait patterns are the most difficult to spoof; further research is needed on adversarial models and associated countermeasures.
CVApr 8
VGGT-SLAM++Avilasha Mandal, Rajesh Kumar, Sudarshan Sunil Harithas et al.
We introduce VGGT-SLAM++, a complete visual SLAM system that leverages the geometry-rich outputs of the Visual Geometry Grounded Transformer (VGGT). The system comprises a visual odometry (front-end) fusing the VGGT feed-forward transformer and a Sim(3) solution, a Digital Elevation Map (DEM)-based graph construction module, and a back-end that jointly enable accurate large-scale mapping with bounded memory. While prior transformer-based SLAM pipelines such as VGGT-SLAM rely primarily on sparse loop closures or global Sim(3) manifold constraints - allowing short-horizon pose drift - VGGT-SLAM++ restores high-cadence local bundle adjustment (LBA) through a spatially corrective back-end. For each VGGT submap, we construct a dense planar-canonical DEM, partition it into patches, and compute their DINOv2 embeddings to integrate the submap into a covisibility graph. Spatial neighbors are retrieved using a Visual Place Recognition (VPR) module within the covisibility window, triggering frequent local optimization that stabilizes trajectories. Across standard SLAM benchmarks, VGGT-SLAM++ achieves state-of-the-art accuracy, substantially reducing short-term drift, accelerating graph convergence, and maintaining global consistency with compact DEM tiles and sublinear retrieval.
SEMay 11
Deterministic vs. LLM-Controlled Orchestration for COBOL-to-Python ModernizationNaing Oo Lwin, Rajesh Kumar
Modernizing legacy COBOL systems remains difficult due to scarce expertise, large and long-lived codebases, and strict correctness requirements. Recent large language model (LLM)-based modernization systems increasingly rely on agentic workflows in which the model controls multi-step tool execution. However, it remains unclear whether delegating execution control to the LLM improves correctness, robustness, or efficiency in structured software engineering workflows. We present a controlled empirical study of deterministic and LLM-controlled orchestration for COBOL-to-Python modernization. Using a unified experimental framework, we hold the language models, prompts, tools, configurations, and source programs constant while varying only the execution control strategy. This isolates orchestration as the sole experimental variable. We evaluate both approaches using functional correctness, robustness across repeated stochastic runs, and computational efficiency. Across multiple models, deterministic orchestration achieves comparable computational accuracy to LLM-controlled orchestration while improving worst-case robustness and reducing performance variability across runs. Deterministic execution also reduces token consumption by up to 3.5x, leading to substantially lower operational cost. These results suggest that, in structured modernization workflows with explicit validation stages, fixed execution policies provide more stable and cost-efficient behavior than fully agentic orchestration without reducing translation quality.
ROMar 29, 2022
Sparse Image based Navigation Architecture to Mitigate the need of precise Localization in Mobile RobotsPranay Mathur, Rajesh Kumar, Sarthak Upadhyay
Traditional simultaneous localization and mapping (SLAM) methods focus on improvement in the robot's localization under environment and sensor uncertainty. This paper, however, focuses on mitigating the need for exact localization of a mobile robot to pursue autonomous navigation using a sparse set of images. The proposed method consists of a model architecture - RoomNet, for unsupervised learning resulting in a coarse identification of the environment and a separate local navigation policy for local identification and navigation. The former learns and predicts the scene based on the short term image sequences seen by the robot along with the transition image scenarios using long term image sequences. The latter uses sparse image matching to characterise the similarity of frames achieved vis-a-vis the frames viewed by the robot during the mapping and training stage. A sparse graph of the image sequence is created which is then used to carry out robust navigation purely on the basis of visual goals. The proposed approach is evaluated on two robots in a test environment and demonstrates the ability to navigate in dynamic environments where landmarks are obscured and classical localization methods fail.
LGFeb 12, 2025
Advancing machine fault diagnosis: A detailed examination of convolutional neural networksGovind Vashishtha, Sumika Chauhan, Mert Sehri et al.
The growing complexity of machinery and the increasing demand for operational efficiency and safety have driven the development of advanced fault diagnosis techniques. Among these, convolutional neural networks (CNNs) have emerged as a powerful tool, offering robust and accurate fault detection and classification capabilities. This comprehensive review delves into the application of CNNs in machine fault diagnosis, covering its theoretical foundation, architectural variations, and practical implementations. The strengths and limitations of CNNs are analyzed in this domain, discussing their effectiveness in handling various fault types, data complexities, and operational environments. Furthermore, we explore the evolving landscape of CNN-based fault diagnosis, examining recent advancements in data augmentation, transfer learning, and hybrid architectures. Finally, we highlight future research directions and potential challenges to further enhance the application of CNNs for reliable and proactive machine fault diagnosis.
LGJul 29, 2025
LLM-Assisted Cheating Detection in Korean Language via KeystrokesDong Hyun Roh, Rajesh Kumar, An Ngo
This paper presents a keystroke-based framework for detecting LLM-assisted cheating in Korean, addressing key gaps in prior research regarding language coverage, cognitive context, and the granularity of LLM involvement. Our proposed dataset includes 69 participants who completed writing tasks under three conditions: Bona fide writing, paraphrasing ChatGPT responses, and transcribing ChatGPT responses. Each task spans six cognitive processes defined in Bloom's Taxonomy (remember, understand, apply, analyze, evaluate, and create). We extract interpretable temporal and rhythmic features and evaluate multiple classifiers under both Cognition-Aware and Cognition-Unaware settings. Temporal features perform well under Cognition-Aware evaluation scenarios, while rhythmic features generalize better under cross-cognition scenarios. Moreover, detecting bona fide and transcribed responses was easier than paraphrased ones for both the proposed models and human evaluators, with the models significantly outperforming the humans. Our findings affirm that keystroke dynamics facilitate reliable detection of LLM-assisted writing across varying cognitive demands and writing strategies, including paraphrasing and transcribing LLM-generated responses.
CLJan 8, 2025
Enhancing Financial VQA in Vision Language Models using Intermediate Structured RepresentationsArchita Srivastava, Abhas Kumar, Rajesh Kumar et al.
Chart interpretation is crucial for visual data analysis, but accurately extracting information from charts poses significant challenges for automated models. This study investigates the fine-tuning of DEPLOT, a modality conversion module that translates the image of a plot or chart to a linearized table, on a custom dataset of 50,000 bar charts. The dataset comprises simple, stacked, and grouped bar charts, targeting the unique structural features of these visualizations. The finetuned DEPLOT model is evaluated against its base version using a test set of 1,000 images and two metrics: Relative Mapping Similarity (RMS), which measures categorical mapping accuracy, and Relative Number Set Similarity (RNSS), which evaluates numerical interpretation accuracy. To further explore the reasoning capabilities of large language models (LLMs), we curate an additional set of 100 bar chart images paired with question answer sets. Our findings demonstrate that providing a structured intermediate table alongside the image significantly enhances LLM reasoning performance compared to direct image queries.
CLSep 29, 2025
How Well Do LLMs Imitate Human Writing Style?Rebira Jemama, Rajesh Kumar
Large language models (LLMs) can generate fluent text, but their ability to replicate the distinctive style of a specific human author remains unclear. We present a fast, training-free framework for authorship verification and style imitation analysis. The method integrates TF-IDF character n-grams with transformer embeddings and classifies text pairs through empirical distance distributions, eliminating the need for supervised training or threshold tuning. It achieves 97.5\% accuracy on academic essays and 94.5\% in cross-domain evaluation, while reducing training time by 91.8\% and memory usage by 59\% relative to parameter-based baselines. Using this framework, we evaluate five LLMs from three separate families (Llama, Qwen, Mixtral) across four prompting strategies - zero-shot, one-shot, few-shot, and text completion. Results show that the prompting strategy has a more substantial influence on style fidelity than model size: few-shot prompting yields up to 23.5x higher style-matching accuracy than zero-shot, and completion prompting reaches 99.9\% agreement with the original author's style. Crucially, high-fidelity imitation does not imply human-like unpredictability - human essays average a perplexity of 29.5, whereas matched LLM outputs average only 15.2. These findings demonstrate that stylistic fidelity and statistical detectability are separable, establishing a reproducible basis for future work in authorship modeling, detection, and identity-conditioned generation.
AIAug 7, 2025
Large Language Models Transform Organic Synthesis From Reaction Prediction to AutomationKartar Kumar Lohana Tharwani, Rajesh Kumar, Sumita et al.
Large language models (LLMs) are beginning to reshape how chemists plan and run reactions in organic synthesis. Trained on millions of reported transformations, these text-based models can propose synthetic routes, forecast reaction outcomes and even instruct robots that execute experiments without human supervision. Here we survey the milestones that turned LLMs from speculative tools into practical lab partners. We show how coupling LLMs with graph neural networks, quantum calculations and real-time spectroscopy shrinks discovery cycles and supports greener, data-driven chemistry. We discuss limitations, including biased datasets, opaque reasoning and the need for safety gates that prevent unintentional hazards. Finally, we outline community initiatives open benchmarks, federated learning and explainable interfaces that aim to democratize access while keeping humans firmly in control. These advances chart a path towards rapid, reliable and inclusive molecular innovation powered by artificial intelligence and automation.
IVMay 18, 2024
Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence ApproachesNand Lal Yadav, Satyendra Singh, Rajesh Kumar et al.
X-ray is one of the prevalent image modalities for the detection and diagnosis of the human body. X-ray provides an actual anatomical structure of an organ present with disease or absence of disease. Segmentation of disease in chest X-ray images is essential for the diagnosis and treatment. In this paper, a framework for the segmentation of X-ray images using artificial intelligence techniques has been discussed. Here data has been pre-processed and cleaned followed by segmentation using SegNet and Residual Net approaches to X-ray images. Finally, segmentation has been evaluated using well known metrics like Loss, Dice Coefficient, Jaccard Coefficient, Precision, Recall, Binary Accuracy, and Validation Accuracy. The experimental results reveal that the proposed approach performs better in all respect of well-known parameters with 16 batch size and 50 epochs. The value of validation accuracy, precision, and recall of SegNet and Residual Unet models are 0.9815, 0.9699, 0.9574, and 0.9901, 0.9864, 0.9750 respectively.
RONov 22, 2025
A Coordinated Dual-Arm Framework for Delicate Snap-Fit AssembliesShreyas Kumar, Barat S, Debojit Das et al.
Delicate snap-fit assemblies, such as inserting a lens into an eye-wear frame or during electronics assembly, demand timely engagement detection and rapid force attenuation to prevent overshoot-induced component damage or assembly failure. We address these challenges with two key contributions. First, we introduce SnapNet, a lightweight neural network that detects snap-fit engagement from joint-velocity transients in real-time, showing that reliable detection can be achieved using proprioceptive signals without external sensors. Second, we present a dynamical-systems-based dual-arm coordination framework that integrates SnapNet driven detection with an event-triggered impedance modulation, enabling accurate alignment and compliant insertion during delicate snap-fit assemblies. Experiments across diverse geometries on a heterogeneous bimanual platform demonstrate high detection accuracy (over 96% recall) and up to a 30% reduction in peak impact forces compared to standard impedance control.
SIJul 21, 2025
Weak Links in LinkedIn: Enhancing Fake Profile Detection in the Age of LLMsApoorva Gulati, Rajesh Kumar, Vinti Agarwal et al.
Large Language Models (LLMs) have made it easier to create realistic fake profiles on platforms like LinkedIn. This poses a significant risk for text-based fake profile detectors. In this study, we evaluate the robustness of existing detectors against LLM-generated profiles. While highly effective in detecting manually created fake profiles (False Accept Rate: 6-7%), the existing detectors fail to identify GPT-generated profiles (False Accept Rate: 42-52%). We propose GPT-assisted adversarial training as a countermeasure, restoring the False Accept Rate to between 1-7% without impacting the False Reject Rates (0.5-2%). Ablation studies revealed that detectors trained on combined numerical and textual embeddings exhibit the highest robustness, followed by those using numerical-only embeddings, and lastly those using textual-only embeddings. Complementary analysis on the ability of prompt-based GPT-4Turbo and human evaluators affirms the need for robust automated detectors such as the one proposed in this study.
CVJun 21, 2024
Keystroke Dynamics Against Academic Dishonesty in the Age of LLMsDebnath Kundu, Atharva Mehta, Rajesh Kumar et al.
The transition to online examinations and assignments raises significant concerns about academic integrity. Traditional plagiarism detection systems often struggle to identify instances of intelligent cheating, particularly when students utilize advanced generative AI tools to craft their responses. This study proposes a keystroke dynamics-based method to differentiate between bona fide and assisted writing within academic contexts. To facilitate this, a dataset was developed to capture the keystroke patterns of individuals engaged in writing tasks, both with and without the assistance of generative AI. The detector, trained using a modified TypeNet architecture, achieved accuracies ranging from 74.98% to 85.72% in condition-specific scenarios and from 52.24% to 80.54% in condition-agnostic scenarios. The findings highlight significant differences in keystroke dynamics between genuine and assisted writing. The outcomes of this study enhance our understanding of how users interact with generative AI and have implications for improving the reliability of digital educational platforms.
SPMay 6, 2024
MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture IdentificationNaveen Gehlot, Ashutosh Jena, Rajesh Kumar et al.
Artificial intelligence (AI) has made significant advances in recent years and opened up new possibilities in exploring applications in various fields such as biomedical, robotics, education, industry, etc. Among these fields, human hand gesture recognition is a subject of study that has recently emerged as a research interest in robotic hand control using electromyography (EMG). Surface electromyography (sEMG) is a primary technique used in EMG, which is popular due to its non-invasive nature and is used to capture gesture movements using signal acquisition devices placed on the surface of the forearm. Moreover, these signals are pre-processed to extract significant handcrafted features through time and frequency domain analysis. These are helpful and act as input to machine learning (ML) models to identify hand gestures. However, handling multiple classes and biases are major limitations that can affect the performance of an ML model. Therefore, to address this issue, a new mixture of experts extra tree (MEET) model is proposed to identify more accurate and effective hand gesture movements. This model combines individual ML models referred to as experts, each focusing on a minimal class of two. Moreover, a fully trained model known as the gate is employed to weigh the output of individual expert models. This amalgamation of the expert models with the gate model is known as a mixture of experts extra tree (MEET) model. In this study, four subjects with six hand gesture movements have been considered and their identification is evaluated among eleven models, including the MEET classifier. Results elucidate that the MEET classifier performed best among other algorithms and identified hand gesture movement accurately.
CLFeb 19, 2024
Can LLMs Compute with Reasons?Harshit Sandilya, Peehu Raj, Jainit Sushil Bafna et al.
Large language models (LLMs) often struggle with complex mathematical tasks, prone to "hallucinating" incorrect answers due to their reliance on statistical patterns. This limitation is further amplified in average Small LangSLMs with limited context and training data. To address this challenge, we propose an "Inductive Learning" approach utilizing a distributed network of SLMs. This network leverages error-based learning and hint incorporation to refine the reasoning capabilities of SLMs. Our goal is to provide a framework that empowers SLMs to approach the level of logic-based applications achieved by high-parameter models, potentially benefiting any language model. Ultimately, this novel concept paves the way for bridging the logical gap between humans and LLMs across various fields.
ROOct 15, 2021
Attention-based Estimation and Prediction of Human Intent to augment Haptic Glove aided Control of Robotic HandMuneeb Ahmed, Rajesh Kumar, Qaim Abbas et al.
The letter focuses on Haptic Glove (HG) based control of a Robotic Hand (RH) executing in-hand manipulation of certain objects of interest. The high dimensional motion signals in HG and RH possess intrinsic variability of kinematics resulting in difficulty to establish a direct mapping of the motion signals from HG onto the RH. An estimation mechanism is proposed to quantify the motion signal acquired from the human controller in relation to the intended goal pose of the object being held by the robotic hand. A control algorithm is presented to transform the synthesized intent at the RH and allow relocation of the object to the expected goal pose. The lag in synthesis of the intent in the presence of communication delay leads to a requirement of predicting the estimated intent. We leverage an attention-based convolutional neural network encoder to predict the trajectory of intent for a certain lookahead to compensate for the delays. The proposed methodology is evaluated across objects of different shapes, mass, and materials. We present a comparative performance of the estimation and prediction mechanisms on 5G-driven real-world robotic setup against benchmark methodologies. The test-MSE in prediction of human intent is reported to yield ~ 97.3 -98.7% improvement of accuracy in comparison to LSTM-based benchmark
CRJun 15, 2021
Defending Touch-based Continuous Authentication Systems from Active Adversaries Using Generative Adversarial NetworksMohit Agrawal, Pragyan Mehrotra, Rajesh Kumar et al.
Previous studies have demonstrated that commonly studied (vanilla) touch-based continuous authentication systems (V-TCAS) are susceptible to population attack. This paper proposes a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework, which showed more resilience to the population attack. G-TCAS framework was tested on a dataset of 117 users who interacted with a smartphone and tablet pair. On average, the increase in the false accept rates (FARs) for V-TCAS was much higher (22%) than G-TCAS (13%) for the smartphone. Likewise, the increase in the FARs for V-TCAS was 25% compared to G-TCAS (6%) for the tablet.
HCJun 14, 2021
Communication is the universal solvent: atreya bot -- an interactive bot for chemical scientistsMahak Sharma, Abhishek Kaushik, Rajesh Kumar et al.
Conversational agents are a recent trend in human-computer interaction, deployed in multidisciplinary applications to assist the users. In this paper, we introduce "Atreya", an interactive bot for chemistry enthusiasts, researchers, and students to study the ChEMBL database. Atreya is hosted by Telegram, a popular cloud-based instant messaging application. This user-friendly bot queries the ChEMBL database, retrieves the drug details for a particular disease, targets associated with that drug, etc. This paper explores the potential of using a conversational agent to assist chemistry students and chemical scientist in complex information seeking process.
CRApr 22, 2021
Blockchain based Privacy-Preserved Federated Learning for Medical Images: A Case Study of COVID-19 CT ScansRajesh Kumar, WenYong Wang, Cheng Yuan et al.
Medical health care centers are envisioned as a promising paradigm to handle the massive volume of data of COVID-19 patients using artificial intelligence (AI). Traditionally, AI techniques often require centralized data collection and training the model in a single organization, which is most common weakness due to the privacy and security of raw data communication. To solve this challenging task, we propose a blockchain-based federated learning framework that provides collaborative data training solutions by coordinating multiple hospitals to train and share encrypted federated models without leakage of data privacy. The blockchain ledger technology provides the decentralization of federated learning model without any central server. The proposed homomorphic encryption scheme encrypts and decrypts the gradients of model to preserve the privacy. More precisely, the proposed framework: i) train the local model by a novel capsule network to segmentation and classify COVID-19 images, ii) then use the homomorphic encryption scheme to secure the local model that encrypts and decrypts the gradients, and finally the model is shared over a decentralized platform through proposed blockchain-based federated learning algorithm. The integration of blockchain and federated learning leads to a new paradigm for medical image data sharing in the decentralized network. The conducted experimental resultsdemonstrate the performance of the proposed scheme.
CRFeb 26, 2021
Collective Intelligence: Decentralized Learning for Android Malware Detection in IoT with BlockchainRajesh Kumar, WenYong Wang, Jay Kumar et al.
The widespread significance of Android IoT devices is due to its flexibility and hardware support features which revolutionized the digital world by introducing exciting applications almost in all walks of daily life, such as healthcare, smart cities, smart environments, safety, remote sensing, and many more. Such versatile applicability gives incentive for more malware attacks. In this paper, we propose a framework which continuously aggregates multiple user trained models on non-overlapping data into single model. Specifically for malware detection task, (i) we propose a novel user (local) neural network (LNN) which trains on local distribution and (ii) then to assure the model authenticity and quality, we propose a novel smart contract which enable aggregation process over blokchain platform. The LNN model analyzes various static and dynamic features of both malware and benign whereas the smart contract verifies the malicious applications both for uploading and downloading processes in the network using stored aggregated features of local models. In this way, the proposed model not only improves malware detection accuracy using decentralized model network but also model efficacy with blockchain. We evaluate our approach with three state-of-the-art models and performed deep analyses of extracted features of the relative model.
CVFeb 19, 2021
Trends in Vehicle Re-identification Past, Present, and Future: A Comprehensive ReviewZakria, Jianhua Deng, Muhammad Saddam Khokhar et al.
Vehicle Re-identification (re-id) over surveillance camera network with non-overlapping field of view is an exciting and challenging task in intelligent transportation systems (ITS). Due to its versatile applicability in metropolitan cities, it gained significant attention. Vehicle re-id matches targeted vehicle over non-overlapping views in multiple camera network. However, it becomes more difficult due to inter-class similarity, intra-class variability, viewpoint changes, and spatio-temporal uncertainty. In order to draw a detailed picture of vehicle re-id research, this paper gives a comprehensive description of the various vehicle re-id technologies, applicability, datasets, and a brief comparison of different methodologies. Our paper specifically focuses on vision-based vehicle re-id approaches, including vehicle appearance, license plate, and spatio-temporal characteristics. In addition, we explore the main challenges as well as a variety of applications in different domains. Lastly, a detailed comparison of current state-of-the-art methods performances over VeRi-776 and VehicleID datasets is summarized with future directions. We aim to facilitate future research by reviewing the work being done on vehicle re-id till to date.
ROJan 29, 2021
A note on synthesizing geodesic based contact curvesRajesh Kumar, Sudipto Mukherjee
The paper focuses on synthesizing optimal contact curves that can be used to ensure a rolling constraint between two bodies in relative motion. We show that geodesic based contact curves generated on both the contacting surfaces are sufficient conditions to ensure rolling. The differential geodesic equations, when modified, can ensure proper disturbance rejection in case the system of interacting bodies is perturbed from the desired curve. A corollary states that geodesic curves are generated on the surface if rolling constraints are satisfied. Simulations in the context of in-hand manipulations of the objects are used as examples.
CRDec 17, 2020
Treadmill Assisted Gait Spoofing (TAGS): An Emerging Threat to wearable Sensor-based Gait AuthenticationRajesh Kumar, Can Isik, Vir V Phoha
In this work, we examine the impact of Treadmill Assisted Gait Spoofing (TAGS) on Wearable Sensor-based Gait Authentication (WSGait). We consider more realistic implementation and deployment scenarios than the previous study, which focused only on the accelerometer sensor and a fixed set of features. Specifically, we consider the situations in which the implementation of WSGait could be using one or more sensors embedded into modern smartphones. Besides, it could be using different sets of features or different classification algorithms, or both. Despite the use of a variety of sensors, feature sets (ranked by mutual information), and six different classification algorithms, TAGS was able to increase the average False Accept Rate (FAR) from 4% to 26%. Such a considerable increase in the average FAR, especially under the stringent implementation and deployment scenarios considered in this study, calls for a further investigation into the design of evaluations of WSGait before its deployment for public use.
IVJul 10, 2020
Blockchain-Federated-Learning and Deep Learning Models for COVID-19 detection using CT ImagingRajesh Kumar, Abdullah Aman Khan, Sinmin Zhang et al.
With the increase of COVID-19 cases worldwide, an effective way is required to diagnose COVID-19 patients. The primary problem in diagnosing COVID-19 patients is the shortage and reliability of testing kits, due to the quick spread of the virus, medical practitioners are facing difficulty identifying the positive cases. The second real-world problem is to share the data among the hospitals globally while keeping in view the privacy concerns of the organizations. Building a collaborative model and preserving privacy are major concerns for training a global deep learning model. This paper proposes a framework that collects a small amount of data from different sources (various hospitals) and trains a global deep learning model using blockchain based federated learning. Blockchain technology authenticates the data and federated learning trains the model globally while preserving the privacy of the organization. First, we propose a data normalization technique that deals with the heterogeneity of data as the data is gathered from different hospitals having different kinds of CT scanners. Secondly, we use Capsule Network-based segmentation and classification to detect COVID-19 patients. Thirdly, we design a method that can collaboratively train a global model using blockchain technology with federated learning while preserving privacy. Additionally, we collected real-life COVID-19 patients data, which is, open to the research community. The proposed framework can utilize up-to-date data which improves the recognition of computed tomography (CT) images. Finally, our results demonstrate a better performance to detect COVID-19 patients.
CVJun 16, 2020
On the Inference of Soft Biometrics from Typing Patterns Collected in a Multi-device EnvironmentVishaal Udandarao, Mohit Agrawal, Rajesh Kumar et al.
In this paper, we study the inference of gender, major/minor (computer science, non-computer science), typing style, age, and height from the typing patterns collected from 117 individuals in a multi-device environment. The inference of the first three identifiers was considered as classification tasks, while the rest as regression tasks. For classification tasks, we benchmark the performance of six classical machine learning (ML) and four deep learning (DL) classifiers. On the other hand, for regression tasks, we evaluated three ML and four DL-based regressors. The overall experiment consisted of two text-entry (free and fixed) and four device (Desktop, Tablet, Phone, and Combined) configurations. The best arrangements achieved accuracies of 96.15%, 93.02%, and 87.80% for typing style, gender, and major/minor, respectively, and mean absolute errors of 1.77 years and 2.65 inches for age and height, respectively. The results are promising considering the variety of application scenarios that we have listed in this work.
HCJan 24, 2020
Touchless Typing Using Head Movement-based GesturesShivam Rustagi, Aakash Garg, Pranay Raj Anand et al.
In this paper, we propose a novel touchless typing interface that makes use of an on-screen QWERTY keyboard and a smartphone camera. The keyboard was divided into nine color-coded clusters. The user moved their head toward clusters, which contained the letters that they wanted to type. A front-facing smartphone camera recorded the head movements. A bidirectional GRU based model which used pre-trained embedding rich in head pose features was employed to translate the recordings into cluster sequences. The model achieved an accuracy of 96.78% and 86.81% under intra- and inter-user scenarios, respectively, over a dataset of 2234 video sequences collected from 22 users.
CVNov 12, 2018
Identification of Internal Faults in Indirect Symmetrical Phase Shift Transformers Using Ensemble LearningPallav Kumar Bera, Rajesh Kumar, Can Isik
This paper proposes methods to identify 40 different types of internal faults in an Indirect Symmetrical Phase Shift Transformer (ISPST). The ISPST was modeled using Power System Computer Aided Design (PSCAD)/ Electromagnetic Transients including DC (EMTDC). The internal faults were simulated by varying the transformer tapping, backward and forward phase shifts, loading, and percentage of winding faulted. Data for 960 cases of each type of fault was recorded. A series of features were extracted for a, b, and c phases from time, frequency, time-frequency, and information theory domains. The importance of the extracted features was evaluated through univariate tests which helped to reduce the number of features. The selected features were then used for training five state-of-the-art machine learning classifiers. Extremely Random Trees and Random Forest, the ensemble-based learners, achieved the accuracy of 98.76% and 97.54% respectively outperforming Multilayer Perceptron (96.13%), Logistic Regression (93.54%), and Support Vector Machines (92.60%)
CVOct 30, 2017
Continuous Authentication Using One-class Classifiers and their FusionRajesh Kumar, Partha Pratim Kundu, Vir V. Phoha
While developing continuous authentication systems (CAS), we generally assume that samples from both genuine and impostor classes are readily available. However, the assumption may not be true in certain circumstances. Therefore, we explore the possibility of implementing CAS using only genuine samples. Specifically, we investigate the usefulness of four one-class classifiers OCC (elliptic envelope, isolation forest, local outliers factor, and one-class support vector machines) and their fusion. The performance of these classifiers was evaluated on four distinct behavioral biometric datasets, and compared with eight multi-class classifiers (MCC). The results demonstrate that if we have sufficient training data from the genuine user the OCC, and their fusion can closely match the performance of the majority of MCC. Our findings encourage the research community to use OCC in order to build CAS as they do not require knowledge of impostor class during the enrollment process.
HCAug 15, 2017
Continuous User Authentication via Unlabeled Phone Movement PatternsRajesh Kumar, Partha Pratim Kundu, Diksha Shukla et al.
In this paper, we propose a novel continuous authentication system for smartphone users. The proposed system entirely relies on unlabeled phone movement patterns collected through smartphone accelerometer. The data was collected in a completely unconstrained environment over five to twelve days. The contexts of phone usage were identified using k-means clustering. Multiple profiles, one for each context, were created for every user. Five machine learning algorithms were employed for classification of genuine and impostors. The performance of the system was evaluated over a diverse population of 57 users. The mean equal error rates achieved by Logistic Regression, Neural Network, kNN, SVM, and Random Forest were 13.7%, 13.5%, 12.1%, 10.7%, and 5.6% respectively. A series of statistical tests were conducted to compare the performance of the classifiers. The suitability of the proposed system for different types of users was also investigated using the failure to enroll policy.
CVMar 7, 2016
Authenticating users through their arm movement patternsRajesh Kumar, Vir V Phoha, Rahul Raina
In this paper, we propose four continuous authentication designs by using the characteristics of arm movements while individuals walk. The first design uses acceleration of arms captured by a smartwatch's accelerometer sensor, the second design uses the rotation of arms captured by a smartwatch's gyroscope sensor, third uses the fusion of both acceleration and rotation at the feature-level and fourth uses the fusion at score-level. Each of these designs is implemented by using four classifiers, namely, k nearest neighbors (k-NN) with Euclidean distance, Logistic Regression, Multilayer Perceptrons, and Random Forest resulting in a total of sixteen authentication mechanisms. These authentication mechanisms are tested under three different environments, namely an intra-session, inter-session on a dataset of 40 users and an inter-phase on a dataset of 12 users. The sessions of data collection were separated by at least ten minutes, whereas the phases of data collection were separated by at least three months. Under the intra-session environment, all of the twelve authentication mechanisms achieve a mean dynamic false accept rate (DFAR) of 0% and dynamic false reject rate (DFRR) of 0%. For the inter-session environment, feature level fusion-based design with classifier k-NN achieves the best error rates that are a mean DFAR of 2.2% and DFRR of 4.2%. The DFAR and DFRR increased from 5.68% and 4.23% to 15.03% and 14.62% respectively when feature level fusion-based design with classifier k-NN was tested under the inter-phase environment on a dataset of 12 users.
CRSep 30, 2015
Time Dependent Analysis with Dynamic Counter Measure TreesRajesh Kumar, Dennis Guck, Marielle Stoelinga
The success of a security attack crucially depends on time: the more time available to the attacker, the higher the probability of a successful attack. Formalisms such as Reliability block diagrams, Reliability graphs and Attack Countermeasure trees provide quantitative information about attack scenarios, but they are provably insufficient to model dependent actions which involve costs, skills, and time. In this presentation, we extend the Attack Countermeasure trees with a notion of time; inspired by the fact that there is a strong correlation between the amount of resources in which the attacker invests (in this case time) and probability that an attacker succeeds. This allows for an effective selection of countermeasures and rank them according to their resource consumption in terms of costs/skills of installing them and effectiveness in preventing an attack
CRFeb 13, 2014
A Fast Compressive Sensing Based Digital Image Encryption Technique using Structurally Random Matrices and Arnold TransformNitin Rawat, Pavel Ni, Rajesh Kumar
A new digital image encryption method based on fast compressed sensing approach using structurally random matrices and Arnold transform is proposed. Considering the natural images to be compressed in any domain, the fast compressed sensing based approach saves computational time, increases the quality of the image and reduces the dimension of the digital image by choosing even 25 % of the measurements. First, dimension reduction is utilized to compress the digital image with scrambling effect. Second, Arnold transformation is used to give the reduced digital image into more complex form. Then, the complex image is again encrypted by double random phase encoding process embedded with a host image; two random keys with fractional Fourier transform are been used as a secret keys. At the receiver, the decryption process is recovered by using TwIST algorithm. Experimental results including peak-to-peak signal-to-noise ratio between the original and reconstructed image are shown to analyze the validity of this technique and demonstrated our proposed method to be secure, fast, complex and robust.