ROJul 21, 2022
Towards Robust On-Ramp Merging via Augmented Multimodal Reinforcement LearningGaurav Bagwe, Jian Li, Xiaoyong Yuan et al.
Despite the success of AI-enabled onboard perception, on-ramp merging has been one of the main challenges for autonomous driving. Due to limited sensing range of onboard sensors, a merging vehicle can hardly observe main road conditions and merge properly. By leveraging the wireless communications between connected and automated vehicles (CAVs), a merging CAV has potential to proactively obtain the intentions of nearby vehicles. However, CAVs can be prone to inaccurate observations, such as the noisy basic safety messages (BSM) and poor quality surveillance images. In this paper, we present a novel approach for Robust on-ramp merge of CAVs via Augmented and Multi-modal Reinforcement Learning, named by RAMRL. Specifically, we formulate the on-ramp merging problem as a Markov decision process (MDP) by taking driving safety, comfort driving behavior, and traffic efficiency into account. To provide reliable merging maneuvers, we simultaneously leverage BSM and surveillance images for multi-modal observation, which is used to learn a policy model through proximal policy optimization (PPO). Moreover, to improve data efficiency and provide better generalization performance, we train the policy model with augmented data (e.g., noisy BSM and noisy surveillance images). Extensive experiments are conducted with Simulation of Urban MObility (SUMO) platform under two typical merging scenarios. Experimental results demonstrate the effectiveness and efficiency of our robust on-ramp merging design.
LGJul 10, 2023
Fed-CPrompt: Contrastive Prompt for Rehearsal-Free Federated Continual LearningGaurav Bagwe, Xiaoyong Yuan, Miao Pan et al.
Federated continual learning (FCL) learns incremental tasks over time from confidential datasets distributed across clients. This paper focuses on rehearsal-free FCL, which has severe forgetting issues when learning new tasks due to the lack of access to historical task data. To address this issue, we propose Fed-CPrompt based on prompt learning techniques to obtain task-specific prompts in a communication-efficient way. Fed-CPrompt introduces two key components, asynchronous prompt learning, and contrastive continual loss, to handle asynchronous task arrival and heterogeneous data distributions in FCL, respectively. Extensive experiments demonstrate the effectiveness of Fed-CPrompt in achieving SOTA rehearsal-free FCL performance.
CVNov 13, 2025
MOBA: A Material-Oriented Backdoor Attack against LiDAR-based 3D Object Detection SystemsSaket S. Chaturvedi, Gaurav Bagwe, Lan Zhang et al.
LiDAR-based 3D object detection is widely used in safety-critical systems. However, these systems remain vulnerable to backdoor attacks that embed hidden malicious behaviors during training. A key limitation of existing backdoor attacks is their lack of physical realizability, primarily due to the digital-to-physical domain gap. Digital triggers often fail in real-world settings because they overlook material-dependent LiDAR reflection properties. On the other hand, physically constructed triggers are often unoptimized, leading to low effectiveness or easy detectability.This paper introduces Material-Oriented Backdoor Attack (MOBA), a novel framework that bridges the digital-physical gap by explicitly modeling the material properties of real-world triggers. MOBA tackles two key challenges in physical backdoor design: 1) robustness of the trigger material under diverse environmental conditions, 2) alignment between the physical trigger's behavior and its digital simulation. First, we propose a systematic approach to selecting robust trigger materials, identifying titanium dioxide (TiO_2) for its high diffuse reflectivity and environmental resilience. Second, to ensure the digital trigger accurately mimics the physical behavior of the material-based trigger, we develop a novel simulation pipeline that features: (1) an angle-independent approximation of the Oren-Nayar BRDF model to generate realistic LiDAR intensities, and (2) a distance-aware scaling mechanism to maintain spatial consistency across varying depths. We conduct extensive experiments on state-of-the-art LiDAR-based and Camera-LiDAR fusion models, showing that MOBA achieves a 93.50% attack success rate, outperforming prior methods by over 41%. Our work reveals a new class of physically realizable threats and underscores the urgent need for defenses that account for material-level properties in real-world environments.
CVSep 18, 2025
AIP: Subverting Retrieval-Augmented Generation via Adversarial Instructional PromptSaket S. Chaturvedi, Gaurav Bagwe, Lan Zhang et al.
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by retrieving relevant documents from external sources to improve factual accuracy and verifiability. However, this reliance introduces new attack surfaces within the retrieval pipeline, beyond the LLM itself. While prior RAG attacks have exposed such vulnerabilities, they largely rely on manipulating user queries, which is often infeasible in practice due to fixed or protected user inputs. This narrow focus overlooks a more realistic and stealthy vector: instructional prompts, which are widely reused, publicly shared, and rarely audited. Their implicit trust makes them a compelling target for adversaries to manipulate RAG behavior covertly. We introduce a novel attack for Adversarial Instructional Prompt (AIP) that exploits adversarial instructional prompts to manipulate RAG outputs by subtly altering retrieval behavior. By shifting the attack surface to the instructional prompts, AIP reveals how trusted yet seemingly benign interface components can be weaponized to degrade system integrity. The attack is crafted to achieve three goals: (1) naturalness, to evade user detection; (2) utility, to encourage use of prompts; and (3) robustness, to remain effective across diverse query variations. We propose a diverse query generation strategy that simulates realistic linguistic variation in user queries, enabling the discovery of prompts that generalize across paraphrases and rephrasings. Building on this, a genetic algorithm-based joint optimization is developed to evolve adversarial prompts by balancing attack success, clean-task utility, and stealthiness. Experimental results show that AIP achieves up to 95.23% ASR while preserving benign functionality. These findings uncover a critical and previously overlooked vulnerability in RAG systems, emphasizing the need to reassess the shared instructional prompts.
LGJul 8, 2025
Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's DiseaseHarsh Ravivarapu, Gaurav Bagwe, Xiaoyong Yuan et al.
Deep brain stimulation (DBS) is an established intervention for Parkinson's disease (PD), but conventional open-loop systems lack adaptability, are energy-inefficient due to continuous stimulation, and provide limited personalization to individual neural dynamics. Adaptive DBS (aDBS) offers a closed-loop alternative, using biomarkers such as beta-band oscillations to dynamically modulate stimulation. While reinforcement learning (RL) holds promise for personalized aDBS control, existing methods suffer from high sample complexity, unstable exploration in binary action spaces, and limited deployability on resource-constrained hardware. We propose SEA-DBS, a sample-efficient actor-critic framework that addresses the core challenges of RL-based adaptive neurostimulation. SEA-DBS integrates a predictive reward model to reduce reliance on real-time feedback and employs Gumbel Softmax-based exploration for stable, differentiable policy updates in binary action spaces. Together, these components improve sample efficiency, exploration robustness, and compatibility with resource-constrained neuromodulatory hardware. We evaluate SEA-DBS on a biologically realistic simulation of Parkinsonian basal ganglia activity, demonstrating faster convergence, stronger suppression of pathological beta-band power, and resilience to post-training FP16 quantization. Our results show that SEA-DBS offers a practical and effective RL-based aDBS framework for real-time, resource-constrained neuromodulation.