SYFeb 26, 2018
A Learning-based Stochastic MPC Design for Cooperative Adaptive Cruise Control to Handle Interfering VehiclesHadi Kazemi, Hossein Nourkhiz Mahjoub, Amin Tahmasbi-Sarvestani et al.
Vehicle to Vehicle (V2V) communication has a great potential to improve reaction accuracy of different driver assistance systems in critical driving situations. Cooperative Adaptive Cruise Control (CACC), which is an automated application, provides drivers with extra benefits such as traffic throughput maximization and collision avoidance. CACC systems must be designed in a way that are sufficiently robust against all special maneuvers such as cutting-into the CACC platoons by interfering vehicles or hard braking by leading cars. To address this problem, a Neural- Network (NN)-based cut-in detection and trajectory prediction scheme is proposed in the first part of this paper. Next, a probabilistic framework is developed in which the cut-in probability is calculated based on the output of the mentioned cut-in prediction block. Finally, a specific Stochastic Model Predictive Controller (SMPC) is designed which incorporates this cut-in probability to enhance its reaction against the detected dangerous cut-in maneuver. The overall system is implemented and its performance is evaluated using realistic driving scenarios from Safety Pilot Model Deployment (SPMD).
SPJul 11, 2018
A Driver Behavior Modeling Structure Based on Non-parametric Bayesian Stochastic Hybrid ArchitectureHossein Nourkhiz Mahjoub, Behrad Toghi, Yaser P. Fallah
Heterogeneous nature of the vehicular networks, which results from the co-existence of human-driven, semi-automated, and fully autonomous vehicles, is a challenging phenomenon toward the realization of the intelligent transportation systems with an acceptable level of safety, comfort, and efficiency. Safety applications highly suffer from communication resource limitations, specifically in dense and congested vehicular networks. The idea of model-based communication (MBC) has been recently proposed to address this issue. In this work, we propose Gaussian Process-based Stochastic Hybrid System with Cumulative Relevant History (CRH-GP-SHS) framework, which is a hierarchical stochastic hybrid modeling structure, built upon a non-parametric Bayesian inference method, i.e. Gaussian processes. This framework is proposed in order to be employed within the MBC context to jointly model driver/vehicle behavior as a stochastic object. Non-parametric Bayesian methods relieve the limitations imposed by non-evolutionary model structures and enable the proposed framework to properly capture different stochastic behaviors. The performance of the proposed CRH-GP-SHS framework at the inter-mode level has been evaluated over a set of realistic lane change maneuvers from NGSIM-US101 dataset. The results show a noticeable performance improvement for GP in comparison to the baseline constant speed model, specifically in critical situations such as highly congested networks. Moreover, an augmented model has also been proposed which is a composition of GP and constant speed models and capable of capturing the driver behavior under various network reliability conditions.
RODec 24, 2022
Context-Aware Target Classification with Hybrid Gaussian Process prediction for Cooperative Vehicle Safety systemsRodolfo Valiente, Arash Raftari, Hossein Nourkhiz Mahjoub et al.
Vehicle-to-Everything (V2X) communication has been proposed as a potential solution to improve the robustness and safety of autonomous vehicles by improving coordination and removing the barrier of non-line-of-sight sensing. Cooperative Vehicle Safety (CVS) applications are tightly dependent on the reliability of the underneath data system, which can suffer from loss of information due to the inherent issues of their different components, such as sensors failures or the poor performance of V2X technologies under dense communication channel load. Particularly, information loss affects the target classification module and, subsequently, the safety application performance. To enable reliable and robust CVS systems that mitigate the effect of information loss, we proposed a Context-Aware Target Classification (CA-TC) module coupled with a hybrid learning-based predictive modeling technique for CVS systems. The CA-TC consists of two modules: A Context-Aware Map (CAM), and a Hybrid Gaussian Process (HGP) prediction system. Consequently, the vehicle safety applications use the information from the CA-TC, making them more robust and reliable. The CAM leverages vehicles path history, road geometry, tracking, and prediction; and the HGP is utilized to provide accurate vehicles' trajectory predictions to compensate for data loss (due to communication congestion) or sensor measurements' inaccuracies. Based on offline real-world data, we learn a finite bank of driver models that represent the joint dynamics of the vehicle and the drivers' behavior. We combine offline training and online model updates with on-the-fly forecasting to account for new possible driver behaviors. Finally, our framework is validated using simulation and realistic driving scenarios to confirm its potential in enhancing the robustness and reliability of CVS systems.
LGJan 30
Learning Robust Reasoning through Guided Adversarial Self-PlayShuozhe Li, Vaishnav Tadiparthi, Kwonjoon Lee et al.
Reinforcement learning from verifiable rewards (RLVR) produces strong reasoning models, yet they can fail catastrophically when the conditioning context is fallible (e.g., corrupted chain-of-thought, misleading partial solutions, or mild input perturbations), since standard RLVR optimizes final-answer correctness only under clean conditioning. We introduce GASP (Guided Adversarial Self-Play), a robustification method that explicitly trains detect-and-repair capabilities using only outcome verification. Without human labels or external teachers, GASP forms an adversarial self-play game within a single model: a polluter learns to induce failure via locally coherent corruptions, while an agent learns to diagnose and recover under the same corrupted conditioning. To address the scarcity of successful recoveries early in training, we propose in-distribution repair guidance, an imitation term on self-generated repairs that increases recovery probability while preserving previously acquired capabilities. Across four open-weight models (1.5B--8B), GASP transforms strong-but-brittle reasoners into robust ones that withstand misleading and perturbed context while often improving clean accuracy. Further analysis shows that adversarial corruptions induce an effective curriculum, and in-distribution guidance enables rapid recovery learning with minimal representational drift.
CVMar 4
SSR: A Generic Framework for Text-Aided Map Compression for LocalizationMohammad Omama, Po-han Li, Harsh Goel et al.
Mapping is crucial in robotics for localization and downstream decision-making. As robots are deployed in ever-broader settings, the maps they rely on continue to increase in size. However, storing these maps indefinitely (cold storage), transferring them across networks, or sending localization queries to cloud-hosted maps imposes prohibitive memory and bandwidth costs. We propose a text-enhanced compression framework that reduces both memory and bandwidth footprints while retaining high-fidelity localization. The key idea is to treat text as an alternative modality: one that can be losslessly compressed with large language models. We propose leveraging lightweight text descriptions combined with very small image feature vectors, which capture "complementary information" as a compact representation for the mapping task. Building on this, our novel technique, Similarity Space Replication (SSR), learns an adaptive image embedding in one shot that captures only the information "complementary" to the text descriptions. We validate our compression framework on multiple downstream localization tasks, including Visual Place Recognition as well as object-centric Monte Carlo localization in both indoor and outdoor settings. SSR achieves 2 times better compression than competing baselines on state-of-the-art datasets, including TokyoVal, Pittsburgh30k, Replica, and KITTI.
95.4LGApr 3
Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning ModelsGengwei Zhang, Jie Peng, Zhen Tan et al.
The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large Language Models (MLLMs) to enhance their visual reasoning capabilities. Although many studies have reported improved performance, it remains unclear whether RL training truly enables models to learn from visual information. In this work, we propose the Hallucination-as-Cue Framework, an analytical framework designed to investigate the effects of RL-based post-training on multimodal reasoning models from the perspective of model hallucination. Specifically, we introduce hallucination-inductive, modality-specific corruptions that remove or replace essential information required to derive correct answers, thereby forcing the model to reason by hallucination. By applying these corruptions during both training and evaluation, our framework provides a unique perspective for diagnosing RL training dynamics and understanding the intrinsic properties of datasets. Through extensive experiments and analyses across multiple multimodal reasoning benchmarks, we reveal that the role of model hallucination for RL-training is more significant than previously recognized. For instance, we find that RL post-training under purely hallucination-inductive settings can still significantly improve models' reasoning performance, and in some cases even outperform standard training. These findings challenge prevailing assumptions about MLLM reasoning training and motivate the development of more modality-aware RL-based training designs.
ROJan 2, 2025
In Search of a Lost Metric: Human Empowerment as a Pillar of Socially Conscious NavigationVasanth Reddy Baddam, Behdad Chalaki, Vaishnav Tadiparthi et al.
In social robot navigation, traditional metrics like proxemics and behavior naturalness emphasize human comfort and adherence to social norms but often fail to capture an agent's autonomy and adaptability in dynamic environments. This paper introduces human empowerment, an information-theoretic concept that measures a human's ability to influence their future states and observe those changes, as a complementary metric for evaluating social compliance. This metric reveals how robot navigation policies can indirectly impact human empowerment. We present a framework that integrates human empowerment into the evaluation of social performance in navigation tasks. Through numerical simulations, we demonstrate that human empowerment as a metric not only aligns with intuitive social behavior, but also shows statistically significant differences across various robot navigation policies. These results provide a deeper understanding of how different policies affect social compliance, highlighting the potential of human empowerment as a complementary metric for future research in social navigation.
AIOct 16, 2025
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution ReconstructionXu Shen, Qi Zhang, Song Wang et al.
Large Language Model based multi-agent systems (MAS) excel at collaborative problem solving but remain brittle to cascading errors: a single faulty step can propagate across agents and disrupt the trajectory. In this paper, we present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction. MASC rethinks detection as history-conditioned anomaly scoring via two complementary designs: (1) Next-Execution Reconstruction, which predicts the embedding of the next step from the query and interaction history to capture causal consistency, and (2) Prototype-Guided Enhancement, which learns a prototype prior over normal-step embeddings and uses it to stabilize reconstruction and anomaly scoring under sparse context (e.g., early steps). When an anomaly step is flagged, MASC triggers a correction agent to revise the acting agent's output before information flows downstream. On the Who&When benchmark, MASC consistently outperforms all baselines, improving step-level error detection by up to 8.47% AUC-ROC ; When plugged into diverse MAS frameworks, it delivers consistent end-to-end gains across architectures, confirming that our metacognitive monitoring and targeted correction can mitigate error propagation with minimal overhead.
AIJul 15, 2025
MR-LDM -- The Merge-Reactive Longitudinal Decision Model: Game Theoretic Human Decision Modeling for Interactive Sim AgentsDustin Holley, Jovin D'sa, Hossein Nourkhiz Mahjoub et al.
Enhancing simulation environments to replicate real-world driver behavior, i.e., more humanlike sim agents, is essential for developing autonomous vehicle technology. In the context of highway merging, previous works have studied the operational-level yielding dynamics of lag vehicles in response to a merging car at highway on-ramps. Other works focusing on tactical decision modeling generally consider limited action sets or utilize payoff functions with large parameter sets and limited payoff bounds. In this work, we aim to improve the simulation of the highway merge scenario by targeting a game theoretic model for tactical decision-making with improved payoff functions and lag actions. We couple this with an underlying dynamics model to have a unified decision and dynamics model that can capture merging interactions and simulate more realistic interactions in an explainable and interpretable fashion. The proposed model demonstrated good reproducibility of complex interactions when validated on a real-world dataset. The model was finally integrated into a high fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.
CVOct 30, 2024
Symbolic Graph Inference for Compound Scene UnderstandingFNU Aryan, Simon Stepputtis, Sarthak Bhagat et al.
Scene understanding is a fundamental capability needed in many domains, ranging from question-answering to robotics. Unlike recent end-to-end approaches that must explicitly learn varying compositions of the same scene, our method reasons over their constituent objects and analyzes their arrangement to infer a scene's meaning. We propose a novel approach that reasons over a scene's scene- and knowledge-graph, capturing spatial information while being able to utilize general domain knowledge in a joint graph search. Empirically, we demonstrate the feasibility of our method on the ADE20K dataset and compare it to current scene understanding approaches.
ITMay 24, 2023
Task-aware Distributed Source Coding under Dynamic BandwidthPo-han Li, Sravan Kumar Ankireddy, Ruihan Zhao et al.
Efficient compression of correlated data is essential to minimize communication overload in multi-sensor networks. In such networks, each sensor independently compresses the data and transmits them to a central node due to limited communication bandwidth. A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output. Thus, it is important to compress the features that are relevant to the task. Additionally, the final performance depends heavily on the total available bandwidth. In practice, it is common to encounter varying availability in bandwidth, and higher bandwidth results in better performance of the task. We design a novel distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA). NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead. NDPCA achieves this by learning low-rank task representations and efficiently distributing bandwidth among sensors, thus providing a graceful trade-off between performance and bandwidth. Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14% compared to an autoencoder with uniform bandwidth allocation.
SPMar 4, 2019
V2X System Architecture Utilizing Hybrid Gaussian Process-based Model StructuresHossein Nourkhiz Mahjoub, Behrad Toghi, S M Osman Gani et al.
Scalable communication is of utmost importance for reliable dissemination of time-sensitive information in cooperative vehicular ad-hoc networks (VANETs), which is, in turn, an essential prerequisite for the proper operation of the critical cooperative safety applications. The model-based communication (MBC) is a recently-explored scalability solution proposed in the literature, which has shown a promising potential to reduce the channel congestion to a great extent. In this work, based on the MBC notion, a technology-agnostic hybrid model selection policy for Vehicle-to-Everything (V2X) communication is proposed which benefits from the characteristics of the non-parametric Bayesian inference techniques, specifically Gaussian Processes. The results show the effectiveness of the proposed communication architecture on both reducing the required message exchange rate and increasing the remote agent tracking precision.
ROAug 1, 2018
A Learning-Based Framework for Two-Dimensional Vehicle Maneuver Prediction over V2V NetworksHossein Nourkhiz Mahjoub, Amin Tahmasbi-Sarvestani, Hadi Kazemi et al.
Situational awareness in vehicular networks could be substantially improved utilizing reliable trajectory prediction methods. More precise situational awareness, in turn, results in notably better performance of critical safety applications, such as Forward Collision Warning (FCW), as well as comfort applications like Cooperative Adaptive Cruise Control (CACC). Therefore, vehicle trajectory prediction problem needs to be deeply investigated in order to come up with an end to end framework with enough precision required by the safety applications' controllers. This problem has been tackled in the literature using different methods. However, machine learning, which is a promising and emerging field with remarkable potential for time series prediction, has not been explored enough for this purpose. In this paper, a two-layer neural network-based system is developed which predicts the future values of vehicle parameters, such as velocity, acceleration, and yaw rate, in the first layer and then predicts the two-dimensional, i.e. longitudinal and lateral, trajectory points based on the first layer's outputs. The performance of the proposed framework has been evaluated in realistic cut-in scenarios from Safety Pilot Model Deployment (SPMD) dataset and the results show a noticeable improvement in the prediction accuracy in comparison with the kinematics model which is the dominant employed model by the automotive industry. Both ideal and nonideal communication circumstances have been investigated for our system evaluation. For non-ideal case, an estimation step is included in the framework before the parameter prediction block to handle the drawbacks of packet drops or sensor failures and reconstruct the time series of vehicle parameters at a desirable frequency.