ROJun 11, 2023
Digital Twin-Enhanced Wireless Indoor Navigation: Achieving Efficient Environment Sensing with Zero-Shot Reinforcement LearningTao Li, Haozhe Lei, Hao Guo et al.
Millimeter-wave (mmWave) communication is a vital component of future generations of mobile networks, offering not only high data rates but also precise beams, making it ideal for indoor navigation in complex environments. However, the challenges of multipath propagation and noisy signal measurements in indoor spaces complicate the use of mmWave signals for navigation tasks. Traditional physics-based methods, such as following the angle of arrival (AoA), often fall short in complex scenarios, highlighting the need for more sophisticated approaches. Digital twins, as virtual replicas of physical environments, offer a powerful tool for simulating and optimizing mmWave signal propagation in such settings. By creating detailed, physics-based models of real-world spaces, digital twins enable the training of machine learning algorithms in virtual environments, reducing the costs and limitations of physical testing. Despite their advantages, current machine learning models trained in digital twins often overfit specific virtual environments and require costly retraining when applied to new scenarios. In this paper, we propose a Physics-Informed Reinforcement Learning (PIRL) approach that leverages the physical insights provided by digital twins to shape the reinforcement learning (RL) reward function. By integrating physics-based metrics such as signal strength, AoA, and path reflections into the learning process, PIRL enables efficient learning and improved generalization to new environments without retraining. Our experiments demonstrate that the proposed PIRL, supported by digital twin simulations, outperforms traditional heuristics and standard RL models, achieving zero-shot generalization in unseen environments and offering a cost-effective, scalable solution for wireless indoor navigation.
SYAug 5, 2024
Multi-level Traffic-Responsive Tilt Camera Surveillance through Predictive Correlated Online LearningTao Li, Zilin Bian, Haozhe Lei et al.
In urban traffic management, the primary challenge of dynamically and efficiently monitoring traffic conditions is compounded by the insufficient utilization of thousands of surveillance cameras along the intelligent transportation system. This paper introduces the multi-level Traffic-responsive Tilt Camera surveillance system (TTC-X), a novel framework designed for dynamic and efficient monitoring and management of traffic in urban networks. By leveraging widely deployed pan-tilt-cameras (PTCs), TTC-X overcomes the limitations of a fixed field of view in traditional surveillance systems by providing mobilized and 360-degree coverage. The innovation of TTC-X lies in the integration of advanced machine learning modules, including a detector-predictor-controller structure, with a novel Predictive Correlated Online Learning (PiCOL) methodology and the Spatial-Temporal Graph Predictor (STGP) for real-time traffic estimation and PTC control. The TTC-X is tested and evaluated under three experimental scenarios (e.g., maximum traffic flow capture, dynamic route planning, traffic state estimation) based on a simulation environment calibrated using real-world traffic data in Brooklyn, New York. The experimental results showed that TTC-X captured over 60\% total number of vehicles at the network level, dynamically adjusted its route recommendation in reaction to unexpected full-lane closure events, and reconstructed link-level traffic states with best MAE less than 1.25 vehicle/hour. Demonstrating scalability, cost-efficiency, and adaptability, TTC-X emerges as a powerful solution for urban traffic management in both cyber-physical and real-world environments.
SOC-PHJul 3, 2024
Digital Twin-based Driver Risk-Aware Intelligent Mobility Analytics for Urban Transportation ManagementTao Li, Zilin Bian, Haozhe Lei et al.
Traditional mobility management strategies emphasize macro-level mobility oversight from traffic-sensing infrastructures, often overlooking safety risks that directly affect road users. To address this, we propose a Digital Twin-based Driver Risk-Aware Intelligent Mobility Analytics (DT-DIMA) system. The DT-DIMA system integrates real-time traffic information from pan-tilt-cameras (PTCs), synchronizes this data into a digital twin to accurately replicate the physical world, and predicts network-wide mobility and safety risks in real time. The system's innovation lies in its integration of spatial-temporal modeling, simulation, and online control modules. Tested and evaluated under normal traffic conditions and incidental situations (e.g., unexpected accidents, pre-planned work zones) in a simulated testbed in Brooklyn, New York, DT-DIMA demonstrated mean absolute percentage errors (MAPEs) ranging from 8.40% to 15.11% in estimating network-level traffic volume and MAPEs from 0.85% to 12.97% in network-level safety risk prediction. In addition, the highly accurate safety risk prediction enables PTCs to preemptively monitor road segments with high driving risks before incidents take place. Such proactive PTC surveillance creates around a 5-minute lead time in capturing traffic incidents. The DT-DIMA system enables transportation managers to understand mobility not only in terms of traffic patterns but also driver-experienced safety risks, allowing for proactive resource allocation in response to various traffic situations. To the authors' best knowledge, DT-DIMA is the first urban mobility management system that considers both mobility and safety risks based on digital twin architecture.
LGJul 29, 2022
Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity AnalysisTao Li, Haozhe Lei, Quanyan Zhu
Meta reinforcement learning (meta RL), as a combination of meta-learning ideas and reinforcement learning (RL), enables the agent to adapt to different tasks using a few samples. However, this sampling-based adaptation also makes meta RL vulnerable to adversarial attacks. By manipulating the reward feedback from sampling processes in meta RL, an attacker can mislead the agent into building wrong knowledge from training experience, which deteriorates the agent's performance when dealing with different tasks after adaptation. This paper provides a game-theoretical underpinning for understanding this type of security risk. In particular, we formally define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation. It leads to two online attack schemes: Intermittent Attack and Persistent Attack, which enable the attacker to learn an optimal sampling attack, defined by an $ε$-first-order stationary point, within $\mathcal{O}(ε^{-2})$ iterations. These attack schemes freeride the learning progress concurrently without extra interactions with the environment. By corroborating the convergence results with numerical experiments, we observe that a minor effort of the attacker can significantly deteriorate the learning performance, and the minimax approach can also help robustify the meta RL algorithms.
RODec 17, 2022
Cognitive Level-$k$ Meta-Learning for Safe and Pedestrian-Aware Autonomous DrivingHaozhe Lei, Quanyan Zhu
The potential market for modern self-driving cars is enormous, as they are developing remarkably rapidly. At the same time, however, accidents of pedestrian fatalities caused by autonomous driving have been recorded in the case of street crossing. To ensure traffic safety in self-driving environments and respond to vehicle-human interaction challenges such as jaywalking, we propose Level-$k$ Meta Reinforcement Learning (LK-MRL) algorithm. It takes into account the cognitive hierarchy of pedestrian responses and enables self-driving vehicles to adapt to various human behaviors. %which takes into account pedestrian responses while learning the optimal strategies. As a self-driving vehicle algorithm, the LK-MRL combines level-$k$ thinking into MAML to prepare for heterogeneous pedestrians and improve intersection safety based on the combination of meta-reinforcement learning and human cognitive hierarchy framework. We evaluate the algorithm in two cognitive confrontation hierarchy scenarios in an urban traffic simulator and illustrate its role in ensuring road safety by demonstrating its capability of conjectural and higher-level reasoning.
OCFeb 11
Distributed Online Convex Optimization with Nonseparable Costs and ConstraintsZhaoye Pan, Haozhe Lei, Fan Zuo et al.
This paper studies distributed online convex optimization with time-varying coupled constraints, motivated by distributed online control in network systems. Most prior work assumes a separability condition: the global objective and coupled constraint functions are sums of local costs and individual constraints. In contrast, we study a group of agents, networked via a communication graph, that collectively select actions to minimize a sequence of nonseparable global cost functions and to stratify nonseparable long-term constraints based on full-information feedback and intra-agent communication. We propose a distributed online primal-dual belief consensus algorithm, where each agent maintains and updates a local belief of the global collective decisions, which are repeatedly exchanged with neighboring agents. Unlike the previous consensus primal-dual algorithms under separability that ask agents to only communicate their local decisions, our belief-sharing protocol eliminates coupling between the primal consensus disagreement and the dual constraint violation, yielding sublinear regret and cumulative constraint violation (CCV) bounds, both in $O({T}^{1/2})$, where $T$ denotes the time horizon. Such a result breaks the long-standing $O(T^{3/4})$ barrier for CCV and matches the lower bound of online constrained convex optimization, indicating the online learning efficiency at the cost of communication overhead.
ROSep 5, 2023
Neurosymbolic Meta-Reinforcement Lookahead Learning Achieves Safe Self-Driving in Non-Stationary EnvironmentsHaozhe Lei, Quanyan Zhu
In the area of learning-driven artificial intelligence advancement, the integration of machine learning (ML) into self-driving (SD) technology stands as an impressive engineering feat. Yet, in real-world applications outside the confines of controlled laboratory scenarios, the deployment of self-driving technology assumes a life-critical role, necessitating heightened attention from researchers towards both safety and efficiency. To illustrate, when a self-driving model encounters an unfamiliar environment in real-time execution, the focus must not solely revolve around enhancing its anticipated performance; equal consideration must be given to ensuring its execution or real-time adaptation maintains a requisite level of safety. This study introduces an algorithm for online meta-reinforcement learning, employing lookahead symbolic constraints based on \emph{Neurosymbolic Meta-Reinforcement Lookahead Learning} (NUMERLA). NUMERLA proposes a lookahead updating mechanism that harmonizes the efficiency of online adaptations with the overarching goal of ensuring long-term safety. Experimental results demonstrate NUMERLA confers the self-driving agent with the capacity for real-time adaptability, leading to safe and self-adaptive driving under non-stationary urban human-vehicle interaction scenarios.
CROct 31, 2024
ADAPT: A Game-Theoretic and Neuro-Symbolic Framework for Automated Distributed Adaptive Penetration TestingHaozhe Lei, Yunfei Ge, Quanyan Zhu
The integration of AI into modern critical infrastructure systems, such as healthcare, has introduced new vulnerabilities that can significantly impact workflow, efficiency, and safety. Additionally, the increased connectivity has made traditional human-driven penetration testing insufficient for assessing risks and developing remediation strategies. Consequently, there is a pressing need for a distributed, adaptive, and efficient automated penetration testing framework that not only identifies vulnerabilities but also provides countermeasures to enhance security posture. This work presents ADAPT, a game-theoretic and neuro-symbolic framework for automated distributed adaptive penetration testing, specifically designed to address the unique cybersecurity challenges of AI-enabled healthcare infrastructure networks. We use a healthcare system case study to illustrate the methodologies within ADAPT. The proposed solution enables a learning-based risk assessment. Numerical experiments are used to demonstrate effective countermeasures against various tactical techniques employed by adversarial AI.
LGJun 27, 2025
Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor NavigationTao Li, Haozhe Lei, Mingsheng Yin et al.
When using reinforcement learning (RL) to tackle physical control tasks, inductive biases that encode physics priors can help improve sample efficiency during training and enhance generalization in testing. However, the current practice of incorporating these helpful physics-informed inductive biases inevitably runs into significant manual labor and domain expertise, making them prohibitive for general users. This work explores a symbolic approach to distill physics-informed inductive biases into RL agents, where the physics priors are expressed in a domain-specific language (DSL) that is human-readable and naturally explainable. Yet, the DSL priors do not translate directly into an implementable policy due to partial and noisy observations and additional physical constraints in navigation tasks. To address this gap, we develop a physics-informed program-guided RL (PiPRL) framework with applications to indoor navigation. PiPRL adopts a hierarchical and modularized neuro-symbolic integration, where a meta symbolic program receives semantically meaningful features from a neural perception module, which form the bases for symbolic programming that encodes physics priors and guides the RL process of a low-level neural controller. Extensive experiments demonstrate that PiPRL consistently outperforms purely symbolic or neural policies and reduces training time by over 26% with the help of the program-based inductive biases.
SPSep 30, 2025
Transformer-Based Rate Prediction for Multi-Band Cellular HandsetsRuibin Chen, Haozhe Lei, Hao Guo et al.
Cellular wireless systems are witnessing the proliferation of frequency bands over a wide spectrum, particularly with the expansion of new bands in FR3. These bands must be supported in user equipment (UE) handsets with multiple antennas in a constrained form factor. Rapid variations in channel quality across the bands from motion and hand blockage, limited field-of-view of antennas, and hardware and power-constrained measurement sparsity pose significant challenges to reliable multi-band channel tracking. This paper formulates the problem of predicting achievable rates across multiple antenna arrays and bands with sparse historical measurements. We propose a transformer-based neural architecture that takes asynchronous rate histories as input and outputs per-array rate predictions. Evaluated on ray-traced simulations in a dense urban micro-cellular setting with FR1 and FR3 arrays, our method demonstrates superior performance over baseline predictors, enabling more informed band selection under realistic mobility and hardware constraints.
OCApr 15, 2025
Traffic Adaptive Moving-window Service Patrolling for Real-time Incident Management during High-impact EventsHaozhe Lei, Ya-Ting Yang, Tao Li et al.
This paper presents the Traffic Adaptive Moving-window Patrolling Algorithm (TAMPA), designed to improve real-time incident management during major events like sports tournaments and concerts. Such events significantly stress transportation networks, requiring efficient and adaptive patrol solutions. TAMPA integrates predictive traffic modeling and real-time complaint estimation, dynamically optimizing patrol deployment. Using dynamic programming, the algorithm continuously adjusts patrol strategies within short planning windows, effectively balancing immediate response and efficient routing. Leveraging the Dvoretzky-Kiefer-Wolfowitz inequality, TAMPA detects significant shifts in complaint patterns, triggering proactive adjustments in patrol routes. Theoretical analyses ensure performance remains closely aligned with optimal solutions. Simulation results from an urban traffic network demonstrate TAMPA's superior performance, showing improvements of approximately 87.5\% over stationary methods and 114.2\% over random strategies. Future work includes enhancing adaptability and incorporating digital twin technology for improved predictive accuracy, particularly relevant for events like the 2026 FIFA World Cup at MetLife Stadium.
LGSep 30, 2025
Beyond Point Estimates: Likelihood-Based Full-Posterior Wireless LocalizationHaozhe Lei, Hao Guo, Tommy Svensson et al.
Modern wireless systems require not only position estimates, but also quantified uncertainty to support planning, control, and radio resource management. We formulate localization as posterior inference of an unknown transmitter location from receiver measurements. We propose Monte Carlo Candidate-Likelihood Estimation (MC-CLE), which trains a neural scoring network using Monte Carlo sampling to compare true and candidate transmitter locations. We show that in line-of-sight simulations with a multi-antenna receiver, MC-CLE learns critical properties including angular ambiguity and front-to-back antenna patterns. MC-CLE also achieves lower cross-entropy loss relative to a uniform baseline and Gaussian posteriors. alternatives under a uniform-loss metric.