MAMar 13, 2025
H2-MARL: Multi-Agent Reinforcement Learning for Pareto Optimality in Hospital Capacity Strain and Human Mobility during EpidemicXueting Luo, Hao Deng, Jihong Yang et al.
The necessity of achieving an effective balance between minimizing the losses associated with restricting human mobility and ensuring hospital capacity has gained significant attention in the aftermath of COVID-19. Reinforcement learning (RL)-based strategies for human mobility management have recently advanced in addressing the dynamic evolution of cities and epidemics; however, they still face challenges in achieving coordinated control at the township level and adapting to cities of varying scales. To address the above issues, we propose a multi-agent RL approach that achieves Pareto optimality in managing hospital capacity and human mobility (H2-MARL), applicable across cities of different scales. We first develop a township-level infection model with online-updatable parameters to simulate disease transmission and construct a city-wide dynamic spatiotemporal epidemic simulator. On this basis, H2-MARL is designed to treat each division as an agent, with a trade-off dual-objective reward function formulated and an experience replay buffer enriched with expert knowledge built. To evaluate the effectiveness of the model, we construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales. Extensive experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can minimize hospital capacity strain while minimizing human mobility restriction loss. Meanwhile, the applicability of the proposed model to epidemic control in cities of varying scales is verified, which showcases its feasibility and versatility in practical applications.
LGFeb 25, 2025
PVBF: A Framework for Mitigating Parameter Variation Imbalance in Online Continual LearningZelin Tao, Hao Deng, Mingqing Liu et al.
Online continual learning (OCL), which enables AI systems to adaptively learn from non-stationary data streams, is commonly achieved using experience replay (ER)-based methods that retain knowledge by replaying stored past during training. However, these methods face challenges of prediction bias, stemming from deviations in parameter update directions during task transitions. This paper identifies parameter variation imbalance as a critical factor contributing to prediction bias in ER-based OCL. Specifically, using the proposed parameter variation evaluation method, we highlight two types of imbalance: correlation-induced imbalance, where certain parameters are disproportionately updated across tasks, and layer-wise imbalance, where output layer parameters update faster than those in preceding layers. To mitigate the above imbalances, we propose the Parameter Variation Balancing Framework (PVBF), which incorporates: 1) a novel method to compute parameter correlations with previous tasks based on parameter variations, 2) an encourage-and-consolidate (E&C) method utilizing parameter correlations to perform gradient adjustments across all parameters during training, 3) a dual-layer copy weights with reinit (D-CWR) strategy to slowly update output layer parameters for frequently occuring sample categories. Experiments on short and long task sequences demonstrate that PVBF significantly reduces prediction bias and improves OCL performance, achieving up to 47\% higher accuracy compared to existing ER-based methods.
CVApr 19, 2020
Lightweight Mask R-CNN for Long-Range Wireless Power Transfer SystemsHao Li, Aozhou Wu, Wen Fang et al.
Resonant Beam Charging (RBC) is a wireless charging technology which supports multi-watt power transfer over meter-level distance. The features of safety, mobility and simultaneous charging capability enable RBC to charge multiple mobile devices safely at the same time. To detect the devices that need to be charged, a Mask R-CNN based dection model is proposed in previous work. However, considering the constraints of the RBC system, it's not easy to apply Mask R-CNN in lightweight hardware-embedded devices because of its heavy model and huge computation. Thus, we propose a machine learning detection approach which provides a lighter and faster model based on traditional Mask R-CNN. The proposed approach makes the object detection much easier to be transplanted on mobile devices and reduce the burden of hardware computation. By adjusting the structure of the backbone and the head part of Mask R-CNN, we reduce the average detection time from $1.02\mbox{s}$ per image to $0.6132\mbox{s}$, and reduce the model size from $245\mbox{MB}$ to $47.1\mbox{MB}$. The improved model is much more suitable for the application in the RBC system.
SYSep 25, 2018
Optimal Resonant Beam Charging for Electronic Vehicles in Internet of Intelligent VehiclesQingqing Zhang, Mingqing Liu, Xing Lin et al.
To enable electric vehicles (EVs) to access to the internet of intelligent vehicles (IoIV), charging EVs wirelessly anytime and anywhere becomes an urgent need. The resonant beam charging (RBC) technology can provide high-power and long-range wireless energy for EVs. However, the RBC system is unefficient. To improve the RBC power transmission efficiency, the adaptive resonant beam charging (ARBC) technology was introduced. In this paper, after analyzing the modular model of the ARBC system, we obtain the closed-form formula of the end-to-end power transmission efficiency. Then, we prove that the optimal power transmission efficiency uniquely exists. Moreover, we analyze the relationships among the optimal power transmission efficiency, the source power, the output power, and the beam transmission efficiency, which provide the guidelines for the optimal ARBC system design and implementation. Hence, perpetual energy can be supplied to EVs in IoIV virtually.
SYSep 25, 2018
Earning Maximization with Quality of Charging Service Guarantee for IoT DevicesWen Fang, Qingqing Zhang, Mingqing Liu et al.
Resonant Beam Charging (RBC) is a promising Wireless Power Transfer (WPT) technology to provide long-range, high-power, mobile and safe wireless power for the Internet of Things (IoT) devices. The Point-to-Multipoint (PtMP) RBC system can charge multiple receivers simultaneously similar to WiFi communications. To guarantee the Quality of Charging Service (QoCS) for each receiver and maximize the overall earning in the PtMP RBC service, we specify the Charging Pricing Strategy (CPS) and develop the High Priority Charge (HPC) scheduling algorithm to control the charging order and power allocation. Each receiver is assigned a priority, which is updated dynamically based on its State of Charging (SOC) and specified charging power. The receivers with high priorities are scheduled to be charged in each time slot. We present the pseudo code of the HPC algorithm based on quantifying the receiver's SOC, discharging energy and various relevant parameters. Relying on simulation analysis, we demonstrate that the HPC algorithm can achieve better QoCS and earning than the Round-Robin Charge (RRC) scheduling algorithm. Based on the performance evaluation, we illustrate that the methods to improve the PtMP RBC service are: 1) limiting the receiver number within a reasonable range and 2) prolonging the charging duration as long as possible. In summary, the HPC scheduling algorithm provides a practical strategy to maximize the earning of the PtMP RBC service with each receiver's QoCS guarantee.