Junsuo Qu

2papers

2 Papers

SYMar 9

Joint Trajectory, RIS, and Computation Offloading Optimization via Decentralized Model-Based PPO in Urban Multi-UAV Mobile Edge Computing

Liangshun Wu, Jianbo Du, Junsuo Qu

Efficient computation offloading in multi-UAV edge networks becomes particularly challenging in dense urban areas, where line-of-sight (LoS) links are frequently blocked and user demand varies rapidly. Reconfigurable intelligent surfaces (RISs) can mitigate blockage by creating controllable reflected links, but realizing their potential requires tightly coupled decisions on UAV trajectories, offloading schedules, and RIS phase configurations. This joint optimization is hard to solve in practice because multiple UAVs must coordinate under limited information exchange, and purely model-free multi-agent reinforcement learning (MARL) often learns too slowly in highly dynamic environments. To address these challenges, we propose a decentralized model-based MARL framework. Each UAV optimizes mobility and offloading using observations from several hop neighbors, and submits an RIS phase proposal that is aggregated by a lightweight RIS controller. To boost sample efficiency and stability, agents learn local dynamics models and perform short horizon branched rollouts for proximal policy optimization (PPO) updates. Simulations show near centralized performance with improved throughput and energy efficiency at scale.

NIMar 23, 2021

Fully-echoed Q-routing with Simulated Annealing Inference for Flying Adhoc Networks

Arnau Rovira-Sugranes, Fatemeh Afghah, Junsuo Qu et al.

Current networking protocols deem inefficient in accommodating the two key challenges of Unmanned Aerial Vehicle (UAV) networks, namely the network connectivity loss and energy limitations. One approach to solve these issues is using learning-based routing protocols to make close-to-optimal local decisions by the network nodes, and Q-routing is a bold example of such protocols. However, the performance of the current implementations of Q-routing algorithms is not yet satisfactory, mainly due to the lack of adaptability to continued topology changes. In this paper, we propose a full-echo Q-routing algorithm with a self-adaptive learning rate that utilizes Simulated Annealing (SA) optimization to control the exploration rate of the algorithm through the temperature decline rate, which in turn is regulated by the experienced variation rate of the Q-values. Our results show that our method adapts to the network dynamicity without the need for manual re-initialization at transition points (abrupt network topology changes). Our method exhibits a reduction in the energy consumption ranging from 7% up to 82%, as well as a 2.6 fold gain in successful packet delivery rate}, compared to the state of the art Q-routing protocols