LGFeb 15, 2024

Non-orthogonal Age-Optimal Information Dissemination in Vehicular Networks: A Meta Multi-Objective Reinforcement Learning Approach

arXiv:2402.12260v16.48 citationsh-index: 10IEEE Trans Mob Comput

Originality Incremental advance

AI Analysis

It addresses timely and efficient communication for vehicles, but is incremental as it builds on existing reinforcement learning methods.

This paper tackles minimizing age-of-information and transmit power in vehicular networks using non-orthogonal transmission, achieving a high-quality Pareto frontier with reduced training time compared to benchmarks.

This paper considers minimizing the age-of-information (AoI) and transmit power consumption in a vehicular network, where a roadside unit (RSU) provides timely updates about a set of physical processes to vehicles. We consider non-orthogonal multi-modal information dissemination, which is based on superposed message transmission from RSU and successive interference cancellation (SIC) at vehicles. The formulated problem is a multi-objective mixed-integer nonlinear programming problem; thus, a Pareto-optimal front is very challenging to obtain. First, we leverage the weighted-sum approach to decompose the multi-objective problem into a set of multiple single-objective sub-problems corresponding to each predefined objective preference weight. Then, we develop a hybrid deep Q-network (DQN)-deep deterministic policy gradient (DDPG) model to solve each optimization sub-problem respective to predefined objective-preference weight. The DQN optimizes the decoding order, while the DDPG solves the continuous power allocation. The model needs to be retrained for each sub-problem. We then present a two-stage meta-multi-objective reinforcement learning solution to estimate the Pareto front with a few fine-tuning update steps without retraining the model for each sub-problem. Simulation results illustrate the efficacy of the proposed solutions compared to the existing benchmarks and that the meta-multi-objective reinforcement learning model estimates a high-quality Pareto frontier with reduced training time.

View on arXiv PDF

Similar