LG NI SPFeb 12, 2024

Leveraging Digital Cousins for Ensemble Q-Learning in Large-Scale Wireless Networks

arXiv:2402.08022v19.210 citationsh-index: 5Has CodeIEEE Transactions on Signal Processing

Originality Highly original

AI Analysis

This work addresses performance and complexity issues in wireless network optimization, offering a domain-specific improvement for network management.

The paper tackles the challenge of optimizing large-scale wireless networks by proposing a novel ensemble Q-learning algorithm that uses digital cousins to run multiple Q-learning algorithms in parallel, achieving up to 50% less average policy error and 40% less runtime complexity compared to state-of-the-art reinforcement learning methods.

Optimizing large-scale wireless networks, including optimal resource management, power allocation, and throughput maximization, is inherently challenging due to their non-observable system dynamics and heterogeneous and complex nature. Herein, a novel ensemble Q-learning algorithm that addresses the performance and complexity challenges of the traditional Q-learning algorithm for optimizing wireless networks is presented. Ensemble learning with synthetic Markov Decision Processes is tailored to wireless networks via new models for approximating large state-space observable wireless networks. In particular, digital cousins are proposed as an extension of the traditional digital twin concept wherein multiple Q-learning algorithms on multiple synthetic Markovian environments are run in parallel and their outputs are fused into a single Q-function. Convergence analyses of key statistics and Q-functions and derivations of upper bounds on the estimation bias and variance are provided. Numerical results across a variety of real-world wireless networks show that the proposed algorithm can achieve up to 50% less average policy error with up to 40% less runtime complexity than the state-of-the-art reinforcement learning algorithms. It is also shown that theoretical results properly predict trends in the experimental results.

View on arXiv PDF Code

Similar