IT ITMar 16

Timely Best Arm Identification in Restless Shared Networks

Mengqiu Zhou, Vincent Y. F. Tan, Meng Zhang

arXiv:2603.1507818.7h-index: 3

AI Analysis

This work addresses the challenge of ensuring timely data freshness in edge computing networks for real-time applications, representing an incremental advance by applying restless bandit models to age of information optimization.

The paper tackles the problem of identifying the optimal edge node for real-time status updates under uncertain and time-varying dynamics, framed as a best arm identification problem in a restless multi-armed bandit model to minimize long-term average age of information. The result shows that the proposed age-aware LUCB algorithm significantly reduces sampling costs compared to existing benchmarks, with numerical results indicating particular effectiveness under stringent confidence levels.

Real-time status updating applications increasingly rely on networks of devices and edge nodes to maintain data freshness, as quantified by the age of information (AoI) metric. Given that edge computing nodes exhibit uncertain and time-varying dynamics, it is essential to identify the optimal edge node with high confidence and sample efficiency, even without prior knowledge of these dynamics, to ensure timely updates. To address this challenge, we introduce the first best arm identification (BAI) problem aimed at minimizing the long-term average AoI under a fixed confidence setting, framed within the context of a restless multi-armed bandit (RMAB) model. In this model, each arm evolves independently according to an unknown Markov chain over time, regardless of whether it is selected. To capture the temporal trajectories of AoI in the presence of unknown restless dynamics, we develop an age-aware LUCB algorithm that incorporates Markovian sampling. Additionally, we establish an instance-dependent upper bound on the sample complexity, which captures the difficulty of the problem as a function of the underlying Markov mixing behavior. Moreover, we derive an information-theoretic lower bound to characterize the fundamental challenges of the problem. We show that the sample complexity is influenced by the temporal correlation of the Markov dynamics, aligning with the intuition offered by the upper bound. Our numerical results show that, compared to existing benchmarks, the proposed scheme significantly reduces sampling costs, particularly under more stringent confidence levels.

View on arXiv PDF

Similar