Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks
This addresses decentralized estimation in dynamic wireless networks, offering incremental improvements in policy optimization for structurally similar graphs.
The paper tackles real-time sampling and estimation of autoregressive Markovian sources in multi-hop wireless networks by proposing a graphical multi-agent reinforcement learning framework for decentralized policy optimization, achieving performance gains over state-of-the-art baselines with transferability to larger networks.
We address real-time sampling and estimation of autoregressive Markovian sources in dynamic yet structurally similar multi-hop wireless networks. Each node caches samples from others and communicates over wireless collision channels, aiming to minimize time-average estimation error via decentralized policies. Due to the high dimensionality of action spaces and complexity of network topologies, deriving optimal policies analytically is intractable. To address this, we propose a graphical multi-agent reinforcement learning framework for policy optimization. Theoretically, we demonstrate that our proposed policies are transferable, allowing a policy trained on one graph to be effectively applied to structurally similar graphs. Numerical experiments demonstrate that (i) our proposed policy outperforms state-of-the-art baselines; (ii) the trained policies are transferable to larger networks, with performance gains increasing with the number of agents; (iii) the graphical training procedure withstands non-stationarity, even when using independent learning techniques; and (iv) recurrence is pivotal in both independent learning and centralized training and decentralized execution, and improves the resilience to non-stationarity.