Deep Reinforcement Learning for IRS Phase Shift Design in Spatiotemporally Correlated Environments
This addresses the challenge of optimizing IRS configurations for mobile receivers in correlated channels, which is incremental as it builds on prior deep reinforcement learning methods for IRS design.
The paper tackles the problem of designing Intelligent Reflecting Surface (IRS) phase shifters in spatiotemporally correlated MISO communication environments to maximize expected sum of SNRs over infinite time horizons, proposing a deep actor-critic algorithm with Fourier kernel preprocessing that enables stable value learning and showing that including SNR in the state representation inhibits convergence in such environments.
The paper studies the problem of designing the Intelligent Reflecting Surface (IRS) phase shifters for Multiple Input Single Output (MISO) communication systems in spatiotemporally correlated channel environments, where the destination can move within a confined area. The objective is to maximize the expected sum of SNRs at the receiver over infinite time horizons. The problem formulation gives rise to a Markov Decision Process (MDP). We propose a deep actor-critic algorithm that accounts for channel correlations and destination motion by constructing the state representation to include the current position of the receiver and the phase shift values and receiver positions that correspond to a window of previous time steps. The channel variability induces high frequency components on the spectrum of the underlying value function. We propose the preprocessing of the critic's input with a Fourier kernel which enables stable value learning. Finally, we investigate the use of the destination SNR as a component of the designed MDP state, which is common practice in previous work. We provide empirical evidence that, when the channels are spatiotemporally correlated, the inclusion of the SNR in the state representation interacts with function approximation in ways that inhibit convergence.