Distributed Noncoherent Joint Transmission Based on Multi-Agent Reinforcement Learning for Dense Small Cell MISO Systems
This work addresses implementation challenges in dense small cell networks for improved capacity, though it is incremental as it builds on existing noncoherent joint transmission techniques.
The paper tackles the nonconvex and NP-hard sum rate maximization problem in dense small cell networks using noncoherent joint transmission by proving and exploiting an optimal beamforming structure, and proposes a distributed multi-agent reinforcement learning scheme that achieves comparable performance with lower complexity and overhead than centralized methods.
We consider a dense small cell (DSC) network where multi-antenna small cell base stations (SBSs) transmit data to single-antenna users over a shared frequency band. To enhance capacity, a state-of-the-art technique known as noncoherent joint transmission (JT) is applied, enabling users to receive data from multiple coordinated SBSs. However, the sum rate maximization problem with noncoherent JT is inherently nonconvex and NP-hard. While existing optimization-based noncoherent JT algorithms can provide near-optimal performance, they require global channel state information (CSI) and multiple iterations, which makes them difficult to be implemeted in DSC networks.To overcome these challenges, we first prove that the optimal beamforming structure is the same for both the power minimization problem and the sum rate maximization problem, and then mathematically derive the optimal beamforming structure for both problems by solving the power minimization problem.The optimal beamforming structure can effectively reduces the variable dimensions.By exploiting the optimal beamforming structure, we propose a deep deterministic policy gradient-based distributed noncoherent JT scheme to maximize the system sum rate.In the proposed scheme, each SBS utilizes global information for training and uses local CSI to determine beamforming vectors. Simulation results demonstrate that the proposed scheme achieves comparable performance with considerably lower computational complexity and information overhead compared to centralized iterative optimization-based techniques, making it more attractive for practical deployment.