Joint Power Allocation and Beamformer for mmW-NOMA Downlink Systems by Deep Reinforcement Learning
This addresses data rate demands in next-generation wireless communication, though it appears incremental as it applies an existing DRL method to a specific optimization problem.
The paper tackles the problem of joint power allocation and beamforming in millimeter-wave NOMA downlink systems to maximize user sum-rate, achieving superior performance compared to TDMA and another NOMA optimized strategy in simulations.
The high demand for data rate in the next generation of wireless communication could be ensured by Non-Orthogonal Multiple Access (NOMA) approach in the millimetre-wave (mmW) frequency band. Joint power allocation and beamforming of mmW-NOMA systems is mandatory which could be met by optimization approaches. To this end, we have exploited Deep Reinforcement Learning (DRL) approach due to policy generation leading to an optimized sum-rate of users. Actor-critic phenomena are utilized to measure the immediate reward and provide the new action to maximize the overall Q-value of the network. The immediate reward has been defined based on the summation of the rate of two users regarding the minimum guaranteed rate for each user and the sum of consumed power as the constraints. The simulation results represent the superiority of the proposed approach rather than the Time-Division Multiple Access (TDMA) and another NOMA optimized strategy in terms of sum-rate of users.