LGMay 27, 2022
Deep Reinforcement Learning for Distributed and Uncoordinated Cognitive Radios Resource AllocationAnkita Tondwalkar, Andres Kwasinski
This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network where the interactions of the agents during learning may lead to a non-stationary environment. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. It is shown by considering aspects specific to deep reinforcement learning that the presented algorithm converges in an arbitrarily long time to equilibrium policies in a non-stationary multi-agent environment that results from the uncoordinated dynamic interaction between radios through the shared wireless environment. Simulation results show that the presented technique achieves a faster learning performance compared to an equivalent table-based Q-learning algorithm and is able to find the optimal policy in 99% of cases for a sufficiently long learning time. In addition, simulations show that our DQL approach requires less than half the number of learning steps to achieve the same performance as an equivalent table-based implementation. Moreover, it is shown that the use of a standard single-agent deep reinforcement learning approach may not achieve convergence when used in an uncoordinated interacting multi-radio scenario
NIJun 28, 2018
Neural Network Cognitive Engine for Autonomous and Distributed Underlay Dynamic Spectrum AccessFatemeh Shah-Mohammadi, Andres Kwasinski
Two key challenges in underlay dynamic spectrum access (DSA) are how to establish an interference limit from the primary network (PN) and how cognitive radios (CRs) in the secondary network (SN) become aware of the interference they create on the PN, especially when there is no exchange of information between the two networks. These challenges are addressed in this paper by presenting a fully autonomous and distributed underlay DSA scheme where each CR operates based on predicting its transmission effect on the PN. The scheme is based on a cognitive engine with an artificial neural network that predicts, without exchanging information between the networks, the adaptive modulation and coding configuration for the primary link nearest to a transmitting CR. By managing the effect of the SN on the PN, the presented technique maintains the relative average throughput change in the PN within a prescribed maximum value, while also finding transmit settings for the CRs that result in throughput as large as allowed by the PN interference limit. Simulation results show that the ability of the cognitive engine in estimating the effect of a CR transmission on the full adaptive modulation and coding (AMC) mode leads to a much more fine underlay transmit power control. This ability also provides higher transmission opportunities for the CRs, compared to a scheme that can only estimate the modulation scheme used at the PN link.