SY IT LG SPAug 5, 2020

Learning Power Control from a Fixed Batch of Data

Mohammad G. Khoshkholgh, Halim Yanikomeroglu

arXiv:2008.02669v11.22 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses power control in wireless networks, offering a data-efficient solution, but it is incremental as it builds on offline deep reinforcement learning methods.

The paper tackles the problem of learning power control policies for an unexplored environment using only a fixed batch of data from a monitored environment, and demonstrates that the agent can learn quickly even with dissimilar objectives, requiring only about one-third high-quality data.

We address how to exploit power control data, gathered from a monitored environment, for performing power control in an unexplored environment. We adopt offline deep reinforcement learning, whereby the agent learns the policy to produce the transmission powers solely by using the data. Experiments demonstrate that despite discrepancies between the monitored and unexplored environments, the agent successfully learns the power control very quickly, even if the objective functions in the monitored and unexplored environments are dissimilar. About one third of the collected data is sufficient to be of high-quality and the rest can be from any sub-optimal algorithm.

View on arXiv PDF

Similar