AIMar 10, 2017

Communications that Emerge through Reinforcement Learning Using a (Recurrent) Neural Network

arXiv:1703.03543v21.74 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of developing comprehensive communication systems for AI agents, showing incremental progress in applying end-to-end reinforcement learning to multi-agent and real-world scenarios.

The paper tackled the problem of enabling agents to learn communication through reinforcement learning with neural networks, demonstrating that agents can negotiate to avoid conflicts, discretize signals for robustness, and use real-world inputs like camera images to control a robot, achieving successful outcomes such as appropriate decisions after negotiation and goal-reaching from any initial location.

Communication is not only an action of choosing a signal, but needs to consider the context and sensor signals. It also needs to decide what information is communicated and how it is represented in or understood from signals. Therefore, communication should be realized comprehensively together with its purpose and other functions. The recent successful results in end-to-end reinforcement learning (RL) show the importance of comprehensive learning and the usefulness of end-to-end RL. Although little is known, we have shown that a variety of communications emerge through RL using a (recurrent) neural network (NN). Here, three of them are introduced. In the 1st one, negotiation to avoid conflicts among 4 randomly-picked agents was learned. Each agent generates a binary signal from the output of its recurrent NN (RNN), and receives 4 signals from the agents three times. After learning, each agent made an appropriate final decision after negotiation for any combination of 4 agents. Differentiation of individuality among the agents also could be seen. The 2nd one focused on discretization of communication signal. A sender agent perceives the receiver's location and generates a continuous signal twice by its RNN. A receiver agent receives them sequentially, and moves according to its RNN's output to reach the sender's location. When noises were added to the signal, it was binarized through learning and 2-bit communication was established. The 3rd one focused on end-to-end comprehensive communication. A sender receives 1,785-pixel real camera image on which a real robot can be seen, and sends two sounds whose frequencies are computed by its NN. A receiver receives them, and two motion commands for the robot are generated by its NN. After learning, though some preliminary learning was necessary for the sender, the robot could reach the goal from any initial location.

View on arXiv PDF

Similar