NE AIFeb 10, 2020

Novelty Producing Synaptic Plasticity

Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher, Mykola Pechenizkiy

arXiv:2002.03620v12.4

Originality Incremental advance

AI Analysis

This addresses a challenge in reinforcement learning for agents in environments with unknown goals, though it appears incremental as it builds on novelty search methods.

The paper tackles the problem of learning in tasks like maze-navigation where reinforcement signals are unavailable, by introducing novelty producing synaptic plasticity (NPSP) to evolve synaptic plasticity rules that generate novel behaviors, and results show it produces significantly more novel behaviors than random search.

A learning process with the plasticity property often requires reinforcement signals to guide the process. However, in some tasks (e.g. maze-navigation), it is very difficult (or impossible) to measure the performance of an agent (i.e. a fitness value) to provide reinforcements since the position of the goal is not known. This requires finding the correct behavior among a vast number of possible behaviors without having the knowledge of the reinforcement signals. In these cases, an exhaustive search may be needed. However, this might not be feasible especially when optimizing artificial neural networks in continuous domains. In this work, we introduce novelty producing synaptic plasticity (NPSP), where we evolve synaptic plasticity rules to produce as many novel behaviors as possible to find the behavior that can solve the problem. We evaluate the NPSP on maze-navigation on deceptive maze environments that require complex actions and the achievement of subgoals to complete. Our results show that the search heuristic used with the proposed NPSP is indeed capable of producing much more novel behaviors in comparison with a random search taken as baseline.

View on arXiv PDF

Similar