LGAug 10, 2020Code
GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement LearningGrgur Kovač, Adrien Laversanne-Finot, Pierre-Yves Oudeyer
Designing agents, capable of learning autonomously a wide range of skills is critical in order to increase the scope of reinforcement learning. It will both increase the diversity of learned skills and reduce the burden of manually designing reward functions for each skill. Self-supervised agents, setting their own goals, and trying to maximize the diversity of those goals have shown great promise towards this end. However, a currently known limitation of agents trying to maximize the diversity of sampled goals is that they tend to get attracted to noise or more generally to parts of the environments that cannot be controlled (distractors). When agents have access to predefined goal features or expert knowledge, absolute Learning Progress (ALP) provides a way to distinguish between regions that can be controlled and those that cannot. However, those methods often fall short when the agents are only provided with raw sensory inputs such as images. In this work we extend those concepts to unsupervised image-based goal exploration. We propose a framework that allows agents to autonomously identify and ignore noisy distracting regions while searching for novelty in the learnable regions to both improve overall performance and avoid catastrophic forgetting. Our framework can be combined with any state-of-the-art novelty seeking goal exploration approaches. We construct a rich 3D image based environment with distractors. Experiments on this environment show that agents using our framework successfully identify interesting regions of the environment, resulting in drastically improved performances. The source code is available at https://sites.google.com/view/grimgep.
LGJun 10, 2019
Autonomous Goal Exploration using Learned Goal Spaces for Visuomotor Skill Acquisition in RobotsAdrien Laversanne-Finot, Alexandre Péré, Pierre-Yves Oudeyer
The automatic and efficient discovery of skills, without supervision, for long-living autonomous agents, remains a challenge of Artificial Intelligence. Intrinsically Motivated Goal Exploration Processes give learning agents a human-inspired mechanism to sequentially select goals to achieve. This approach gives a new perspective on the lifelong learning problem, with promising results on both simulated and real-world experiments. Until recently, those algorithms were restricted to domains with experimenter-knowledge, since the Goal Space used by the agents was built on engineered feature extractors. The recent advances of deep representation learning, enables new ways of designing those feature extractors, using directly the agent experience. Recent work has shown the potential of those methods on simple yet challenging simulated domains. In this paper, we present recent results showing the applicability of those principles on a real-world robotic setup, where a 6-joint robotic arm learns to manipulate a ball inside an arena, by choosing goals in a space learned from its past experience.
LGJul 4, 2018
Curiosity Driven Exploration of Learned Disentangled Goal SpacesAdrien Laversanne-Finot, Alexandre Péré, Pierre-Yves Oudeyer
Intrinsically motivated goal exploration processes enable agents to autonomously sample goals to explore efficiently complex environments with high-dimensional continuous actions. They have been applied successfully to real world robots to discover repertoires of policies producing a wide diversity of effects. Often these algorithms relied on engineered goal spaces but it was recently shown that one can use deep representation learning algorithms to learn an adequate goal space in simple environments. However, in the case of more complex environments containing multiple objects or distractors, an efficient exploration requires that the structure of the goal space reflects the one of the environment. In this paper we show that using a disentangled goal space leads to better exploration performances than an entangled goal space. We further show that when the representation is disentangled, one can leverage it by sampling goals that maximize learning progress in a modular manner. Finally, we show that the measure of learning progress, used to drive curiosity-driven exploration, can be used simultaneously to discover abstract independently controllable features of the environment.
CRMar 24, 2018
Blockclique: scaling blockchains through transaction sharding in a multithreaded block graphSébastien Forestier, Damir Vodenicarevic, Adrien Laversanne-Finot
Decentralized crypto-currencies based on the blockchain architecture under-utilize available network bandwidth, making them unable to scale to thousands of transactions per second. We define the Blockclique architecture, that addresses this limitation by sharding transactions in a block graph with a fixed number of threads. The architecture allows the creation of intrinsically compatible blocks in parallel, where each block references one previous block of each thread. The consistency of the Blockclique protocol is formally established in presence of attackers. An experimental evaluation of the architecture's performance in large realistic networks demonstrates an efficient use of available bandwidth and a throughput of thousands of transactions per second.