RO AIMar 9, 2021

A model-based framework for learning transparent swarm behaviors

Mario Coppola, Jian Guo, Eberhard Gill, Guido C. H. E. de Croon

arXiv:2103.05343v13.0Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of creating understandable swarm behaviors for robotics researchers, though it is incremental as it builds on existing model-based and evolutionary methods.

The paper tackles the problem of designing transparent and verifiable behaviors for robot swarms by proposing a model-based framework that extracts neural network and probabilistic models from simulation data to optimize policies, achieving effective controllers in aggregation and foraging tasks.

This paper proposes a model-based framework to automatically and efficiently design understandable and verifiable behaviors for swarms of robots. The framework is based on the automatic extraction of two distinct models: 1) a neural network model trained to estimate the relationship between the robots' sensor readings and the global performance of the swarm, and 2) a probabilistic state transition model that explicitly models the local state transitions (i.e., transitions in observations from the perspective of a single robot in the swarm) given a policy. The models can be trained from a data set of simulated runs featuring random policies. The first model is used to automatically extract a set of local states that are expected to maximize the global performance. These local states are referred to as desired local states. The second model is used to optimize a stochastic policy so as to increase the probability that the robots in the swarm observe one of the desired local states. Following these steps, the framework proposed in this paper can efficiently lead to effective controllers. This is tested on four case studies, featuring aggregation and foraging tasks. Importantly, thanks to the models, the framework allows us to understand and inspect a swarm's behavior. To this end, we propose verification checks to identify some potential issues that may prevent the swarm from achieving the desired global objective. In addition, we explore how the framework can be used in combination with a "standard" evolutionary robotics strategy (i.e., where performance is measured via simulation), or with online learning.

View on arXiv PDF Code

Similar