LGMADec 14, 2020

SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement Learning

arXiv:2012.07949v13 citations
AI Analysis

This work is significant for industrial applications of MARL, where predictable and compliant system behavior is crucial for safety and operational integrity.

This paper addresses the problem of multi-agent reinforcement learning (MARL) agents developing unforeseen and potentially undesirable behaviors by explicitly transferring functional and non-functional requirements into shaped rewards. The proposed approach successfully achieves compliance with these constraints in a smart factory environment with up to eight agents.

A characteristic of reinforcement learning is the ability to develop unforeseen strategies when solving problems. While such strategies sometimes yield superior performance, they may also result in undesired or even dangerous behavior. In industrial scenarios, a system's behavior also needs to be predictable and lie within defined ranges. To enable the agents to learn (how) to align with a given specification, this paper proposes to explicitly transfer functional and non-functional requirements into shaped rewards. Experiments are carried out on the smart factory, a multi-agent environment modeling an industrial lot-size-one production facility, with up to eight agents and different multi-agent reinforcement learning algorithms. Results indicate that compliance with functional and non-functional constraints can be achieved by the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes