AIJan 4, 2019

Machine Teaching in Hierarchical Genetic Reinforcement Learning: Curriculum Design of Reward Functions for Swarm Shepherding

arXiv:1901.00949v122 citations
Originality Incremental advance
AI Analysis

This addresses the lack of systematic methods for reward function design in reinforcement learning, offering a structured approach for researchers and practitioners, though it is incremental as it adapts existing educational concepts to a new domain.

The paper tackles the problem of designing reward functions in reinforcement learning by applying Systematic Instructional Design from human education to create a machine education methodology, resulting in a hierarchical genetic reinforcement learner that successfully evolves a swarm controller for shepherding.

The design of reward functions in reinforcement learning is a human skill that comes with experience. Unfortunately, there is not any methodology in the literature that could guide a human to design the reward function or to allow a human to transfer the skills developed in designing reward functions to another human and in a systematic manner. In this paper, we use Systematic Instructional Design, an approach in human education, to engineer a machine education methodology to design reward functions for reinforcement learning. We demonstrate the methodology in designing a hierarchical genetic reinforcement learner that adopts a neural network representation to evolve a swarm controller for an agent shepherding a boids-based swarm. The results reveal that the methodology is able to guide the design of hierarchical reinforcement learners, with each model in the hierarchy learning incrementally through a multi-part reward function. The hierarchy acts as a decision fusion function that combines the individual behaviours and skills learnt by each instruction to create a smart shepherd to control the swarm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes