Many Episode Learning in a Modular Embodied Agent via End-to-End Interaction
This work addresses the challenge of enabling modular agents to learn from end-to-end interactions, though it is incremental as it builds on existing methods for human-in-the-loop learning.
The paper tackles the problem of improving an embodied agent through iterative human-agent interactions, where crowd-workers assign credit to module errors and label data, resulting in demonstrated agent improvement over multiple rounds.
In this work we give a case study of an embodied machine-learning (ML) powered agent that improves itself via interactions with crowd-workers. The agent consists of a set of modules, some of which are learned, and others heuristic. While the agent is not "end-to-end" in the ML sense, end-to-end interaction is a vital part of the agent's learning mechanism. We describe how the design of the agent works together with the design of multiple annotation interfaces to allow crowd-workers to assign credit to module errors from end-to-end interactions, and to label data for individual modules. Over multiple automated human-agent interaction, credit assignment, data annotation, and model re-training and re-deployment, rounds we demonstrate agent improvement.