CLMay 25, 2018

Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation

arXiv:1805.10209v232.11101 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of interpreting sequential instructions in interactive environments for AI systems, representing an incremental improvement over prior methods.

The paper tackles the problem of mapping context-dependent sequential instructions to actions by proposing SESTRA, a learning algorithm that uses single-step reward observations and immediate expected reward maximization, achieving absolute accuracy improvements of 9.8%-25.3% over existing approaches on SCONE domains.

We propose a learning approach for mapping context-dependent sequential instructions to actions. We address the problem of discourse and state dependencies with an attention-based model that considers both the history of the interaction and the state of the world. To train from start and goal states without access to demonstrations, we propose SESTRA, a learning algorithm that takes advantage of single-step reward observations and immediate expected reward maximization. We evaluate on the SCONE domains, and show absolute accuracy improvements of 9.8%-25.3% across the domains over approaches that use high-level logical representations.

View on arXiv PDF Code

Similar