CLAISep 5, 2024

E2CL: Exploration-based Error Correction Learning for Embodied Agents

arXiv:2409.03256v29 citationsh-index: 10
AI Analysis

This addresses the challenge of environment alignment for embodied agents, offering a novel approach to improve adaptability, though it appears incremental as it builds on existing exploration and feedback concepts.

The paper tackles the problem of misalignment between language models' intrinsic knowledge and environmental knowledge in embodied agents, which leads to infeasible actions, by proposing E2CL, a framework that uses exploration-induced errors and environmental feedback to enhance alignment, resulting in agents outperforming baselines and showing superior self-correction in the VirtualHome environment.

Language models are exhibiting increasing capability in knowledge utilization and reasoning. However, when applied as agents in embodied environments, they often suffer from misalignment between their intrinsic knowledge and environmental knowledge, leading to infeasible actions. Traditional environment alignment methods, such as supervised learning on expert trajectories and reinforcement learning, encounter limitations in covering environmental knowledge and achieving efficient convergence, respectively. Inspired by human learning, we propose Exploration-based Error Correction Learning (E2CL), a novel framework that leverages exploration-induced errors and environmental feedback to enhance environment alignment for embodied agents. E2CL incorporates teacher-guided and teacher-free explorations to gather environmental feedback and correct erroneous actions. The agent learns to provide feedback and self-correct, thereby enhancing its adaptability to target environments. Extensive experiments in the VirtualHome environment demonstrate that E2CL-trained agents outperform those trained by baseline methods and exhibit superior self-correction capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes