DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning
This addresses performance degradation in intelligent agents due to ambiguous language instructions in reinforcement learning, representing a novel method for a known bottleneck.
The paper tackles the problem of task ambiguity in language-conditioned reinforcement learning, where flexible linguistic instructions degrade performance, and presents DAIL (Distributional Aligned Learning), which resolves these ambiguities and achieves superior performance to baseline methods on structured and visual observation benchmarks.
Comprehending natural language and following human instructions are critical capabilities for intelligent agents. However, the flexibility of linguistic instructions induces substantial ambiguity across language-conditioned tasks, severely degrading algorithmic performance. To address these limitations, we present a novel method named DAIL (Distributional Aligned Learning), featuring two key components: distributional policy and semantic alignment. Specifically, we provide theoretical results that the value distribution estimation mechanism enhances task differentiability. Meanwhile, the semantic alignment module captures the correspondence between trajectories and linguistic instructions. Extensive experimental results on both structured and visual observation benchmarks demonstrate that DAIL effectively resolves instruction ambiguities, achieving superior performance to baseline methods. Our implementation is available at https://github.com/RunpengXie/Distributional-Aligned-Learning.