AIJul 16, 2023

Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training

Yao Wei, Yanchao Sun, Ruijie Zheng, Sai Vemprala, Rogerio Bonatti, Shuhang Chen, Ratnesh Madaan, Zhongjie Ba, Ashish Kapoor, Shuang Ma

arXiv:2307.07909v316.820 citationsh-index: 58Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of overfitting and task-specific fine-tuning in AI agents for various decision-making domains, representing a strong incremental advance in generalist agent design.

The paper tackles the problem of creating a generalist agent for decision-making tasks by introducing DualMind, which uses a dual-phase training strategy to learn common knowledge and imitate behaviors, achieving over 50% and 70% performance gains on Habitat and MetaWorld compared to other agents and over 30 tasks at a 90% success rate on MetaWorld.

We introduce DualMind, a generalist agent designed to tackle various decision-making tasks that addresses challenges posed by current methods, such as overfitting behaviors and dependence on task-specific fine-tuning. DualMind uses a novel "Dual-phase" training strategy that emulates how humans learn to act in the world. The model first learns fundamental common knowledge through a self-supervised objective tailored for control tasks and then learns how to make decisions based on different contexts through imitating behaviors conditioned on given prompts. DualMind can handle tasks across domains, scenes, and embodiments using just a single set of model weights and can execute zero-shot prompting without requiring task-specific fine-tuning. We evaluate DualMind on MetaWorld and Habitat through extensive experiments and demonstrate its superior generalizability compared to previous techniques, outperforming other generalist agents by over 50$\%$ and 70$\%$ on Habitat and MetaWorld, respectively. On the 45 tasks in MetaWorld, DualMind achieves over 30 tasks at a 90$\%$ success rate.

View on arXiv PDF Code

Similar