AIJul 21, 2023

A Two-stage Fine-tuning Strategy for Generalizable Manipulation Skill of Embodied AI

arXiv:2307.11343v15 citationsh-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of making Embodied AI more practical for real-world applications by improving generalization to unseen scenarios, though it appears incremental as it builds on an existing benchmark.

The paper tackles the problem of Embodied AI models requiring massive interactions for training by proposing a two-stage fine-tuning strategy to enhance generalization, achieving first prize in all three tracks of the ManiSkill2 Challenge.

The advent of Chat-GPT has led to a surge of interest in Embodied AI. However, many existing Embodied AI models heavily rely on massive interactions with training environments, which may not be practical in real-world situations. To this end, the Maniskill2 has introduced a full-physics simulation benchmark for manipulating various 3D objects. This benchmark enables agents to be trained using diverse datasets of demonstrations and evaluates their ability to generalize to unseen scenarios in testing environments. In this paper, we propose a novel two-stage fine-tuning strategy that aims to further enhance the generalization capability of our model based on the Maniskill2 benchmark. Through extensive experiments, we demonstrate the effectiveness of our approach by achieving the 1st prize in all three tracks of the ManiSkill2 Challenge. Our findings highlight the potential of our method to improve the generalization abilities of Embodied AI models and pave the way for their ractical applications in real-world scenarios. All codes and models of our solution is available at https://github.com/xtli12/GXU-LIPE.git

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes