MOPA: Modular Object Navigation with PointGoal Agents
This work addresses object navigation for Embodied AI systems, presenting an incremental modular improvement.
The authors tackled object navigation in Embodied AI by proposing MOPA, a modular approach that reuses a pretrained PointGoal agent for navigation, and found that a simple uniform exploration strategy outperforms more advanced methods.
We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI. MOPA consists of four modules: (a) an object detection module trained to identify objects from RGB images, (b) a map building module to build a semantic map of the observed objects, (c) an exploration module enabling the agent to explore the environment, and (d) a navigation module to move to identified target objects. We show that we can effectively reuse a pretrained PointGoal agent as the navigation model instead of learning to navigate from scratch, thus saving time and compute. We also compare various exploration strategies for MOPA and find that a simple uniform strategy significantly outperforms more advanced exploration methods.