CVAIJun 15, 2022

Zero-shot object goal visual navigation

arXiv:2206.07423v364 citationsh-index: 33Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of real-world household navigation where robots must handle numerous object classes not seen during training, though it is incremental as it builds on existing zero-shot and navigation methods.

The paper tackles the problem of enabling robots to navigate to objects from novel classes without prior training, proposing a semantic similarity network (SSNet) that uses detection results and semantic word embeddings to generalize to unseen classes, and demonstrates superior performance over baselines on the AI2-THOR platform.

Object goal visual navigation is a challenging task that aims to guide a robot to find the target object based on its visual observation, and the target is limited to the classes pre-defined in the training stage. However, in real households, there may exist numerous target classes that the robot needs to deal with, and it is hard for all of these classes to be contained in the training stage. To address this challenge, we study the zero-shot object goal visual navigation task, which aims at guiding robots to find targets belonging to novel classes without any training samples. To this end, we also propose a novel zero-shot object navigation framework called semantic similarity network (SSNet). Our framework use the detection results and the cosine similarity between semantic word embeddings as input. Such type of input data has a weak correlation with classes and thus our framework has the ability to generalize the policy to novel classes. Extensive experiments on the AI2-THOR platform show that our model outperforms the baseline models in the zero-shot object navigation task, which proves the generalization ability of our model. Our code is available at: https://github.com/pioneer-innovation/Zero-Shot-Object-Navigation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes