OrthographicNet: A Deep Transfer Learning Approach for 3D Object Recognition in Open-Ended Domains
This addresses the problem of robots needing to recognize new objects in real-world environments, but it is incremental as it builds on existing CNN-based approaches.
The paper tackles 3D object recognition in open-ended domains for service robots, proposing OrthographicNet to generate rotation- and scale-invariant representations, resulting in significant improvements over state-of-the-art methods in performance and scalability, with real-time validation in demonstrations.
Nowadays, service robots are appearing more and more in our daily life. For this type of robot, open-ended object category learning and recognition is necessary since no matter how extensive the training data used for batch learning, the robot might be faced with a new object when operating in a real-world environment. In this work, we present OrthographicNet, a Convolutional Neural Network (CNN)-based model, for 3D object recognition in open-ended domains. In particular, OrthographicNet generates a global rotation- and scale-invariant representation for a given 3D object, enabling robots to recognize the same or similar objects seen from different perspectives. Experimental results show that our approach yields significant improvements over the previous state-of-the-art approaches concerning object recognition performance and scalability in open-ended scenarios. Moreover, OrthographicNet demonstrates the capability of learning new categories from very few examples on-site. Regarding real-time performance, three real-world demonstrations validate the promising performance of the proposed architecture.