RoboSherlock: Cognition-enabled Robot Perception for Everyday Manipulation Tasks
This work addresses the problem of fragmented perception-cognition integration in autonomous robots, offering a novel framework for mobile robots performing human-scale manipulation tasks, though it appears incremental as it builds on existing paradigms.
The paper tackles the challenge of integrating perception and cognition in robotic systems for everyday manipulation tasks by introducing RoboSherlock, a knowledge-enabled cognitive perception framework that formulates scene interpretation as an unstructured information management problem, resulting in improved object recognition and task-relevant query answering.
A pressing question when designing intelligent autonomous systems is how to integrate the various subsystems concerned with complementary tasks. More specifically, robotic vision must provide task-relevant information about the environment and the objects in it to various planning related modules. In most implementations of the traditional Perception-Cognition-Action paradigm these tasks are treated as quasi-independent modules that function as black boxes for each other. It is our view that perception can benefit tremendously from a tight collaboration with cognition. We present RoboSherlock, a knowledge-enabled cognitive perception systems for mobile robots performing human-scale everyday manipulation tasks. In RoboSherlock, perception and interpretation of realistic scenes is formulated as an unstructured information management(UIM) problem. The application of the UIM principle supports the implementation of perception systems that can answer task-relevant queries about objects in a scene, boost object recognition performance by combining the strengths of multiple perception algorithms, support knowledge-enabled reasoning about objects and enable automatic and knowledge-driven generation of processing pipelines. We demonstrate the potential of the proposed framework through feasibility studies of systems for real-world scene perception that have been built on top of the framework.