Video Object Segmentation-based Visual Servo Control and Object Depth Estimation on a Mobile Robot
This work addresses the challenge of enabling robots to interact with real-world objects in everyday environments, representing an incremental improvement by building on existing video object segmentation techniques.
The paper tackled the problem of identifying and locating generic objects in 3D using a mobile robot with an RGB camera, achieving this by introducing a video object segmentation-based approach for visual servo control and active perception, along with a new Hadamard-Broyden update formulation, validated through experiments on a mobile HSR robot that successfully identified, located, and grasped objects from the YCB dataset and tracked dynamic objects in real-time.
To be useful in everyday environments, robots must be able to identify and locate real-world objects. In recent years, video object segmentation has made significant progress on densely separating such objects from background in real and challenging videos. Building off of this progress, this paper addresses the problem of identifying generic objects and locating them in 3D using a mobile robot with an RGB camera. We achieve this by, first, introducing a video object segmentation-based approach to visual servo control and active perception and, second, developing a new Hadamard-Broyden update formulation. Our segmentation-based methods are simple but effective, and our update formulation lets a robot quickly learn the relationship between actuators and visual features without any camera calibration. We validate our approach in experiments by learning a variety of actuator-camera configurations on a mobile HSR robot, which subsequently identifies, locates, and grasps objects from the YCB dataset and tracks people and other dynamic articulated objects in real-time.