Active Perception with A Monocular Camera for Multiscopic Vision
This provides a low-cost, robust depth estimation solution for robotic applications, though it is an incremental improvement over existing stereo methods.
The paper tackles the problem of accurate depth estimation for robotics by designing a multiscopic vision system that actively controls a monocular camera on a robot arm to capture aligned images, reducing average absolute error by 50.2% compared to two-frame stereo matching.
We design a multiscopic vision system that utilizes a low-cost monocular RGB camera to acquire accurate depth estimation for robotic applications. Unlike multi-view stereo with images captured at unconstrained camera poses, the proposed system actively controls a robot arm with a mounted camera to capture a sequence of images in horizontally or vertically aligned positions with the same parallax. In this system, we combine the cost volumes for stereo matching between the reference image and the surrounding images to form a fused cost volume that is robust to outliers. Experiments on the Middlebury dataset and real robot experiments show that our obtained disparity maps are more accurate than two-frame stereo matching: the average absolute error is reduced by 50.2% in our experiments.