CVIVAug 2, 2018

Object Localization and Size Estimation from RGB-D Images

arXiv:1808.00641v1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of accurate object size estimation for robotics or AR applications, but it is incremental as it builds on existing RGB-D fusion techniques.

The paper tackled object localization and size estimation by combining RGB and depth images from a Tango phone, achieving real-world measurements in meters through evaluation of height estimation methods under various settings.

Depth sensing cameras (e.g., Kinect sensor, Tango phone) can acquire color and depth images that are registered to a common viewpoint. This opens the possibility of developing algorithms that exploit the advantages of both sensing modalities. Traditionally, cues from color images have been used for object localization (e.g., YOLO). However, the addition of a depth image can be further used to segment images that might otherwise have identical color information. Further, the depth image can be used for object size (height/width) estimation (in real-world measurements units, such as meters) as opposed to image based segmentation that would only support drawing bounding boxes around objects of interest. In this paper, we first collect color camera information along with depth information using a custom Android application on Tango Phab2 phone. Second, we perform timing and spatial alignment between the two data sources. Finally, we evaluate several ways of measuring the height of the object of interest within the captured images under a variety of settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes