CVNov 27, 2020

Efficient Scene Compression for Visual-based Localization

Marcela Mera-Trujillo, Benjamin Smith, Victor Fragoso

arXiv:2011.13894v19.119 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a faster and easier-to-tune method for compressing 3D scene representations, benefiting mixed reality and robotics applications that require efficient visual-based localization under storage and bandwidth constraints.

This paper addresses the problem of efficient scene compression for visual-based localization, which is crucial for mixed reality and robotics applications with storage and bandwidth constraints. The authors propose a novel approach based on a constrained quadratic program, solved using a variant of sequential minimal optimization, to select a subset of 3D points for scene representation. Their method achieves fast compression and accurate pose estimates on public datasets.

Estimating the pose of a camera with respect to a 3D reconstruction or scene representation is a crucial step for many mixed reality and robotics applications. Given the vast amount of available data nowadays, many applications constrain storage and/or bandwidth to work efficiently. To satisfy these constraints, many applications compress a scene representation by reducing its number of 3D points. While state-of-the-art methods use $K$-cover-based algorithms to compress a scene, they are slow and hard to tune. To enhance speed and facilitate parameter tuning, this work introduces a novel approach that compresses a scene representation by means of a constrained quadratic program (QP). Because this QP resembles a one-class support vector machine, we derive a variant of the sequential minimal optimization to solve it. Our approach uses the points corresponding to the support vectors as the subset of points to represent a scene. We also present an efficient initialization method that allows our method to converge quickly. Our experiments on publicly available datasets show that our approach compresses a scene representation quickly while delivering accurate pose estimates.

View on arXiv PDF Code

Similar