30.3CVJun 4
T-FunS3D: Task-Driven Hierarchical Open-Vocabulary 3D Functionality SegmentationJingkun Feng, Reza Sabzevari
Open-vocabulary 3D functionality segmentation enables robots to localize functional object components in 3D scenes. It is a challenging task that requires spatial understanding and task interpretation. Current open-vocabulary 3D segmentation methods primarily focus on object-level recognition, while scene-wide part segmentation methods attempt to segment the entire scene exhaustively, making them highly resource-intensive and time consuming. Balancing segmentation performance in terms of granularity, accuracy, and speed remains a challenge. As one step towards alleviating this, we introduce T-FunS3D, a task-driven hierarchical open-vocabulary 3D functionality segmentation method that provides actionable perception for robotic applications. Our method takes as input the 3D point cloud and posed RGB-D images of an indoor scene. We construct an open-vocabulary scene graph by extracting instances and their visual embeddings in the environment. Given a task description, T-FunS3D identifies the most relevant instances in the scene graph and locates their functional components leveraging a vision-language model. Experiments on the SceneFun3D dataset demonstrate that T-FunS3D is comparable to state-of-the-art in open-vocabulary 3D functionality segmentation, while achieving faster runtime and reduced memory usage.
ROJun 16, 2015
LightPanel: Active Mobile Platform for Dense 3D ModellingJonas Schuler, Reza Sabzevari, Davide Scaramuzza
In this paper we introduce a novel platform for dense 3D modelling. This platform is an active image acquisition setup assisted with a set of light sources and a distance sensor. The hardware setup is designed for being mounted on a mobile robot which is remotely driven to create accurate dense 3D models from out-of-reach objects. For this reason, the object is actively illuminated by the imaging setup and Photometric Stereo is used to recover the dense 3D model. The proposed image acquisition setup, called LightPanel, is described from design to calibration and discusses the practical challenges of using Photometric Stereo under uncontrolled lighting conditions.
CVMar 16, 2015
PiMPeR: Piecewise Dense 3D Reconstruction from Multi-View and Multi-Illumination ImagesReza Sabzevari, Vittori Murino, Alessio Del Bue
In this paper, we address the problem of dense 3D reconstruction from multiple view images subject to strong lighting variations. In this regard, a new piecewise framework is proposed to explicitly take into account the change of illumination across several wide-baseline images. Unlike multi-view stereo and multi-view photometric stereo methods, this pipeline deals with wide-baseline images that are uncalibrated, in terms of both camera parameters and lighting conditions. Such a scenario is meant to avoid use of any specific imaging setup and provide a tool for normal users without any expertise. To the best of our knowledge, this paper presents the first work that deals with such unconstrained setting. We propose a coarse-to-fine approach, in which a coarse mesh is first created using a set of geometric constraints and, then, fine details are recovered by exploiting photometric properties of the scene. Augmenting the fine details on the coarse mesh is done via a final optimization step. Note that the method does not provide a generic solution for multi-view photometric stereo problem but it relaxes several common assumptions of this problem. The approach scales very well in size given its piecewise nature, dealing with large scale optimization and with severe missing data. Experiments on a benchmark dataset Robot data-set show the method performance against 3D ground truth.