CVDec 2, 2024

Semantic Scene Completion with Multi-Feature Data Balancing Network

arXiv:2412.01431v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses data imbalance and ambiguity in 3D scene reconstruction for applications like virtual reality, representing an incremental improvement.

The paper tackles the problem of Semantic Scene Completion (SSC) by proposing MDBNet, a dual-head model for RGB and depth data, which surpasses state-of-the-art methods on NYU datasets.

Semantic Scene Completion (SSC) is a critical task in computer vision, that utilized in applications such as virtual reality (VR). SSC aims to construct detailed 3D models from partial views by transforming a single 2D image into a 3D representation, assigning each voxel a semantic label. The main challenge lies in completing 3D volumes with limited information, compounded by data imbalance, inter-class ambiguity, and intra-class diversity in indoor scenes. To address this, we propose the Multi-Feature Data Balancing Network (MDBNet), a dual-head model for RGB and depth data (F-TSDF) inputs. Our hybrid encoder-decoder architecture with identity transformation in a pre-activation residual module (ITRM) effectively manages diverse signals within F-TSDF. We evaluate RGB feature fusion strategies and use a combined loss function cross entropy for 2D RGB features and weighted cross-entropy for 3D SSC predictions. MDBNet results surpass comparable state-of-the-art (SOTA) methods on NYU datasets, demonstrating the effectiveness of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes