CVDec 2, 2024

Occam's LGS: An Efficient Approach for Language Gaussian Splatting

Jiahuan Cheng, Jan-Nico Zaech, Luc Van Gool, Danda Pani Paudel

arXiv:2412.01807v214.111 citationsh-index: 30

Originality Incremental advance

AI Analysis

This work addresses efficiency issues in 3D scene representation for open-set tasks, offering a significant speed improvement for researchers and practitioners in computer vision and graphics.

The paper tackled the problem of high computational costs and long training times in language 3D Gaussian Splatting by proposing a probabilistic formulation with weighted multi-view feature aggregation, achieving state-of-the-art results with a speed-up of two orders of magnitude.

TL;DR: Gaussian Splatting is a widely adopted approach for 3D scene representation, offering efficient, high-quality reconstruction and rendering. A key reason for its success is the simplicity of representing scenes with sets of Gaussians, making it interpretable and adaptable. To enhance understanding beyond visual representation, recent approaches extend Gaussian Splatting with semantic vision-language features, enabling open-set tasks. Typically, these language features are aggregated from multiple 2D views, however, existing methods rely on cumbersome techniques, resulting in high computational costs and longer training times. In this work, we show that the complicated pipelines for language 3D Gaussian Splatting are simply unnecessary. Instead, we follow a probabilistic formulation of Language Gaussian Splatting and apply Occam's razor to the task at hand, leading to a highly efficient weighted multi-view feature aggregation technique. Doing so offers us state-of-the-art results with a speed-up of two orders of magnitude without any compression, allowing for easy scene manipulation. Project Page: https://insait-institute.github.io/OccamLGS/

View on arXiv PDF

Similar