CVDec 2, 2024

Occam's LGS: An Efficient Approach for Language Gaussian Splatting

arXiv:2412.01807v211 citationsh-index: 30
AI Analysis

This work addresses efficiency issues in 3D scene representation for open-set tasks, offering a significant speed improvement for researchers and practitioners in computer vision and graphics.

The paper tackled the problem of high computational costs and long training times in language 3D Gaussian Splatting by proposing a probabilistic formulation with weighted multi-view feature aggregation, achieving state-of-the-art results with a speed-up of two orders of magnitude.

TL;DR: Gaussian Splatting is a widely adopted approach for 3D scene representation, offering efficient, high-quality reconstruction and rendering. A key reason for its success is the simplicity of representing scenes with sets of Gaussians, making it interpretable and adaptable. To enhance understanding beyond visual representation, recent approaches extend Gaussian Splatting with semantic vision-language features, enabling open-set tasks. Typically, these language features are aggregated from multiple 2D views, however, existing methods rely on cumbersome techniques, resulting in high computational costs and longer training times. In this work, we show that the complicated pipelines for language 3D Gaussian Splatting are simply unnecessary. Instead, we follow a probabilistic formulation of Language Gaussian Splatting and apply Occam's razor to the task at hand, leading to a highly efficient weighted multi-view feature aggregation technique. Doing so offers us state-of-the-art results with a speed-up of two orders of magnitude without any compression, allowing for easy scene manipulation. Project Page: https://insait-institute.github.io/OccamLGS/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes