CVNov 20, 2024

FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting

arXiv:2411.13753v212 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses efficiency and ambiguity issues in 3D scene understanding for computer vision applications, representing an incremental improvement over prior semantic Gaussian Splatting methods.

The paper tackles the limitations of slow training, high memory usage, and ambiguous semantic localization in semantic Gaussian Splatting by introducing FAST-Splat, which uses Gaussian-specific semantic codes and a hash-table for open-vocabulary queries, resulting in 6x to 8x faster training, 18x to 51x faster rendering, and about 6x smaller GPU memory compared to existing methods.

We present FAST-Splat for fast, ambiguity-free semantic Gaussian Splatting, which seeks to address the main limitations of existing semantic Gaussian Splatting methods, namely: slow training and rendering speeds; high memory usage; and ambiguous semantic object localization. We take a bottom-up approach in deriving FAST-Splat, dismantling the limitations of closed-set semantic distillation to enable open-set (open-vocabulary) semantic distillation. Ultimately, this key approach enables FAST-Splat to provide precise semantic object localization results, even when prompted with ambiguous user-provided natural-language queries. Further, by exploiting the explicit form of the Gaussian Splatting scene representation to the fullest extent, FAST-Splat retains the remarkable training and rendering speeds of Gaussian Splatting. Precisely, while existing semantic Gaussian Splatting methods distill semantics into a separate neural field or utilize neural models for dimensionality reduction, FAST-Splat directly augments each Gaussian with specific semantic codes, preserving the training, rendering, and memory-usage advantages of Gaussian Splatting over neural field methods. These Gaussian-specific semantic codes, together with a hash-table, enable semantic similarity to be measured with open-vocabulary user prompts and further enable FAST-Splat to respond with unambiguous semantic object labels and $3$D masks, unlike prior methods. In experiments, we demonstrate that FAST-Splat is 6x to 8x faster to train, achieves between 18x to 51x faster rendering speeds, and requires about 6x smaller GPU memory, compared to the best-competing semantic Gaussian Splatting methods. Further, FAST-Splat achieves relatively similar or better semantic segmentation performance compared to existing methods. After the review period, we will provide links to the project website and the codebase.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes