CVJan 8

DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation

arXiv:2601.04860v1
Originality Incremental advance
AI Analysis

This addresses the need for faster, interactive 3D segmentation in computer vision applications, though it is incremental as it builds on existing 2D foundation models and NeRF techniques.

The paper tackles the problem of slow, optimization-based segmentation of Neural Radiance Fields (NeRFs) by introducing DivAS, an optimization-free, interactive framework that uses depth priors and a custom CUDA kernel to achieve segmentation quality comparable to existing methods while being 2-2.5x faster end-to-end.

Existing methods for segmenting Neural Radiance Fields (NeRFs) are often optimization-based, requiring slow per-scene training that sacrifices the zero-shot capabilities of 2D foundation models. We introduce DivAS (Depth-interactive Voxel Aggregation Segmentation), an optimization-free, fully interactive framework that addresses these limitations. Our method operates via a fast GUI-based workflow where 2D SAM masks, generated from user point prompts, are refined using NeRF-derived depth priors to improve geometric accuracy and foreground-background separation. The core of our contribution is a custom CUDA kernel that aggregates these refined multi-view masks into a unified 3D voxel grid in under 200ms, enabling real-time visual feedback. This optimization-free design eliminates the need for per-scene training. Experiments on Mip-NeRF 360° and LLFF show that DivAS achieves segmentation quality comparable to optimization-based methods, while being 2-2.5x faster end-to-end, and up to an order of magnitude faster when excluding user prompting time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes