CVAug 27, 2024
LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive StreamingYuang Shi, Géraldine Morin, Simone Gasparini et al.
The rise of Extended Reality (XR) requires efficient streaming of 3D online worlds, challenging current 3DGS representations to adapt to bandwidth-constrained environments. This paper proposes LapisGS, a layered 3DGS that supports adaptive streaming and progressive rendering. Our method constructs a layered structure for cumulative representation, incorporates dynamic opacity optimization to maintain visual fidelity, and utilizes occupancy maps to efficiently manage Gaussian splats. This proposed model offers a progressive representation supporting a continuous rendering quality adapted for bandwidth-aware streaming. Extensive experiments validate the effectiveness of our approach in balancing visual fidelity with the compactness of the model, with up to 50.71% improvement in SSIM, 286.53% improvement in LPIPS with 23% of the original model size, and shows its potential for bandwidth-adapted 3D streaming and rendering applications.
CVJan 8
Sketch&Patch++: Efficient Structure-Aware 3D Gaussian RepresentationYuang Shi, Géraldine Morin, Simone Gasparini et al.
We observe that Gaussians exhibit distinct roles and characteristics analogous to traditional artistic techniques -- like how artists first sketch outlines before filling in broader areas with color, some Gaussians capture high-frequency features such as edges and contours, while others represent broader, smoother regions analogous to brush strokes that add volume and depth. Based on this observation, we propose a hybrid representation that categorizes Gaussians into (i) Sketch Gaussians, which represent high-frequency, boundary-defining features, and (ii) Patch Gaussians, which cover low-frequency, smooth regions. This semantic separation naturally enables layered progressive streaming, where the compact Sketch Gaussians establish the structural skeleton before Patch Gaussians incrementally refine volumetric detail. In this work, we extend our previous method to arbitrary 3D scenes by proposing a novel hierarchical adaptive categorization framework that operates directly on the 3DGS representation. Our approach employs multi-criteria density-based clustering, combined with adaptive quality-driven refinement. This method eliminates dependency on external 3D line primitives while ensuring optimal parametric encoding effectiveness. Our comprehensive evaluation across diverse scenes, including both man-made and natural environments, demonstrates that our method achieves up to 1.74 dB improvement in PSNR, 6.7% in SSIM, and 41.4% in LPIPS at equivalent model sizes compared to uniform pruning baselines. For indoor scenes, our method can maintain visual quality with only 0.5\% of the original model size. This structure-aware representation enables efficient storage, adaptive streaming, and rendering of high-fidelity 3D content across bandwidth-constrained networks and resource-limited devices.
CVJan 22, 2025
Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made ScenesYuang Shi, Simone Gasparini, Géraldine Morin et al.
3D Gaussian Splatting (3DGS) has emerged as a promising representation for photorealistic rendering of 3D scenes. However, its high storage requirements pose significant challenges for practical applications. We observe that Gaussians exhibit distinct roles and characteristics that are analogous to traditional artistic techniques -- Like how artists first sketch outlines before filling in broader areas with color, some Gaussians capture high-frequency features like edges and contours; While other Gaussians represent broader, smoother regions, that are analogous to broader brush strokes that add volume and depth to a painting. Based on this observation, we propose a novel hybrid representation that categorizes Gaussians into (i) Sketch Gaussians, which define scene boundaries, and (ii) Patch Gaussians, which cover smooth regions. Sketch Gaussians are efficiently encoded using parametric models, leveraging their geometric coherence, while Patch Gaussians undergo optimized pruning, retraining, and vector quantization to maintain volumetric consistency and storage efficiency. Our comprehensive evaluation across diverse indoor and outdoor scenes demonstrates that this structure-aware approach achieves up to 32.62% improvement in PSNR, 19.12% in SSIM, and 45.41% in LPIPS at equivalent model sizes, and correspondingly, for an indoor scene, our model maintains the visual quality with 2.3% of the original model size.
GROct 2, 2025
ROI-GS: Interest-based Local Quality 3D Gaussian SplattingQuoc-Anh Bui, Gilles Rougeron, Géraldine Morin et al.
We tackle the challenge of efficiently reconstructing 3D scenes with high detail on objects of interest. Existing 3D Gaussian Splatting (3DGS) methods allocate resources uniformly across the scene, limiting fine detail to Regions Of Interest (ROIs) and leading to inflated model size. We propose ROI-GS, an object-aware framework that enhances local details through object-guided camera selection, targeted Object training, and seamless integration of high-fidelity object of interest reconstructions into the global scene. Our method prioritizes higher resolution details on chosen objects while maintaining real-time performance. Experiments show that ROI-GS significantly improves local quality (up to 2.96 dB PSNR), while reducing overall model size by $\approx 17\%$ of baseline and achieving faster training for a scene with a single object of interest, outperforming existing methods.
CVFeb 18, 2025
ROI-NeRFs: Hi-Fi Visualization of Objects of Interest within a Scene by NeRFs CompositionQuoc-Anh Bui, Gilles Rougeron, Géraldine Morin et al.
Efficient and accurate 3D reconstruction is essential for applications in cultural heritage. This study addresses the challenge of visualizing objects within large-scale scenes at a high level of detail (LOD) using Neural Radiance Fields (NeRFs). The aim is to improve the visual fidelity of chosen objects while maintaining the efficiency of the computations by focusing on details only for relevant content. The proposed ROI-NeRFs framework divides the scene into a Scene NeRF, which represents the overall scene at moderate detail, and multiple ROI NeRFs that focus on user-defined objects of interest. An object-focused camera selection module automatically groups relevant cameras for each NeRF training during the decomposition phase. In the composition phase, a Ray-level Compositional Rendering technique combines information from the Scene NeRF and ROI NeRFs, allowing simultaneous multi-object rendering composition. Quantitative and qualitative experiments conducted on two real-world datasets, including one on a complex eighteen's century cultural heritage room, demonstrate superior performance compared to baseline methods, improving LOD for object regions, minimizing artifacts, and without significantly increasing inference time.
CVFeb 26, 2018
Using Curvilinear Features in Focus for Registering a Single Image to a 3D ObjectHatem A. Rashwan, Sylvie Chambon, Pierre Gurdjos et al.
In the context of 2D/3D registration, this paper introduces an approach that allows to match features detected in two different modalities: photographs and 3D models, by using a common 2D reprensentation. More precisely, 2D images are matched with a set of depth images, representing the 3D model. After introducing the concept of curvilinear saliency, related to curvature estimation, we propose a new ridge and valley detector for depth images rendered from 3D model. A variant of this detector is adapted to photographs, in particular by applying it in multi-scale and by combining this feature detector with the principle of focus curves. Finally, a registration algorithm for determining the correct viewpoint of the 3D model and thus the pose is proposed. It is based on using histogram of gradients features adapted to the features manipulated in 2D and in 3D, and the introduction of repeatability scores. The results presented highlight the quality of the features detected, in term of repeatability, and also the interest of the approach for registration and pose estimation.