73.2CVJun 1
WebSpline: Structure-Informed Splines for Real-Time 3D Gaussians from Monocular VideosJongmin Park, Jeonghwan Yun, Minh-Quan Viet Bui et al.
Dynamic scene reconstruction from monocular videos remains highly challenging, as existing methods often struggle to balance global structural coherence and local fine-grained details under limited multi-view cues. To address this challenge, we propose WebSpline, a novel dynamic 3D Gaussian framework that enables structurally coherent and high-fidelity reconstruction from monocular videos with fast rendering. The core of WebSpline is the Structure-Informed Spline (SIS) representation, which models each dynamic Gaussian trajectory using a learnable cubic Hermite spline whose motion is structurally organized with an auxiliary Structural Proxy Graph (SPG). The proposed framework is optimized in two stages: (i) in the first stage, the SPG is initialized from 2D point tracks and refined with temporal rigidity regularization to establish structural coherence for moving objects across the sequence; and (ii) in the second stage, the SIS representation is initialized from the refined SPG and optimized under both spatial and structural neighborhood constraints. At inference, Gaussian motion is obtained solely by evaluating the learned SIS, enabling fast rendering. Extensive experiments on the challenging monocular dynamic scene benchmarks, iPhone and NVIDIA, demonstrate that our WebSpline achieves state-of-the-art rendering quality while rendering over 10 times faster than WorldTree, the second-best method on the iPhone dataset.
CVDec 11, 2024
Diffusion-based Data Augmentation and Knowledge Distillation with Generated Soft Labels Solving Data Scarcity Problems of SAR Oil Spill SegmentationJaeho Moon, Jeonghwan Yun, Jaehyun Kim et al.
Oil spills pose severe environmental risks, making early detection crucial for effective response and mitigation. As Synthetic Aperture Radar (SAR) images operate under all-weather conditions, SAR-based oil spill segmentation enables fast and robust monitoring. However, when using deep learning models, SAR oil spill segmentation often struggles in training due to the scarcity of labeled data. To address this limitation, we propose a diffusion-based data augmentation with knowledge transfer (DAKTer) strategy. Our DAKTer strategy enables a diffusion model to generate SAR oil spill images along with soft label pairs, which offer richer class probability distributions than segmentation masks (i.e. hard labels). Also, for reliable joint generation of high-quality SAR images and well-aligned soft labels, we introduce an SNR-based balancing factor aligning the noise corruption process of both modalilties in diffusion models. By leveraging the generated SAR images and soft labels, a student segmentation model can learn robust feature representations without teacher models trained for the same task, improving its ability to segment oil spill regions. Extensive experiments demonstrate that our DAKTer strategy effectively transfers the knowledge of per-pixel class probabilities to the student segmentation model to distinguish the oil spill regions from other look-alike regions in the SAR images. Our DAKTer strategy boosts various segmentation models to achieve superior performance with large margins compared to other generative data augmentation methods.