Mingyao Hong

CVDec 31, 2025Code

Splatwizard: A Benchmark Toolkit for 3D Gaussian Splatting Compression

Xiang Liu, Yimin Zhou, Jinxiang Wang et al.

The recent advent of 3D Gaussian Splatting (3DGS) has marked a significant breakthrough in real-time novel view synthesis. However, the rapid proliferation of 3DGS-based algorithms has created a pressing need for standardized and comprehensive evaluation tools, especially for compression task. Existing benchmarks often lack the specific metrics necessary to holistically assess the unique characteristics of different methods, such as rendering speed, rate distortion trade-offs memory efficiency, and geometric accuracy. To address this gap, we introduce Splatwizard, a unified benchmark toolkit designed specifically for benchmarking 3DGS compression models. Splatwizard provides an easy-to-use framework to implement new 3DGS compression model and utilize state-of-the-art techniques proposed by previous work. Besides, an integrated pipeline that automates the calculation of key performance indicators, including image-based quality metrics, chamfer distance of reconstruct mesh, rendering frame rates, and computational resource consumption is included in the framework as well. Code is available at https://github.com/splatwizard/splatwizard

CVMay 9, 2025

FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors

Yimin Zhou, Yichong Xia, Bin Chen et al.

With the increasing deployment of facial image data across a wide range of applications, efficient compression tailored to facial semantics has become critical for both storage and transmission. While recent learning-based face image compression methods have achieved promising results, they often suffer from degraded reconstruction quality at low bit rates. Directly applying diffusion-based generative priors to this task leads to suboptimal performance in downstream machine vision tasks, primarily due to poor preservation of high-frequency details. In this work, we propose FaSDiff (\textbf{Fa}cial Image Compression with a \textbf{S}table \textbf{Diff}usion Prior), a novel diffusion-driven compression framework designed to enhance both visual fidelity and semantic consistency. FaSDiff incorporates a high-frequency-sensitive compressor to capture fine-grained details and generate robust visual prompts for guiding the diffusion model. To address low-frequency degradation, we further introduce a hybrid low-frequency enhancement module that disentangles and preserves semantic structures, enabling stable modulation of the diffusion prior during reconstruction. By jointly optimizing perceptual quality and semantic preservation, FaSDiff effectively balances human visual fidelity and machine vision accuracy. Extensive experiments demonstrate that FaSDiff outperforms state-of-the-art methods in both perceptual metrics and downstream task performance.

Mingyao Hong

2 Papers