Yuyang You

2papers

2 Papers

66.3CVMar 23
Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation

Yuyang You, Yongzhi Li, Jiahui Li et al.

Video generation has recently emerged as a central task in the field of generative AI. However, the substantial computational cost inherent in video synthesis makes model distillation a critical technique for efficient deployment. Despite its significance, there is a scarcity of methods specifically designed for video diffusion models. Prevailing approaches often directly adapt image distillation techniques, which frequently lead to artifacts such as oversaturation, temporal inconsistency, and mode collapse. To address these challenges, we propose a novel distillation framework tailored specifically for video diffusion models. Its core innovations include: (1) an adaptive regression loss that dynamically adjusts spatial supervision weights to prevent artifacts arising from excessive distribution shifts; (2) a temporal regularization loss to counteract temporal collapse, promoting smooth and physically plausible sampling trajectories; and (3) an inference-time frame interpolation strategy that reduces sampling overhead while preserving perceptual quality. Extensive experiments and ablation studies on the VBench and VBench2 benchmarks demonstrate that our method achieves stable few-step video synthesis, significantly enhancing perceptual fidelity and motion realism. It consistently outperforms existing distillation baselines across multiple metrics.

CVJan 31, 2021
Spectral Roll-off Points Variations: Exploring Useful Information in Feature Maps by Its Variations

Yunkai Yu, Yuyang You, Zhihong Yang et al.

Useful information (UI) is an elusive concept in neural networks. A quantitative measurement of UI is absent, despite the variations of UI can be recognized by prior knowledge. The communication bandwidth of feature maps decreases after downscaling operations, but UI flows smoothly after training due to lower Nyquist frequency. Inspired by the low-Nyqusit-frequency nature of UI, we propose the use of spectral roll-off points (SROPs) to estimate UI on variations. The computation of an SROP is extended from a 1-D signal to a 2-D image by the required rotation invariance in image classification tasks. SROP statistics across feature maps are implemented as layer-wise useful information estimates. We design sanity checks to explore SROP variations when UI variations are produced by variations in model input, model architecture and training stages. The variations of SROP is synchronizes with UI variations in various randomized and sufficiently trained model structures. Therefore, SROP variations is an accurate and convenient sign of UI variations, which promotes the explainability of data representations with respect to frequency-domain knowledge.