CVOct 29, 2025

VADB: A Large-Scale Video Aesthetic Database with Professional and Multi-Dimensional Annotations

Qianqian Qiao, DanDan Zheng, Yihang Bo, Bao Peng, Heng Huang, Longteng Jiang, Huaye Wang, Jingdong Chen, Jun Zhou, Xin Jin

arXiv:2510.25238v23 citationsh-index: 7Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited progress in video aesthetic assessment for multimedia computing researchers, though it is incremental as it builds on existing methods with new data and a hybrid approach.

The study tackled the lack of standardized datasets and robust models for video aesthetic assessment by introducing VADB, a large-scale database with 10,490 videos annotated by professionals across multiple dimensions, and VADB-Net, a dual-modal pre-training framework that outperforms existing models in scoring tasks.

Video aesthetic assessment, a vital area in multimedia computing, integrates computer vision with human cognition. Its progress is limited by the lack of standardized datasets and robust models, as the temporal dynamics of video and multimodal fusion challenges hinder direct application of image-based methods. This study introduces VADB, the largest video aesthetic database with 10,490 diverse videos annotated by 37 professionals across multiple aesthetic dimensions, including overall and attribute-specific aesthetic scores, rich language comments and objective tags. We propose VADB-Net, a dual-modal pre-training framework with a two-stage training strategy, which outperforms existing video quality assessment models in scoring tasks and supports downstream video aesthetic assessment tasks. The dataset and source code are available at https://github.com/BestiVictory/VADB.

View on arXiv PDF Code

Similar