CVAIIVOct 16, 2025

Camera Movement Classification in Historical Footage: A Comparative Study of Deep Video Models

arXiv:2510.14713v1h-index: 31
Originality Synthesis-oriented
AI Analysis

It addresses the challenge of adapting camera movement classification methods to archival film material, which is incremental as it applies existing models to a new domain.

This paper tackled the problem of camera movement classification in historical footage by evaluating deep video models on the HISTORIAN dataset, finding that Video Swin Transformer achieved 80.25% accuracy.

Camera movement conveys spatial and narrative information essential for understanding video content. While recent camera movement classification (CMC) methods perform well on modern datasets, their generalization to historical footage remains unexplored. This paper presents the first systematic evaluation of deep video CMC models on archival film material. We summarize representative methods and datasets, highlighting differences in model design and label definitions. Five standard video classification models are assessed on the HISTORIAN dataset, which includes expert-annotated World War II footage. The best-performing model, Video Swin Transformer, achieves 80.25% accuracy, showing strong convergence despite limited training data. Our findings highlight the challenges and potential of adapting existing models to low-quality video and motivate future work combining diverse input modalities and temporal architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes