CVOct 17, 2024

Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation

arXiv:2410.13585v11 citationsh-index: 3VCIP
Originality Incremental advance
AI Analysis

This addresses the domain adaptation challenge for professionals in media production, though it is incremental as it builds on existing learning-based frameworks.

The paper tackles the problem of multi-camera view recommendation models struggling outside their training domains due to scarce labeled data, by generating pseudo-labeled datasets from regular videos in the target domain, resulting in a 68% relative improvement in accuracy for out-of-domain performance.

Multi-camera systems are indispensable in movies, TV shows, and other media. Selecting the appropriate camera at every timestamp has a decisive impact on production quality and audience preferences. Learning-based view recommendation frameworks can assist professionals in decision-making. However, they often struggle outside of their training domains. The scarcity of labeled multi-camera view recommendation datasets exacerbates the issue. Based on the insight that many videos are edited from the original multi-camera videos, we propose transforming regular videos into pseudo-labeled multi-camera view recommendation datasets. Promisingly, by training the model on pseudo-labeled datasets stemming from videos in the target domain, we achieve a 68% relative improvement in the model's accuracy in the target domain and bridge the accuracy gap between in-domain and never-before-seen domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes