CVDec 11, 2025

ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions

arXiv:2512.10286v11 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the need for more narrative and directorial control in video generation for filmmakers and content creators, representing a novel method for a known bottleneck.

The paper tackled the problem of generating multi-shot videos with intentional cinematographic transitions, proposing ShotDirector to integrate camera control and editing-pattern-aware prompting, achieving film-like controllable shot transitions as demonstrated through extensive experiments.

Shot transitions play a pivotal role in multi-shot video generation, as they determine the overall narrative expression and the directorial design of visual storytelling. However, recent progress has primarily focused on low-level visual consistency across shots, neglecting how transitions are designed and how cinematographic language contributes to coherent narrative expression. This often leads to mere sequential shot changes without intentional film-editing patterns. To address this limitation, we propose ShotDirector, an efficient framework that integrates parameter-level camera control and hierarchical editing-pattern-aware prompting. Specifically, we adopt a camera control module that incorporates 6-DoF poses and intrinsic settings to enable precise camera information injection. In addition, a shot-aware mask mechanism is employed to introduce hierarchical prompts aware of professional editing patterns, allowing fine-grained control over shot content. Through this design, our framework effectively combines parameter-level conditions with high-level semantic guidance, achieving film-like controllable shot transitions. To facilitate training and evaluation, we construct ShotWeaver40K, a dataset that captures the priors of film-like editing patterns, and develop a set of evaluation metrics for controllable multi-shot video generation. Extensive experiments demonstrate the effectiveness of our framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes