CVJul 1, 2024

E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness

arXiv:2407.01516v122 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automating camera trajectory generation for filmmakers and common users, making cinematography more accessible, though it is incremental as it builds on existing diffusion and contrastive learning methods.

The paper tackles the complex task of generating camera trajectories for cinematography by introducing the E.T. dataset with camera trajectories, character information, and textual captions, and proposes DIRECTOR, a diffusion-based method that generates trajectories from text, achieving competitive results on new benchmarks.

Stories and emotions in movies emerge through the effect of well-thought-out directing decisions, in particular camera placement and movement over time. Crafting compelling camera trajectories remains a complex iterative process, even for skilful artists. To tackle this, in this paper, we propose a dataset called the Exceptional Trajectories (E.T.) with camera trajectories along with character information and textual captions encompassing descriptions of both camera and character. To our knowledge, this is the first dataset of its kind. To show the potential applications of the E.T. dataset, we propose a diffusion-based approach, named DIRECTOR, which generates complex camera trajectories from textual captions that describe the relation and synchronisation between the camera and characters. To ensure robust and accurate evaluations, we train on the E.T. dataset CLaTr, a Contrastive Language-Trajectory embedding for evaluation metrics. We posit that our proposed dataset and method significantly advance the democratization of cinematography, making it more accessible to common users.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes