AIMAMMApr 27

Co-Director: Agentic Generative Video Storytelling

arXiv:2604.2484295.5h-index: 5
AI Analysis

For AI video generation, Co-Director addresses semantic drift in agentic pipelines, offering a principled optimization approach that generalizes to cinematic narratives.

Co-Director introduces a hierarchical multi-agent framework that formalizes video storytelling as a global optimization problem, using a multi-armed bandit for global creative direction and a local self-refinement loop to maintain coherence. It outperforms SOTA baselines on a new 400-scenario advertising benchmark.

While diffusion models generate high-fidelity video clips, transforming them into coherent storytelling engines remains challenging. Current agentic pipelines automate this via chained modules but suffer from semantic drift and cascading failures due to independent, handcrafted prompting. We present Co-Director, a hierarchical multi-agent framework formalizing video storytelling as a global optimization problem. To ensure semantic coherence, we introduce hierarchical parameterization: a multi-armed bandit globally identifies promising creative directions, while a local multimodal self-refinement loop mitigates identity drift and ensures sequence-level consistency. This balances the exploration of novel narrative strategies with the exploitation of effective creative configurations. For evaluation, we introduce GenAD-Bench, a 400-scenario dataset of fictional products for personalized advertising. Experiments demonstrate that Co-Director significantly outperforms state-of-the-art baselines, offering a principled approach that seamlessly generalizes to broader cinematic narratives. Project Page: https://co-director-agent.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes