CVMar 9, 2023

Controllable Video Generation by Learning the Underlying Dynamical System with Neural ODE

arXiv:2303.05323v25 citationsh-index: 16
Originality Highly original
AI Analysis

This work addresses the need for advanced controllable video generation models in computer vision, representing a significant step forward in handling complex and dynamic scenes.

The paper tackles the problem of generating controllable videos from a static image and text caption by learning the underlying dynamical system, resulting in a framework capable of producing highly controllable and visually consistent videos.

Videos depict the change of complex dynamical systems over time in the form of discrete image sequences. Generating controllable videos by learning the dynamical system is an important yet underexplored topic in the computer vision community. This paper presents a novel framework, TiV-ODE, to generate highly controllable videos from a static image and a text caption. Specifically, our framework leverages the ability of Neural Ordinary Differential Equations~(Neural ODEs) to represent complex dynamical systems as a set of nonlinear ordinary differential equations. The resulting framework is capable of generating videos with both desired dynamics and content. Experiments demonstrate the ability of the proposed method in generating highly controllable and visually consistent videos, and its capability of modeling dynamical systems. Overall, this work is a significant step towards developing advanced controllable video generation models that can handle complex and dynamic scenes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes