Eulerian Motion Guidance: Robust Image Animation via Bidirectional Geometric Consistency
For researchers in image animation and video generation, this work offers a more robust and efficient training framework, though it is an incremental improvement over existing diffusion-based methods.
The paper introduces Eulerian motion guidance for image animation, using adjacent-frame motion fields instead of reference-frame optical flow, combined with bidirectional geometric consistency to handle occlusions. This approach accelerates training, improves temporal coherence, and reduces artifacts compared to prior methods.
Recent advancements in image animation have utilized diffusion models to breathe life into static images. However, existing controllable frameworks typically rely on Lagrangian motion guidance, where optical flow is estimated relative to the initial frame. This paper revisits the same optical-flow primitive through a more local supervision design: we use adjacent-frame Eulerian motion fields to guide generation, where the motion signal always describes a short temporal hop. This shift enables parallelized training and provides bounded-error supervision throughout the generation process. To mitigate the drift artifacts common in adjacent frame generation, we introduce a Bidirectional Geometric Consistency mechanism, which computes a forward-backward cycle check to mathematically identify and mask occluded regions, preventing the model from learning incorrect warping objectives. Extensive experiments demonstrate that our approach accelerates training, preserves temporal coherence, and reduces dynamic artifacts compared to reference-based baselines.