CVAIMay 26

Generative Animations: A Multi-Model Pipeline for Prompt-Driven Motion Synthesis

arXiv:2605.272034.3
AI Analysis

This system simplifies animation creation for designers by automating motion path generation from text, but it is an incremental application of existing models (LLMs and SAM) to a new domain.

Generative Animations transforms natural language prompts into production-ready animations by chaining LLMs for semantic parsing with SAM for visual grounding, automatically generating motion paths that respect scene geometry, handle depth-based occlusions, and honor 3D perspective transforms.

Animation elevates digital documents into immersive experiences, yet creating custom motion paths remains cumbersome, requiring designers to manually select presets, plot Bézier points, and configure timing properties. We introduce Generative Animations, a system that transforms natural language prompts into production-ready animations. By chaining Large Language Models (LLMs) for semantic parsing with the Segment Anything Model (SAM) for visual grounding, our pipeline automatically generates motion paths that respect scene geometry, handle depth-based occlusions, and honor 3D perspective transforms. We demonstrate the system through three use cases: contour-following trajectories, orbital animations with z-order awareness, and perspective-aligned motion on transformed objects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes