Responsive Action-based Video Synthesis
This addresses the challenge for video artists and creators in reusing and editing video clips interactively, though it appears incremental as it builds on existing video synthesis and human-in-the-loop concepts.
The paper tackles the problem of enabling interactive video synthesis where users can loop, merge, and trigger video elements with semantic-level control, proposing a human-in-the-loop system that allows artists to author videos and video-performances through progressive creative control.
We propose technology to enable a new medium of expression, where video elements can be looped, merged, and triggered, interactively. Like audio, video is easy to sample from the real world but hard to segment into clean reusable elements. Reusing a video clip means non-linear editing and compositing with novel footage. The new context dictates how carefully a clip must be prepared, so our end-to-end approach enables previewing and easy iteration. We convert static-camera videos into loopable sequences, synthesizing them in response to simple end-user requests. This is hard because a) users want essentially semantic-level control over the synthesized video content, and b) automatic loop-finding is brittle and leaves users limited opportunity to work through problems. We propose a human-in-the-loop system where adding effort gives the user progressively more creative control. Artists help us evaluate how our trigger interfaces can be used for authoring of videos and video-performances.