CVFeb 2

MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos

arXiv:2602.02123v1h-index: 3
Originality Incremental advance
AI Analysis

This work addresses the challenge of maintaining global temporal consistency in long-duration video editing for applications requiring efficient and artifact-free manipulation.

The paper tackles the problem of editing minute-level videos by proposing MLV-Edit, a training-free, flow-based framework that addresses computational overhead and temporal consistency issues, resulting in outperforming state-of-the-art methods in temporal stability and semantic fidelity.

We propose MLV-Edit, a training-free, flow-based framework that address the unique challenges of minute-level video editing. While existing techniques excel in short-form video manipulation, scaling them to long-duration videos remains challenging due to prohibitive computational overhead and the difficulty of maintaining global temporal consistency across thousands of frames. To address this, MLV-Edit employs a divide-and-conquer strategy for segment-wise editing, facilitated by two core modules: Velocity Blend rectifies motion inconsistencies at segment boundaries by aligning the flow fields of adjacent chunks, eliminating flickering and boundary artifacts commonly observed in fragmented video processing; and Attention Sink anchors local segment features to global reference frames, effectively suppressing cumulative structural drift. Extensive quantitative and qualitative experiments demonstrate that MLV-Edit consistently outperforms state-of-the-art methods in terms of temporal stability and semantic fidelity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes