CVDec 17, 2024

PRIMEdit: Probability Redistribution for Instance-aware Multi-object Video Editing with Benchmark Dataset

arXiv:2412.12877v2h-index: 10
Originality Incremental advance
AI Analysis

This addresses the challenge of localized, faithful editing for multiple objects in videos, which is crucial for users in video production and editing, though it appears incremental as it builds on existing zero-shot techniques.

The paper tackles the problem of unintended changes in multi-object video editing by proposing PRIMEdit, a zero-shot framework that introduces instance-centric probability redistribution and disentangled multi-instance sampling, resulting in significant outperformance over state-of-the-art methods in editing faithfulness, accuracy, and leakage prevention.

Recent AI-based video editing has enabled users to edit videos through simple text prompts, significantly simplifying the editing process. However, recent zero-shot video editing techniques primarily focus on global or single-object edits, which can lead to unintended changes in other parts of the video. When multiple objects require localized edits, existing methods face challenges, such as unfaithful editing, editing leakage, and lack of suitable evaluation datasets and metrics. To overcome these limitations, we propose $\textbf{P}$robability $\textbf{R}$edistribution for $\textbf{I}$nstance-aware $\textbf{M}$ulti-object Video $\textbf{Edit}$ing ($\textbf{PRIMEdit}$). PRIMEdit is a zero-shot framework that introduces two key modules: (i) Instance-centric Probability Redistribution (IPR) to ensure precise localization and faithful editing and (ii) Disentangled Multi-instance Sampling (DMS) to prevent editing leakage. Additionally, we present our new MIVE Dataset for video editing featuring diverse video scenarios, and introduce the Cross-Instance Accuracy (CIA) Score to evaluate editing leakage in multi-instance video editing tasks. Our extensive qualitative, quantitative, and user study evaluations demonstrate that PRIMEdit significantly outperforms recent state-of-the-art methods in terms of editing faithfulness, accuracy, and leakage prevention, setting a new benchmark for multi-instance video editing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes