CVDec 15, 2025

CogniEdit: Dense Gradient Flow Optimization for Fine-Grained Image Editing

arXiv:2512.13276v1h-index: 23
Originality Incremental advance
AI Analysis

This addresses the challenge of precise attribute control in image editing for users, representing an incremental improvement over prior methods.

The paper tackled the problem of fine-grained image editing with diffusion models, where existing methods struggle with precise instructions, and proposed CogniEdit, which achieved state-of-the-art performance in balancing instruction following with visual quality.

Instruction-based image editing with diffusion models has achieved impressive results, yet existing methods struggle with fine-grained instructions specifying precise attributes such as colors, positions, and quantities. While recent approaches employ Group Relative Policy Optimization (GRPO) for alignment, they optimize only at individual sampling steps, providing sparse feedback that limits trajectory-level control. We propose a unified framework CogniEdit, combining multi-modal reasoning with dense reward optimization that propagates gradients across consecutive denoising steps, enabling trajectory-level gradient flow through the sampling process. Our method comprises three components: (1) Multi-modal Large Language Models for decomposing complex instructions into actionable directives, (2) Dynamic Token Focus Relocation that adaptively emphasizes fine-grained attributes, and (3) Dense GRPO-based optimization that propagates gradients across consecutive steps for trajectory-level supervision. Extensive experiments on benchmark datasets demonstrate that our CogniEdit achieves state-of-the-art performance in balancing fine-grained instruction following with visual quality and editability preservation

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes