CVMay 21

EvoIR-Agent: Self-Evolving Image Restoration Agentic System via Experience-Driven Learning

arXiv:2605.2220884.5

Predicted impact top 23% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the dilemma between training-based and training-free methods for MLLM-driven image restoration, offering a solution that combines high efficiency with compatibility for new tools and degradations.

EvoIR-Agent introduces a self-evolving, training-free image restoration agent that uses a hierarchical experience pool to guide tool selection and removal order, achieving a Pareto-optimal balance between performance and efficiency, with significant improvements in full reference metrics over state-of-the-art methods.

Multimodal Large Language Model (MLLM)-driven image restoration agent demonstrates effectiveness in degradation coupling scenarios by flexibly selecting tools and determining removal orders. However, their zero-shot planning often fails without experience, necessitating severe trial-and-error overhead to achieve satisfactory outcomes. Currently, two paradigms are employed to address this issue, yet a dilemma persists: Training-based methods embed intrinsic experience into parameters, achieving high inference efficiency but lacking compatibility with new tools or degradation. In contrast, training-free methods utilize explicit experience storage for compatibility but still incur trial-and-error overhead due to naive experience. To resolve the dilemma, we propose EvoIR-Agent, which first systematically formulates the experience components of a training-free image restoration agent. Subsequently, a hierarchical experience pool is constructed, which enables coarse-to-fine guidance for diverse tools and removal orders. Furthermore, a self-evolving mechanism is introduced to update the pool from scratch using accumulated records, thereby greatly improving performance and efficiency. Extensive experiments reveal that EvoIR-Agent achieves a significant lead in the full reference metrics and yields a remarkable Pareto-optimal balance between performance and efficiency compared to the state-of-the-art methods.

View on arXiv PDF

Similar