CVOct 23, 2024

An Intelligent Agentic System for Complex Image Restoration Problems

arXiv:2410.17809v242 citationsh-index: 33ICLR
AI Analysis

This addresses the problem of integrating multiple models for diverse image degradations, representing an incremental step toward general intelligence in visual processing.

The paper tackles complex image restoration by proposing AgenticIR, an agentic system that uses LLMs and VLMs to dynamically combine specialized models, demonstrating its potential in handling such tasks.

Real-world image restoration (IR) is inherently complex and often requires combining multiple specialized models to address diverse degradations. Inspired by human problem-solving, we propose AgenticIR, an agentic system that mimics the human approach to image processing by following five key stages: Perception, Scheduling, Execution, Reflection, and Rescheduling. AgenticIR leverages large language models (LLMs) and vision-language models (VLMs) that interact via text generation to dynamically operate a toolbox of IR models. We fine-tune VLMs for image quality analysis and employ LLMs for reasoning, guiding the system step by step. To compensate for LLMs' lack of specific IR knowledge and experience, we introduce a self-exploration method, allowing the LLM to observe and summarize restoration results into referenceable documents. Experiments demonstrate AgenticIR's potential in handling complex IR tasks, representing a promising path toward achieving general intelligence in visual processing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes