CVNov 2, 2025

MID: A Self-supervised Multimodal Iterative Denoising Framework

arXiv:2511.00997v1h-index: 6
Originality Highly original
AI Analysis

This addresses the problem of data corruption for scientific and engineering domains, offering a novel method that is self-supervised and does not require paired clean-noisy datasets.

The paper tackles the challenge of denoising real-world data corrupted by complex, non-linear noise by proposing a self-supervised multimodal iterative denoising (MID) framework, which achieves state-of-the-art performance across four classic computer vision tasks and shows strong adaptability in biomedical and bioinformatics domains.

Data denoising is a persistent challenge across scientific and engineering domains. Real-world data is frequently corrupted by complex, non-linear noise, rendering traditional rule-based denoising methods inadequate. To overcome these obstacles, we propose a novel self-supervised multimodal iterative denoising (MID) framework. MID models the collected noisy data as a state within a continuous process of non-linear noise accumulation. By iteratively introducing further noise, MID learns two neural networks: one to estimate the current noise step and another to predict and subtract the corresponding noise increment. For complex non-linear contamination, MID employs a first-order Taylor expansion to locally linearize the noise process, enabling effective iterative removal. Crucially, MID does not require paired clean-noisy datasets, as it learns noise characteristics directly from the noisy inputs. Experiments across four classic computer vision tasks demonstrate MID's robustness, adaptability, and consistent state-of-the-art performance. Moreover, MID exhibits strong performance and adaptability in tasks within the biomedical and bioinformatics domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes