CVNov 2, 2025

MID: A Self-supervised Multimodal Iterative Denoising Framework

Chang Nie, Tianchen Deng, Zhe Liu, Hesheng Wang

arXiv:2511.00997v13.6h-index: 6

Originality Highly original

AI Analysis

This addresses the problem of data corruption for scientific and engineering domains, offering a novel method that is self-supervised and does not require paired clean-noisy datasets.

The paper tackles the challenge of denoising real-world data corrupted by complex, non-linear noise by proposing a self-supervised multimodal iterative denoising (MID) framework, which achieves state-of-the-art performance across four classic computer vision tasks and shows strong adaptability in biomedical and bioinformatics domains.

Data denoising is a persistent challenge across scientific and engineering domains. Real-world data is frequently corrupted by complex, non-linear noise, rendering traditional rule-based denoising methods inadequate. To overcome these obstacles, we propose a novel self-supervised multimodal iterative denoising (MID) framework. MID models the collected noisy data as a state within a continuous process of non-linear noise accumulation. By iteratively introducing further noise, MID learns two neural networks: one to estimate the current noise step and another to predict and subtract the corresponding noise increment. For complex non-linear contamination, MID employs a first-order Taylor expansion to locally linearize the noise process, enabling effective iterative removal. Crucially, MID does not require paired clean-noisy datasets, as it learns noise characteristics directly from the noisy inputs. Experiments across four classic computer vision tasks demonstrate MID's robustness, adaptability, and consistent state-of-the-art performance. Moreover, MID exhibits strong performance and adaptability in tasks within the biomedical and bioinformatics domains.

View on arXiv PDF

Similar