Multi-modal Image Processing based on Coupled Dictionary Learning
This addresses image processing challenges in real-world scenarios with multimodal data, but it is incremental as it builds on existing dictionary learning methods.
The paper tackled the problem of processing heterogeneous images from different modalities by proposing a coupled dictionary learning framework to capture shared attributes, resulting in notable benefits for tasks like denoising and super-resolution.
In real-world scenarios, many data processing problems often involve heterogeneous images associated with different imaging modalities. Since these multimodal images originate from the same phenomenon, it is realistic to assume that they share common attributes or characteristics. In this paper, we propose a multi-modal image processing framework based on coupled dictionary learning to capture similarities and disparities between different image modalities. In particular, our framework can capture favorable structure similarities across different image modalities such as edges, corners, and other elementary primitives in a learned sparse transform domain, instead of the original pixel domain, that can be used to improve a number of image processing tasks such as denoising, inpainting, or super-resolution. Practical experiments demonstrate that incorporating multimodal information using our framework brings notable benefits.