IV CVOct 11, 2023

PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction

Weijie Gan, Qiuchen Zhai, Michael Thompson McCann, Cristina Garcia Cardona, Ulugbek S. Kamilov, Brendt Wohlberg

arXiv:2310.07504v214.813 citationsh-index: 37Has Code

Originality Highly original

AI Analysis

This work addresses the computational bottleneck in ptychography, a domain-specific imaging technique, with an incremental hybrid approach.

The paper tackles the high computational cost of iterative ptychographic image reconstruction by introducing PtychoDV, a deep model-based network that uses a vision transformer and deep unrolling to achieve efficient, high-quality reconstruction, outperforming existing deep learning methods and significantly reducing computational cost while maintaining competitive performance.

Ptychography is an imaging technique that captures multiple overlapping snapshots of a sample, illuminated coherently by a moving localized probe. The image recovery from ptychographic data is generally achieved via an iterative algorithm that solves a nonlinear phase retrieval problem derived from measured diffraction patterns. However, these iterative approaches have high computational cost. In this paper, we introduce PtychoDV, a novel deep model-based network designed for efficient, high-quality ptychographic image reconstruction. PtychoDV comprises a vision transformer that generates an initial image from the set of raw measurements, taking into consideration their mutual correlations. This is followed by a deep unrolling network that refines the initial image using learnable convolutional priors and the ptychography measurement model. Experimental results on simulated data demonstrate that PtychoDV is capable of outperforming existing deep learning methods for this problem, and significantly reduces computational cost compared to iterative methodologies, while maintaining competitive performance.

View on arXiv PDF Code

Similar