IVApr 7, 2023
RED-PSM: Regularization by Denoising of Factorized Low Rank Models for Dynamic ImagingBerk Iskender, Marc L. Klasky, Yoram Bresler
Dynamic imaging addresses the recovery of a time-varying 2D or 3D object at each time instant using its undersampled measurements. In particular, in the case of dynamic tomography, only a single projection at a single view angle may be available at a time, making the problem severely ill-posed. We propose an approach, RED-PSM, which combines for the first time two powerful techniques to address this challenging imaging problem. The first, are non-parametric factorized low rank models, also known as partially separable models (PSMs), which have been used to efficiently introduce a low-rank prior for the spatio-temporal object. The second is the recent Regularization by Denoising (RED), which provides a flexible framework to exploit the impressive performance of state-of-the-art image denoising algorithms, for various inverse problems. We propose a partially separable objective with RED and a computationally efficient and scalable optimization scheme with variable splitting and ADMM. Theoretical analysis proves the convergence of our objective to a value corresponding to a stationary point satisfying the first-order optimality conditions. Convergence is accelerated by a particular projection-domain-based initialization. We demonstrate the performance and computational improvements of our proposed RED-PSM with a learned image denoiser by comparing it to a recent deep-prior-based method known as TD-DIP. Although the main focus is on dynamic tomography, we also show performance advantages of RED-PSM in a cardiac dynamic MRI setting.
CVOct 11, 2022
Improving Dense Contrastive Learning with Dense Negative PairsBerk Iskender, Zhenlin Xu, Simon Kornblith et al.
Many contrastive representation learning methods learn a single global representation of an entire image. However, dense contrastive representation learning methods such as DenseCL (Wang et al., 2021) can learn better representations for tasks requiring stronger spatial localization of features, such as multi-label classification, detection, and segmentation. In this work, we study how to improve the quality of the representations learned by DenseCL by modifying the training scheme and objective function, and propose DenseCL++. We also conduct several ablation studies to better understand the effects of: (i) various techniques to form dense negative pairs among augmentations of different images, (ii) cross-view dense negative and positive pairs, and (iii) an auxiliary reconstruction task. Our results show 3.5% and 4% mAP improvement over SimCLR (Chen et al., 2020a) andDenseCL in COCO multi-label classification. In COCO and VOC segmentation tasks, we achieve 1.8% and 0.7% mIoU improvements over SimCLR, respectively.
CVOct 23, 2024
Semantic Segmentation and Scene Reconstruction of RGB-D Image Frames: An End-to-End Modular Pipeline for Robotic ApplicationsZhiwu Zheng, Lauren Mentzer, Berk Iskender et al.
Robots operating in unstructured environments require a comprehensive understanding of their surroundings, necessitating geometric and semantic information from sensor data. Traditional RGB-D processing pipelines focus primarily on geometric reconstruction, limiting their ability to support advanced robotic perception, planning, and interaction. A key challenge is the lack of generalized methods for segmenting RGB-D data into semantically meaningful components while maintaining accurate geometric representations. We introduce a novel end-to-end modular pipeline that integrates state-of-the-art semantic segmentation, human tracking, point-cloud fusion, and scene reconstruction. Our approach improves semantic segmentation accuracy by leveraging the foundational segmentation model SAM2 with a hybrid method that combines its mask generation with a semantic classification model, resulting in sharper masks and high classification accuracy. Compared to SegFormer and OneFormer, our method achieves a similar semantic segmentation accuracy (mIoU of 47.0% vs 45.9% in the ADE20K dataset) but provides much more precise object boundaries. Additionally, our human tracking algorithm interacts with the segmentation enabling continuous tracking even when objects leave and re-enter the frame by object re-identification. Our point cloud fusion approach reduces computation time by 1.81x while maintaining a small mean reconstruction error of 25.3 mm by leveraging the semantic information. We validate our approach on benchmark datasets and real-world Kinect RGB-D data, demonstrating improved efficiency, accuracy, and usability. Our structured representation, stored in the Universal Scene Description (USD) format, supports efficient querying, visualization, and robotic simulation, making it practical for real-world deployment.
IVMar 13, 2025
RSR-NF: Neural Field Regularization by Static Restoration Priors for Dynamic ImagingBerk Iskender, Sushan Nakarmi, Nitin Daphalapurkar et al.
Dynamic imaging involves the reconstruction of a spatio-temporal object at all times using its undersampled measurements. In particular, in dynamic computed tomography (dCT), only a single projection at one view angle is available at a time, making the inverse problem very challenging. Moreover, ground-truth dynamic data is usually either unavailable or too scarce to be used for supervised learning techniques. To tackle this problem, we propose RSR-NF, which uses a neural field (NF) to represent the dynamic object and, using the Regularization-by-Denoising (RED) framework, incorporates an additional static deep spatial prior into a variational formulation via a learned restoration operator. We use an ADMM-based algorithm with variable splitting to efficiently optimize the variational objective. We compare RSR-NF to three alternatives: NF with only temporal regularization; a recent method combining a partially-separable low-rank representation with RED using a denoiser pretrained on static data; and a deep-image prior-based model. The first comparison demonstrates the reconstruction improvements achieved by combining the NF representation with static restoration priors, whereas the other two demonstrate the improvement over state-of-the art techniques for dCT.
MED-PHMar 21, 2021
Scatter Correction in X-ray CT by Physics-Inspired Deep LearningBerk Iskender, Yoram Bresler
Scatter due to interaction of photons with the imaged object is a fundamental problem in X-ray Computed Tomography (CT). It manifests as various artifacts in the reconstruction, making its abatement or correction critical for image quality. Despite success in specific settings, hardware-based methods require modification in the hardware, or increase in the scan time or dose. This accounts for the great interest in software-based methods, including Monte-Carlo based scatter estimation, analytical-numerical, and kernel-based methods, with data-driven learning-based approaches demonstrated recently. In this work, two novel physics-inspired deep-learning-based methods, PhILSCAT and OV-PhILSCAT, are proposed. The methods estimate and correct for the scatter in the acquired projection measurements. Different from previous works, they incorporate both an initial reconstruction of the object of interest and the scatter-corrupted measurements related to it, and use a deep neural network architecture and cost function, both specifically tailored to the problem. Numerical experiments with data generated by Monte-Carlo simulations of the imaging of phantoms reveal consistent improvement over a recent purely projection-domain deep neural network scatter correction method.