Xiao-Yun Zhou

22papers

344citations

Novelty42%

AI Score25

Ranked #174,271 of 205,806 authors (top 85%)#52,554 in CV (top 89%)

22 Papers

CVAug 25, 2022

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Yu Chen, Xu Cao, Xiaoyi Lin et al.

Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.

CVSep 16, 2018Code

3D Path Planning from a Single 2D Fluoroscopic Image for Robot Assisted Fenestrated Endovascular Aortic Repair

Jian-Qing Zheng, Xiao-Yun Zhou, Celia Riga et al.

The current standard of intra-operative navigation during Fenestrated Endovascular Aortic Repair (FEVAR) calls for need of 3D alignments between inserted devices and aortic branches. The navigation commonly via 2D fluoroscopic images, lacks anatomical information, resulting in longer operation hours and radiation exposure. In this paper, a framework for real-time 3D robotic path planning from a single 2D fluoroscopic image of Abdominal Aortic Aneurysm (AAA) is introduced. A graph matching method is proposed to establish the correspondence between the 3D preoperative and 2D intra-operative AAA skeletons, and then the two skeletons are registered by skeleton deformation and regularization in respect to skeleton length and smoothness. Furthermore, deep learning was used to segment 3D pre-operative AAA from Computed Tomography (CT) scans to facilitate the framework automation. Simulation, phantom and patient AAA data sets have been used to validate the proposed framework. 3D distance error of 2mm was achieved in the phantom setup. Performance advantages were also achieved in terms of accuracy, robustness and time-efficiency. All the code will be open source.

IVOct 17, 2021

A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI

Peng Wang, Yuhsuan Wu, Bolin Lai et al.

Objectives: to propose a fully-automatic computer-aided diagnosis (CAD) solution for liver lesion characterization, with uncertainty estimation. Methods: we enrolled 400 patients who had either liver resection or a biopsy and was diagnosed with either hepatocellular carcinoma (HCC), intrahepatic cholangiocarcinoma, or secondary metastasis, from 2006 to 2019. Each patient was scanned with T1WI, T2WI, T1WI venous phase (T2WI-V), T1WI arterial phase (T1WI-A), and DWI MRI sequences. We propose a fully-automatic deep CAD pipeline that localizes lesions from 3D MRI studies using key-slice parsing and provides a confidence measure for its diagnoses. We evaluate using five-fold cross validation and compare performance against three radiologists, including a senior hepatology radiologist, a junior hepatology radiologist and an abdominal radiologist. Results: the proposed CAD solution achieves a mean F1 score of 0.62, outperforming the abdominal radiologist (0.47), matching the junior hepatology radiologist (0.61), and underperforming the senior hepatology radiologist (0.68). The CAD system can informatively assess its diagnostic confidence, i.e., when only evaluating on the 70% most confident cases the mean f1 score and sensitivity at 80% specificity for HCC vs. others are boosted from 0.62 to 0.71 and 0.84 to 0.92, respectively. Conclusion: the proposed fully-automatic CAD solution can provide good diagnostic performance with informative confidence assessments in finding and discriminating liver lesions from MRI studies.

CVApr 29, 2021

Scalable Semi-supervised Landmark Localization for X-ray Images using Few-shot Deep Adaptive Graph

Xiao-Yun Zhou, Bolin Lai, Weijian Li et al.

Landmark localization plays an important role in medical image analysis. Learning based methods, including CNN and GCN, have demonstrated the state-of-the-art performance. However, most of these methods are fully-supervised and heavily rely on manual labeling of a large training dataset. In this paper, based on a fully-supervised graph-based method, DAG, we proposed a semi-supervised extension of it, termed few-shot DAG, \ie five-shot DAG. It first trains a DAG model on the labeled data and then fine-tunes the pre-trained model on the unlabeled data with a teacher-student SSL mechanism. In addition to the semi-supervised loss, we propose another loss using JS divergence to regulate the consistency of the intermediate feature maps. We extensively evaluated our method on pelvis, hand and chest landmark detection tasks. Our experiment results demonstrate consistent and significant improvements over previous methods.

CVMar 24, 2021

Hetero-Modal Learning and Expansive Consistency Constraints for Semi-Supervised Detection from Multi-Sequence Data

Bolin Lai, Yuhsuan Wu, Xiao-Yun Zhou et al.

Lesion detection serves a critical role in early diagnosis and has been well explored in recent years due to methodological advancesand increased data availability. However, the high costs of annotations hinder the collection of large and completely labeled datasets, motivating semi-supervised detection approaches. In this paper, we introduce mean teacher hetero-modal detection (MTHD), which addresses two important gaps in current semi-supervised detection. First, it is not obvious how to enforce unlabeled consistency constraints across the very different outputs of various detectors, which has resulted in various compromises being used in the state of the art. Using an anchor-free framework, MTHD formulates a mean teacher approach without such compromises, enforcing consistency on the soft-output of object centers and size. Second, multi-sequence data is often critical, e.g., for abdominal lesion detection, but unlabeled data is often missing sequences. To deal with this, MTHD incorporates hetero-modal learning in its framework. Unlike prior art, MTHD is able to incorporate an expansive set of consistency constraints that include geometric transforms and random sequence combinations. We train and evaluate MTHD on liver lesion detection using the largest MR lesion dataset to date (1099 patients with >5000 volumes). MTHD surpasses the best fully-supervised and semi-supervised competitors by 10.1% and 3.5%, respectively, in average sensitivity.

CVDec 30, 2020

Knowledge Distillation with Adaptive Asymmetric Label Sharpening for Semi-supervised Fracture Detection in Chest X-rays

Yirui Wang, Kang Zheng, Chi-Tung Chang et al.

Exploiting available medical records to train high performance computer-aided diagnosis (CAD) models via the semi-supervised learning (SSL) setting is emerging to tackle the prohibitively high labor costs involved in large-scale medical image annotations. Despite the extensive attentions received on SSL, previous methods failed to 1) account for the low disease prevalence in medical records and 2) utilize the image-level diagnosis indicated from the medical records. Both issues are unique to SSL for CAD models. In this work, we propose a new knowledge distillation method that effectively exploits large-scale image-level labels extracted from the medical records, augmented with limited expert annotated region-level labels, to train a rib and clavicle fracture CAD model for chest X-ray (CXR). Our method leverages the teacher-student model paradigm and features a novel adaptive asymmetric label sharpening (AALS) algorithm to address the label imbalance problem that specially exists in medical domain. Our approach is extensively evaluated on all CXR (N = 65,845) from the trauma registry of anonymous hospital over a period of 9 years (2008-2016), on the most common rib and clavicle fractures. The experiment results demonstrate that our method achieves the state-of-the-art fracture detection performance, i.e., an area under receiver operating characteristic curve (AUROC) of 0.9318 and a free-response receiver operating characteristic (FROC) score of 0.8914 on the rib fractures, significantly outperforming previous approaches by an AUROC gap of 1.63% and an FROC improvement by 3.74%. Consistent performance gains are also observed for clavicle fracture detection.

LGDec 15, 2020

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou et al.

Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. In addition, Amata can be incorporated into other adversarial training acceleration algorithms (e.g. YOPO, Free, Fast, and ATTA), which leads to further reduction in computational time on large-scale problems.

CVDec 13, 2020

Fully-Automated Liver Tumor Localization and Characterization from Multi-Phase MR Volumes Using Key-Slice ROI Parsing: A Physician-Inspired Approach

Bolin Lai, Yuhsuan Wu, Xiaoyu Bai et al.

Using radiological scans to identify liver tumors is crucial for proper patient treatment. This is highly challenging, as top radiologists only achieve F1 scores of roughly 80% (hepatocellular carcinoma (HCC) vs. others) with only moderate inter-rater agreement, even when using multi-phase magnetic resonance (MR) imagery. Thus, there is great impetus for computer-aided diagnosis (CAD) solutions. A critical challenge is to robustly parse a 3D MR volume to localize diagnosable regions of interest (ROI), especially for edge cases. In this paper, we break down this problem using a key-slice parser (KSP), which emulates physician workflows by first identifying key slices and then localizing their corresponding key ROIs. To achieve robustness, the KSP also uses curve-parsing and detection confidence re-weighting. We evaluate our approach on the largest multi-phase MR liver lesion test dataset to date (430 biopsy-confirmed patients). Experiments demonstrate that our KSP can localize diagnosable ROIs with high reliability: 87% patients have an average 3D overlap of >= 40% with the ground truth compared to only 79% using the best tested detector. When coupled with a classifier, we achieve an HCC vs. others F1 score of 0.801, providing a fully-automated CAD performance comparable to top human physicians.

LGDec 4, 2020

Batch Group Normalization

Xiao-Yun Zhou, Jiacheng Sun, Nanyang Ye et al.

Deep Convolutional Neural Networks (DCNNs) are hard and time-consuming to train. Normalization is one of the effective solutions. Among previous normalization methods, Batch Normalization (BN) performs well at medium and large batch sizes and is with good generalizability to multiple vision tasks, while its performance degrades significantly at small batch sizes. In this paper, we find that BN saturates at extreme large batch sizes, i.e., 128 images per worker, i.e., GPU, as well and propose that the degradation/saturation of BN at small/extreme large batch sizes is caused by noisy/confused statistic calculation. Hence without adding new trainable parameters, using multiple-layer or multi-iteration information, or introducing extra computation, Batch Group Normalization (BGN) is proposed to solve the noisy/confused statistic calculation of BN at small/extreme large batch sizes with introducing the channel, height and width dimension to compensate. The group technique in Group Normalization (GN) is used and a hyper-parameter G is used to control the number of feature instances used for statistic calculation, hence to offer neither noisy nor confused statistic for different batch sizes. We empirically demonstrate that BGN consistently outperforms BN, Instance Normalization (IN), Layer Normalization (LN), GN, and Positional Normalization (PN), across a wide spectrum of vision tasks, including image classification, Neural Architecture Search (NAS), adversarial learning, Few Shot Learning (FSL) and Unsupervised Domain Adaptation (UDA), indicating its good performance, robust stability to batch size and wide generalizability. For example, for training ResNet-50 on ImageNet with a batch size of 2, BN achieves Top1 accuracy of 66.512% while BGN achieves 76.096% with notable improvement.

IVNov 8, 2020

Real-time Surgical Environment Enhancement for Robot-Assisted Minimally Invasive Surgery Based on Super-Resolution

Ruoxi Wang, Dandan Zhang, Qingbiao Li et al.

In Robot-Assisted Minimally Invasive Surgery (RAMIS), a camera assistant is normally required to control the position and zooming ratio of the laparoscope, following the surgeon's instructions. However, moving the laparoscope frequently may lead to unstable and suboptimal views, while the adjustment of zooming ratio may interrupt the workflow of the surgical operation. To this end, we propose a multi-scale Generative Adversarial Network (GAN)-based video super-resolution method to construct a framework for automatic zooming ratio adjustment. It can provide automatic real-time zooming for high-quality visualization of the Region Of Interest (ROI) during the surgical operation. In the pipeline of the framework, the Kernel Correlation Filter (KCF) tracker is used for tracking the tips of the surgical tools, while the Semi-Global Block Matching (SGBM) based depth estimation and Recurrent Neural Network (RNN)-based context-awareness are developed to determine the upscaling ratio for zooming. The framework is validated with the JIGSAW dataset and Hamlyn Centre Laparoscopic/Endoscopic Video Datasets, with results demonstrating its practicability.

MED-PHDec 23, 2019

Artificial Intelligence in Surgery

Xiao-Yun Zhou, Yao Guo, Mali Shen et al.

Artificial Intelligence (AI) is gradually changing the practice of surgery with the advanced technological development of imaging, navigation and robotic intervention. In this article, the recent successful and influential applications of AI in surgery are reviewed from pre-operative planning and intra-operative guidance to the integration of surgical robots. We end with summarizing the current state, emerging trends and major challenges in the future development of AI in surgery.

IVSep 16, 2019

Instantiation-Net: 3D Mesh Reconstruction from Single 2D Image for Right Ventricle

Zhao-Yang Wang, Xiao-Yun Zhou, Peichao Li et al.

3D shape instantiation which reconstructs the 3D shape of a target from limited 2D images or projections is an emerging technique for surgical intervention. It improves the currently less-informative and insufficient 2D navigation schemes for robot-assisted Minimally Invasive Surgery (MIS) to 3D navigation. Previously, a general and registration-free framework was proposed for 3D shape instantiation based on Kernel Partial Least Square Regression (KPLSR), requiring manually segmented anatomical structures as the pre-requisite. Two hyper-parameters including the Gaussian width and component number also need to be carefully adjusted. Deep Convolutional Neural Network (DCNN) based framework has also been proposed to reconstruct a 3D point cloud from a single 2D image, with end-to-end and fully automatic learning. In this paper, an Instantiation-Net is proposed to reconstruct the 3D mesh of a target from its a single 2D image, by using DCNN to extract features from the 2D image and Graph Convolutional Network (GCN) to reconstruct the 3D mesh, and using Fully Connected (FC) layers to connect the DCNN to GCN. Detailed validation was performed to demonstrate the practical strength of the method and its potential clinical use.

IVSep 16, 2019

Z-Net: an Anisotropic 3D DCNN for Medical CT Volume Segmentation

Peichao Li, Xiao-Yun Zhou, Zhao-Yang Wang et al.

Accurate volume segmentation from the Computed Tomography (CT) scan is a common prerequisite for pre-operative planning, intra-operative guidance and quantitative assessment of therapeutic outcomes in robot-assisted Minimally Invasive Surgery (MIS). 3D Deep Convolutional Neural Network (DCNN) is a viable solution for this task, but is memory intensive. Small isotropic patches are cropped from the original and large CT volume to mitigate this issue in practice, but it may cause discontinuities between the adjacent patches and severe class-imbalances within individual sub-volumes. This paper presents a new 3D DCNN framework, namely Z-Net, to tackle the discontinuity and class-imbalance issue by preserving a full field-of-view of the objects in the XY planes using anisotropic spatial separable convolutions. The proposed Z-Net can be seamlessly integrated into existing 3D DCNNs with isotropic convolutions such as 3D U-Net and V-Net, with improved volume segmentation Intersection over Union (IoU) - up to $12.6\%$. Detailed validation of Z-Net is provided for CT aortic, liver and lung segmentation, demonstrating the effectiveness and practical value of Z-Net for intra-operative 3D navigation in robot-assisted MIS.

IVAug 21, 2019

U-Net Training with Instance-Layer Normalization

Xiao-Yun Zhou, Peichao Li, Zhao-Yang Wang et al.

Normalization layers are essential in a Deep Convolutional Neural Network (DCNN). Various normalization methods have been proposed. The statistics used to normalize the feature maps can be computed at batch, channel, or instance level. However, in most of existing methods, the normalization for each layer is fixed. Batch-Instance Normalization (BIN) is one of the first proposed methods that combines two different normalization methods and achieve diverse normalization for different layers. However, two potential issues exist in BIN: first, the Clip function is not differentiable at input values of 0 and 1; second, the combined feature map is not with a normalized distribution which is harmful for signal propagation in DCNN. In this paper, an Instance-Layer Normalization (ILN) layer is proposed by using the Sigmoid function for the feature map combination, and cascading group normalization. The performance of ILN is validated on image segmentation of the Right Ventricle (RV) and Left Ventricle (LV) using U-Net as the network architecture. The results show that the proposed ILN outperforms previous traditional and popular normalization methods with noticeable accuracy improvements for most validations, supporting the effectiveness of the proposed ILN.

CVJul 24, 2019

One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud

Xiao-Yun Zhou, Zhao-Yang Wang, Peichao Li et al.

Shape instantiation which predicts the 3D shape of a dynamic target from one or more 2D images is important for real-time intra-operative navigation. Previously, a general shape instantiation framework was proposed with manual image segmentation to generate a 2D Statistical Shape Model (SSM) and with Kernel Partial Least Square Regression (KPLSR) to learn the relationship between the 2D and 3D SSM for 3D shape prediction. In this paper, the two-stage shape instantiation is improved to be one-stage. PointOutNet with 19 convolutional layers and three fully-connected layers is used as the network structure and Chamfer distance is used as the loss function to predict the 3D target point cloud from a single 2D image. With the proposed one-stage shape instantiation algorithm, a spontaneous image-to-point cloud training and inference can be achieved. A dataset from 27 Right Ventricle (RV) subjects, indicating 609 experiments, were used to validate the proposed one-stage shape instantiation algorithm. An average point cloud-to-point cloud (PC-to-PC) error of 1.72mm has been achieved, which is comparable to the PLSR-based (1.42mm) and KPLSR-based (1.31mm) two-stage shape instantiation algorithm.

CVFeb 28, 2019

Real-time 3D Shape Instantiation for Partially-deployed Stent Segment from a Single 2D Fluoroscopic Image in Robot-assisted Fenestrated Endovascular Aortic Repair

Jian-Qing Zheng, Xiao-Yun Zhou, Guang-Zhong Yang

In robot-assisted Fenestrated Endovascular Aortic Repair (FEVAR), accurate alignment of stent graft fenestrations or scallops with aortic branches is essential for establishing complete blood flow perfusion. Current navigation is largely based on 2D fluoroscopic images, which lacks 3D anatomical information, thus causing longer operation time as well as high risks of radiation exposure. Previously, 3D shape instantiation frameworks for real-time 3D shape reconstruction of fully-deployed or fully-compressed stent graft from a single 2D fluoroscopic image have been proposed for 3D navigation in robot-assisted FEVAR. However, these methods could not instantiate partially-deployed stent segments, as the 3D marker references are unknown. In this paper, an adapted Graph Convolutional Network (GCN) is proposed to predict 3D marker references from 3D fully-deployed markers. As original GCN is for classification, in this paper, the coarsening layers are removed and the softmax function at the network end is replaced with linear mapping for the regression task. The derived 3D and the 2D marker references are used to instantiate partially-deployed stent segment shape with the existing 3D shape instantiation framework. Validations were performed on three commonly used stent grafts and five patient-specific 3D printed aortic aneurysm phantoms. Comparable performances with average mesh distance errors of 1$\sim$3mm and average angular errors around 7degree were achieved.

LGJan 26, 2019

ACNN: a Full Resolution DCNN for Medical Image Segmentation

Xiao-Yun Zhou, Jian-Qing Zheng, Peichao Li et al.

Deep Convolutional Neural Networks (DCNNs) are used extensively in medical image segmentation and hence 3D navigation for robot-assisted Minimally Invasive Surgeries (MISs). However, current DCNNs usually use down sampling layers for increasing the receptive field and gaining abstract semantic information. These down sampling layers decrease the spatial dimension of feature maps, which can be detrimental to image segmentation. Atrous convolution is an alternative for the down sampling layer. It increases the receptive field whilst maintains the spatial dimension of feature maps. In this paper, a method for effective atrous rate setting is proposed to achieve the largest and fully-covered receptive field with a minimum number of atrous convolutional layers. Furthermore, a new and full resolution DCNN - Atrous Convolutional Neural Network (ACNN), which incorporates cascaded atrous II-blocks, residual learning and Instance Normalization (IN) is proposed. Application results of the proposed ACNN to Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) image segmentation demonstrate that the proposed ACNN can achieve higher segmentation Intersection over Unions (IoUs) than U-Net and Deeplabv3+, but with reduced trainable parameters.

CVSep 11, 2018

Normalization in Training U-Net for 2D Biomedical Semantic Segmentation

Xiao-Yun Zhou, Guang-Zhong Yang

2D biomedical semantic segmentation is important for robotic vision in surgery. Segmentation methods based on Deep Convolutional Neural Network (DCNN) can out-perform conventional methods in terms of both accuracy and levels of automation. One common issue in training a DCNN for biomedical semantic segmentation is the internal covariate shift where the training of convolutional kernels is encumbered by the distribution change of input features, hence both the training speed and performance are decreased. Batch Normalization (BN) is the first proposed method for addressing internal covariate shift and is widely used. Instance Normalization (IN) and Layer Normalization (LN) have also been proposed. Group Normalization (GN) is proposed more recently and has not yet been applied to 2D biomedical semantic segmentation, however, no specific validations on GN were given. Most DCNNs for biomedical semantic segmentation adopt BN as the normalization method by default, without reviewing its performance. In this paper, four normalization methods - BN, IN, LN and GN are compared in details, specifically for 2D biomedical semantic segmentation. U-Net is adopted as the basic DCNN structure. Three datasets regarding the Right Ventricle (RV), aorta, and Left Ventricle (LV) are used for the validation. The results show that detailed subdivision of the feature map, i.e. GN with a large group number or IN, achieves higher accuracy. This accuracy improvement mainly comes from better model generalization. Codes are uploaded and maintained at Xiao-Yun Zhou's Github.

CVApr 19, 2018

Estimation of Tissue Oxygen Saturation from RGB Images based on Pixel-level Image Translation

Qing-Biao Li, Xiao-Yun Zhou, Jianyu Lin et al.

Intra-operative measurement of tissue oxygen saturation (StO2) has been widely explored by pulse oximetry or hyperspectral imaging (HSI) to assess the function and viability of tissue. In this paper we propose a pixel- level image-to-image translation approach based on conditional Generative Adversarial Networks (cGAN) to estimate tissue oxygen saturation (StO2) directly from RGB images. The real-time performance and non-reliance on additional hardware, enable a seamless integration of the proposed method into surgical and diagnostic workflows with standard endoscope systems. For validation, RGB images and StO2 ground truth were simulated and estimated from HSI images collected by a liquid crystal tuneable filter (LCTF) endoscope for three tissue types (porcine bowel, lamb uterus and rabbit uterus). The result show that the proposed method can achieve visually identical images with comparable accuracy.

CVApr 9, 2018

Abdominal Aortic Aneurysm Segmentation with a Small Number of Training Subjects

Jian-Qing Zheng, Xiao-Yun Zhou, Qing-Biao Li et al.

Pre-operative Abdominal Aortic Aneurysm (AAA) 3D shape is critical for customized stent-graft design in Fenestrated Endovascular Aortic Repair (FEVAR). Traditional segmentation approaches implement expert-designed feature extractors while recent deep neural networks extract features automatically with multiple non-linear modules. Usually, a large training dataset is essential for applying deep learning on AAA segmentation. In this paper, the AAA was segmented using U-net with a small number (two) of training subjects. Firstly, Computed Tomography Angiography (CTA) slices were augmented with gray value variation and translation to avoid the overfitting caused by the small number of training subjects. Then, U-net was trained to segment the AAA. Dice Similarity Coefficients (DSCs) over 0.8 were achieved on the testing subjects. The PLZ, DLZ and aortic branches are all reconstructed reasonably, which will facilitate stent graft customization and help shape instantiation for intra-operative surgery navigation in FEVAR.

CVDec 30, 2017

A Real-time and Registration-free Framework for Dynamic Shape Instantiation

Xiao-Yun Zhou, Guang-Zhong Yang, Su-Lin Lee

Real-time 3D navigation during minimally invasive procedures is an essential yet challenging task, especially when considerable tissue motion is involved. To balance image acquisition speed and resolution, only 2D images or low-resolution 3D volumes can be used clinically. In this paper, a real-time and registration-free framework for dynamic shape instantiation, generalizable to multiple anatomical applications, is proposed to instantiate high-resolution 3D shapes of an organ from a single 2D image intra-operatively. Firstly, an approximate optimal scan plane was determined by analyzing the pre-operative 3D statistical shape model (SSM) of the anatomy with sparse principal component analysis (SPCA) and considering practical constraints . Secondly, kernel partial least squares regression (KPLSR) was used to learn the relationship between the pre-operative 3D SSM and a synchronized 2D SSM constructed from 2D images obtained at the approximate optimal scan plane. Finally, the derived relationship was applied to the new intra-operative 2D image obtained at the same scan plane to predict the high-resolution 3D shape intra-operatively. A major feature of the proposed framework is that no extra registration between the pre-operative 3D SSM and the synchronized 2D SSM is required. Detailed validation was performed on studies including the liver and right ventricle (RV) of the heart. The derived results (mean accuracy of 2.19mm on patients and computation speed of 1ms) demonstrate its potential clinical value for real-time, high-resolution, dynamic and 3D interventional guidance.

ROSep 22, 2017

Real-time 3D Shape Instantiation from Single Fluoroscopy Projection for Fenestrated Stent Graft Deployment

Xiao-Yun Zhou, Jianyu Lin, Celia Riga et al.

Robot-assisted deployment of fenestrated stent grafts in Fenestrated Endovascular Aortic Repair (FEVAR) requires accurate geometrical alignment. Currently, this process is guided by 2D fluoroscopy, which is uninformative and error prone. In this paper, a real-time framework is proposed to instantiate the 3D shape of a fenestrated stent graft based on only a single low-dose 2D fluoroscopic image. Firstly, the fenestrated stent graft was placed with markers. Secondly, the 3D pose of each stent segment was instantiated by the RPnP (Robust Perspective-n-Point) method. Thirdly, the 3D shape of the whole stent graft was instantiated via graft gap interpolation. Focal-Unet was proposed to segment the markers from 2D fluoroscopic images to achieve semi-automatic marker detection. The proposed framework was validated on five patient-specific 3D printed phantoms of aortic aneurysms and three stent grafts with new marker placements, showing an average distance error of 1-3mm and an average angle error of 4 degree.