IVApr 26, 2021Code
Recalibration of Aleatoric and Epistemic Regression Uncertainty in Medical ImagingMax-Heinrich Laves, Sontje Ihler, Jacob F. Fast et al.
The consideration of predictive uncertainty in medical imaging with deep learning is of utmost importance. We apply estimation of both aleatoric and epistemic uncertainty by variational Bayesian inference with Monte Carlo dropout to regression tasks and show that predictive uncertainty is systematically underestimated. We apply $ σ$ scaling with a single scalar value; a simple, yet effective calibration method for both types of uncertainty. The performance of our approach is evaluated on a variety of common medical regression data sets using different state-of-the-art convolutional network architectures. In our experiments, $ σ$ scaling is able to reliably recalibrate predictive uncertainty. It is easy to implement and maintains the accuracy. Well-calibrated uncertainty in regression allows robust rejection of unreliable predictions or detection of out-of-distribution samples. Our source code is available at https://github.com/mlaves/well-calibrated-regression-uncertainty
CVMar 23, 2019
Retinal OCT disease classification with variational autoencoder regularizationMax-Heinrich Laves, Sontje Ihler, Lüder A. Kahrs et al.
According to the World Health Organization, 285 million people worldwide live with visual impairment. The most commonly used imaging technique for diagnosis in ophthalmology is optical coherence tomography (OCT). However, analysis of retinal OCT requires trained ophthalmologists and time, making a comprehensive early diagnosis unlikely. A recent study established a diagnostic tool based on convolutional neural networks (CNN), which was trained on a large database of retinal OCT images. The performance of the tool in classifying retinal conditions was on par to that of trained medical experts. However, the training of these networks is based on an enormous amount of labeled data, which is expensive and difficult to obtain. Therefore, this paper describes a method based on variational autoencoder regularization that improves classification performance when using a limited amount of labeled data. This work uses a two-path CNN model combining a classification network with an autoencoder (AE) for regularization. The key idea behind this is to prevent overfitting when using a limited training dataset size with small number of patients. Results show superior classification performance compared to a pre-trained and fully fine-tuned baseline ResNet-34. Clustering of the latent space in relation to the disease class is distinct. Neural networks for disease classification on OCTs can benefit from regularization using variational autoencoders when trained with limited amount of patient data. Especially in the medical imaging domain, data annotated by experts is expensive to obtain.
CVJan 19, 2019
Endoscopic vs. volumetric OCT imaging of mastoid bone structure for pose estimation in minimally invasive cochlear implant surgeryMax-Heinrich Laves, Sarah Latus, Jan Bergmeier et al.
Purpose: The facial recess is a delicate structure that must be protected in minimally invasive cochlear implant surgery. Current research estimates the drill trajectory by using endoscopy of the unique mastoid patterns. However, missing depth information limits available features for a registration to preoperative CT data. Therefore, this paper evaluates OCT for enhanced imaging of drill holes in mastoid bone and compares OCT data to original endoscopic images. Methods: A catheter-based OCT probe is inserted into a drill trajectory of a mastoid phantom in a translation-rotation manner to acquire the inner surface state. The images are undistorted and stitched to create volumentric data of the drill hole. The mastoid cell pattern is segmented automatically and compared to ground truth. Results: The mastoid pattern segmented on images acquired with OCT show a similarity of J = 73.6 % to ground truth based on endoscopic images and measured with the Jaccard metric. Leveraged by additional depth information, automated segmentation tends to be more robust and fail-safe compared to endoscopic images. Conclusion: The feasibility of using a clinically approved OCT probe for imaging the drill hole in cochlear implantation is shown. The resulting volumentric images provide additional information on the shape of caveties in the bone structure, which will be useful for image-to-patient registration and to estimate the drill trajectory. This will be another step towards safe minimally invasive cochlear implantation.
CVOct 26, 2018
Deep learning based 2.5D flow field estimation for maximum intensity projections of 4D optical coherence tomographyMax-Heinrich Laves, Lüder A. Kahrs, Tobias Ortmaier
In microsurgery, lasers have emerged as precise tools for bone ablation. A challenge is automatic control of laser bone ablation with 4D optical coherence tomography (OCT). OCT as high resolution imaging modality provides volumetric images of tissue and foresees information of bone position and orientation (pose) as well as thickness. However, existing approaches for OCT based laser ablation control rely on external tracking systems or invasively ablated artificial landmarks for tracking the pose of the OCT probe relative to the tissue. This can be superseded by estimating the scene flow caused by relative movement between OCT-based laser ablation system and patient. Therefore, this paper deals with 2.5D scene flow estimation of volumetric OCT images for application in laser ablation. We present a semi-supervised convolutional neural network based tracking scheme for subsequent 3D OCT volumes and apply it to a realistic semi-synthetic data set of ex vivo human temporal bone specimen. The scene flow is estimated in a two-stage approach. In the first stage, 2D lateral scene flow is computed on census-transformed en-face arguments-of-maximum intensity projections. Subsequent to this, the projections are warped by predicted lateral flow and 1D depth flow is estimated. The neural network is trained semi-supervised by combining error to ground truth and the reconstruction error of warped images with assumptions of spatial flow smoothness. Quantitative evaluation reveals a mean endpoint error of $ (4.7\pm{}3.5) $ voxel or $ 27.5 \pm 20.5 μ\mathrm{m} $ for scene flow estimation caused by simulated relative movement between the OCT probe and bone. The scene flow estimation for 4D OCT enables its use for markerless tracking of mastoid bone structures for image guidance in general, and automated laser ablation control.
CVJul 16, 2018
A Dataset of Laryngeal Endoscopic Images with Comparative Study on Convolution Neural Network Based Semantic SegmentationMax-Heinrich Laves, Jens Bicker, Lüder A. Kahrs et al.
Purpose Automated segmentation of anatomical structures in medical image analysis is a prerequisite for autonomous diagnosis as well as various computer and robot aided interventions. Recent methods based on deep convolutional neural networks (CNN) have outperformed former heuristic methods. However, those methods were primarily evaluated on rigid, real-world environments. In this study, existing segmentation methods were evaluated for their use on a new dataset of transoral endoscopic exploration. Methods Four machine learning based methods SegNet, UNet, ENet and ErfNet were trained with supervision on a novel 7-class dataset of the human larynx. The dataset contains 536 manually segmented images from two patients during laser incisions. The Intersection-over-Union (IoU) evaluation metric was used to measure the accuracy of each method. Data augmentation and network ensembling were employed to increase segmentation accuracy. Stochastic inference was used to show uncertainties of the individual models. Patient-to-patient transfer was investigated using patient-specific fine-tuning. Results In this study, a weighted average ensemble network of UNet and ErfNet was best suited for the segmentation of laryngeal soft tissue with a mean IoU of 84.7 %. The highest efficiency was achieved by ENet with a mean inference time of 9.22 ms per image. It is shown that 10 additional images from a new patient are sufficient for patient-specific fine-tuning. Conclusion CNN-based methods for semantic segmentation are applicable to endoscopic images of laryngeal soft tissue. The segmentation can be used for active constraints or to monitor morphological changes and autonomously detect pathologies. Further improvements could be achieved by using a larger dataset or training the models in a self-supervised manner on additional unlabeled data.