John A. Lee

h-index55

11papers

380citations

Novelty49%

AI Score39

Ranked #81,757 of 194,257 authors (top 42%)#27,590 in CV (top 47%)

11 Papers

2.0LGOct 30, 2023

Can input reconstruction be used to directly estimate uncertainty of a regression U-Net model? -- Application to proton therapy dose prediction for head and neck cancer patients

Margerie Huet-Dastarac, Dan Nguyen, Steve Jiang et al.

Estimating the uncertainty of deep learning models in a reliable and efficient way has remained an open problem, where many different solutions have been proposed in the literature. Most common methods are based on Bayesian approximations, like Monte Carlo dropout (MCDO) or Deep ensembling (DE), but they have a high inference time (i.e. require multiple inference passes) and might not work for out-of-distribution detection (OOD) data (i.e. similar uncertainty for in-distribution (ID) and OOD). In safety critical environments, like medical applications, accurate and fast uncertainty estimation methods, able to detect OOD data, are crucial, since wrong predictions can jeopardize patients safety. In this study, we present an alternative direct uncertainty estimation method and apply it for a regression U-Net architecture. The method consists in the addition of a branch from the bottleneck which reconstructs the input. The input reconstruction error can be used as a surrogate of the model uncertainty. For the proof-of-concept, our method is applied to proton therapy dose prediction in head and neck cancer patients. Accuracy, time-gain, and OOD detection are analyzed for our method in this particular application and compared with the popular MCDO and DE. The input reconstruction method showed a higher Pearson correlation coefficient with the prediction error (0.620) than DE and MCDO (between 0.447 and 0.612). Moreover, our method allows an easier identification of OOD (Z-score of 34.05). It estimates the uncertainty simultaneously to the regression task, therefore requires less time or computational resources.

4.1LGSep 9, 2025

FUnc-SNE: A flexible, Fast, and Unconstrained algorithm for neighbour embeddings

Pierre Lambert, Edouard Couplet, Michel Verleysen et al.

Neighbour embeddings (NE) allow the representation of high dimensional datasets into lower dimensional spaces and are often used in data visualisation. In practice, accelerated approximations are employed to handle very large datasets. Accelerating NE is challenging, and two main directions have been explored: very coarse approximations based on negative sampling (as in UMAP) achieve high effective speed but may lack quality in the extracted structures; less coarse approximations, as used in FIt-SNE or BH-t-SNE, offer better structure preservation at the cost of speed, while also restricting the target dimensionality to 2 or 3, limiting NE to visualisation. In some variants, the precision of these costlier accelerations also enables finer-grained control on the extracted structures through dedicated hyperparameters. This paper proposes to bridge the gab between both approaches by introducing a novel way to accelerate NE, requiring a small number of computations per iteration while maintaining good fine-grained structure preservation and flexibility through hyperparameter tuning, without limiting the dimensionality of the embedding space. The method was designed for interactive exploration of data; as such, it abandons the traditional two-phased approach of other NE methods, allowing instantaneous visual feedback when changing hyperparameters, even when these control processes happening on the high-dimensional side of the computations. Experiments using a publicly available, GPU accelerated GUI integration of the method show promising results in terms of speed, flexibility in the structures getting extracted, and show potential uses in broader machine learning contexts with minimal algorithmic modifications. Central to this algorithm is a novel approach to iterative approximate nearest neighbour search, which shows promising results compared to nearest neighbour descent.

2.3OPTICSApr 22, 2021

Compressive lensless endoscopy with partial speckle scanning

Stéphanie Guérit, Siddharth Sivankutty, John Aldo Lee et al.

The lensless endoscope (LE) is a promising device to acquire in vivo images at a cellular scale. The tiny size of the probe enables a deep exploration of the tissues. Lensless endoscopy with a multicore fiber (MCF) commonly uses a spatial light modulator (SLM) to coherently combine, at the output of the MCF, few hundreds of beamlets into a focus spot. This spot is subsequently scanned across the sample to generate a fluorescent image. We propose here a novel scanning scheme, partial speckle scanning (PSS), inspired by compressive sensing theory, that avoids the use of an SLM to perform fluorescent imaging in LE with reduced acquisition time. Such a strategy avoids photo-bleaching while keeping high reconstruction quality. We develop our approach on two key properties of the LE: (i) the ability to easily generate speckles, and (ii) the memory effect in MCF that allows to use fast scan mirrors to shift light patterns. First, we show that speckles are sub-exponential random fields. Despite their granular structure, an appropriate choice of the reconstruction parameters makes them good candidates to build efficient sensing matrices. Then, we numerically validate our approach and apply it on experimental data. The proposed sensing technique outperforms conventional raster scanning: higher reconstruction quality is achieved with far fewer observations. For a fixed reconstruction quality, our speckle scanning approach is faster than compressive sensing schemes which require to change the speckle pattern for each observation.

3.3LGOct 3, 2020Code

Perplexity-free Parametric t-SNE

Francesco Crecchi, Cyril de Bodt, Michel Verleysen et al.

The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a ubiquitously employed dimensionality reduction (DR) method. Its non-parametric nature and impressive efficacy motivated its parametric extension. It is however bounded to a user-defined perplexity parameter, restricting its DR quality compared to recently developed multi-scale perplexity-free approaches. This paper hence proposes a multi-scale parametric t-SNE scheme, relieved from the perplexity tuning and with a deep neural network implementing the mapping. It produces reliable embeddings with out-of-sample extensions, competitive with the best perplexity adjustments in terms of neighborhood preservation on multiple data sets.

5.8CVSep 2, 2020

Deep Learning to Detect Bacterial Colonies for the Production of Vaccines

Thomas Beznik, Paul Smyth, Gaël de Lannoy et al.

During the development of vaccines, bacterial colony forming units (CFUs) are counted in order to quantify the yield in the fermentation process. This manual task is time-consuming and error-prone. In this work we test multiple segmentation algorithms based on the U-Net CNN architecture and show that these offer robust, automated CFU counting. We show that the multiclass generalisation with a bespoke loss function allows distinguishing virulent and avirulent colonies with acceptable accuracy. While many possibilities are left to explore, our results show the potential of deep learning for separating and classifying bacterial colonies.

10.7CVDec 3, 2018Code

Knowing what you know in brain segmentation using Bayesian deep neural networks

Patrick McClure, Nao Rho, John A. Lee et al.

In this paper, we describe a Bayesian deep neural network (DNN) for predicting FreeSurfer segmentations of structural MRI volumes, in minutes rather than hours. The network was trained and evaluated on a large dataset (n = 11,480), obtained by combining data from more than a hundred different sites, and also evaluated on another completely held-out dataset (n = 418). The network was trained using a novel spike-and-slab dropout-based variational inference approach. We show that, on these datasets, the proposed Bayesian DNN outperforms previously proposed methods, in terms of the similarity between the segmentation predictions and the FreeSurfer labels, and the usefulness of the estimate uncertainty of these predictions. In particular, we demonstrated that the prediction uncertainty of this network at each voxel is a good indicator of whether the network has made an error and that the uncertainty across the whole brain can predict the manual quality control ratings of a scan. The proposed Bayesian DNN method should be applicable to any new network architecture for addressing the segmentation problem.

0.9CVMay 29, 2018

Capturing Variabilities from Computed Tomography Images with Generative Adversarial Networks

Umair Javaid, John A. Lee

With the advent of Deep Learning (DL) techniques, especially Generative Adversarial Networks (GANs), data augmentation and generation are quickly evolving domains that have raised much interest recently. However, the DL techniques are data demanding and since, medical data is not easily accessible, they suffer from data insufficiency. To deal with this limitation, different data augmentation techniques are used. Here, we propose a novel unsupervised data-driven approach for data augmentation that can generate 2D Computed Tomography (CT) images using a simple GAN. The generated CT images have good global and local features of a real CT image and can be used to augment the training datasets for effective learning. In this proof-of-concept study, we show that our proposed solution using GANs is able to capture some of the global and local CT variabilities. Our network is able to generate visually realistic CT images and we aim to further enhance its output by scaling it to a higher resolution and potentially from 2D to 3D.

10.1LGMay 28, 2018

Distributed Weight Consolidation: A Brain Segmentation Case Study

Patrick McClure, Charles Y. Zheng, Jakub R. Kaczmarzyk et al.

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline.

16.0MLOct 25, 2017

Inversion using a new low-dimensional representation of complex binary geological media based on a deep neural network

Eric Laloy, Romain Hérault, John Lee et al.

Efficient and high-fidelity prior sampling and inversion for complex geological media is still a largely unsolved challenge. Here, we use a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media. For inversion purposes, it has the attractive feature that random draws from an uncorrelated standard normal distribution yield model realizations with spatial characteristics that are in agreement with the training set. In comparison with the most commonly used parametric representations in probabilistic inversion, we find that our dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA (OPCA) and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model. For the considered examples, important compression ratios (200 - 500) are achieved. Given that the construction of our parameterization requires a training set of several tens of thousands of prior model realizations, our DR approach is more suited for probabilistic (or deterministic) inversion than for unconditional (or point-conditioned) geostatistical simulation. Probabilistic inversions of 2D steady-state and 3D transient hydraulic tomography data are used to demonstrate the DR-based inversion. For the 2D case study, the performance is superior compared to current state-of-the-art multiple-point statistics inversion by sequential geostatistical resampling (SGR). Inversion results for the 3D application are also encouraging.

1.1CVAug 5, 2016

Blind Deconvolution of PET Images using Anatomical Priors

Stéphanie Guérit, Adriana González, Anne Bol et al.

Images from positron emission tomography (PET) provide metabolic information about the human body. They present, however, a spatial resolution that is limited by physical and instrumental factors often modeled by a blurring function. Since this function is typically unknown, blind deconvolution (BD) techniques are needed in order to produce a useful restored PET image. In this work, we propose a general BD technique that restores a low resolution blurry image using information from data acquired with a high resolution modality (e.g., CT-based delineation of regions with uniform activity in PET images). The proposed BD method is validated on synthetic and actual phantoms.

4.5CVJun 16, 2015

Post-Reconstruction Deconvolution of PET Images by Total Generalized Variation Regularization

Stéphanie Guérit, Laurent Jacques, Benoît Macq et al.

Improving the quality of positron emission tomography (PET) images, affected by low resolution and high level of noise, is a challenging task in nuclear medicine and radiotherapy. This work proposes a restoration method, achieved after tomographic reconstruction of the images and targeting clinical situations where raw data are often not accessible. Based on inverse problem methods, our contribution introduces the recently developed total generalized variation (TGV) norm to regularize PET image deconvolution. Moreover, we stabilize this procedure with additional image constraints such as positivity and photometry invariance. A criterion for updating and adjusting automatically the regularization parameter in case of Poisson noise is also presented. Experiments are conducted on both synthetic data and real patient images.