CVOct 24, 2023
Anatomically-aware Uncertainty for Semi-supervised Image SegmentationSukesh Adiga, Jose Dolz, Herve Lombaert
Semi-supervised learning relaxes the need of large pixel-wise labeled datasets for image segmentation by leveraging unlabeled data. A prominent way to exploit unlabeled data is to regularize model predictions. Since the predictions of unlabeled data can be unreliable, uncertainty-aware schemes are typically employed to gradually learn from meaningful and reliable predictions. Uncertainty estimation methods, however, rely on multiple inferences from the model predictions that must be computed for each training step, which is computationally expensive. Moreover, these uncertainty maps capture pixel-wise disparities and do not consider global information. This work proposes a novel method to estimate segmentation uncertainty by leveraging global information from the segmentation masks. More precisely, an anatomically-aware representation is first learnt to model the available segmentation masks. The learnt representation thereupon maps the prediction of a new segmentation into an anatomically-plausible segmentation. The deviation from the plausible segmentation aids in estimating the underlying pixel-level uncertainty in order to further guide the segmentation network. The proposed method consequently estimates the uncertainty using a single inference from our representation, thereby reducing the total computation. We evaluate our method on two publicly available segmentation datasets of left atria in cardiac MRIs and of multiple organs in abdominal CTs. Our anatomically-aware method improves the segmentation accuracy over the state-of-the-art semi-supervised methods in terms of two commonly used evaluation metrics.
CVMar 10, 2022
Leveraging Labeling Representations in Uncertainty-based Semi-supervised SegmentationSukesh Adiga, Jose Dolz, Herve Lombaert
Semi-supervised segmentation tackles the scarcity of annotations by leveraging unlabeled data with a small amount of labeled data. A prominent way to utilize the unlabeled data is by consistency training which commonly uses a teacher-student network, where a teacher guides a student segmentation. The predictions of unlabeled data are not reliable, therefore, uncertainty-aware methods have been proposed to gradually learn from meaningful and reliable predictions. Uncertainty estimation, however, relies on multiple inferences from model predictions that need to be computed for each training step, which is computationally expensive. This work proposes a novel method to estimate the pixel-level uncertainty by leveraging the labeling representation of segmentation masks. On the one hand, a labeling representation is learnt to represent the available segmentation masks. The learnt labeling representation is used to map the prediction of the segmentation into a set of plausible masks. Such a reconstructed segmentation mask aids in estimating the pixel-level uncertainty guiding the segmentation network. The proposed method estimates the uncertainty with a single inference from the labeling representation, thereby reducing the total computation. We evaluate our method on the 3D segmentation of left atrium in MRI, and we show that our uncertainty estimates from our labeling representation improve the segmentation accuracy over state-of-the-art methods.
CVJun 18, 2022
Attention-based Dynamic Subspace Learners for Medical Image AnalysisSukesh Adiga, Jose Dolz, Herve Lombaert
Learning similarity is a key aspect in medical image analysis, particularly in recommendation systems or in uncovering the interpretation of anatomical data in images. Most existing methods learn such similarities in the embedding space over image sets using a single metric learner. Images, however, have a variety of object attributes such as color, shape, or artifacts. Encoding such attributes using a single metric learner is inadequate and may fail to generalize. Instead, multiple learners could focus on separate aspects of these attributes in subspaces of an overarching embedding. This, however, implies the number of learners to be found empirically for each new dataset. This work, Dynamic Subspace Learners, proposes to dynamically exploit multiple learners by removing the need of knowing apriori the number of learners and aggregating new subspace learners during training. Furthermore, the visual interpretability of such subspace learning is enforced by integrating an attention module into our method. This integrated attention mechanism provides a visual insight of discriminative image features that contribute to the clustering of image sets and a visual explanation of the embedding features. The benefits of our attention-based dynamic subspace learners are evaluated in the application of image clustering, image retrieval, and weakly supervised segmentation. Our method achieves competitive results with the performances of multiple learners baselines and significantly outperforms the classification network in terms of clustering and retrieval scores on three different public benchmark datasets. Moreover, our attention maps offer a proxy-labels, which improves the segmentation accuracy up to 15% in Dice scores when compared to state-of-the-art interpretation techniques.
CVApr 9, 2018Code
HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentationJose Dolz, Karthik Gopinath, Jing Yuan et al.
Recently, dense connections have attracted substantial attention in computer vision because they facilitate gradient flow and implicit deep supervision during training. Particularly, DenseNet, which connects each layer to every other layer in a feed-forward fashion, has shown impressive performances in natural image classification tasks. We propose HyperDenseNet, a 3D fully convolutional neural network that extends the definition of dense connectivity to multi-modal segmentation problems. Each imaging modality has a path, and dense connections occur not only between the pairs of layers within the same path, but also between those across different paths. This contrasts with the existing multi-modal CNN approaches, in which modeling several modalities relies entirely on a single joint layer (or level of abstraction) for fusion, typically either at the input or at the output of the network. Therefore, the proposed network has total freedom to learn more complex combinations between the modalities, within and in-between all the levels of abstraction, which increases significantly the learning representation. We report extensive evaluations over two different and highly competitive multi-modal brain tissue segmentation challenges, iSEG 2017 and MRBrainS 2013, with the former focusing on 6-month infant data and the latter on adult images. HyperDenseNet yielded significant improvements over many state-of-the-art segmentation networks, ranking at the top on both benchmarks. We further provide a comprehensive experimental analysis of features re-use, which confirms the importance of hyper-dense connections in multi-modal representation learning. Our code is publicly available at https://www.github.com/josedolz/HyperDenseNet.
CVMay 3
Exploring Entropy-based Active Learning for Fair Brain SegmentationGhazal Danaee, Mélanie Gaillochet, Christian Desrosiers et al.
Active learning (AL) has emerged as a crucial strategy for reducing the prohibitive costs associated with medical image segmentation. However, standard uncertainty-based AL methods typically focus on maximizing performance metrics, ignoring performance disparities or fairness across groups with sensitive attributes. While fair active learning has been explored in classification tasks, its intersection with medical image segmentation remains unaddressed. In this work, we introduced a fairness-aware active learning framework with a Weighted Entropy selection strategy that modulates uncertainty based on current group-specific performance estimates on the labeled set. To decouple true epistemic uncertainty from anatomical volume variances, we further utilized a masked, scaled entropy restricted to the region of interest. The framework was evaluated on synthetic T1-weighted brain MRIs with controlled left caudate bias in both strong and weak bias settings. A 3D U-Net was trained to segment the left caudate under several AL strategies, starting from both demographically balanced and strongly imbalanced initial labeled sets. Experiments demonstrated that our method markedly reduces performance disparities between groups compared to random sampling and standard uncertainty sampling. By prioritizing poorly segmented subgroups during the AL cycles, our method consistently achieved the highest equity-scaled performance and reduced the disparity metric by 75% (strong bias) and 86% (weak bias) relative to standard entropy at the final budget. Overall, this work is among the first studies on fair AL for medical image segmentation, offering an efficient strategy to train more equitable models in resource-constrained environments.
CVMar 9, 2025
Spectral State Space Model for Rotation-Invariant Visual Representation LearningSahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.
State Space Models (SSMs) have recently emerged as an alternative to Vision Transformers (ViTs) due to their unique ability of modeling global relationships with linear complexity. SSMs are specifically designed to capture spatially proximate relationships of image patches. However, they fail to identify relationships between conceptually related yet not adjacent patches. This limitation arises from the non-causal nature of image data, which lacks inherent directional relationships. Additionally, current vision-based SSMs are highly sensitive to transformations such as rotation. Their predefined scanning directions depend on the original image orientation, which can cause the model to produce inconsistent patch-processing sequences after rotation. To address these limitations, we introduce Spectral VMamba, a novel approach that effectively captures the global structure within an image by leveraging spectral information derived from the graph Laplacian of image patches. Through spectral decomposition, our approach encodes patch relationships independently of image orientation, achieving rotation invariance with the aid of our Rotational Feature Normalizer (RFN) module. Our experiments on classification tasks show that Spectral VMamba outperforms the leading SSM models in vision, such as VMamba, while maintaining invariance to rotations and a providing a similar runtime efficiency.
CVSep 26, 2025
TRUST: Test-Time Refinement using Uncertainty-Guided SSM TraversesSahar Dastani, Ali Bahri, Gustavo Adolfo Vargas Hakim et al.
State Space Models (SSMs) have emerged as efficient alternatives to Vision Transformers (ViTs), with VMamba standing out as a pioneering architecture designed for vision tasks. However, their generalization performance degrades significantly under distribution shifts. To address this limitation, we propose TRUST (Test-Time Refinement using Uncertainty-Guided SSM Traverses), a novel test-time adaptation (TTA) method that leverages diverse traversal permutations to generate multiple causal perspectives of the input image. Model predictions serve as pseudo-labels to guide updates of the Mamba-specific parameters, and the adapted weights are averaged to integrate the learned information across traversal scans. Altogether, TRUST is the first approach that explicitly leverages the unique architectural properties of SSMs for adaptation. Experiments on seven benchmarks show that TRUST consistently improves robustness and outperforms existing TTA methods.
CVJun 11, 2024
Sparse Bayesian Networks: Efficient Uncertainty Quantification in Medical Image AnalysisZeinab Abboud, Herve Lombaert, Samuel Kadoury
Efficiently quantifying predictive uncertainty in medical images remains a challenge. While Bayesian neural networks (BNN) offer predictive uncertainty, they require substantial computational resources to train. Although Bayesian approximations such as ensembles have shown promise, they still suffer from high training and inference costs. Existing approaches mainly address the costs of BNN inference post-training, with little focus on improving training efficiency and reducing parameter complexity. This study introduces a training procedure for a sparse (partial) Bayesian network. Our method selectively assigns a subset of parameters as Bayesian by assessing their deterministic saliency through gradient sensitivity analysis. The resulting network combines deterministic and Bayesian parameters, exploiting the advantages of both representations to achieve high task-specific performance and minimize predictive uncertainty. Demonstrated on multi-label ChestMNIST for classification and ISIC, LIDC-IDRI for segmentation, our approach achieves competitive performance and predictive uncertainty estimation by reducing Bayesian parameters by over 95\%, significantly reducing computational expenses compared to fully Bayesian and ensemble methods.
CVNov 26, 2021
Medial Spectral Coordinates for 3D Shape AnalysisMorteza Rezanejad, Mohammad Khodadad, Hamidreza Mahyar et al.
In recent years there has been a resurgence of interest in our community in the shape analysis of 3D objects represented by surface meshes, their voxelized interiors, or surface point clouds. In part, this interest has been stimulated by the increased availability of RGBD cameras, and by applications of computer vision to autonomous driving, medical imaging, and robotics. In these settings, spectral coordinates have shown promise for shape representation due to their ability to incorporate both local and global shape properties in a manner that is qualitatively invariant to isometric transformations. Yet, surprisingly, such coordinates have thus far typically considered only local surface positional or derivative information. In the present article, we propose to equip spectral coordinates with medial (object width) information, so as to enrich them. The key idea is to couple surface points that share a medial ball, via the weights of the adjacency matrix. We develop a spectral feature using this idea, and the algorithms to compute it. The incorporation of object width and medial coupling has direct benefits, as illustrated by our experiments on object classification, object part segmentation, and surface point correspondence.
LGSep 29, 2020
Realistic Image Normalization for Multi-Domain SegmentationPierre-Luc Delisle, Benoit Anctil-Robitaille, Christian Desrosiers et al.
Image normalization is a building block in medical image analysis. Conventional approaches are customarily utilized on a per-dataset basis. This strategy, however, prevents the current normalization algorithms from fully exploiting the complex joint information available across multiple datasets. Consequently, ignoring such joint information has a direct impact on the performance of segmentation algorithms. This paper proposes to revisit the conventional image normalization approach by instead learning a common normalizing function across multiple datasets. Jointly normalizing multiple datasets is shown to yield consistent normalized images as well as an improved image segmentation. To do so, a fully automated adversarial and task-driven normalization approach is employed as it facilitates the training of realistic and interpretable images while keeping performance on-par with the state-of-the-art. The adversarial training of our network aims at finding the optimal transfer function to improve both the segmentation accuracy and the generation of realistic images. We evaluated the performance of our normalizer on both infant and adult brains images from the iSEG, MRBrainS and ABIDE datasets. Results reveal the potential of our normalization approach for segmentation, with Dice improvements of up to 57.5% over our baseline. Our method can also enhance data availability by increasing the number of samples available when learning from multiple imaging domains.
CVJun 29, 2020
Medical Imaging with Deep Learning: MIDL 2020 -- Short Paper TrackTal Arbel, Ismail Ben Ayed, Marleen de Bruijne et al.
This compendium gathers all the accepted extended abstracts from the Third International Conference on Medical Imaging with Deep Learning (MIDL 2020), held in Montreal, Canada, 6-9 July 2020. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2020 Full Paper Track are published in the Proceedings of Machine Learning Research (PMLR).
CVMay 7, 2020
Source-Relaxed Domain Adaptation for Image SegmentationMathilde Bateson, Hoel Kervadec, Jose Dolz et al.
Domain adaptation (DA) has drawn high interests for its capacity to adapt a model trained on labeled source data to perform well on unlabeled or weakly labeled target data from a different domain. Most common DA techniques require the concurrent access to the input images of both the source and target domains. However, in practice, it is common that the source images are not available in the adaptation phase. This is a very frequent DA scenario in medical imaging, for instance, when the source and target images come from different clinical sites. We propose a novel formulation for adapting segmentation networks, which relaxes such a constraint. Our formulation is based on minimizing a label-free entropy loss defined over target-domain data, which we further guide with a domain invariant prior on the segmentation regions. Many priors can be used, derived from anatomical information. Here, a class-ratio prior is learned via an auxiliary network and integrated in the form of a Kullback-Leibler (KL) divergence in our overall loss function. We show the effectiveness of our prior-aware entropy minimization in adapting spine segmentation across different MRI modalities. Our method yields comparable results to several state-of-the-art adaptation techniques, even though is has access to less information, the source images being absent in the adaptation phase. Our straight-forward adaptation strategy only uses one network, contrary to popular adversarial techniques, which cannot perform without the presence of the source images. Our framework can be readily used with various priors and segmentation problems.
CVApr 7, 2020
Manifold-driven Attention Maps for Weakly Supervised SegmentationSukesh Adiga, Jose Dolz, Herve Lombaert
Segmentation using deep learning has shown promising directions in medical imaging as it aids in the analysis and diagnosis of diseases. Nevertheless, a main drawback of deep models is that they require a large amount of pixel-level labels, which are laborious and expensive to obtain. To mitigate this problem, weakly supervised learning has emerged as an efficient alternative, which employs image-level labels, scribbles, points, or bounding boxes as supervision. Among these, image-level labels are easier to obtain. However, since this type of annotation only contains object category information, the segmentation task under this learning paradigm is a challenging problem. To address this issue, visual salient regions derived from trained classification networks are typically used. Despite their success to identify important regions on classification tasks, these saliency regions only focus on the most discriminant areas of an image, limiting their use in semantic segmentation. In this work, we propose a manifold driven attention-based network to enhance visual salient regions, thereby improving segmentation accuracy in a weakly supervised setting. Our method generates superior attention maps directly during inference without the need of extra computations. We evaluate the benefits of our approach in the task of segmentation using a public benchmark on skin lesion images. Results demonstrate that our method outperforms the state-of-the-art GradCAM by a margin of ~22% in terms of Dice score.
IVApr 1, 2020
Manifold-Aware CycleGAN for High-Resolution Structural-to-DTI SynthesisBenoit Anctil-Robitaille, Christian Desrosiers, Herve Lombaert
Unpaired image-to-image translation has been applied successfully to natural images but has received very little attention for manifold-valued data such as in diffusion tensor imaging (DTI). The non-Euclidean nature of DTI prevents current generative adversarial networks (GANs) from generating plausible images and has mainly limited their application to diffusion MRI scalar maps, such as fractional anisotropy (FA) or mean diffusivity (MD). Even if these scalar maps are clinically useful, they mostly ignore fiber orientations and therefore have limited applications for analyzing brain fibers. Here, we propose a manifold-aware CycleGAN that learns the generation of high-resolution DTI from unpaired T1w images. We formulate the objective as a Wasserstein distance minimization problem of data distributions on a Riemannian manifold of symmetric positive definite 3x3 matrices SPD(3), using adversarial and cycle-consistency losses. To ensure that the generated diffusion tensors lie on the SPD(3) manifold, we exploit the theoretical properties of the exponential and logarithm maps of the Log-Euclidean metric. We demonstrate that, unlike standard GANs, our method is able to generate realistic high-resolution DTI that can be used to compute diffusion-based metrics and potentially run fiber tractography algorithms. To evaluate our model's performance, we compute the cosine similarity between the generated tensors principal orientation and their ground-truth orientation, the mean squared error (MSE) of their derived FA values and the Log-Euclidean distance between the tensors. We demonstrate that our method produces 2.5 times better FA MSE than a standard CycleGAN and up to 30% better cosine similarity than a manifold-aware Wasserstein GAN while synthesizing sharp high-resolution DTI.
IVMar 31, 2020
Graph Domain Adaptation for Alignment-Invariant Brain Surface SegmentationKarthik Gopinath, Christian Desrosiers, Herve Lombaert
The varying cortical geometry of the brain creates numerous challenges for its analysis. Recent developments have enabled learning surface data directly across multiple brain surfaces via graph convolutions on cortical data. However, current graph learning algorithms do fail when brain surface data are misaligned across subjects, thereby affecting their ability to deal with data from multiple domains. Adversarial training is widely used for domain adaptation to improve the segmentation performance across domains. In this paper, adversarial training is exploited to learn surface data across inconsistent graph alignments. This novel approach comprises a segmentator that uses a set of graph convolution layers to enable parcellation directly across brain surfaces in a source domain, and a discriminator that predicts a graph domain from segmentations. More precisely, the proposed adversarial network learns to generalize a parcellation across both, source and target domains. We demonstrate an 8% mean improvement in performance over a non-adversarial training strategy applied on multiple target domains extracted from MindBoggle, the largest publicly available manually-labeled brain surface dataset.
LGDec 2, 2019
Adversarial normalization for multi domain image segmentationPierre-Luc Delisle, Benoit Anctil-Robitaille, Christian Desrosiers et al.
Image normalization is a critical step in medical imaging. This step is often done on a per-dataset basis, preventing current segmentation algorithms from the full potential of exploiting jointly normalized information across multiple datasets. To solve this problem, we propose an adversarial normalization approach for image segmentation which learns common normalizing functions across multiple datasets while retaining image realism. The adversarial training provides an optimal normalizer that improves both the segmentation accuracy and the discrimination of unrealistic normalizing functions. Our contribution therefore leverages common imaging information from multiple domains. The optimality of our common normalizer is evaluated by combining brain images from both infants and adults. Results on the challenging iSEG and MRBrainS datasets reveal the potential of our adversarial normalization approach for segmentation, with Dice improvements of up to 59.6% over the baseline.
CVNov 22, 2019
Learnable Pooling in Graph Convolution Networks for Brain Surface AnalysisKarthik Gopinath, Christian Desrosiers, Herve Lombaert
Brain surface analysis is essential to neuroscience, however, the complex geometry of the brain cortex hinders computational methods for this task. The difficulty arises from a discrepancy between 3D imaging data, which is represented in Euclidean space, and the non-Euclidean geometry of the highly-convoluted brain surface. Recent advances in machine learning have enabled the use of neural networks for non-Euclidean spaces. These facilitate the learning of surface data, yet pooling strategies often remain constrained to a single fixed-graph. This paper proposes a new learnable graph pooling method for processing multiple surface-valued data to output subject-based information. The proposed method innovates by learning an intrinsic aggregation of graph nodes based on graph spectral embedding. We illustrate the advantages of our approach with in-depth experiments on two large-scale benchmark datasets. The flexibility of the pooling strategy is evaluated on four different prediction tasks, namely, subject-sex classification, regression of cortical region sizes, classification of Alzheimer's disease stages, and brain age regression. Our experiments demonstrate the superiority of our learnable pooling approach compared to other pooling techniques for graph convolution networks, with results improving the state-of-the-art in brain surface analysis.
IVNov 22, 2019
Spectral Graph Transformer Networks for Brain Surface ParcellationRan He, Karthik Gopinath, Christian Desrosiers et al.
The analysis of the brain surface modeled as a graph mesh is a challenging task. Conventional deep learning approaches often rely on data lying in the Euclidean space. As an extension to irregular graphs, convolution operations are defined in the Fourier or spectral domain. This spectral domain is obtained by decomposing the graph Laplacian, which captures relevant shape information. However, the spectral decomposition across different brain graphs causes inconsistencies between the eigenvectors of individual spectral domains, causing the graph learning algorithm to fail. Current spectral graph convolution methods handle this variance by separately aligning the eigenvectors to a reference brain in a slow iterative step. This paper presents a novel approach for learning the transformation matrix required for aligning brain meshes using a direct data-driven approach. Our alignment and graph processing method provides a fast analysis of brain surfaces. The novel Spectral Graph Transformer (SGT) network proposed in this paper uses very few randomly sub-sampled nodes in the spectral domain to learn the alignment matrix for multiple brain surfaces. We validate the use of this SGT network along with a graph convolution network to perform cortical parcellation. Our method on 101 manually-labeled brain surfaces shows improved parcellation performance over a no-alignment strategy, gaining a significant speed (1400 fold) over traditional iterative alignment approaches.
CVMar 27, 2018
Graph Convolutions on Spectral Embeddings: Learning of Cortical Surface DataKarthik Gopinath, Christian Desrosiers, Herve Lombaert
Neuronal cell bodies mostly reside in the cerebral cortex. The study of this thin and highly convoluted surface is essential for understanding how the brain works. The analysis of surface data is, however, challenging due to the high variability of the cortical geometry. This paper presents a novel approach for learning and exploiting surface data directly across surface domains. Current approaches rely on geometrical simplifications, such as spherical inflations, a popular but costly process. For instance, the widely used FreeSurfer takes about 3 hours to parcellate brain surfaces on a standard machine. Direct learning of surface data via graph convolutions would provide a new family of fast algorithms for processing brain surfaces. However, the current limitation of existing state-of-the-art approaches is their inability to compare surface data across different surface domains. Surface bases are indeed incompatible between brain geometries. This paper leverages recent advances in spectral graph matching to transfer surface data across aligned spectral domains. This novel approach enables a direct learning of surface data across compatible surface bases. It exploits spectral filters over intrinsic representations of surface neighborhoods. We illustrate the benefits of this approach with an application to brain parcellation. We validate the algorithm over 101 manually labeled brain surfaces. The results show a significant improvement in labeling accuracy over recent Euclidean approaches, while gaining a drastic speed improvement over conventional methods.