CVSep 14, 2022Code
Landmark-free Statistical Shape Modeling via Neural Flow DeformationsDavid Lüdke, Tamaz Amiranashvili, Felix Ambellan et al.
Statistical shape modeling aims at capturing shape variations of an anatomical structure that occur within a given population. Shape models are employed in many tasks, such as shape reconstruction and image segmentation, but also shape generation and classification. Existing shape priors either require dense correspondence between training examples or lack robustness and topological guarantees. We present FlowSSM, a novel shape modeling approach that learns shape variability without requiring dense correspondence between training instances. It relies on a hierarchy of continuous deformation flows, which are parametrized by a neural network. Our model outperforms state-of-the-art methods in providing an expressive and robust shape prior for distal femur and liver. We show that the emerging latent representation is discriminative by separating healthy from pathological shapes. Ultimately, we demonstrate its effectiveness on two shape reconstruction tasks from partial data. Our source code is publicly available (https://github.com/davecasp/flowssm).
46.3CVApr 10
Globally Optimal Pose from Orthographic SilhouettesAgniva Sengupta, Dilara KuÅ, Jianning Li et al.
We solve the problem of determining the pose of known shapes in $\mathbb{R}^3$ from their unoccluded silhouettes. The pose is determined up to global optimality using a simple yet under-explored property of the area-of-silhouette: its continuity w.r.t trajectories in the rotation space. The proposed method utilises pre-computed silhouette-signatures, modelled as a response surface of the area-of-silhouettes. Querying this silhouette-signature response surface for pose estimation leads to a strong branching of the rotation search space, making resolution-guided candidate search feasible. Additionally, we utilise the aspect ratio of 2D ellipses fitted to projected silhouettes as an auxiliary global shape signature to accelerate the pose search. This combined strategy forms the first method to efficiently estimate globally optimal pose from just the silhouettes, without being guided by correspondences, for any shape, irrespective of its convexity and genus. We validate our method on synthetic and real examples, demonstrating significantly improved accuracy against comparable approaches. Code, data, and supplementary in: https://agnivsen.github.io/pose-from-silhouette/
IVJul 29, 2025Code
CADS: A Comprehensive Anatomical Dataset and Segmentation for Whole-Body Anatomy in Computed TomographyMurong Xu, Tamaz Amiranashvili, Fernando Navarro et al.
Accurate delineation of anatomical structures in volumetric CT scans is crucial for diagnosis and treatment planning. While AI has advanced automated segmentation, current approaches typically target individual structures, creating a fragmented landscape of incompatible models with varying performance and disparate evaluation protocols. Foundational segmentation models address these limitations by providing a holistic anatomical view through a single model. Yet, robust clinical deployment demands comprehensive training data, which is lacking in existing whole-body approaches, both in terms of data heterogeneity and, more importantly, anatomical coverage. In this work, rather than pursuing incremental optimizations in model architecture, we present CADS, an open-source framework that prioritizes the systematic integration, standardization, and labeling of heterogeneous data sources for whole-body CT segmentation. At its core is a large-scale dataset of 22,022 CT volumes with complete annotations for 167 anatomical structures, representing a significant advancement in both scale and coverage, with 18 times more scans than existing collections and 60% more distinct anatomical targets. Building on this diverse dataset, we develop the CADS-model using established architectures for accessible and automated full-body CT segmentation. Through comprehensive evaluation across 18 public datasets and an independent real-world hospital cohort, we demonstrate advantages over SoTA approaches. Notably, thorough testing of the model's performance in segmentation tasks from radiation oncology validates its direct utility for clinical interventions. By making our large-scale dataset, our segmentation models, and our clinical software tool publicly available, we aim to advance robust AI solutions in radiology and make comprehensive anatomical analysis accessible to clinicians and researchers alike.
CVJan 24, 2020Code
VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT ImagesAnjany Sekuboyina, Malek E. Husseini, Amirhossein Bayat et al.
Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms towards labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The content and code concerning VerSe can be accessed at: https://github.com/anjany/verse.
CVNov 5, 2025
Generalizing Shape-from-Template to Topological ChangesKevin Manogue, Tomasz M Schang, Dilara Kuş et al.
Reconstructing the surfaces of deformable objects from correspondences between a 3D template and a 2D image is well studied under Shape-from-Template (SfT) methods; however, existing approaches break down when topological changes accompany the deformation. We propose a principled extension of SfT that enables reconstruction in the presence of such changes. Our approach is initialized with a classical SfT solution and iteratively adapts the template by partitioning its spatial domain so as to minimize an energy functional that jointly encodes physical plausibility and reprojection consistency. We demonstrate that the method robustly captures a wide range of practically relevant topological events including tears and cuts on bounded 2D surfaces, thereby establishing the first general framework for topological-change-aware SfT. Experiments on both synthetic and real data confirm that our approach consistently outperforms baseline methods.
CVAug 19, 2025
Shape-from-Template with Generalised CameraAgniva Sengupta, Stefan Zachow
This article presents a new method for non-rigidly registering a 3D shape to 2D keypoints observed by a constellation of multiple cameras. Non-rigid registration of a 3D shape to observed 2D keypoints, i.e., Shape-from-Template (SfT), has been widely studied using single images, but SfT with information from multiple-cameras jointly opens new directions for extending the scope of known use-cases such as 3D shape registration in medical imaging and registration from hand-held cameras, to name a few. We represent such multi-camera setup with the generalised camera model; therefore any collection of perspective or orthographic cameras observing any deforming object can be registered. We propose multiple approaches for such SfT: the first approach where the corresponded keypoints lie on a direction vector from a known 3D point in space, the second approach where the corresponded keypoints lie on a direction vector from an unknown 3D point in space but with known orientation w.r.t some local reference frame, and a third approach where, apart from correspondences, the silhouette of the imaged object is also known. Together, these form the first set of solutions to the SfT problem with generalised cameras. The key idea behind SfT with generalised camera is the improved reconstruction accuracy from estimating deformed shape while utilising the additional information from the mutual constraints between multiple views of a deformed object. The correspondence-based approaches are solved with convex programming while the silhouette-based approach is an iterative refinement of the results from the convex solutions. We demonstrate the accuracy of our proposed methods on many synthetic and real data
CVMar 12, 2021
Geodesic B-Score for Improved Assessment of Knee OsteoarthritisFelix Ambellan, Stefan Zachow, Christoph von Tycowicz
Three-dimensional medical imaging enables detailed understanding of osteoarthritis structural status. However, there remains a vast need for automatic, thus, reader-independent measures that provide reliable assessment of subject-specific clinical outcomes. To this end, we derive a consistent generalization of the recently proposed B-score to Riemannian shape spaces. We further present an algorithmic treatment yielding simple, yet efficient computations allowing for analysis of large shape populations with several thousand samples. Our intrinsic formulation exhibits improved discrimination ability over its Euclidean counterpart, which we demonstrate for predictive validity on assessing risks of total knee replacement. This result highlights the potential of the geodesic B-score to enable improved personalized assessment and stratification for interventions.
CVMar 2, 2021
Simulation-to-Real domain adaptation with teacher-student learning for endoscopic instrument segmentationManish Sahu, Anirban Mukhopadhyay, Stefan Zachow
Purpose: Segmentation of surgical instruments in endoscopic videos is essential for automated surgical scene understanding and process modeling. However, relying on fully supervised deep learning for this task is challenging because manual annotation occupies valuable time of the clinical experts. Methods: We introduce a teacher-student learning approach that learns jointly from annotated simulation data and unlabeled real data to tackle the erroneous learning problem of the current consistency-based unsupervised domain adaptation framework. Results: Empirical results on three datasets highlight the effectiveness of the proposed framework over current approaches for the endoscopic instrument segmentation task. Additionally, we provide analysis of major factors affecting the performance on all datasets to highlight the strengths and failure modes of our approach. Conclusion: We show that our proposed approach can successfully exploit the unlabeled real endoscopic video frames and improve generalization performance over pure simulation-based training and the previous state-of-the-art. This takes us one step closer to effective segmentation of surgical tools in the annotation scarce setting.
CVJul 22, 2020
Endo-Sim2Real: Consistency learning-based domain adaptation for instrument segmentationManish Sahu, Ronja Strömsdörfer, Anirban Mukhopadhyay et al.
Surgical tool segmentation in endoscopic videos is an important component of computer assisted interventions systems. Recent success of image-based solutions using fully-supervised deep learning approaches can be attributed to the collection of big labeled datasets. However, the annotation of a big dataset of real videos can be prohibitively expensive and time consuming. Computer simulations could alleviate the manual labeling problem, however, models trained on simulated data do not generalize to real data. This work proposes a consistency-based framework for joint learning of simulated and real (unlabeled) endoscopic data to bridge this performance generalization issue. Empirical results on two data sets (15 videos of the Cholec80 and EndoVis'15 dataset) highlight the effectiveness of the proposed \emph{Endo-Sim2Real} method for instrument segmentation. We compare the segmentation of the proposed approach with state-of-the-art solutions and show that our method improves segmentation both in terms of quality and quantity.
CVOct 27, 2016
Tool and Phase recognition using contextual CNN featuresManish Sahu, Anirban Mukhopadhyay, Angelika Szengel et al.
A transfer learning method for generating features suitable for surgical tools and phase recognition from the ImageNet classification features [1] is proposed here. In addition, methods are developed for generating contextual features and combining them with time series analysis for final classification using multi-class random forest. The proposed pipeline is tested over the training and testing datasets of M2CAI16 challenges: tool and phase detection. Encouraging results are obtained by leave-one-out cross validation evaluation on the training dataset.
CVFeb 26, 2016
Shape-aware Surface Reconstruction from Sparse 3D Point-CloudsFlorian Bernard, Luis Salamanca, Johan Thunberg et al.
The reconstruction of an object's shape or surface from a set of 3D points plays an important role in medical image analysis, e.g. in anatomy reconstruction from tomographic measurements or in the process of aligning intra-operative navigation and preoperative planning data. In such scenarios, one usually has to deal with sparse data, which significantly aggravates the problem of reconstruction. However, medical applications often provide contextual information about the 3D point data that allow to incorporate prior knowledge about the shape that is to be reconstructed. To this end, we propose the use of a statistical shape model (SSM) as a prior for surface reconstruction. The SSM is represented by a point distribution model (PDM), which is associated with a surface mesh. Using the shape distribution that is modelled by the PDM, we formulate the problem of surface reconstruction from a probabilistic perspective based on a Gaussian Mixture Model (GMM). In order to do so, the given points are interpreted as samples of the GMM. By using mixture components with anisotropic covariances that are "oriented" according to the surface normals at the PDM points, a surface-based fitting is accomplished. Estimating the parameters of the GMM in a maximum a posteriori manner yields the reconstruction of the surface from the given data points. We compare our method to the extensively used Iterative Closest Points method on several different anatomical datasets/SSMs (brain, femur, tibia, hip, liver) and demonstrate superior accuracy and robustness on sparse data.