IVApr 12, 2023Code
SAMM (Segment Any Medical Model): A 3D Slicer Integration to SAMYihao Liu, Jiaming Zhang, Zhangcong She et al.
The Segment Anything Model (SAM) is a new image segmentation tool trained with the largest available segmentation dataset. The model has demonstrated that, with prompts, it can create high-quality masks for general images. However, the performance of the model on medical images requires further validation. To assist with the development, assessment, and application of SAM on medical images, we introduce Segment Any Medical Model (SAMM), an extension of SAM on 3D Slicer - an image processing and visualization software extensively used by the medical imaging community. This open-source extension to 3D Slicer and its demonstrations are posted on GitHub (https://github.com/bingogome/samm). SAMM achieves 0.6-second latency of a complete cycle and can infer image masks in nearly real-time.
LGApr 18, 2023
Pelphix: Surgical Phase Recognition from X-ray Images in Percutaneous Pelvic FixationBenjamin D. Killeen, Han Zhang, Jan Mangulabnan et al.
Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater. While SPR based on video sources is well-established, incorporation of interventional X-ray sequences has not yet been explored. This paper presents Pelphix, a first approach to SPR for X-ray-guided percutaneous pelvic fracture fixation, which models the procedure at four levels of granularity -- corridor, activity, view, and frame value -- simulating the pelvic fracture fixation workflow as a Markov process to provide fully annotated training data. Using added supervision from detection of bony corridors, tools, and anatomy, we learn image representations that are fed into a transformer model to regress surgical phases at the four granularity levels. Our approach demonstrates the feasibility of X-ray-based SPR, achieving an average accuracy of 93.8% on simulated sequences and 67.57% in cadaver across all granularity levels, with up to 88% accuracy for the target corridor in real data. This work constitutes the first step toward SPR for the X-ray domain, establishing an approach to categorizing phases in X-ray-guided surgery, simulating realistic image sequences to enable machine learning model development, and demonstrating that this approach is feasible for the analysis of real procedures. As X-ray-based SPR continues to mature, it will benefit procedures in orthopedic surgery, angiography, and interventional radiology by equipping intelligent surgical systems with situational awareness in the operating room.
IVJun 13, 2022
SyntheX: Scaling Up Learning-based X-ray Image Analysis Through In Silico ExperimentsCong Gao, Benjamin D. Killeen, Yicheng Hu et al.
Artificial intelligence (AI) now enables automated interpretation of medical images for clinical use. However, AI's potential use for interventional images (versus those involved in triage or diagnosis), such as for guidance during surgery, remains largely untapped. This is because surgical AI systems are currently trained using post hoc analysis of data collected during live surgeries, which has fundamental and practical limitations, including ethical considerations, expense, scalability, data integrity, and a lack of ground truth. Here, we demonstrate that creating realistic simulated images from human models is a viable alternative and complement to large-scale in situ data collection. We show that training AI image analysis models on realistically synthesized data, combined with contemporary domain generalization or adaptation techniques, results in models that on real data perform comparably to models trained on a precisely matched real data training set. Because synthetic generation of training data from human-based models scales easily, we find that our model transfer paradigm for X-ray image analysis, which we refer to as SyntheX, can even outperform real data-trained models due to the effectiveness of training on a larger dataset. We demonstrate the potential of SyntheX on three clinical tasks: Hip image analysis, surgical robotic tool detection, and COVID-19 lung lesion segmentation. SyntheX provides an opportunity to drastically accelerate the conception, design, and evaluation of intelligent systems for X-ray-based medicine. In addition, simulated image environments provide the opportunity to test novel instrumentation, design complementary surgical approaches, and envision novel techniques that improve outcomes, save time, or mitigate human error, freed from the ethical and practical considerations of live human data collection.
CVJul 18, 2023
Skin Lesion Correspondence Localization in Total Body PhotographyWei-Lun Huang, Davood Tashayyod, Jun Kang et al.
Longitudinal tracking of skin lesions - finding correspondence, changes in morphology, and texture - is beneficial to the early detection of melanoma. However, it has not been well investigated in the context of full-body imaging. We propose a novel framework combining geometric and texture information to localize skin lesion correspondence from a source scan to a target scan in total body photography (TBP). Body landmarks or sparse correspondence are first created on the source and target 3D textured meshes. Every vertex on each of the meshes is then mapped to a feature vector characterizing the geodesic distances to the landmarks on that mesh. Then, for each lesion of interest (LOI) on the source, its corresponding location on the target is first coarsely estimated using the geometric information encoded in the feature vectors and then refined using the texture information. We evaluated the framework quantitatively on both a public and a private dataset, for which our success rates (at 10 mm criterion) are comparable to the only reported longitudinal study. As full-body 3D capture becomes more prevalent and has higher quality, we expect the proposed method to constitute a valuable step in the longitudinal tracking of skin lesions.
IVNov 12, 2025
DualVision ArthroNav: Investigating Opportunities to Enhance Localization and Reconstruction in Image-based Arthroscopy Navigation via External CamerasHongchao Shu, Lalithkumar Seenivasan, Mingxu Liu et al.
Arthroscopic procedures can greatly benefit from navigation systems that enhance spatial awareness, depth perception, and field of view. However, existing optical tracking solutions impose strict workspace constraints and disrupt surgical workflow. Vision-based alternatives, though less invasive, often rely solely on the monocular arthroscope camera, making them prone to drift, scale ambiguity, and sensitivity to rapid motion or occlusion. We propose DualVision ArthroNav, a multi-camera arthroscopy navigation system that integrates an external camera rigidly mounted on the arthroscope. The external camera provides stable visual odometry and absolute localization, while the monocular arthroscope video enables dense scene reconstruction. By combining these complementary views, our system resolves the scale ambiguity and long-term drift inherent in monocular SLAM and ensures robust relocalization. Experiments demonstrate that our system effectively compensates for calibration errors, achieving an average absolute trajectory error of 1.09 mm. The reconstructed scenes reach an average target registration error of 2.16 mm, with high visual fidelity (SSIM = 0.69, PSNR = 22.19). These results indicate that our system provides a practical and cost-efficient solution for arthroscopic navigation, bridging the gap between optical tracking and purely vision-based systems, and paving the way toward clinically deployable, fully vision-based arthroscopic guidance.
CVMar 12, 2024Code
FluoroSAM: A Language-promptable Foundation Model for Flexible X-ray Image SegmentationBenjamin D. Killeen, Liam J. Wang, Blanca Inigo et al.
Language promptable X-ray image segmentation would enable greater flexibility for human-in-the-loop workflows in diagnostic and interventional precision medicine. Prior efforts have contributed task-specific models capable of solving problems within a narrow scope, but expanding to broader use requires additional data, annotations, and training time. Recently, language-aligned foundation models (LFMs) -- machine learning models trained on large amounts of highly variable image and text data thus enabling broad applicability -- have emerged as promising tools for automated image analysis. Existing foundation models for medical image analysis focus on scenarios and modalities where large, richly annotated datasets are available. However, the X-ray imaging modality features highly variable image appearance and applications, from diagnostic chest X-rays to interventional fluoroscopy, with varying availability of data. To pave the way toward an LFM for comprehensive and language-aligned analysis of arbitrary medical X-ray images, we introduce FluoroSAM, a language-promptable variant of the Segment Anything Model, trained from scratch on 3M synthetic X-ray images from a wide variety of human anatomies, imaging geometries, and viewing angles. These include pseudo-ground truth masks for 128 organ types and 464 tools with associated text descriptions. FluoroSAM is capable of segmenting myriad anatomical structures and tools based on natural language prompts, thanks to the novel incorporation of vector quantization (VQ) of text embeddings in the training process. We demonstrate FluoroSAM's performance quantitatively on real X-ray images and showcase on several applications how FluoroSAM is a key enabler for rich human-machine interaction in the X-ray image acquisition and analysis context. Code is available at https://github.com/arcadelab/fluorosam.
CVJul 21, 2024
A Novel Method to Improve Quality Surface Coverage in Multi-View CaptureWei-Lun Huang, Davood Tashayyod, Amir Gandjbakhche et al.
The depth of field of a camera is a limiting factor for applications that require taking images at a short subject-to-camera distance or using a large focal length, such as total body photography, archaeology, and other close-range photogrammetry applications. Furthermore, in multi-view capture, where the target is larger than the camera's field of view, an efficient way to optimize surface coverage captured with quality remains a challenge. Given the 3D mesh of the target object and camera poses, we propose a novel method to derive a focus distance for each camera that optimizes the quality of the covered surface area. We first design an Expectation-Minimization (EM) algorithm to assign points on the mesh uniquely to cameras and then solve for a focus distance for each camera given the associated point set. We further improve the quality surface coverage by proposing a $k$-view algorithm that solves for the points assignment and focus distances by considering multiple views simultaneously. We demonstrate the effectiveness of the proposed method under various simulations for total body photography. The EM and $k$-view algorithms improve the relative cost of the baseline single-view methods by at least $24$% and $28$% respectively, corresponding to increasing the in-focus surface area by roughly $1550$ cm$^2$ and $1780$ cm$^2$. We believe the algorithms can be useful in a number of vision applications that require photogrammetric details but are limited by the depth of field.
CVMar 26, 2024
Segment Any Medical Model ExtendedYihao Liu, Jiaming Zhang, Andres Diaz-Pinto et al.
The Segment Anything Model (SAM) has drawn significant attention from researchers who work on medical image segmentation because of its generalizability. However, researchers have found that SAM may have limited performance on medical images compared to state-of-the-art non-foundation models. Regardless, the community sees potential in extending, fine-tuning, modifying, and evaluating SAM for analysis of medical imaging. An increasing number of works have been published focusing on the mentioned four directions, where variants of SAM are proposed. To this end, a unified platform helps push the boundary of the foundation model for medical images, facilitating the use, modification, and validation of SAM and its variants in medical image segmentation. In this work, we introduce SAMM Extended (SAMME), a platform that integrates new SAM variant models, adopts faster communication protocols, accommodates new interactive modes, and allows for fine-tuning of subcomponents of the models. These features can expand the potential of foundation models like SAM, and the results can be translated to applications such as image-guided therapy, mixed reality interaction, robotic navigation, and data augmentation.
ROMar 24, 2024
Realtime Robust Shape Estimation of Deformable Linear ObjectJiaming Zhang, Zhaomeng Zhang, Yihao Liu et al.
Realtime shape estimation of continuum objects and manipulators is essential for developing accurate planning and control paradigms. The existing methods that create dense point clouds from camera images, and/or use distinguishable markers on a deformable body have limitations in realtime tracking of large continuum objects/manipulators. The physical occlusion of markers can often compromise accurate shape estimation. We propose a robust method to estimate the shape of linear deformable objects in realtime using scattered and unordered key points. By utilizing a robust probability-based labeling algorithm, our approach identifies the true order of the detected key points and then reconstructs the shape using piecewise spline interpolation. The approach only relies on knowing the number of the key points and the interval between two neighboring points. We demonstrate the robustness of the method when key points are partially occluded. The proposed method is also integrated into a simulation in Unity for tracking the shape of a cable with a length of 1m and a radius of 5mm. The simulation results show that our proposed approach achieves an average length error of 1.07% over the continuum's centerline and an average cross-section error of 2.11mm. The real-world experiments of tracking and estimating a heavy-load cable prove that the proposed approach is robust under occlusion and complex entanglement scenarios.
ROMar 7, 2025
dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at ScaleYihao Liu, Yu-Chun Ku, Jiaming Zhang et al.
Data scarcity has long been an issue in the robot learning community. Particularly, in safety-critical domains like surgical applications, obtaining high-quality data can be especially difficult. It poses challenges to researchers seeking to exploit recent advancements in reinforcement learning and imitation learning, which have greatly improved generalizability and enabled robots to conduct tasks autonomously. We introduce dARt Vinci, a scalable data collection platform for robot learning in surgical settings. The system uses Augmented Reality (AR) hand tracking and a high-fidelity physics engine to capture subtle maneuvers in primitive surgical tasks: By eliminating the need for a physical robot setup and providing flexibility in terms of time, space, and hardware resources-such as multiview sensors and actuators-specialized simulation is a viable alternative. At the same time, AR allows the robot data collection to be more egocentric, supported by its body tracking and content overlaying capabilities. Our user study confirms the proposed system's efficiency and usability, where we use widely-used primitive tasks for training teleoperation with da Vinci surgical robots. Data throughput improves across all tasks compared to real robot settings by 41% on average. The total experiment time is reduced by an average of 10%. The temporal demand in the task load survey is improved. These gains are statistically significant. Additionally, the collected data is over 400 times smaller in size, requiring far less storage while achieving double the frequency.
ROMar 21, 2024
A Roadmap Towards Automated and Regulated Robotic SystemsYihao Liu, Mehran Armand
The rapid development of generative technology opens up possibility for higher level of automation, and artificial intelligence (AI) embodiment in robotic systems is imminent. However, due to the blackbox nature of the generative technology, the generation of the knowledge and workflow scheme is uncontrolled, especially in a dynamic environment and a complex scene. This poses challenges to regulations in safety-demanding applications such as medical scenes. We argue that the unregulated generative processes from AI is fitted for low level end tasks, but intervention in the form of manual or automated regulation should happen post-workflow-generation and pre-robotic-execution. To address this, we propose a roadmap that can lead to fully automated and regulated robotic systems. In this paradigm, the high level policies are generated as structured graph data, enabling regulatory oversight and reusability, while the code base for lower level tasks is generated by generative models. Our approach aims the transitioning from expert knowledge to regulated action, akin to the iterative processes of study, practice, scrutiny, and execution in human tasks. We identify the generative and deterministic processes in a design cycle, where generative processes serve as a text-based world simulator and the deterministic processes generate the executable system. We propose State Machine Seralization Language (SMSL) to be the conversion point between text simulator and executable workflow control. From there, we analyze the modules involved based on the current literature, and discuss human in the loop. As a roadmap, this work identifies the current possible implementation and future work. This work does not provide an implemented system but envisions to inspire the researchers working on the direction in the roadmap. We implement the SMSL and D-SFO paradigm that serve as the starting point of the roadmap.
ROMar 7, 2025
Look Before You Leap: Using Serialized State Machine for Language Conditioned Robotic ManipulationTong Mu, Yihao Liu, Mehran Armand
Imitation learning frameworks for robotic manipulation have drawn attention in the recent development of language model grounded robotics. However, the success of the frameworks largely depends on the coverage of the demonstration cases: When the demonstration set does not include examples of how to act in all possible situations, the action may fail and can result in cascading errors. To solve this problem, we propose a framework that uses serialized Finite State Machine (FSM) to generate demonstrations and improve the success rate in manipulation tasks requiring a long sequence of precise interactions. To validate its effectiveness, we use environmentally evolving and long-horizon puzzles that require long sequential actions. Experimental results show that our approach achieves a success rate of up to 98 in these tasks, compared to the controlled condition using existing approaches, which only had a success rate of up to 60, and, in some tasks, almost failed completely.
HCMar 9
Extend Your Horizon: A Device-Agnostic Surgical Tool Tracking Framework with Multi-View Optimization for Augmented RealityJiaming Zhang, Mingxu Liu, Hongchao Shu et al.
Surgical navigation provides real-time guidance by estimating the pose of patient anatomy and surgical instruments to visualize relevant intraoperative information. In conventional systems, instruments are typically tracked using fiducial markers and stationary optical tracking systems (OTS). Augmented reality (AR) has further enabled intuitive visualization and motivated tracking using sensors embedded in head-mounted displays (HMDs). However, most existing approaches rely on a clear line of sight, which is difficult to maintain in dynamic operating room environments due to frequent occlusions caused by equipment, surgical tools, and personnel. This work introduces a framework for tracking surgical instruments under occlusion by fusing multiple sensing modalities within a dynamic scene graph representation. The proposed approach integrates tracking systems with different accuracy levels and motion characteristics while estimating tracking reliability in real time. Experimental results demonstrate improved robustness and enhanced consistency of AR visualization in the presence of occlusions.
CVMay 22, 2025
A Shape-Aware Total Body Photography System for In-focus Surface Coverage OptimizationWei-Lun Huang, Joshua Liu, Davood Tashayyod et al.
Total Body Photography (TBP) is becoming a useful screening tool for patients at high risk for skin cancer. While much progress has been made, existing TBP systems can be further improved for automatic detection and analysis of suspicious skin lesions, which is in part related to the resolution and sharpness of acquired images. This paper proposes a novel shape-aware TBP system automatically capturing full-body images while optimizing image quality in terms of resolution and sharpness over the body surface. The system uses depth and RGB cameras mounted on a 360-degree rotary beam, along with 3D body shape estimation and an in-focus surface optimization method to select the optimal focus distance for each camera pose. This allows for optimizing the focused coverage over the complex 3D geometry of the human body given the calibrated camera poses. We evaluate the effectiveness of the system in capturing high-fidelity body images. The proposed system achieves an average resolution of 0.068 mm/pixel and 0.0566 mm/pixel with approximately 85% and 95% of surface area in-focus, evaluated on simulation data of diverse body shapes and poses as well as a real scan of a mannequin respectively. Furthermore, the proposed shape-aware focus method outperforms existing focus protocols (e.g. auto-focus). We believe the high-fidelity imaging enabled by the proposed system will improve automated skin lesion analysis for skin cancer screening.
IVApr 3, 2025
Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-ray: Summary of the PENGWIN 2024 ChallengeYudi Sang, Yanzhen Liu, Sutuke Yibulayimu et al.
The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm attained an IoU of 0.774, highlighting the greater challenges posed by overlapping anatomical structures. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.
CVDec 10, 2024
Revisiting Lesion Tracking in 3D Total Body PhotographyWei-Lun Huang, Minghao Xue, Zhiyou Liu et al.
Melanoma is the most deadly form of skin cancer. Tracking the evolution of nevi and detecting new lesions across the body is essential for the early detection of melanoma. Despite prior work on longitudinal tracking of skin lesions in 3D total body photography, there are still several challenges, including 1) low accuracy for finding correct lesion pairs across scans, 2) sensitivity to noisy lesion detection, and 3) lack of large-scale datasets with numerous annotated lesion pairs. We propose a framework that takes in a pair of 3D textured meshes, matches lesions in the context of total body photography, and identifies unmatchable lesions. We start by computing correspondence maps bringing the source and target meshes to a template mesh. Using these maps to define source/target signals over the template domain, we construct a flow field aligning the mapped signals. The initial correspondence maps are then refined by advecting forward/backward along the vector field. Finally, lesion assignment is performed using the refined correspondence maps. We propose the first large-scale dataset for skin lesion tracking with 25K lesion pairs across 198 subjects. The proposed method achieves a success rate of 89.9% (at 10 mm criterion) for all pairs of annotated lesions and a matching accuracy of 98.2% for subjects with more than 200 lesions.
CVAug 4, 2021
The Impact of Machine Learning on 2D/3D Registration for Image-guided Interventions: A Systematic Review and PerspectiveMathias Unberath, Cong Gao, Yicheng Hu et al.
Image-based navigation is widely considered the next frontier of minimally invasive surgery. It is believed that image-based navigation will increase the access to reproducible, safe, and high-precision surgery as it may then be performed at acceptable costs and effort. This is because image-based techniques avoid the need of specialized equipment and seamlessly integrate with contemporary workflows. Further, it is expected that image-based navigation will play a major role in enabling mixed reality environments and autonomous, robotic workflows. A critical component of image guidance is 2D/3D registration, a technique to estimate the spatial relationships between 3D structures, e.g., volumetric imagery or tool models, and 2D images thereof, such as fluoroscopy or endoscopy. While image-based 2D/3D registration is a mature technique, its transition from the bench to the bedside has been restrained by well-known challenges, including brittleness of the optimization objective, hyperparameter selection, and initialization, difficulties around inconsistencies or multiple objects, and limited single-view performance. One reason these challenges persist today is that analytical solutions are likely inadequate considering the complexity, variability, and high-dimensionality of generic 2D/3D registration problems. The recent advent of machine learning-based approaches to imaging problems that, rather than specifying the desired functional mapping, approximate it using highly expressive parametric models holds promise for solving some of the notorious challenges in 2D/3D registration. In this manuscript, we review the impact of machine learning on 2D/3D registration to systematically summarize the recent advances made by introduction of this novel technology. Grounded in these insights, we then offer our perspective on the most pressing needs, significant open problems, and possible next steps.
ROJan 12, 2021
A Robotic System for Implant Modification in Single-stage CranioplastyShuya Liu, Wei-Lun Huang, Chad Gordon et al.
Craniomaxillofacial reconstruction with patient-specific customized craniofacial implants (CCIs) is most commonly performed for large-sized skeletal defects. Because the exact size of skull resection may not be known prior to the surgery, in the single-stage cranioplasty, a large CCI is prefabricated and resized intraoperatively with a manual-cutting process provided by a surgeon. The manual resizing, however, may be inaccurate and significantly add to the operating time. This paper introduces a fast and non-contact approach for intraoperatively determining the exact contour of the skull resection and automatically resizing the implant to fit the resection area. Our approach includes four steps: First, a patient's defect information is acquired by a 3D scanner. Second, the scanned defect is aligned to the CCI by registering the scanned defect to the reconstructed CT model. Third, a cutting toolpath is generated from the contour of the scanned defect. Lastly, the large CCI is resized by a cutting robot to fit the resection area according to the given toolpath. To evaluate the resizing performance of our method, six different resection shapes were used in the cutting experiments. We compared the performance of our method to the performances of surgeon's manual resizing and an existing technique which collects the defect contour with an optical tracking system and projects the contour on the CCI to guide the manual modification. The results show that our proposed method improves the resizing accuracy by 56% compared to the surgeon's manual modification and 42% compared to the projection method.
ROMay 5, 2020
A Versatile Data-Driven Framework for Model-Independent Control of Continuum Manipulators Interacting With Obstructed Environments With Unknown Geometry and StiffnessFarshid Alambeigi, Zerui Wang, Yun-Hui Liu et al.
This paper addresses the problem of controlling a continuum manipulator (CM) in free or obstructed environments with no prior knowledge about the deformation behavior of the CM and the stiffness and geometry of the interacting obstructed environment. We propose a versatile data-driven priori-model-independent (PMI) control framework, in which various control paradigms (e.g. CM's position or shape control) can be defined based on the provided feedback. This optimal iterative algorithm learns the deformation behavior of the CM in interaction with an unknown environment, in real time, and then accomplishes the defined control objective. To evaluate the scalability of the proposed framework, we integrated two different CMs, designed for medical applications, with the da Vinci Research Kit (dVRK). The performance and learning capability of the framework was investigated in 11 sets of experiments including PMI position and shape control in free and unknown obstructed environments as well as during manipulation of an unknown deformable object. We also evaluated the performance of our algorithm in an ex-vivo experiment with a lamb heart.The theoretical and experimental results demonstrate the adaptivity, versatility, and accuracy of the proposed framework and, therefore, its suitability for a variety of applications involving continuum manipulators.
CVMar 24, 2020
Generalizing Spatial Transformers to Projective Geometry with Applications to 2D/3D RegistrationCong Gao, Xingtong Liu, Wenhao Gu et al.
Differentiable rendering is a technique to connect 3D scenes with corresponding 2D images. Since it is differentiable, processes during image formation can be learned. Previous approaches to differentiable rendering focus on mesh-based representations of 3D scenes, which is inappropriate for medical applications where volumetric, voxelized models are used to represent anatomy. We propose a novel Projective Spatial Transformer module that generalizes spatial transformers to projective geometry, thus enabling differentiable volume rendering. We demonstrate the usefulness of this architecture on the example of 2D/3D registration between radiographs and CT scans. Specifically, we show that our transformer enables end-to-end learning of an image processing and projection model that approximates an image similarity function that is convex with respect to the pose parameters, and can thus be optimized effectively using conventional gradient descent. To the best of our knowledge, this is the first time that spatial transformers have been described for projective geometry. The source code will be made public upon publication of this manuscript and we hope that our developments will benefit related 3D research applications.
IVMar 5, 2020
From Perspective X-ray Imaging to Parallax-Robust Orthographic StitchingJavad Fotouhi, Xingtong Liu, Mehran Armand et al.
Stitching images acquired under perspective projective geometry is a relevant topic in computer vision with multiple applications ranging from smartphone panoramas to the construction of digital maps. Image stitching is an equally prominent challenge in medical imaging, where the limited field-of-view captured by single images prohibits holistic analysis of patient anatomy. The barrier that prevents straight-forward mosaicing of 2D images is depth mismatch due to parallax. In this work, we leverage the Fourier slice theorem to aggregate information from multiple transmission images in parallax-free domains using fundamental principles of X-ray image formation. The semantics of the stitched image are restored using a novel deep learning strategy that exploits similarity measures designed around frequency, as well as dense and sparse spatial image content. Our pipeline, not only stitches images, but also provides orthographic reconstruction that enables metric measurements of clinically relevant quantities directly on the 2D image plane.
CVMar 4, 2020
Exploring Partial Intrinsic and Extrinsic Symmetry in 3D Medical ImagingJavad Fotouhi, Giacomo Taylor, Mathias Unberath et al.
We present a novel methodology to detect imperfect bilateral symmetry in CT of human anatomy. In this paper, the structurally symmetric nature of the pelvic bone is explored and is used to provide interventional image augmentation for treatment of unilateral fractures in patients with traumatic injuries. The mathematical basis of our solution is on the incorporation of attributes and characteristics that satisfy the properties of intrinsic and extrinsic symmetry and are robust to outliers. In the first step, feature points that satisfy intrinsic symmetry are automatically detected in the Möbius space defined on the CT data. These features are then pruned via a two-stage RANSAC to attain correspondences that satisfy also the extrinsic symmetry. Then, a disparity function based on Tukey's biweight robust estimator is introduced and minimized to identify a symmetry plane parametrization that yields maximum contralateral similarity. Finally, a novel regularization term is introduced to enhance similarity between bone density histograms across the partial symmetry plane, relying on the important biological observation that, even if injured, the dislocated bone segments remain within the body. Our extensive evaluations on various cases of common fracture types demonstrate the validity of the novel concepts and the robustness and accuracy of the proposed method.
CVMar 4, 2020
Spatiotemporal-Aware Augmented Reality: Redefining HCI in Image-Guided TherapyJavad Fotouhi, Arian Mehrfard, Tianyu Song et al.
Suboptimal interaction with patient data and challenges in mastering 3D anatomy based on ill-posed 2D interventional images are essential concerns in image-guided therapies. Augmented reality (AR) has been introduced in the operating rooms in the last decade; however, in image-guided interventions, it has often only been considered as a visualization device improving traditional workflows. As a consequence, the technology is gaining minimum maturity that it requires to redefine new procedures, user interfaces, and interactions. The main contribution of this paper is to reveal how exemplary workflows are redefined by taking full advantage of head-mounted displays when entirely co-registered with the imaging system at all times. The proposed AR landscape is enabled by co-localizing the users and the imaging devices via the operating room environment and exploiting all involved frustums to move spatial information between different bodies. The awareness of the system from the geometric and physical characteristics of X-ray imaging allows the redefinition of different human-machine interfaces. We demonstrate that this AR paradigm is generic, and can benefit a wide variety of procedures. Our system achieved an error of $4.76\pm2.91$ mm for placing K-wire in a fracture management procedure, and yielded errors of $1.57\pm1.16^\circ$ and $1.46\pm1.00^\circ$ in the abduction and anteversion angles, respectively, for total hip arthroplasty. We hope that our holistic approach towards improving the interface of surgery not only augments the surgeon's capabilities but also augments the surgical team's experience in carrying out an effective intervention with reduced complications and provide novel approaches of documenting procedures for training purposes.
CVNov 16, 2019
Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D RegistrationRobert Grupp, Mathias Unberath, Cong Gao et al.
Fluoroscopy is the standard imaging modality used to guide hip surgery and is therefore a natural sensor for computer-assisted navigation. In order to efficiently solve the complex registration problems presented during navigation, human-assisted annotations of the intraoperative image are typically required. This manual initialization interferes with the surgical workflow and diminishes any advantages gained from navigation. We propose a method for fully automatic registration using annotations produced by a neural network. Neural networks are trained to simultaneously segment anatomy and identify landmarks in fluoroscopy. Training data is obtained using an intraoperatively incompatible 2D/3D registration of hip anatomy. Ground truth 2D labels are established using projected 3D annotations. Intraoperative registration couples an intensity-based strategy with annotations inferred by the network and requires no human assistance. Ground truth labels were obtained in 366 fluoroscopic images across 6 cadaveric specimens. In a leave-one-subject-out experiment, networks obtained mean dice coefficients for left and right hemipelves, left and right femurs of 0.86, 0.87, 0.90, and 0.84. The mean 2D landmark error was 5.0 mm. The pelvis was registered within 1 degree for 86% of the images when using the proposed intraoperative approach with an average runtime of 7 seconds. In comparison, an intensity-only approach without manual initialization, registered the pelvis to 1 degree in 18% of images. We have created the first accurately annotated, non-synthetic, dataset of hip fluoroscopy. By using these annotations as training data for neural networks, state of the art performance in fluoroscopic segmentation and landmark localization was achieved. Integrating these annotations allows for a robust, fully automatic, and efficient intraoperative registration during fluoroscopic navigation of the hip.
CVOct 22, 2019
Fast and Automatic Periacetabular Osteotomy Fragment Pose Estimation Using Intraoperatively Implanted Fiducials and Single-View FluoroscopyRobert Grupp, Ryan Murphy, Rachel Hegeman et al.
Accurate and consistent mental interpretation of fluoroscopy to determine the position and orientation of acetabular bone fragments in 3D space is difficult. We propose a computer assisted approach that uses a single fluoroscopic view and quickly reports the pose of an acetabular fragment without any user input or initialization. Intraoperatively, but prior to any osteotomies, two constellations of metallic ball-bearings (BBs) are injected into the wing of a patient's ilium and lateral superior pubic ramus. One constellation is located on the expected acetabular fragment, and the other is located on the remaining, larger, pelvis fragment. The 3D locations of each BB are reconstructed using three fluoroscopic views and 2D/3D registrations to a preoperative CT scan of the pelvis. The relative pose of the fragment is established by estimating the movement of the two BB constellations using a single fluoroscopic view taken after osteotomy and fragment relocation. BB detection and inter-view correspondences are automatically computed throughout the processing pipeline. The proposed method was evaluated on a multitude of fluoroscopic images collected from six cadaveric surgeries performed bilaterally on three specimens. Mean fragment rotation error was 2.4 +/- 1.0 degrees, mean translation error was 2.1 +/- 0.6 mm, and mean 3D lateral center edge angle error was 1.0 +/- 0.5 degrees. The average runtime of the single-view pose estimation was 0.7 +/- 0.2 seconds. The proposed method demonstrates accuracy similar to other state of the art systems which require optical tracking systems or multiple-view 2D/3D registrations with manual input. The errors reported on fragment poses and lateral center edge angles are within the margins required for accurate intraoperative evaluation of femoral head coverage.
CVSep 23, 2019
Pelvis Surface Estimation From Partial CT for Computer-Aided Pelvic OsteotomiesRobert Grupp, Yoshito Otake, Ryan Murphy et al.
Computer-aided surgical systems commonly use preoperative CT scans when performing pelvic osteotomies for intraoperative navigation. These systems have the potential to improve the safety and accuracy of pelvic osteotomies, however, exposing the patient to radiation is a significant drawback. In order to reduce radiation exposure, we propose a new smooth extrapolation method leveraging a partial pelvis CT and a statistical shape model (SSM) of the full pelvis in order to estimate a patient's complete pelvis. A SSM of normal, complete, female pelvis anatomy was created and evaluated from 42 subjects. A leave-one-out test was performed to characterise the inherent generalisation capability of the SSM. An additional leave-one-out test was conducted to measure performance of the smooth extrapolation method and an existing "cut-and-paste" extrapolation method. Unknown anatomy was simulated by keeping the axial slices of the patient's acetabulum intact and varying the amount of the superior iliac crest retained; from 0% to 15% of the total pelvis extent. The smooth technique showed an average improvement over the cut-and-paste method of 1.31 mm and 3.61 mm, in RMS and maximum surface error, respectively. With 5% of the iliac crest retained, the smoothly estimated surface had an RMS surface error of 2.21 mm, an improvement of 1.25 mm when retaining none of the iliac crest. This anatomical estimation method creates the possibility of a patient and surgeon benefiting from the use of a CAS system and simultaneously reducing the patient's radiation exposure.
CVSep 23, 2019
Patch-Based Image Similarity for Intraoperative 2D/3D Pelvis Registration During Periacetabular OsteotomyRobert Grupp, Mehran Armand, Russell Taylor
Periacetabular osteotomy is a challenging surgical procedure for treating developmental hip dysplasia, providing greater coverage of the femoral head via relocation of a patient's acetabulum. Since fluoroscopic imaging is frequently used in the surgical workflow, computer-assisted X-Ray navigation of osteotomes and the relocated acetabular fragment should be feasible. We use intensity-based 2D/3D registration to estimate the pelvis pose with respect to fluoroscopic images, recover relative poses of multiple views, and triangulate landmarks which may be used for navigation. Existing similarity metrics are unable to consistently account for the inherent mismatch between the preoperative intact pelvis, and the intraoperative reality of a fractured pelvis. To mitigate the effect of this mismatch, we continuously estimate the relevance of each pixel to solving the registration and use these values as weightings in a patch-based similarity metric. Limiting computation to randomly selected subsets of patches results in faster runtimes than existing patch-based methods. A simulation study was conducted with random fragment shapes, relocations, and fluoroscopic views, and the proposed method achieved a 1.7 mm mean triangulation error over all landmarks, compared to mean errors of 3 mm and 2.8 mm for the non-patched and image-intensity-variance-weighted patch similarity metrics, respectively.
CVSep 23, 2019
Smooth Extrapolation of Unknown Anatomy via Statistical Shape ModelsRobert Grupp, Hsin-Hong Chiang, Yoshito Otake et al.
Several methods to perform extrapolation of unknown anatomy were evaluated. The primary application is to enhance surgical procedures that may use partial medical images or medical images of incomplete anatomy. Le Fort-based, face-jaw-teeth transplant is one such procedure. From CT data of 36 skulls and 21 mandibles separate Statistical Shape Models of the anatomical surfaces were created. Using the Statistical Shape Models, incomplete surfaces were projected to obtain complete surface estimates. The surface estimates exhibit non-zero error in regions where the true surface is known; it is desirable to keep the true surface and seamlessly merge the estimated unknown surface. Existing extrapolation techniques produce non-smooth transitions from the true surface to the estimated surface, resulting in additional error and a less aesthetically pleasing result. The three extrapolation techniques evaluated were: copying and pasting of the surface estimate (non-smooth baseline), a feathering between the patient surface and surface estimate, and an estimate generated via a Thin Plate Spline trained from displacements between the surface estimate and corresponding vertices of the known patient surface. Feathering and Thin Plate Spline approaches both yielded smooth transitions. However, feathering corrupted known vertex values. Leave-one-out analyses were conducted, with 5% to 50% of known anatomy removed from the left-out patient and estimated via the proposed approaches. The Thin Plate Spline approach yielded smaller errors than the other two approaches, with an average vertex error improvement of 1.46 mm and 1.38 mm for the skull and mandible respectively, over the baseline approach.
ROAug 12, 2019
Learning to Detect Collisions for Continuum Manipulators without a Prior ModelShahriar Sefati, Shahin Sefati, Iulian Iordachita et al.
Due to their flexibility, dexterity, and compact size, Continuum Manipulators (CMs) can enhance minimally invasive interventions. In these procedures, the CM may be operated in proximity of sensitive organs; therefore, requiring accurate and appropriate feedback when colliding with their surroundings. Conventional CM collision detection algorithms rely on a combination of exact CM constrained kinematics model, geometrical assumptions such as constant curvature behavior, a priori knowledge of the environmental constraint geometry, and/or additional sensors to scan the environment or sense contacts. In this paper, we propose a data-driven machine learning approach using only the available sensory information, without requiring any prior geometrical assumptions, model of the CM or the surrounding environment. The proposed algorithm is implemented and evaluated on a non-constant curvature CM, equipped with Fiber Bragg Grating (FBG) optical sensors for shape sensing purposes. Results demonstrate successful detection of collisions in constrained environments with soft and hard obstacles with unknown stiffness and location.
ROJul 23, 2019
Reflective-AR Display: An Interaction Methodology for Virtual-Real Alignment in Medical RoboticsJavad Fotouhi, Tianyu Song, Arian Mehrfard et al.
Robot-assisted minimally invasive surgery has shown to improve patient outcomes, as well as reduce complications and recovery time for several clinical applications. While increasingly configurable robotic arms can maximize reach and avoid collisions in cluttered environments, positioning them appropriately during surgery is complicated because safety regulations prevent automatic driving. We propose a head-mounted display (HMD) based augmented reality (AR) system designed to guide optimal surgical arm set up. The staff equipped with HMD aligns the robot with its planned virtual counterpart. In this user-centric setting, the main challenge is the perspective ambiguities hindering such collaborative robotic solution. To overcome this challenge, we introduce a novel registration concept for intuitive alignment of AR content to its physical counterpart by providing a multi-view AR experience via reflective-AR displays that simultaneously show the augmentations from multiple viewpoints. Using this system, users can visualize different perspectives while actively adjusting the pose to determine the registration transformation that most closely superimposes the virtual onto the real. The experimental results demonstrate improvement in the interactive alignment of a virtual and real robot when using a reflective-AR display. We also present measurements from configuring a robotic manipulator in a simulated trocar placement surgery using the AR guidance methodology.
CVMar 22, 2019
Pose Estimation of Periacetabular Osteotomy Fragments with Intraoperative X-Ray NavigationRobert B. Grupp, Rachel A. Hegeman, Ryan J. Murphy et al.
Objective: State of the art navigation systems for pelvic osteotomies use optical systems with external fiducials. We propose the use of X-Ray navigation for pose estimation of periacetabular fragments without fiducials. Methods: A 2D/3D registration pipeline was developed to recover fragment pose. This pipeline was tested through an extensive simulation study and 6 cadaveric surgeries. Using osteotomy boundaries in the fluoroscopic images, the preoperative plan is refined to more accurately match the intraoperative shape. Results: In simulation, average fragment pose errors were 1.3°/1.7 mm when the planned fragment matched the intraoperative fragment, 2.2°/2.1 mm when the plan was not updated to match the true shape, and 1.9°/2.0 mm when the fragment shape was intraoperatively estimated. In cadaver experiments, the average pose errors were 2.2°/2.2 mm, 3.8°/2.5 mm, and 3.5°/2.2 mm when registering with the actual fragment shape, a preoperative plan, and an intraoperatively refined plan, respectively. Average errors of the lateral center edge angle were less than 2° for all fragment shapes in simulation and cadaver experiments. Conclusion: The proposed pipeline is capable of accurately reporting femoral head coverage within a range clinically identified for long-term joint survivability. Significance: Human interpretation of fragment pose is challenging and usually restricted to rotation about a single anatomical axis. The proposed pipeline provides an intraoperative estimate of rigid pose with respect to all anatomical axes, is compatible with minimally invasive incisions, and has no dependence on external fiducials.
CVJan 20, 2019
Localizing dexterous surgical tools in X-ray for image-based navigationCong Gao, Mathias Unberath, Russell Taylor et al.
X-ray image based surgical tool navigation is fast and supplies accurate images of deep seated structures. Typically, recovering the 6 DOF rigid pose and deformation of tools with respect to the X-ray camera can be accurately achieved through intensity-based 2D/3D registration of 3D images or models to 2D X-rays. However, the capture range of image-based 2D/3D registration is inconveniently small suggesting that automatic and robust initialization strategies are of critical importance. This manuscript describes a first step towards leveraging semantic information of the imaged object to initialize 2D/3D registration within the capture range of image-based registration by performing concurrent segmentation and localization of dexterous surgical tools in X-ray images. We presented a learning-based strategy to simultaneously localize and segment dexterous surgical tools in X-ray images and demonstrate promising performance on synthetic and ex vivo data. We currently investigate methods to use semantic information extracted by the proposed network to reliably and robustly initialize image-based 2D/3D registration. While image-based 2D/3D registration has been an obvious focus of the CAI community, robust initialization thereof (albeit critical) has largely been neglected. This manuscript discusses learning-based retrieval of semantic information on imaged-objects as a stepping stone for such initialization and may therefore be of interest to the IPCAI community. Since results are still preliminary and only focus on localization, we target the Long Abstract category.
RODec 20, 2018
FBG-Based Position Estimation of Highly Deformable Continuum Manipulators: Model-Dependent vs. Data-Driven ApproachesShahriar Sefati, Rachel Hegeman, Farshid Alambeigi et al.
Conventional shape sensing techniques using Fiber Bragg Grating (FBG) involve finding the curvature at discrete FBG active areas and integrating curvature over the length of the continuum dexterous manipulator (CDM) for tip position estimation (TPE). However, due to limited number of sensing locations and many geometrical assumptions, these methods are prone to large error propagation especially when the CDM undergoes large deflections. In this paper, we study the complications of using the conventional TPE methods that are dependent on sensor model and propose a new data-driven method that overcomes these challenges. The proposed method consists of a regression model that takes FBG wavelength raw data as input and directly estimates the CDM's tip position. This model is pre-operatively (off-line) trained on position information from optical trackers/cameras (as the ground truth) and it intra-operatively (on-line) estimates CDM tip position using only the FBG wavelength data. The method's performance is evaluated on a CDM developed for orthopedic applications, and the results are compared to conventional model-dependent methods during large deflection bendings. Mean absolute TPE error (and standard deviation) of 1.52 (0.67) mm and 0.11 (0.1) mm with maximum absolute errors of 3.63 mm and 0.62 mm for the conventional and the proposed data-driven techniques were obtained, respectively. These results demonstrate a significant out-performance of the proposed data-driven approach versus the conventional estimation technique.
ROJul 1, 2018
FBG-Based Control of a Continuum Manipulator Interacting With ObstaclesShahriar Sefati, Ryan Murphy, Farshid Alambeigi et al.
Tracking and controlling the shape of continuum dexterous manipulators (CDM) in constraint environments is a challenging task. The imposed constraints and interaction with unknown obstacles may conform the CDM's shape and therefore demands for shape sensing methods which do not rely on direct line of sight. To address these issues, we integrate a novel Fiber Bragg Grating (FBG) shape sensing unit into a CDM, reconstruct the shape in real-time, and develop an optimization-based control algorithm using FBG tip position feedback. The CDM is designed for less-invasive treatment of osteolysis (bone degradation). To evaluate the performance of the feedback control algorithm when the CDM interacts with obstacles, we perform a set of experiments similar to the real scenario of the CDM interaction with soft and hard lesions during the treatment of osteolysis. In addition, we propose methods for identification of the CDM collisions with soft or hard obstacles using the jacobian information. Results demonstrate successful control of the CDM tip based on the FBG feedback and indicate repeatability and robustness of the proposed method when interacting with unknown obstacles.
ROJun 30, 2018
Inroads Toward Robot-Assisted Internal Fixation of Bone Fractures Using a Bendable Medical Screw and the Curved Drilling TechniqueFarshid Alambeigi, Mahsan Bakhtiarinejad, Armina Azizi et al.
Internal fixation is a common orthopedic procedure in which a rigid screw is used to fix fragments of a fractured bone together and expedite the healing process. However, the rigidity of the screw, geometry of the fractured anatomy (e.g. femur and pelvis), and patient age can cause an array of complications during screw placement, such as improper fracture healing due to misalignment of the bone fragments, lengthy procedure time and subsequently high radiation exposure. To address these issues, we propose a minimally invasive robot-assisted procedure comprising of a continuum robot, called ortho-snake, together with a novel bendable medical screw (BMS) for fixating the fractures. We describe the implementation of a curved drilling technique and focus on the design, manufacturing, and evaluation of a novel BMS, which can passively morph into the drilled curved tunnels with various curvatures. We evaluate the performance and efficacy of the proposed BMS using both finite element simulations as well as experiments conducted on synthetic bone samples.
CVJun 22, 2018
Augmented Reality-based Feedback for Technician-in-the-loop C-arm RepositioningMathias Unberath, Javad Fotouhi, Jonas Hajek et al.
Interventional C-arm imaging is crucial to percutaneous orthopedic procedures as it enables the surgeon to monitor the progress of surgery on the anatomy level. Minimally invasive interventions require repeated acquisition of X-ray images from different anatomical views to verify tool placement. Achieving and reproducing these views often comes at the cost of increased surgical time and radiation dose to both patient and staff. This work proposes a marker-free "technician-in-the-loop" Augmented Reality (AR) solution for C-arm repositioning. The X-ray technician operating the C-arm interventionally is equipped with a head-mounted display capable of recording desired C-arm poses in 3D via an integrated infrared sensor. For C-arm repositioning to a particular target view, the recorded C-arm pose is restored as a virtual object and visualized in an AR environment, serving as a perceptual reference for the technician. We conduct experiments in a setting simulating orthopedic trauma surgery. Our proof-of-principle findings indicate that the proposed system can decrease the 2.76 X-ray images required per desired view down to zero, suggesting substantial reductions of radiation dose during C-arm repositioning. The proposed AR solution is a first step towards facilitating communication between the surgeon and the surgical staff, improving the quality of surgical image acquisition, and enabling context-aware guidance for surgery rooms of the future. The concept of technician-in-the-loop design will become relevant to various interventions considering the expected advancements of sensing and wearable computing in the near future.
CVApr 9, 2018
Exploiting Partial Structural Symmetry For Patient-Specific Image Augmentation in Trauma InterventionsJavad Fotouhi, Mathias Unberath, Giacomo Taylor et al.
In unilateral pelvic fracture reductions, surgeons attempt to reconstruct the bone fragments such that bilateral symmetry in the bony anatomy is restored. We propose to exploit this "structurally symmetric" nature of the pelvic bone, and provide intra-operative image augmentation to assist the surgeon in repairing dislocated fragments. The main challenge is to automatically estimate the desired plane of symmetry within the patient's pre-operative CT. We propose to estimate this plane using a non-linear optimization strategy, by minimizing Tukey's biweight robust estimator, relying on the partial symmetry of the anatomy. Moreover, a regularization term is designed to enforce the similarity of bone density histograms on both sides of this plane, relying on the biological fact that, even if injured, the dislocated bone segments remain within the body. The experimental results demonstrate the performance of the proposed method in estimating this "plane of partial symmetry" using CT images of both healthy and injured anatomy. Examples of unilateral pelvic fractures are used to show how intra-operative X-ray images could be augmented with the forward-projections of the mirrored anatomy, acting as objective road-map for fracture reduction procedures.
CVMar 22, 2018
Closing the Calibration Loop: An Inside-out-tracking Paradigm for Augmented Reality in Orthopedic SurgeryJonas Hajek, Mathias Unberath, Javad Fotouhi et al.
In percutaneous orthopedic interventions the surgeon attempts to reduce and fixate fractures in bony structures. The complexity of these interventions arises when the surgeon performs the challenging task of navigating surgical tools percutaneously only under the guidance of 2D interventional X-ray imaging. Moreover, the intra-operatively acquired data is only visualized indirectly on external displays. In this work, we propose a flexible Augmented Reality (AR) paradigm using optical see-through head mounted displays. The key technical contribution of this work includes the marker-less and dynamic tracking concept which closes the calibration loop between patient, C-arm and the surgeon. This calibration is enabled using Simultaneous Localization and Mapping of the environment of the operating theater. In return, the proposed solution provides in situ visualization of pre- and intra-operative 3D medical data directly at the surgical site. We demonstrate pre-clinical evaluation of a prototype system, and report errors for calibration and target registration. Finally, we demonstrate the usefulness of the proposed inside-out tracking system in achieving "bull's eye" view for C-arm-guided punctures. This AR solution provides an intuitive visualization of the anatomy and can simplify the hand-eye coordination for the orthopedic surgeon.
CVMar 22, 2018
X-ray-transform Invariant Anatomical Landmark Detection for Pelvic Trauma SurgeryBastian Bier, Mathias Unberath, Jan-Nico Zaech et al.
X-ray image guidance enables percutaneous alternatives to complex procedures. Unfortunately, the indirect view onto the anatomy in addition to projective simplification substantially increase the task-load for the surgeon. Additional 3D information such as knowledge of anatomical landmarks can benefit surgical decision making in complicated scenarios. Automatic detection of these landmarks in transmission imaging is challenging since image-domain features characteristic to a certain landmark change substantially depending on the viewing direction. Consequently and to the best of our knowledge, the above problem has not yet been addressed. In this work, we present a method to automatically detect anatomical landmarks in X-ray images independent of the viewing direction. To this end, a sequential prediction framework based on convolutional layers is trained on synthetically generated data of the pelvic anatomy to predict 23 landmarks in single X-ray images. View independence is contingent on training conditions and, here, is achieved on a spherical segment covering (120 x 90) degrees in LAO/RAO and CRAN/CAUD, respectively, centered around AP. On synthetic data, the proposed approach achieves a mean prediction error of 5.6 +- 4.5 mm. We demonstrate that the proposed network is immediately applicable to clinically acquired data of the pelvis. In particular, we show that our intra-operative landmark detection together with pre-operative CT enables X-ray pose estimation which, ultimately, benefits initialization of image-based 2D/3D registration.
MED-PHMar 22, 2018
DeepDRR -- A Catalyst for Machine Learning in Fluoroscopy-guided ProceduresMathias Unberath, Jan-Nico Zaech, Sing Chun Lee et al.
Machine learning-based approaches outperform competing methods in most disciplines relevant to diagnostic radiology. Interventional radiology, however, has not yet benefited substantially from the advent of deep learning, in particular because of two reasons: 1) Most images acquired during the procedure are never archived and are thus not available for learning, and 2) even if they were available, annotations would be a severe challenge due to the vast amounts of data. When considering fluoroscopy-guided procedures, an interesting alternative to true interventional fluoroscopy is in silico simulation of the procedure from 3D diagnostic CT. In this case, labeling is comparably easy and potentially readily available, yet, the appropriateness of resulting synthetic data is dependent on the forward model. In this work, we propose DeepDRR, a framework for fast and realistic simulation of fluoroscopy and digital radiography from CT scans, tightly integrated with the software platforms native to deep learning. We use machine learning for material decomposition and scatter estimation in 3D and 2D, respectively, combined with analytic forward projection and noise injection to achieve the required performance. On the example of anatomical landmark detection in X-ray images of the pelvis, we demonstrate that machine learning models trained on DeepDRRs generalize to unseen clinically acquired data without the need for re-training or domain adaptation. Our results are promising and promote the establishment of machine learning in fluoroscopy-guided procedures.
ROJan 22, 2018
On The Effect of Vibration on Shape Sensing of Continuum Manipulators Using Fiber Bragg GratingsShahriar Sefati, Farshid Alambeigi, Iulian Iordachita et al.
Fiber Bragg Grating (FBG) has shown great potential in shape and force sensing of continuum manipulators (CM) and biopsy needles. In the recent years, many researchers have studied different manufacturing and modeling techniques of FBG-based force and shape sensors for medical applications. These studies mainly focus on obtaining shape and force information in a static (or quasi-static) environment. In this paper, however, we study and evaluate dynamic environments where the FBG data is affected by vibration caused by a harmonic force e.g. a rotational debriding tool harmonically exciting the CM and the FBG-based shape sensor. In such situations, appropriate pre-processing of the FBG signal is necessary in order to infer correct information from the raw signal. We look at an example of such dynamic environments in the less invasive treatment of osteolysis by studying the FBG data both in time- and frequency-domain in presence of vibration due to a debriding tool rotating inside the lumen of the CM.
ROJan 21, 2018
A Convex Optimization Framework for Constrained Concurrent Motion Control of a Hybrid Redundant Surgical SystemFarshid Alambeigi, Shahriar Sefati, Mehran Armand
We present a constrained motion control framework for a redundant surgical system designed for minimally invasive treatment of pelvic osteolysis. The framework comprises a kinematics model of a six Degrees-of-Freedom (DoF) robotic arm integrated with a one DoF continuum manipulator as well as a novel convex optimization redundancy resolution controller. To resolve the redundancy resolution problem, formulated as a constrained l2-regularized quadratic minimization, we study and evaluate the potential use of an optimally tuned alternating direction method of multipliers (ADMM) algorithm. To this end, we prove global convergence of the algorithm at linear rate and propose expressions for the involved parameters resulting in a fast convergence. Simulations on the robotic system verified our analytical derivations and showed the capability and robustness of the ADMM algorithm in constrained motion control of our redundant surgical system.
CVJan 4, 2018
Plan in 2D, execute in 3D: An augmented reality solution for cup placement in total hip arthroplastyJavad Fotouhi, Clayton P. Alexander, Mathias Unberath et al.
Reproducibly achieving proper implant alignment is a critical step in total hip arthroplasty (THA) procedures that has been shown to substantially affect patient outcome. In current practice, correct alignment of the acetabular cup is verified in C-arm X-ray images that are acquired in an anterior-posterior (AP) view. Favorable surgical outcome is, therefore, heavily dependent on the surgeon's experience in understanding the 3D orientation of a hemispheric implant from 2D AP projection images. This work proposes an easy to use intra-operative component planning system based on two C-arm X-ray images that is combined with 3D augmented reality (AR) visualization that simplifies impactor and cup placement according to the planning by providing a real-time RGBD data overlay. We evaluate the feasibility of our system in a user study comprising four orthopedic surgeons at the Johns Hopkins Hospital, and also report errors in translation, anteversion, and abduction as low as 1.98 mm, 1.10 degrees, and 0.53 degrees, respectively. The promising performance of this AR solution shows that deploying this system could eliminate the need for excessive radiation, simplify the intervention, and enable reproducibly accurate placement of acetabular implants.