CVApr 4, 2023
EduceLab-Scrolls: Verifiable Recovery of Text from Herculaneum Papyri using X-ray CTStephen Parsons, C. Seth Parker, Christy Chapman et al.
We present a complete software pipeline for revealing the hidden texts of the Herculaneum papyri using X-ray CT images. This enhanced virtual unwrapping pipeline combines machine learning with a novel geometric framework linking 3D and 2D images. We also present EduceLab-Scrolls, a comprehensive open dataset representing two decades of research effort on this problem. EduceLab-Scrolls contains a set of volumetric X-ray CT images of both small fragments and intact, rolled scrolls. The dataset also contains 2D image labels that are used in the supervised training of an ink detection model. Labeling is enabled by aligning spectral photography of scroll fragments with X-ray CT images of the same fragments, thus creating a machine-learnable mapping between image spaces and modalities. This alignment permits supervised learning for the detection of "invisible" carbon ink in X-ray CT, a task that is "impossible" even for human expert labelers. To our knowledge, this is the first aligned dataset of its kind and is the largest dataset ever released in the heritage domain. Our method is capable of revealing accurate lines of text on scroll fragments with known ground truth. Revealed text is verified using visual confirmation, quantitative image metrics, and scholarly review. EduceLab-Scrolls has also enabled the discovery, for the first time, of hidden texts from the Herculaneum papyri, which we present here. We anticipate that the EduceLab-Scrolls dataset will generate more textual discovery as research continues.
CVMar 29
Ink Detection from Surface Topography of the Herculaneum PapyriGiorgio Angelotti, Federica Nicolardi, Paul Henderson et al.
Reading the Herculaneum papyri is challenging because both the scrolls and the ink, which is carbon-based, are carbonized. In X-ray radiography and tomography, ink detection typically relies on density- or composition-driven contrast, but carbon ink on carbonized papyrus provides little attenuation contrast. Building on the morphological hypothesis, we show that the surface morphology of written regions contains enough signal to distinguish ink from papyrus. To this end, we train machine learning models on three-dimensional optical profilometry from mechanically opened Herculaneum papyri to separate inked and uninked areas. We further quantify how lateral sampling governs learnability and how a native-resolution model behaves on coarsened inputs. We show that high-resolution topography alone contains a usable signal for ink detection. Diminishing segmentation performance with decreasing lateral resolution provides insight into the characteristic spatial scales that must be resolved on our dataset to exploit the morphological signal. These findings inform spatial resolution targets for morphology-based reading of closed scrolls through X-ray tomography.
CVFeb 25
A Framework for Cross-Domain Generalization in Coronary Artery Calcium Scoring Across Gated and Non-Gated Computed TomographyMahmut S. Gokmen, Moneera N. Haque, Steve W. Leung et al.
Coronary artery calcium (CAC) scoring is a key predictor of cardiovascular risk, but it relies on ECG-gated CT scans, restricting its use to specialized cardiac imaging settings. We introduce an automated framework for CAC detection and lesion-specific Agatston scoring that operates across both gated and non-gated CT scans. At its core is CARD-ViT, a self-supervised Vision Transformer trained exclusively on gated CT data using DINO. Without any non-gated training data, our framework achieves 0.707 accuracy and a Cohen's kappa of 0.528 on the Stanford non-gated dataset, matching models trained directly on non-gated scans. On gated test sets, the framework achieves 0.910 accuracy with Cohen's kappa scores of 0.871 and 0.874 across independent datasets, demonstrating robust risk stratification. These results demonstrate the feasibility of cross-domain CAC scoring from gated to non-gated domains, supporting scalable cardiovascular screening in routine chest imaging without additional scans or annotations.
IVNov 12, 2024
DINO-LG: A Task-Specific DINO Model for Coronary Calcium ScoringMahmut S. Gokmen, Caner Ozcan, Moneera N. Haque et al.
Coronary artery disease (CAD), one of the leading causes of mortality worldwide, necessitates effective risk assessment strategies, with coronary artery calcium (CAC) scoring via computed tomography (CT) being a key method for prevention. Traditional methods, primarily based on UNET architectures implemented on pre-built models, face challenges like the scarcity of annotated CT scans containing CAC and imbalanced datasets, leading to reduced performance in segmentation and scoring tasks. In this study, we address these limitations by incorporating the self-supervised learning (SSL) technique of DINO (self-distillation with no labels), which trains without requiring CAC-specific annotations, enhancing its robustness in generating distinct features. The DINO-LG model, which leverages label guidance to focus on calcified areas, achieves significant improvements, with a sensitivity of 89% and specificity of 90% for detecting CAC-containing CT slices, compared to the standard DINO model's sensitivity of 79% and specificity of 77%. Additionally, false-negative and false-positive rates are reduced by 49% and 59%, respectively, instilling greater confidence in clinicians when ruling out calcification in low-risk patients and minimizing unnecessary imaging reviews by radiologists. Further, CAC scoring and segmentation tasks are conducted using a basic UNET architecture, applied specifically to CT slices identified by the DINO-LG model as containing calcified areas. This targeted approach enhances CAC scoring accuracy by feeding the UNET model with relevant slices, significantly improving diagnostic precision, reducing both false positives and false negatives, and ultimately lowering overall healthcare costs by minimizing unnecessary tests and treatments, presenting a valuable advancement in CAD risk assessment.