CV AIFeb 3

Zero-shot large vision-language model prompting for automated bone identification in paleoradiology x-ray archives

Owen Dong, Lily Gao, Manish Kota, Bennett A. Landmana, Jelena Bekvalac, Gaynor Western, Katherine D. Van Schaik

arXiv:2602.03750v1h-index: 2Medical Imaging 2026: Imaging Informatics

Originality Incremental advance

AI Analysis

This addresses the bottleneck of time-consuming manual triaging for paleoradiology experts working with disarticulated and variable X-ray archives.

The researchers tackled the problem of identifying bones in heterogeneous paleoradiology X-ray images by developing a zero-shot prompting strategy using a Large Vision Language Model, achieving 92% accuracy for main bone identification, 80% for projection view, and 100% for laterality on a sample of 100 expert-reviewed images.

Paleoradiology, the use of modern imaging technologies to study archaeological and anthropological remains, offers new windows on millennial scale patterns of human health. Unfortunately, the radiographs collected during field campaigns are heterogeneous: bones are disarticulated, positioning is ad hoc, and laterality markers are often absent. Additionally, factors such as age at death, age of bone, sex, and imaging equipment introduce high variability. Thus, content navigation, such as identifying a subset of images with a specific projection view, can be time consuming and difficult, making efficient triaging a bottleneck for expert analysis. We report a zero shot prompting strategy that leverages a state of the art Large Vision Language Model (LVLM) to automatically identify the main bone, projection view, and laterality in such images. Our pipeline converts raw DICOM files to bone windowed PNGs, submits them to the LVLM with a carefully engineered prompt, and receives structured JSON outputs, which are extracted and formatted onto a spreadsheet in preparation for validation. On a random sample of 100 images reviewed by an expert board certified paleoradiologist, the system achieved 92% main bone accuracy, 80% projection view accuracy, and 100% laterality accuracy, with low or medium confidence flags for ambiguous cases. These results suggest that LVLMs can substantially accelerate code word development for large paleoradiology datasets, allowing for efficient content navigation in future anthropology workflows.

View on arXiv PDF

Similar