CVFeb 4

A labeled dataset of simulated phlebotomy procedures for medical AI: polygon annotations for object detection and human-object interaction

Raúl Jiménez Cruz, César Torres-Huitzil, Marco Franceschetti, Ronny Seiger, Luciano García-Bañuelos, Barbara Weber

arXiv:2602.04624v12.81 citationsh-index: 17

Originality Synthesis-oriented

AI Analysis

This dataset addresses the need for standardized data in medical AI for phlebotomy training, though it is incremental as it focuses on a specific domain.

The authors created a labeled dataset of 11,884 images from simulated phlebotomy procedures, with polygon annotations for five medical objects, to advance research in medical training automation and human-object interaction.

This data article presents a dataset of 11,884 labeled images documenting a simulated blood extraction (phlebotomy) procedure performed on a training arm. Images were extracted from high-definition videos recorded under controlled conditions and curated to reduce redundancy using Structural Similarity Index Measure (SSIM) filtering. An automated face-anonymization step was applied to all videos prior to frame selection. Each image contains polygon annotations for five medically relevant classes: syringe, rubber band, disinfectant wipe, gloves, and training arm. The annotations were exported in a segmentation format compatible with modern object detection frameworks (e.g., YOLOv8), ensuring broad usability. This dataset is partitioned into training (70%), validation (15%), and test (15%) subsets and is designed to advance research in medical training automation and human-object interaction. It enables multiple applications, including phlebotomy tool detection, procedural step recognition, workflow analysis, conformance checking, and the development of educational systems that provide structured feedback to medical trainees. The data and accompanying label files are publicly available on Zenodo.

View on arXiv PDF

Similar