eSkinHealth: A Multimodal Dataset for Neglected Tropical Skin Diseases
This addresses the lack of diverse dermatological data for underrepresented populations, potentially enabling more equitable AI tools for global health, though it is incremental as it focuses on dataset creation and annotation methods.
The authors tackled the problem of data scarcity for AI-driven diagnosis of skin Neglected Tropical Diseases (NTDs) by introducing eSkinHealth, a novel multimodal dataset with 5,623 images from 1,639 cases covering 47 skin diseases, collected in West Africa, and they proposed an AI-expert collaboration framework for scalable annotation.
Skin Neglected Tropical Diseases (NTDs) impose severe health and socioeconomic burdens in impoverished tropical communities. Yet, advancements in AI-driven diagnostic support are hindered by data scarcity, particularly for underrepresented populations and rare manifestations of NTDs. Existing dermatological datasets often lack the demographic and disease spectrum crucial for developing reliable recognition models of NTDs. To address this, we introduce eSkinHealth, a novel dermatological dataset collected on-site in Côte d'Ivoire and Ghana. Specifically, eSkinHealth contains 5,623 images from 1,639 cases and encompasses 47 skin diseases, focusing uniquely on skin NTDs and rare conditions among West African populations. We further propose an AI-expert collaboration paradigm to implement foundation language and segmentation models for efficient generation of multimodal annotations, under dermatologists' guidance. In addition to patient metadata and diagnosis labels, eSkinHealth also includes semantic lesion masks, instance-specific visual captions, and clinical concepts. Overall, our work provides a valuable new resource and a scalable annotation framework, aiming to catalyze the development of more equitable, accurate, and interpretable AI tools for global dermatology.