CVApr 17, 2023

BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors

Stanford
arXiv:2304.08486v25 citationsh-index: 38Has Code
Originality Synthesis-oriented
AI Analysis

This benchmark addresses the problem of assessing AI robustness for medical applications with diverse data types and distribution shifts, though it is incremental as it builds on existing datasets and methods.

The authors tackled the challenge of evaluating unified, modality-agnostic AI methods on diverse medical data by creating BenchMD, a benchmark combining 19 datasets across 7 modalities, and found that no technique performed strongly across all modalities, leaving room for improvement.

Medical data poses a daunting challenge for AI algorithms: it exists in many different modalities, experiences frequent distribution shifts, and suffers from a scarcity of examples and labels. Recent advances, including transformers and self-supervised learning, promise a more universal approach that can be applied flexibly across these diverse conditions. To measure and drive progress in this direction, we present BenchMD: a benchmark that tests how well unified, modality-agnostic methods, including architectures and training techniques (e.g. self-supervised learning, ImageNet pretraining),perform on a diverse array of clinically-relevant medical tasks. BenchMD combines 19 publicly available datasets for 7 medical modalities, including 1D sensor data, 2D images, and 3D volumetric scans. Our benchmark reflects real-world data constraints by evaluating methods across a range of dataset sizes, including challenging few-shot settings that incentivize the use of pretraining. Finally, we evaluate performance on out-of-distribution data collected at different hospitals than the training data, representing naturally-occurring distribution shifts that frequently degrade the performance of medical AI models. Our baseline results demonstrate that no unified learning technique achieves strong performance across all modalities, leaving ample room for improvement on the benchmark. Code is released at https://github.com/rajpurkarlab/BenchMD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes