CVOct 7, 2021

Curating Subject ID Labels using Keypoint Signatures

arXiv:2110.04055v1
Originality Incremental advance
AI Analysis

This addresses data quality issues for researchers and clinicians using medical image datasets, though it is incremental as it builds on existing keypoint methods.

The paper tackles the problem of subject ID label errors in medical image datasets, which can cause systematic errors in machine learning evaluation and potential patient misdiagnosis, by developing an efficient curation system based on 3D image keypoint representation that discovered previously unknown labeling errors in public brain MRI datasets.

Subject ID labels are unique, anonymized codes that can be used to group all images of a subject while maintaining anonymity. ID errors may be inadvertently introduced manually error during enrollment and may lead to systematic error into machine learning evaluation (e.g. due to double-dipping) or potential patient misdiagnosis in clinical contexts. Here we describe a highly efficient system for curating subject ID labels in large generic medical image datasets, based on the 3D image keypoint representation, which recently led to the discovery of previously unknown labeling errors in widely-used public brain MRI datasets

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes