CVMar 23, 2018

Generalizability vs. Robustness: Adversarial Examples for Medical Imaging

arXiv:1804.00504v1102 citations
Originality Incremental advance
AI Analysis

This addresses the need for more comprehensive evaluation methods in medical imaging to ensure model reliability, though it is incremental as it applies existing adversarial techniques to a specific domain.

The paper tackles the problem of evaluating deep learning models in medical imaging by assessing both generalizability and robustness using adversarial examples. The result shows that models with similar generalizability can have significant variations in robustness, leading to performance gaps in extreme cases like noise and ambiguous data.

In this paper, for the first time, we propose an evaluation method for deep learning models that assesses the performance of a model not only in an unseen test scenario, but also in extreme cases of noise, outliers and ambiguous input data. To this end, we utilize adversarial examples, images that fool machine learning models, while looking imperceptibly different from original data, as a measure to evaluate the robustness of a variety of medical imaging models. Through extensive experiments on skin lesion classification and whole brain segmentation with state-of-the-art networks such as Inception and UNet, we show that models that achieve comparable performance regarding generalizability may have significant variations in their perception of the underlying data manifold, leading to an extensive performance gap in their robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes