Adversarial Attacks Against Medical Deep Learning Systems
This work addresses security risks in medical deep learning systems, which could be exploited for fraud due to unique healthcare incentives and technical vulnerabilities, making it a domain-specific but critical incremental study.
The paper demonstrates that adversarial examples can successfully manipulate state-of-the-art deep learning classifiers in three clinical domains, with both white and black box attacks achieving high success rates, highlighting vulnerabilities in medical AI systems.
The discovery of adversarial examples has raised concerns about the practical deployment of deep learning systems. In this paper, we demonstrate that adversarial examples are capable of manipulating deep learning systems across three clinical domains. For each of our representative medical deep learning classifiers, both white and black box attacks were highly successful. Our models are representative of the current state of the art in medical computer vision and, in some cases, directly reflect architectures already seeing deployment in real world clinical settings. In addition to the technical contribution of our paper, we synthesize a large body of knowledge about the healthcare system to argue that medicine may be uniquely susceptible to adversarial attacks, both in terms of monetary incentives and technical vulnerability. To this end, we outline the healthcare economy and the incentives it creates for fraud and provide concrete examples of how and why such attacks could be realistically carried out. We urge practitioners to be aware of current vulnerabilities when deploying deep learning systems in clinical settings, and encourage the machine learning community to further investigate the domain-specific characteristics of medical learning systems.