Adversarial Attacks Against Deep Learning Systems for ICD-9 Code Assignment
This highlights a security risk for healthcare systems relying on automated coding, potentially affecting patient care and billing accuracy, though it is incremental as it applies known adversarial attack methods to a new domain.
The paper tackled the vulnerability of deep learning systems for automated ICD-9 code assignment by showing that simple typo-based adversarial attacks can significantly degrade model performance, with perturbations affecting less than 3% of words causing notable impact.
Manual annotation of ICD-9 codes is a time consuming and error-prone process. Deep learning based systems tackling the problem of automated ICD-9 coding have achieved competitive performance. Given the increased proliferation of electronic medical records, such automated systems are expected to eventually replace human coders. In this work, we investigate how a simple typo-based adversarial attack strategy can impact the performance of state-of-the-art models for the task of predicting the top 50 most frequent ICD-9 codes from discharge summaries. Preliminary results indicate that a malicious adversary, using gradient information, can craft specific perturbations, that appear as regular human typos, for less than 3% of words in the discharge summary to significantly affect the performance of the baseline model.