Maximum Mean Discrepancy Test is Aware of Adversarial Attacks
This work addresses the detection of adversarial attacks for machine learning security, though it is incremental as it improves an existing test method.
The paper tackled the problem that the maximum mean discrepancy (MMD) test fails to detect distributional differences between natural and adversarial data, and by addressing three key factors—replacing the Gaussian kernel with a deep kernel, maximizing test power, and handling non-independence with wild bootstrap—it verified that the MMD test can become aware of adversarial attacks.
The maximum mean discrepancy (MMD) test could in principle detect any distributional discrepancy between two datasets. However, it has been shown that the MMD test is unaware of adversarial attacks -- the MMD test failed to detect the discrepancy between natural and adversarial data. Given this phenomenon, we raise a question: are natural and adversarial data really from different distributions? The answer is affirmative -- the previous use of the MMD test on the purpose missed three key factors, and accordingly, we propose three components. Firstly, the Gaussian kernel has limited representation power, and we replace it with an effective deep kernel. Secondly, the test power of the MMD test was neglected, and we maximize it following asymptotic statistics. Finally, adversarial data may be non-independent, and we overcome this issue with the wild bootstrap. By taking care of the three factors, we verify that the MMD test is aware of adversarial attacks, which lights up a novel road for adversarial data detection based on two-sample tests.