Adversarially trained neural representations may already be as robust as corresponding biological neural representations
This challenges the assumption that biological vision is inherently more robust, potentially redirecting research in adversarial robustness for AI.
The authors tackled the belief that mimicking primate neural representations yields adversarial robustness by developing a method to attack primate brain activity, finding that biological neurons are as susceptible to adversarial perturbations as robustly trained artificial neural networks.
Visual systems of primates are the gold standard of robust perception. There is thus a general belief that mimicking the neural representations that underlie those systems will yield artificial visual systems that are adversarially robust. In this work, we develop a method for performing adversarial visual attacks directly on primate brain activity. We then leverage this method to demonstrate that the above-mentioned belief might not be well founded. Specifically, we report that the biological neurons that make up visual systems of primates exhibit susceptibility to adversarial perturbations that is comparable in magnitude to existing (robustly trained) artificial neural networks.