Geometric origin of adversarial vulnerability in deep learning
This addresses the adversarial vulnerability problem in deep learning systems, offering a method to enhance robustness while maintaining accuracy, which is crucial for real-world AI applications.
The paper tackles the challenge of balancing training accuracy and adversarial robustness in deep neural networks by introducing a geometry-aware framework that uses layer-wise local training to sculpt internal representations, achieving improved robustness against white or black box attacks with concrete performance gains.
How to balance training accuracy and adversarial robustness has become a challenge since the birth of deep learning. Here, we introduce a geometry-aware deep learning framework that leverages layer-wise local training to sculpt the internal representations of deep neural networks. This framework promotes intra-class compactness and inter-class separation in feature space, leading to manifold smoothness and adversarial robustness against white or black box attacks. The performance can be explained by an energy model with Hebbian coupling between elements of the hidden representation. Our results thus shed light on the physics of learning in the direction of alignment between biological and artificial intelligence systems. Using the current framework, the deep network can assimilate new information into existing knowledge structures while reducing representation interference.