Adversarial Defense Through Network Profiling Based Path Extraction
This work addresses the security issue of adversarial attacks for users of deep neural networks, representing an incremental improvement over existing defense methods.
The paper tackles the problem of defending against adversarial attacks on deep neural networks by proposing a profiling-based method to decompose models into functional blocks and extract effective paths, achieving better accuracy and broader applicability than state-of-the-art techniques.
Recently, researchers have started decomposing deep neural network models according to their semantics or functions. Recent work has shown the effectiveness of decomposed functional blocks for defending adversarial attacks, which add small input perturbation to the input image to fool the DNN models. This work proposes a profiling-based method to decompose the DNN models to different functional blocks, which lead to the effective path as a new approach to exploring DNNs' internal organization. Specifically, the per-image effective path can be aggregated to the class-level effective path, through which we observe that adversarial images activate effective path different from normal images. We propose an effective path similarity-based method to detect adversarial images with an interpretable model, which achieve better accuracy and broader applicability than the state-of-the-art technique.