Overcomplete Representations Against Adversarial Videos
This work tackles the problem of adversarial robustness for deep neural networks specifically in the context of video data, which is a less explored area compared to images.
This paper addresses the limited research on defending against adversarial videos by proposing OUDefend, a novel Over-and-Under complete restoration network. OUDefend balances local and global features by learning both undercomplete and overcomplete representations, enhancing robustness against various adversarial video attacks, including additive, multiplicative, and physically realizable types.
Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend). Most restoration networks adopt an encoder-decoder architecture that first shrinks spatial dimension then expands it back. This approach learns undercomplete representations, which have large receptive fields to collect global information but overlooks local details. On the other hand, overcomplete representations have opposite properties. Hence, OUDefend is designed to balance local and global features by learning those two representations. We attach OUDefend to target video recognition models as a feature restoration block and train the entire network end-to-end. Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos, ranging from additive attacks, multiplicative attacks to physically realizable attacks. Code: https://github.com/shaoyuanlo/OUDefend