Learning Robust Kernel Ensembles with Kernel Average Pooling
This addresses the problem of adversarial vulnerability in neural networks for machine learning practitioners, offering an incremental improvement through a novel architectural modification.
The paper tackled improving neural network robustness against adversarial attacks by introducing Kernel Average Pooling (KAP), a building block that applies mean filtering along kernel dimensions, resulting in substantial robustness gains on datasets like CIFAR10 and ImageNet without adversarial training.
Model ensembles have long been used in machine learning to reduce the variance in individual model predictions, making them more robust to input perturbations. Pseudo-ensemble methods like dropout have also been commonly used in deep learning models to improve generalization. However, the application of these techniques to improve neural networks' robustness against input perturbations remains underexplored. We introduce Kernel Average Pooling (KAP), a neural network building block that applies the mean filter along the kernel dimension of the layer activation tensor. We show that ensembles of kernels with similar functionality naturally emerge in convolutional neural networks equipped with KAP and trained with backpropagation. Moreover, we show that when trained on inputs perturbed with additive Gaussian noise, KAP models are remarkably robust against various forms of adversarial attacks. Empirical evaluations on CIFAR10, CIFAR100, TinyImagenet, and Imagenet datasets show substantial improvements in robustness against strong adversarial attacks such as AutoAttack without training on any adversarial examples.