Neural Anisotropy Directions
This work addresses the problem of understanding architectural biases in deep learning for researchers, providing insights into why CNNs perform differently based on feature directions, but it is incremental as it builds on existing inductive bias analysis.
The paper analyzes how network architecture shapes inductive bias in deep classifiers, showing that many CNNs struggle with linearly separable distributions depending on feature direction, and introduces neural anisotropy directions (NADs) as vectors that encode this directional bias, which are used to characterize features for class discrimination in CIFAR-10.
In this work, we analyze the role of the network architecture in shaping the inductive bias of deep classifiers. To that end, we start by focusing on a very simple problem, i.e., classifying a class of linearly separable distributions, and show that, depending on the direction of the discriminative feature of the distribution, many state-of-the-art deep convolutional neural networks (CNNs) have a surprisingly hard time solving this simple task. We then define as neural anisotropy directions (NADs) the vectors that encapsulate the directional inductive bias of an architecture. These vectors, which are specific for each architecture and hence act as a signature, encode the preference of a network to separate the input data based on some particular features. We provide an efficient method to identify NADs for several CNN architectures and thus reveal their directional inductive biases. Furthermore, we show that, for the CIFAR-10 dataset, NADs characterize the features used by CNNs to discriminate between different classes.