Improving Compositionality of Neural Networks by Decoding Representations to Inputs
This work addresses the problem of improving the interpretability and functionality of deep learning programs for researchers and practitioners, though it is a modest first step and incremental in nature.
The authors tackled the lack of compositionality in neural networks by introducing Decodable Neural Networks (DecNN), which jointly train a generative model to decode activations back to inputs, enabling applications like out-of-distribution detection and adversarial example detection while matching standard networks in accuracy.
In traditional software programs, it is easy to trace program logic from variables back to input, apply assertion statements to block erroneous behavior, and compose programs together. Although deep learning programs have demonstrated strong performance on novel applications, they sacrifice many of the functionalities of traditional software programs. With this as motivation, we take a modest first step towards improving deep learning programs by jointly training a generative model to constrain neural network activations to "decode" back to inputs. We call this design a Decodable Neural Network, or DecNN. Doing so enables a form of compositionality in neural networks, where one can recursively compose DecNN with itself to create an ensemble-like model with uncertainty. In our experiments, we demonstrate applications of this uncertainty to out-of-distribution detection, adversarial example detection, and calibration -- while matching standard neural networks in accuracy. We further explore this compositionality by combining DecNN with pretrained models, where we show promising results that neural networks can be regularized from using protected features.