Controllable Top-down Feature Transformer
This work addresses the challenge of interpreting CNN internal representations for researchers in computer vision, offering a controllable framework that is incremental in improving feature transformation analysis.
The paper tackles the problem of understanding and controlling feature map transformations across convolutional network layers by introducing a top-down feature transformer (TFT) with explicit parameters, which captures data-independent transformations and shows advantages over data-driven methods in spatial tasks.
We study the intrinsic transformation of feature maps across convolutional network layers with explicit top-down control. To this end, we develop top-down feature transformer (TFT), under controllable parameters, that are able to account for the hidden layer transformation while maintaining the overall consistency across layers. The learned generators capture the underlying feature transformation processes that are independent of particular training images. Our proposed TFT framework brings insights to and helps the understanding of, an important problem of studying the CNN internal feature representation and transformation under the top-down processes. In the case of spatial transformations, we demonstrate the significant advantage of TFT over existing data-driven approaches in building data-independent transformations. We also show that it can be adopted in other applications such as data augmentation and image style transfer.