Towards a New Interpretation of Separable Convolutions
This work addresses a theoretical gap for researchers in deep learning, but it is incremental as it builds on existing interpretations without introducing new methods or data.
The paper tackles the problem of understanding the underlying mechanisms of separable convolutions in deep neural networks, proposing a hybrid interpretation to better explain their efficacy.
In recent times, the use of separable convolutions in deep convolutional neural network architectures has been explored. Several researchers, most notably (Chollet, 2016) and (Ghosh, 2017) have used separable convolutions in their deep architectures and have demonstrated state of the art or close to state of the art performance. However, the underlying mechanism of action of separable convolutions are still not fully understood. Although their mathematical definition is well understood as a depthwise convolution followed by a pointwise convolution, deeper interpretations such as the extreme Inception hypothesis (Chollet, 2016) have failed to provide a thorough explanation of their efficacy. In this paper, we propose a hybrid interpretation that we believe is a better model for explaining the efficacy of separable convolutions.