GroSS: Group-Size Series Decomposition for Grouped Architecture Search
This work addresses the challenge of efficiently searching for optimal grouped convolutional architectures, which is incremental as it builds on existing grouped convolution methods by enabling more flexible and concurrent exploration.
The paper tackles the problem of exploring grouped convolution configurations in neural networks by introducing Group-size Series (GroSS) decomposition, a method that enables dynamic and differentiable selection of factorisation rank, allowing simultaneous training of varying group numbers and combinations across layers, and demonstrates its effectiveness through architecture searches on multiple datasets and networks.
We present a novel approach which is able to explore the configuration of grouped convolutions within neural networks. Group-size Series (GroSS) decomposition is a mathematical formulation of tensor factorisation into a series of approximations of increasing rank terms. GroSS allows for dynamic and differentiable selection of factorisation rank, which is analogous to a grouped convolution. Therefore, to the best of our knowledge, GroSS is the first method to enable simultaneous training of differing numbers of groups within a single layer, as well as all possible combinations between layers. In doing so, GroSS is able to train an entire grouped convolution architecture search-space concurrently. We demonstrate this through architecture searches with performance objectives on multiple datasets and networks. GroSS enables more effective and efficient search for grouped convolutional architectures.