EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch
This addresses the problem of efficient neural architecture search for researchers and practitioners, but it is incremental as it builds on existing genetic approaches.
The paper tackles the challenge of designing neural network structures with little prior knowledge by proposing an ecologically-inspired genetic approach (EIGEN) that uses concepts like succession and mimicry to search from scratch, achieving similar or better performance than existing methods with dramatically reduced computation cost, e.g., 78.1% accuracy on CIFAR-100 in 120 GPU hours compared to 77.0% in over 65,536 GPU hours.
Designing the structure of neural networks is considered one of the most challenging tasks in deep learning, especially when there is few prior knowledge about the task domain. In this paper, we propose an Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of succession, extinction, mimicry, and gene duplication to search neural network structure from scratch with poorly initialized simple network and few constraints forced during the evolution, as we assume no prior knowledge about the task domain. Specifically, we first use primary succession to rapidly evolve a population of poorly initialized neural network structures into a more diverse population, followed by a secondary succession stage for fine-grained searching based on the networks from the primary succession. Extinction is applied in both stages to reduce computational cost. Mimicry is employed during the entire evolution process to help the inferior networks imitate the behavior of a superior network and gene duplication is utilized to duplicate the learned blocks of novel structures, both of which help to find better network structures. Experimental results show that our proposed approach can achieve similar or better performance compared to the existing genetic approaches with dramatically reduced computation cost. For example, the network discovered by our approach on CIFAR-100 dataset achieves 78.1% test accuracy under 120 GPU hours, compared to 77.0% test accuracy in more than 65, 536 GPU hours in [35].